Written by 10:50 AM Tech

Creates a ‘Talking AI Avatar’ from a single photo

ETRI Develops Next-Generation Interaction Technology

The Electronics and Telecommunications Research Institute (ETRI) announced on the 15th that they have successfully developed an artificial intelligence (AI) avatar capable of speaking naturally like a real person using just a single portrait photograph.

Traditional AI voice assistants or navigation systems simply recognize and execute commands. However, the newly developed technology meticulously simulates mouth movements and facial expressions to deliver an experience akin to conversing with an actual person. This development enables scenarios such as an AI driving assistant in a vehicle engaging naturally with the driver, or making eye contact and communicating with pedestrians.

The research team developed the AI avatar using a proprietary algorithm that selectively learns and synthesizes areas of the face closely associated with speech, such as the lips and jaw. This approach minimizes unnecessary data learning and allows for precise expression of details like mouth movements, teeth, and skin wrinkles.

ETRI stated that this technology outperformed those presented at major global conferences like CVPR and AAAI in terms of clarity, naturalness, and lip synchronization.

This technology has potential applications in a variety of industries beyond autonomous vehicles, including kiosks, bank counters, news presenting, and advertising models. It holds promise as a core technology in the digital human industry, capable of emotional interaction beyond simple information delivery.

Yoon Dae-seop, director of ETRI’s Mobility UX Laboratory, mentioned, “As mobility technology advances, there is a risk of neglecting the elderly and socially disadvantaged,” expressing hopes that AI avatar technology will evolve into a smart mobility service accessible to all.

Choi Dae-woong, Senior Researcher at ETRI and the project lead, stated plans to further enhance the technology to enable AI avatars to converse and move naturally like real people, with the aim of eventually implementing interactions robust enough to replace some human roles in orders and consultations.

Currently, this technology is registered on ETRI’s technology transfer site as “Realistic Person Speech Video Generation Framework Technology.” The research team is actively pursuing technology transfer and commercialization strategies for various industrial applications.

Visited 1 times, 1 visit(s) today
Close Search Window
Close
Exit mobile version