Hear ‘Mona Lisa’ recite a famous Shakespeare monologue — Chinese engineers manage to get a picture to sing and talk using an AI app called Emote Portrait Live

Chinese engineers at the Institute for Intelligent Computing, Alibaba Group, have developed an AI app called Emote Portrait Live that can animate a still photo of a face and synchronize it to an audio track.

The technology behind this relies on the generative capabilities of diffusion models (mathematical models used to describe how things spread or diffuse over time), which can directly synthesize character head videos from a provided image and any audio clip. This process bypasses the need for complex pre-processing or intermediate representations, thus simplifying the creation of talking head videos.