The line between the real and the unreal is shrinking thanks to a new AI tool from Microsoft.
The technology, called VASA-1, converts images of a person’s face into animated videos of people singing or speaking.
The software company claims that lip movements are “highly synchronized” with sounds that affect human life.
For example, Leonardo da Vinci’s Mona Lisa, created in the 16th century, began to speak with a rough American accent.
Microsoft is keeping the technology secret, claiming that it “can be misused to reveal identities.”
VASA-1 captures images of faces, whether they are photographs of real people or virtual images depicted in art or photography.
He then “carefully” synchronizes this with the words of “anyone” to animate the face.
Because it is trained on a database of facial expressions, the AI can animate even silent images as speech is spoken in real time.
Microsoft researchers described VASA in a blog post as “a framework for creating a variety of speech for virtual characters.”
“Our method not only produces valuable lip-voice synchronization, but is also capable of capturing various emotions and nuances of expressive faces and natural head movements that contribute to the perception of realism and liveliness.”
The team believes that VASA-1 can enable digital AI avatars to interact with use cases in a way that is “natural and intuitive like interacting with real people”.
Fraud can be another concern, as individuals may receive phone calls claiming to be from people they trust online.
According to ESET security expert Jake Moore, “seeing is not believing.”
VASA-1 “is not intended to create content that is used to mislead or deceive,” said Microsoft experts, expecting complaints from the public.
“However, like other methods of content creation, it can still be misused to target people,” they said.
We want to use technology to detect fraud and challenge actions that create false or harmful content by real people. “
Microsoft admits that current methods fall short of “achieving the authenticity of natural speaking faces,” but artificial intelligence (AI) is advancing rapidly.