Microsoft introduces AI VASA-1 that can help portraits talk and sing

Microsoft Research Asia recently revealed a new AI tool called VASA-1 that is capable of converting a still photo or drawing of a person into a realistic video of that person talking and singing.

Microsoft Research Asia recently revealed a new AI tool called VASA-1 that is capable of converting a still photo or drawing of a person into a realistic video of that person talking and singing.

This technology is capable of generating facial expressions, emotions, subtle facial nuances, and natural head movements from existing images. It also creates lip movements that match the sound being played.

Picture 1 of Microsoft introduces AI VASA-1 that can help portraits talk and sing

The VASA-1 tool was trained on the VoxCeleb2 dataset, which includes "over 1 million utterances from 6,112 celebrities". Microsoft has successfully tested VASA-1 on both real images and art such as the Mona Lisa.

VASA-1 can produce high-resolution video (512 x 512 pixels) at high frame rates, researchers said, which in offline mode provides 45 frames per second, while in offline mode it provides 45 frames per second. online is 40 frames per second.

However, many users are concerned that Microsoft's new AI toolkit could be abused to create deepfake videos.

To prevent this, researchers from Microsoft Research Asia decided to release no products related to this technology until appropriate liability protection measures are in place. For now, no products will be released.

Still, researchers are hopeful about the potential of new AI technology to help enhance educational experiences, assist people with communication difficulties, and provide companionship and therapeutic support to those in need. , opens the door for programs to convey information through talking AI characters.

Update 22 April 2024
Category

System

Mac OS X

Hardware

Game

Tech info

Technology

Science

Life

Application

Electric

Program

Mobile