Microsoft AI creates talking deepfakes from single photo

Microsoft AI creates talking deepfakes from single photo
Microsoft Research Asia has released an AI model that can generate realistic, talking deepfake videos from a single still image and an audio track.

The model has been trained on footage of approximately 6,000 talking faces from the VoxCeleb2 dataset, and can animate still images that lip-sync to a supplied voice track, creating realistic vocal expressions and natural head movements.

The technology, called VASA-1, can reportedly generate synced videos at 512x512 pixels at 40 frames per second without latency.

Photo credit: Microsoft Research Asia