A new technique using artificial intelligence to manipulate video content gives new meaning to the expression “talking head.”
An international team of researchers showcased the latest advancement in synthesizing facial expressions—including mouth, eyes, eyebrows, and even head position—in video at this month’s 2018 SIGGRAPH, a conference on innovations in computer graphics, animation, virtual reality, and other forms of digital wizardry.
The project is called Deep Video Portraits. It relies on a type of AI called generative adversarial networks (GANs) to modify a “target” actor based on the facial and head movement of a “source” actor. As the name implies, GANs pit two opposing neural networks against one another to create a realistic talking head, right down to the sneer or raised eyebrow.
In this case, the adversaries are actually working together: One neural network generates content, while the other rejects or approves each effort. The back-and-forth interplay between the two eventually produces a realistic result that can easily fool the human eye, including reproducing a static scene behind the head as it bobs back and forth.