The BBC, one of the most trusted sources of news in the world, recently raised the issue of video manipulation and face mapping as part of the growing rise in so-called “fake news”.
First, a video showed newsreader Matthew Amroliwala, who only speaks English, clearly speaking Spanish, Mandarin and Hindi. This was made possible by extremely convincing face mapping technology, which transplants the words of one person onto the face of another as it moves to speak.
Amroliwala called the resulting video “incredible and unsettling” – probably a good summary of the conflicting feelings that many will have upon seeing this type of technology in use. The techniques that go into it are highly advanced and very impressive, but in a case like this, it’s not hard to imagine the nefarious uses that people could conjure up for it.
The video is truly convincing – you would almost certainly believe it if you didn’t know any better. The technology, the video says, “maps and manipulates [Amroliwala’s] lips to match the dubbed voice”, and as a result, there are very few obvious gaps and inconsistencies.
Next, the BBC’s media editor Amol Rajan – someone with a good reason to worry about the rise of misinformation and “fake news” – tested out similar software. For those looking to deceive others, he said, it could be “devastatingly effective”. It could also, Rajan pointed out, be extremely useful for the creative industries, where existing dubbing for foreign language films is “not exactly subtle”.
Face mapping and AI
As Rajan points out, this technology is in its infancy, meaning it is expensive and quite time-consuming. But as we have seen, the end product is already very effective, so with more developments, it may soon be at a point where even the best trained eye can’t tell the difference between fake and real.
And a year ago, the BBC also highlighted a video made by university researchers in which they created fake footage of former US president Barack Obama. The technology has clearly moved on in that time; but even then, we have seen how easily people can fall for what many would consider obviously “fake” news, and the impact that it can have on our faith in the media and politics more generally.
But how does it work, and why have companies decided it is worth investing in? Synthesia, the startup that makes the technology used by the BBC to make Amroliwala appear to speak several different languages, says it “empowers storytellers with AI”.
“As early pioneers of video synthesis technology,” the company’s website says, “we are excited about the possibilities that synthetic media will bring to visual content creation. Generative AI will reduce the cost, skill and language barriers to creating, distributing and consuming content.”
It calls its technique “native dubbing”, which, it says is a “new method of translating video content that utilises AI to synchronise the lip movements of an actor to a new dialogue track”. The company’s aim is to stop language from being a barrier to people all over the world enjoying video content.
Two faces of the same coin
Its creators, then, see a simple, innocent, and commercially viable use for the product. The BBC, on the other hand, has expressed fears about how this type of technology could be used to fool people.
But others think that the ability to accurately map people’s faces and identify them could create a more secure, safer world. 3D face-mapping, biometrics consultant Steve Cook writes, could be a “true differentiator” in biometrics liveness detection.
3D face mapping, Cook says, “contains 100 times more data points than a 2D photo”, and is “required to accurately recognise the correct user’s face while concurrently verifying their human liveness”. This “liveness check”, he says, is particularly important in “unsupervised authentication scenarios such as confidential account access management”.
Taking the good with the bad
Of course, the most commonly recognised use of this type of technology is in the iPhone’s FaceID, something that is now being put in place in many new phones on the market as standard. This, much like fingerprint scanning on phones, is undoubtedly a step forward.
Clearly there are important applications that can be used for what most of us would see as good. But the nagging issue of misuse often returns. A Wall Street Journal article looked at the issue of facial recognition in police databases – an authoritarian concern that’s also been raised in China.
It’s obvious then, that how the technology is implemented depends a lot on the people that use it. That may seem like is entirely on customers and policymakers, but we have seen from recent experience that tech workers, such as those at Google, can truly influence the way technology is used. Maybe they should face the responsibility here, too.