Commercial "Deep Fake" Technology. It's Coming.
How will we adapt to a world of AI-driven video?
At the recent Digital Publishing Innovation Summit, speaker Francesco Marconi, Head of R&D at The Wall Street Journal, noted that "The next generation of false news will be powered by artificial intelligence (AI).” The technology is already far more advanced than most people realize, and used incorrectly it could cause significant social and political turmoil.
While recognizing the potential downside, it’s worth noting that the technology also has significant commercial potential. Modifying Francesco's statement slightly, replacing "false news" with "media", provides an interesting lens for understanding a major new chapter in the world of content:
"The next generation of media will be powered by artificial intelligence."
I think this prediction is as likely to become reality as Francesco's original warning. AI will revolutionize the digital content space; the open question is how soon and what the impact will be.
Earlier in his presentation, Francesco shared examples of existing automation in the newsroom: financial news, like quarterly earnings coverage, has increasingly been automated. He noted that "plug-and-play" templates allowed his prior company, the Associated Press, to jump from providing earnings coverage on dozens of companies to thousands of companies. Below is a demonstration of the solution used by the AP, a product called Wordsmith by Automated Insights.
So is the next step to have an AI editorial team writing articles from scratch? Possibly, but so far, long-form AI-generated writing hasn't proven to be very compelling to read. On the other hand, AI generated video is already extremely convincing. Here’s a video from University of Washington researchers that Francesco used to highlight AI generated, "Deep Fake", technology:
You can also watch comedian Jordan Peele use AI to make Barack Obama deliver a PSA about fake news here:
It's scary, because seeing is believing (or at least has been until now), and the technology is already out there and working. Now that the pandora’s box of deep fake video technology has been opened, there’s no going back to a world where video evidence can be taken as fact.
But let's approach this new technology a different way. What if media companies used this technology to generate and deliver a more scalable and tailored content experience? There are myriad possibilities, but for the sake of this blog post, let's restrict the discussion to digital news.
What if I prefer to see all of my news from CNN's Anderson Cooper? Normally, he could only cover a few key stories a day, but with this technology, he could cover every story in every market, all day every day, even hyper-local “dog bites man” stories.
Or maybe I prefer the voice of Morgan Freeman for the most distressing news, the voice of Sigourney Weaver for all science news, and former President Barack Obama for everything else.
I could have a fully tailored experience, while behind the scenes the media company produces content in a fraction of the time at a fraction of the cost. A fixed cost investment in this kind of technology would greatly reduce the variable overhead of scaling high production cost content.
To do this, a media company would just need to deploy existing AI technology and secure the rights to the personalities they mobilize.
Until recently, available AI and computer-generated imagery (CGI) technology was only able to assemble voices and basic mouth movements (popularized by free tools like FakeApp), but now it's becoming possible to generate realistic body movement too. There's too much at stake for this technology to be abused by bad actors. The potential consequences of misuse are serious, and smart regulation will be needed.
But the underlying technology is too valuable to be stopped. Someone will commercialize it, and when they do, it will begin to completely change digital content creation.
I doubt traditional news organizations will be early adopters - doing so will be a complicated, risky endeavor for them, and potentially threaten their core business models. It may more likely be first commercialized by new venture-backed startups. I imagine it may first be adopted by resource-strapped teams that want to jumpstart video production without the normal team + infrastructure costs. Teams could leap from having zero video content to a deep portfolio of monetizable content in a matter of days.
It's my opinion that commercialization of AI powered video generation will happen at large scale in the next few years. This could all lead to a horrible "Black Mirror"-esque future, but that dystopian projection doesn’t need to become a reality.
It remains to be seen whether traditional media will lead the charge in embracing this technology for good, and how information consumption will change as a result. This new technology has the power to change the world in vast ways -- whether the impact is beneficial or devastating depends entirely on how we choose to manage its adoption.