When Roadrunner, a documentary about late TV chief and traveler Anthony Bourdain, which hit theaters last month, its director Morgan Neville has spiced up promotional interviews with unconventional disclosure for a documentary maker. Some of the words viewers hear about Bourdain in the film have been tampered with by artificial intelligence software used to mimic the star’s voice.
Accusations by Bourdain fans that Neville had acted unethically quickly dominated coverage of the film. Despite this attention, it was unclear what Bourdain’s fake voice was in the two-hour movie and what she said – until now.
In an interview that made his film infamous, Neville Recount The New Yorker that he had generated three fake Bourdain clips courtesy of his estate, all from words the conductor had written or said but which were not available in audio. He only revealed one, an e-mail that Bourdain “reads” in the movie trailer, but boasted that the other two clips would be undetectable. “If you watch the film” The New Yorker quoted Oscar winner Neville as saying, “You probably don’t know what other lines the AI has said, and you won’t.”
The audio experts at Pindrop, a startup that helps banks and others fight phone fraud, think they know. If the company’s analysis is correct, the deepfake Bourdain controversy is rooted in less than 50 seconds of audio in the 118 minute film.
Pindrop’s analysis pointed to the email quote leaked by Neville along with a clip at the start of the film apparently taken from an essay Bourdain wrote on Vietnam titled “The Hungry American,” collected in his 2008 book. , The wicked pieces. He also highlighted the audio midway through the film in which the Chef observes that many chefs and writers have a “relentless instinct to screw up a good thing.” The same sentences appear in an interview with Bourdain with the gastronomic site First We Feast on the occasion of his 60th birthday in 2016, two years before his death by suicide.
All three clips are recognizably similar to Bourdain. On close listening, however, they appear to carry signatures of synthetic speech, such as weird prosodies and fricatives such as the “s” and “f” sounds. A Reddit user independently reported the same three clips as Pindrop, writing that they were easy to hear when watching the movie for the second time. The film’s distributor, Focus Features, did not respond to requests for comment; Neville’s production company declined to comment.
When Neville predicted that his use of AI-generated media, sometimes referred to as deepfakes, would be undetectable, he may have overestimated the sophistication of his own sham. He probably hadn’t anticipated the controversy or attention his use of the technique would attract from fans and audio experts. When the fury reached the ears of the Pindrop researchers, they saw the perfect test case for the software they built to detect audio deepfakes; they put it to work when the film debuted on streaming services earlier this month. “We are always looking for ways to test our systems, especially in real real world conditions. This was a new way to validate our technology, ”says Collin Davis, CTO of Pindrop.
Pindrop’s findings may have solved the mystery of Neville’s missing deepfakes, but the episode portends future controversies as deepfakes become more sophisticated and accessible for creative and malicious projects.