Midjourney images used to be easy to spot. Until very recently, the telltale sign that an image had been generated on the AI platform could usually be found in the appendages of its central subjects, afflicted with over-stretched wrists, a superfluity of veins or one spindly, accusatory digit too many. But what was a flaw two months ago is now a thing of the past. In March, Midjourney released a software update that fixed some of the many-fingered flaws of its previous outputs — thereby closing a straightforward avenue for the detection of deepfake imagery.
This alone is testament to the incredible pace at which AI is transforming digital image generation. Tools like Midjourney, Dall-E, and Stable Diffusion are all capable of producing complex photo-realistic outputs from simple textual prompts within seconds — and constantly learning from their mistakes. Most of the time, this activity is pretty harmless: a vision of Harry Potter by Pixar here, a casino-dwelling Albert Einstein there. Increasingly, though, sights like the infamous ‘Balenciaga Pope’ are entering the mix: an image so uncanny and yet so mundane in comparison to the surreal cinematograph of contemporary social media that it might, just might, turn out to be real.
It was quickly established, of course, that the photographs of Pope Francis clad in a stylish puffer jacket were, indeed, fake (“Call him the Supreme pontiff,” wrote one over-awed journalist. “The vicar of drip.”) But the fact that so many were fooled by a set of images created by someone who later confessed he’d done so after taking hallucinogenic substances is something that worries experts in computer vision like Hany Farid. “Right now,” says the UC-Berkeley professor, “any knucklehead who’s high on mushrooms can break the internet.”
Midjourney mania
People have been manipulating photographs for almost as long as they’ve been taking photographs. Nineteenth-century adjustments were more hands-on than today’s software, involving ink, paint, corrosive chemicals, and scratching negatives in a lonely darkroom. The development of Adobe Photoshop in 1987 ushered in a new era of digital image manipulation, albeit one that was still unusually laborious. Recent years, however, have seen the advent of automated tools that make complex transmutations straightforward and almost instantaneous.
Filters are now embedded into social media platforms like TikTok and Instagram — smoothing skin tone and adjusting facial proportions. Tools like FaceApp, which was launched in 2016, are explicitly designed to make lifelike real-time alterations to human faces. And it’s not just about editing real images. Platforms like Midjourney, Dall-E, and Stable Diffusion have given ordinary people access to generative tools that the top studios in Hollywood wouldn’t have even been able to dream of just five years ago.
Deepfakes aren’t anything new. The term was first coined in 2017 when Reddit user r/deepfakes shared a series of videos with celebrities’ faces swapped onto the bodies of pornographic performers. What’s changed is the accessibility of this technology. Midjourney and Stable Diffusion made their debut in 2022 alongside the updated Dall-E 2 — democratising the creation of artificial imagery. Anyone with access to the internet can now use these tools to “defraud you, change elections, and inspire civil unrest” with a simple textual prompt, says Farid.
There were 41.4 million visits to Midjourney in March 2023, according to data from Similar Web. With its newfound ability to draw lifelike human hands, the tool was notably used to produce realistic-looking viral visuals of former President Donald Trump being arrested and Pope Francis wearing a stylish white puffer jacket. Reporter Ryan Broderick dubbed the latter “the first real mass-level AI misinformation case.”
Detection is an uphill battle
Can technology help us figure out what’s real? Developers have been hard at work trying to design machine-learning models to take on generative AI by detecting small ‘fingerprints’ in artificial images. These tools can spot subtle nuances that might be invisible to the human eye “like minute texture patterns or differences in neighbouring pixels,” says Yong Jae Lee, a computer science professor at UW-Madison. But existing tools have hefty — and perhaps insurmountable — shortcomings.
Most models are based on images from a specific generator and struggle to correctly identify those produced by other platforms, says Lee. They’re also constantly scrambling to keep up with generative models, which can use detective tools to improve — and ultimately conceal — their own output. “These are learning models. They are learning what’s detecting them and then fixing for it,” says Mounir Ibrahim, VP at image verification startup Truepic.
This means that any recognisable artificial footprints, like Midjourney’s distorted hands, are now quickly removed. That’s why some experts have completely given up on the prospect of individual detection. “I have stopped telling people what to look for because I think it’s a false sense of security and it’s got a shelf life that is relatively short,” says Farid.
There are also several techniques that can be used to compromise the effectiveness of detective tools. One of these is simply to “compress the hell out of an image,” says deepfake expert Henry Ajder. If you lower the resolution, he explains, there’s “a lot less data for these tools to look for.” Post-production is also already a pretty standard part of our media ecosystem, but visual effects or filters can obscure some of the signals that detective models look out for. “The generative side of things is always going to have the upper hand,” says Ajder.
Indeed, Ajder’s alarmed by the fact that some companies are still dreaming of a quick fix like a browser plugin that could automatically detect fake images. He thinks this kind of approach will likely never be viable — and might even create further risks. Even if we developed detection software that could spot 99.9% of deepfakes, Ajder argues, the remaining 0.1% of all digital images uploaded to social media “is still a huge amount of content.” That means that even a highly-reliable detection tool would allow a large volume of artificial content to fly under the radar or — perhaps even more dangerously — falsely implicate ‘innocent’ images.
“I think the place for detection is in the hands of well-trained digital forensics experts who might be able to corroborate the signals that these detection systems provide,” says Ajder.
But while digital forensics experts might — at least for now — have a decent shot at figuring out if an image is an artificially-generated fake, they’re facing an overwhelming volume of content. There’s also the question of speed. “We measure the half-life of a social media post in minutes,” says Farid. “If I create a fake video of a CEO saying that their stock is down, the market has moved billions of dollars before I even get out of bed in the morning.”
Taking the offensive
Defence is fundamentally harder than offence, says Farid. That’s why he’s keen to pressure the biggest generative platforms to inject a digital watermark or fingerprint into any content that they synthesise — flagging artificial content from the outset rather than playing the “cat and mouse” game of detection.
This won’t solve everything, he says. Motivated individuals will likely still be able to develop their own watermark-free platforms and skirt around any regulations. But Farid believes that watermarking content from the most popular generative platforms might solve a large chunk of the problem — eliminating the risk of mass unintentional deception and leaving experts with more time to devote to genuinely malicious images.
Ajder and Ibrahim argue that a similar approach should also be utilised to authenticate real content. This could be achieved by embedding cryptographically-secured metadata — which Ibrahim compares to “nutritional labels” — in an image or video at the point of capture. That’s what Ibhrahim’s startup, Truepic, promises to achieve through a cryptographic signature containing the critical details behind a digital photo, including the time, date, and location of its origin, as well as the true pixels captured.
“We have digitised our existence,” says Ibrahim. Most decisions of consequence — from dating to hiring — start out with a digital image, so “there has to be a foundation of transparency in that digitisation.”
Most fakes aren’t being made at the behest of people like Vladimir Putin or Xi Jinping, says Farid. They’re being made by ordinary people playing around on sites like Midjourney, Dall-E, or Stable Diffusion. Even if they don’t have any deceptive intent, these people — and the artificial images they create — can erode confidence and spread misinformation. Ajder agrees. “This won’t necessarily be what a lot of people have anticipated, which is that one big deepfake completely flips an election or causes a war,” he says. “This is going to be the slow chipping away of trust.”
The infamous ‘Balenciaga Pope’ that circulated on social media in March wasn’t designed to deceive. Its creator, Pablo Xavier, told Buzzfeed that he developed the image on Midjourney while tripping on shrooms and shared it in a specialist Facebook group called AI Art Universe. Xavier himself was shocked by how quickly the images went viral. “I was just blown away,” he said. “I didn’t want it to blow up like that.”
Farid thinks this saga is a prime example of the positive potential of watermarking, which he believes might have prevented Xavier’s unintentional mass deception. “He wasn’t trying to fool people,” he says. “He was just high and having fun… It got out of control, which is what happens on the internet.”