Is client-side scanning the future of content moderation?

Rising inflation and the cost of living crisis has led to some worry that the 5G movement is being slowed down. Photo credit: Getty Images

How do you prevent the spread of illegal imagery without compromising the privacy of users? That was the question facing engineers at Apple. Under pressure from Western governments to crack down on the dissemination of child sexual abuse material (CSAM), the tech giant was also keen to preserve the ability of its users to communicate privately using end-to-end encryption.

Their solution was a perceptual hashing algorithm called NeuralHash. The software harnessed a process known as ‘hashing,’ wherein images are broken up into segments and given an identifying signature – usually a garbled series of letters or numbers – known as a ‘hash.’

In most cases, this signature is compared between the sender’s device and that of the receiver, to ensure that the image sent has not been corrupted in transit. A perceptual hashing algorithm, meanwhile, is capable of matching a known image to its hash, even when that image has been manipulated in such a way as to elude conventional scanning.

The method is computationally expensive, meaning that it usually works best when embedded inside the servers where the images are stored. By extension, perceptual hashing algorithms require access to the hash associated with the original image to know what they’re looking for.

Up until the late 2010s that wasn’t a problem, with millions of people content to send, receive and report images over unencrypted channels like MSN and Facebook Messenger. The rise of encrypted messaging services like WhatsApp and Signal, however, vastly reduced the volume of imagery that perceptual hashing algorithms could analyse.

Where Apple’s engineers innovated with NeuralHash, however, was to place their software inside users’ iPhones – a technique known as ‘client-side scanning’. Wary of creating a backdoor into their customers’ private messaging channels, the company instead tasked the algorithm with vetting images just before they were encrypted and uploaded to their iCloud account. If the number of hashes pertaining to suspected CSAM exceeded more than 30, Apple would then decrypt the images, manually review them and, if necessary, call law enforcement.

Not enough thought has been given to the precedent that legally mandated client-side scanning would establish, say critics. (Photo by Vadim Zhakupov / iStock)

Is NeuralHash secure?

When NeuralHash was first announced in August 2021, security researchers cried foul. While all critics acknowledged the gravity of the problem the software was designed to solve, many believed that Apple’s solution came at far too high a price.

Aside from the sinister implications of a corporate entity scanning private content on users’ devices, some argued the company was introducing new and unnecessary security vulnerabilities, too. “Apple is compromising the phone that you and I own and operate, without any of us having a say in the matter,” wr o te tech analyst Ben Thompson. “The capability to reach into a user’s phone now exists, and there is nothing an iPhone user can do to get rid of it.”

Acknowledging the severity of public criticism of NeuralHash, Apple scaled back its plans before indefinitely delaying the release of the software the following month. That did not, however, end the debate over client-side scanning. The UK and the EU, for example, remain enthusiastic about its capabilities in thwarting the spread of CSAM, with both publicly stating their willingness to compel platforms to institute much stronger content moderation policies aimed at its eradication.

If anything, NeuralHash was taken as an example of a tech company finally doing something proactive about this issue – one that could define the future of content moderation.

The roots of client-side scanning

In some ways, client-side scanning is simply the next iteration of a tried and tested content moderation technique that’s been in place for decades. Hany Farid was there from the start. A specialist in computer vision and a professor at the University of Berkeley, Farid helped create PhotoDNA, a server-based perceptual hashing algorithm used to root out CSAM from services provided by Microsoft, Reddit, Adobe and others.

“The real innovation with PhotoDNA was not necessarily the underlying technology – it was thinking about how to combat this problem within an existing infrastructure,” recalls Farid. “When we looked around back in 2008, this infrastructure existed, right? We were doing it for copyright infringement, we were doing it for spam and malware, ransomware, all forms of online security threats. And we thought, ‘Well, this fits very nicely within that bucket.’”

It worked. Since its introduction, millions of reports of CSAM can be attributed to the use of PhotoDNA. Its effectiveness, however, has been dulled with the rise of E2EE messaging services.

“We should acknowledge there are many good things about end-to-end encryption,” says Farid, not least the fact that the use of such platforms shields the private messages of users from mass surveillance by oppressive governments. Even so, “we should also acknowledge that bad people are doing really bad thing things within end-to-end encrypted systems”.

In that regard, argues Farid, NeuralHash is an equitable compromise, better safeguarding a user’s privacy than server-side scanning solutions and limiting its verification to whenever the device syncs with the iCloud instead of peering into individual messages (although Apple does use AI to scan messages for nudity in the UK).

Its effectiveness, however, is open to debate. Within days of its announcement, security researchers had figured out ways of triggering so-called ‘hash collisions,’ instances of two or more photos being conflated with one hash. While Apple was quick to point out that this version of NeuralHash was a prototype, other variations are just as vulnerable, explains Yves-Alexandre de Montjoye, a professor in computer science at Imperial College London and a special adviser to the EU’s justice commissioner.

A major problem in client-side scanning, says Montjoye, is that “the algorithm is very hard to keep secret,” since it lives in the device rather than a secure server. That also limits the amount of computational power the system can use to recognise an image, markedly raising the chance of collisions – a fact Montjoye and a team of researchers at Imperial demonstrated when they mounted adversarial attacks against five common perceptual hashing algorithms in late 2021.

The images the team used were, “for all intents and purposes, equivalent to one another,” says Montjoye. “And yet, every single one of the modified images is evading detection.”

The end of encryption?

Defeating perceptual hashing algorithms using adversarial attacks is nothing new: it’s the reason why, after all, so many film and TV clips on YouTube feature strange variations in speed, aspect ratio or sound quality to elude copyright takedowns. Would criminals use the same techniques to avoid being caught distributing CSAM and other illegal content? Montjoye thinks so. “I think this is definitely something that people could try to do,” he says.

The possibility doesn’t seem to frighten the UK government, which has repeatedly signalled its intent to force online platforms to tighten their content moderation efforts, including through client-side scanning.

In July, a report by Dr Ian Levy and Crispin Robinson, technical directors at the National Cyber Security Centre and GCHQ respectively, endorsed client-side scanning as a straightforward (if technically imperfect) way of achieving this goal without endangering user privacy.

But proponents of client-side canning have not paid enough thought to what might happen were it to be legally mandated, argues Ross Anderson, a professor of security engineering at Cambridge University. A regime implemented in the name of stopping the spread of CSAM would create a precedent for the policing of other types of content.

Prominent politicians have been, at best, flippant about what kind of speech should and shouldn’t be permissible online, says Anderson. “You may recall a couple of days ago, for example, that Rishi Sunak said that action should be taken against people who denigrate Britain, that don’t believe in Britain, and oppose the existence of Britain,” he says.

Farid, by contrast, believes that platforms could be using perceptual hashing algorithms to moderate much larger volumes of imagery and video, arguing recently in The Guardian that such software could be used to more effectively police violent or extremist content.

That also comes with the recognition of the method’s limitations. Perceptual hashing algorithms, after all, can only be trained to identify content it’s already been trained on. Only once you realise that, explains Farid, does it become clear how modest a proposal NeuralHash really was.

“It says nothing about live streaming, it says nothing about self-generated content, it says nothing about new content, it says nothing about grooming,” says Farid. “The only thing it does is, it says, ‘We’ve identified this [CSAM], we want to stop the redistribution and the re-victimisation.’ That is it.”

Online Safety Bill

As such, Farid rejects the argument that client-side scanning is the start of a slippery slope toward political censorship. End-to-end encrypted platforms like WhatsApp, he adds, constitute relatively small islands of privacy in a world where so many pieces of content are routinely pored over by algorithms for signs of malware, spam, faces and objects.

“The fact is, you’re walking around with a GPS tracker and a microphone and a camera,” says Farid. “Every place you go on the internet is being tracked by any number of third parties and cookies.”

Neither do E2EE platforms provide suspects with an effective shield from which to shelter against criminal investigations: metadata analysis of encrypted messages, for example, can help messaging services like WhatsApp to root out, ban and report entire CSAM distribution networks – some 300,000 accounts per month, according to the platform.

Police, too, are often empowered to seize and open individual devices, or use co-location analysis to trace the movements of suspects using encrypted phones. What worries Anderson, though, is that client-side scanning will be used as a cheaper, less effective cure-all for the spread of CSAM – and one that ignores the root causes of its creation.

“People who are engaged in this fight at the coalface [say] it’s nothing to do with cryptography, or the internet, or anything like that,” says Anderson, but poverty. As such, says the professor, one simple measure the UK government could take tomorrow is simply raise the level of child benefit, giving single-parent families more chances of not falling into horrific cycles of abuse and neglect.

Nothing is likely to change on that front in the near future. A confrontation on CSAM between the UK and the EU and online platforms on client-side scanning, by contrast, seems imminent. Although the Online Safety Bill is in legislative limbo until the Conservative Party chooses a new leader and prime minister, that’s due to change in the autumn. In the meantime, WhatsApp seems to be girding itself for a prolonged public debate about its future in the UK and Europe, warning last month that it will never abandon end-to-end encryption.

In the meantime, illegal and heinous content continues to be shared across a variety of messaging services, peer-to-peer platforms, video games and other online services. The creation of PhotoDNA was in large part, explains Farid, an attempt to stop the survivors of child abuse from reliving the worst day of their lives.

“They will tell you that knowing that the image of their assault is circulating around the internet every single day and being watched by thousands, tens of thousands, of people every day, injures them,” he says.

The corollary that proceeded from this, adds Farid, was that the internet should become a hostile environment for child abusers. It’s plain to see that governments and platforms have the tools available to prosecute this goal. How they use them will define the future of content moderation for decades to come.

Tech Monitor is hosting a roundtable in association with Intel vPro on how to integrate security into operations. For more information, visit NSMG.live.