The UK’s Information Commissioner’s Office (ICO) has launched a consultation on how the legality of using personal data scraped from the internet to train new generative AI models. In a statement published earlier today, the data protection watchdog said that it would be seeking views from a variety of stakeholders on this and other issues where data protection law and the development of generative AI models intersect from now until 1 March 2024.
“The impact of generative AI can be transformative for society if it’s developed and deployed responsibly,” said Stephen Almond, the ICO’s executive director for regulatory risk. “This call for views will help the ICO provide industry with certainty regarding its obligations and safeguard people’s information rights and freedoms.”
ICO shows increasing interest in generative AI
The ICO is the UK’s principal data protection regulator and is not only tasked with investigating and punishing those who breach UK data protection law but also publishing guidance surrounding ambiguities that arise in that legislation as a result of new technological developments. The watchdog has therefore taken a keen interest in the rise of generative AI. In April 2023, it published guidance for major companies seeking to process personal data to build new generative AI models. Two months later, the ICO also launched an advice service for “AI innovators” with questions about data protection issues that aimed to answer their queries in under a fortnight.
Its new consultation will investigate several questions that have arisen surrounding the application of data protection law to the development of generative AI models. These include the expectations placed on developers in complying with data subject rights and the accuracy principle in the UK GDPR, as well as the lawful basis for training new models in the first place. Those interested in participating in the first round of consultations, “The lawful basis for web scraping to train generative AI models,” can contribute via email or an online portal.
The watchdog’s success in defining its jurisdiction over AI developers, however, has been mixed. In October 2023, the ICO issued a preliminary enforcement notice against the owner of Snapchat for failing to adequately assess the risks posed to its user base by its new AI chatbot. In May 2022, the ICO issued an enforcement notice and fine against Clearview after it claimed the US company had scraped the personal data of British individuals to train its facial recognition algorithms in violation of UK data protection law. However, this ruling was overturned in October 2023 because the ICO lacked jurisdiction over the US company, with the watchdog losing an appeal against the decision a month later.
Trust in generative AI paramount, says ICO chief
Since then, the UK Information Commissioner himself has been vocal about the need to maintain high levels of trust and transparency in AI products and services. “2024 cannot be the year that consumers lose trust in AI,” John Edwards told the techUK Digital Ethics Summit in December. “We know there are bad actors out there who aren’t respecting people’s information and who are using technology like AI to gain an unfair advantage over their competitors. Our message to those organisations is clear – non-compliance with data protection will not be profitable.”
Edwards went on to warn that the ICO would take a dim view of persistent misuse of personal data like customer information by AI developers. “Where appropriate,” he added, “we will seek to impose fines commensurate with the ill-gotten gains achieved through non-compliance.”
The ICO’s latest intervention on AI is well-timed, concludes generative AI expert Henry Ajder. “Fundamentally, I feel that the current system doesn’t seem to work very well,” says Ajder, who argues that the practice of training LLMs on data acquired through indiscriminate web scraping does not chime with public expectations that the creators of that data should be acknowledged for their contributions. That said, he continues, major AI developers are increasingly sensitive to this feeling. “The fact that they are now trying to set up licensing agreements with the big news companies and so on shows that they know that [broader regulatory frameworks] are inevitable.”