In 2020, Thomson Reuters sued the AI-powered legal research platform Ross Intelligence after accusing it of copying content from Reuters’ own AI legal platform, Westlaw Precision. The case is to be reviewed by a jury in May 2024, in what will be one of the first jury trials deriving from the AI boom and a bellwether for what is to come.
The question of AI regulations has been on world leaders’ lips for months, from government executives to tech CEOs. The debate between AI innovation and regulation has seen legally-binding measures come into play in several countries including the UK, the US, China and EU member states.
But all around the world, AI regulations – however varied – are focused on guardrailing the threat that the technology could pose to humanity. Perhaps inevitably then, the focus on finding solutions to new problems has left behind the conversation around the threats that AI poses to laws that are already in place.
From copyright infringement to defamation and misinformation, generative AI’s functioning principle – learning from training data – has been accused of posing unlawful threats to businesses and individuals.
AI copyright infringement cases
Generative AI models are trained with large amounts of data – the larger the datasets, the more performant the model. Based on the input, the models are capable of identifying patterns and generating text, images, videos or sounds as prompted. This means the (often copyrighted) data used to train generative models can sometimes be recognised in AI-generated content, without credits or license. This has sparked waves of litigations and lawsuits, notably from artists and writers who have recognised their work in AI-generated content – or are worried they might.
Some companies, including Shutterstock, Associated Press and Axel Springer, have recently signed license deals with AI makers (in these cases, with OpenAI) to let them train generative models with their content without risking infringing copyright. In return, OpenAI gives them access to its products and AI-powered technology. However, other companies and organisations who recognised their copyrighted work in LLM-generated answers have filed lawsuits against some of the most prominent AI companies.
The New York Times v. OpenAI and Microsoft
On 27 December 2023, the New York Times announced it was suing OpenAI and Microsoft because it said “millions of articles” were used to train chatbots “that now compete with it”. After 8 months of failed negotiations with OpenAI around using their content, the New York Times said the suit against the two tech companies results from seeing its investment in journalism taken and used “at no cost”. The Grey Lady accused them of avoiding spending “the billions of dollars The Times invested in creating that work, by taking it without permission or compensation”.
The lawsuit document claims that “the defendants’ GenAI tools can generate output that recites Times content verbatim, closely summarises it, and mimics its expressive style”. It then states that “these tools also wrongly attribute false information to The Times.”
OpenAI has replied in a statement that it remains hopeful to “find a mutually beneficial way to work together, as we are doing with many other publishers.”
Music publishers v. Anthropic
Music publishers Universal Music, ABKCO and Concord Publishing sued AI company and LLM maker Anthropic in October 2023 over accusations of “systematic and widespread” copyright infringement. The three music companies stated in the suit that Anthropic “unlawfully” uses their copyrighted works – mostly lyrics – to train its AI model Claude, an AI-powered bot. This results in the generation of identical copies of lyrics without permission.
This is the first case to address AI’s use of music lyrics. As compensation, the publishers are asking for financial reparation as well as an official court order to stop the alleged copyright infringement.
However, in January 2024, Anthropic filed a motion opposing the copyright infringement claims based on several reasons, including that the case was filed in the wrong jurisdiction – as the company doesn’t run operations in Tennessee, where the case was filed. Moreover, Anthropic argued that considering their AI model only repeated lyrics after it was prompted to do so by the music publishers, copyright liability is on the publishers rather than the AI company – according to a copyright law principle called volitional conduct.
Getty Images v. Stability AI
While most legal battles between content creators and AI firms are happening in the US, UK courts have seen one of the biggest AI copyright cases in 2023. Stock image platform Getty Images has sued Stability AI for allegedly infringing copyright laws by using its copyrighted images to train its image generation model Stable Diffusion.
One of the central pieces of evidence presented by Getty Images is the ability of Stable Diffusion to generate images with Getty Images’ recognisable grey and white watermark.
While the case is still at a very early stage, Judge Mrs Justice Smith has made it clear that one of the core issues at play here is whether the training and development of Stability AI’s model happened in the UK – referred to in the case document as “the location issue”. As copyright is considered a territorial right, both parties have to provide evidence of location. Stability AI claims to have developed Stable Diffusion in the US (which would not constitute a break of the UK’s Copyright Designs and Patents Act). At the same time, Getty Images sustains servers and computers were based in the UK.
In December 2023, UK Courts ruled that the case would proceed to trial. In the meantime, Getty Images has also sued Stability AI in the US over similar allegations.
Authors Guild class-action lawsuit
It is clear that the creative industries are particularly affected by the deployment of generative AI. As a reaction to seeing their work used without permission or credit, America’s biggest professional organisation for writers, the Authors Guild, has organised an open letter signed by over 15,000 writers to call for generative AI developers “to obtain consent, credit and fairly compensate writers for the use of copyrighted materials in training AI”. Signatories include the likes of Dan Brown, Margaret Atwood and Celeste Ng.
In 2023, the Authors Guild and 17 authors filed a class-action lawsuit against OpenAI and Microsoft for using their copyrighted work to train their LLMs. In January 2024, The Hollywood Reporter announced that the Authors Guild is reviewing the possibility of a blanket licence for AI companies – although artists could decide to opt out of the model. This comes almost as a do-or-die measure, as the organisation’s CEO Mary Rasenberger told The Hollywood Reporter that “we have to be proactive because generative AI is here to stay”.
Defamation and misinformation
While AI is trained with large quantities of data, the instances of LLMs generating inaccurate information are frequent. Considering the often authoritative and factual tone of generative AI models, false or misleading answers can be a potential civil liability.
Mark Walters v. OpenAI
In June 2023, Mark Walters – an American radio host whose website introduces as “the loudest voice in America fighting for gun rights” – sued OpenAI in what became the first defamation lawsuit stemming from the generation of information by AI.
After a journalist prompted ChatGPT to summarise a court case (by providing a link to the suit), the bot responded that Walters was accused of “defrauding and embezzling funds” from a major pro-gun rights group, the Second Amendment Foundation (SAF). However, Walters’ name is not even mentioned in the lawsuit document, and no such accusations exist against him.
Walters’ lawsuit against ChatGPT states that “every statement of fact in the summary pertaining to Walters is false”, and goes on to claim that the allegations tend to “injure Walters’ reputation and [expose] him to public hatred, contempt or ridicule”.
OpenAI filed a motion to dismiss in July 2023 based on the disclaimers provided by its bot that notify users of potential false information. The document also claimed that Walters experienced no actual harm as a result of the event and that OpenAI “is not subject to general jurisdiction in Georgia”, where the case was filed. The motion to dismiss was rejected by a Georgia state judge in January 2024.
In 2013, a court ruled that Google was not infringing the Authors Guild’s copyrights by scanning and uploading books and excerpts to its online book search database. The ruling was based on the fair use doctrine, which maintains that copyright infringement took place, but allows it to “promote freedom of expression”, the U.S. Copyright Office Fair Use Index states. A fair use defence is commonly employed when a copyrighted work is used for activities such as “criticism, comment, news reporting, teaching, scholarship and research” the fair use index says.
It should therefore come as no surprise that since the 2013 ruling, AI companies have relied heavily on fair use defences. “When it comes to copyright infringement allegations, a defendant’s best friend is the fair use doctrine”, Michael Rosen, an intellectual property lawyer and nonresident senior fellow at the American Enterprise Institute told Tech Monitor. Rosen explains that for example, in The New York Times v. OpenAI case, “the generative AI giant contends that it only uses the Times’s content for training purposes, which should be considered a fair use, and that the Times may have manipulated ChatGPT to obtain the ‘regurgitations’”.
In these cases, the defence is often based on comparing AI training to studying. In the same way that students learn and take inspiration from other people’s work, lawyers argue that AI doesn’t steal training data but consumes and learns from it – as humans would.
But many interpretations are possible. Litigator Justin Kingsolver told Tech Monitor that several legal defences can be applied to AI lawsuits, including the idea that the generation of false claims – “hallucinations” – is not intended with malice and therefore does not constitute defamation. Kingsolver also mentions the frequent use of disclaimers by generative AI models to invalidate the idea that bots’ answers are statements of facts – and therefore that they could be defamatory.
However, fair use and other defences can be insufficient or inadequate. Matthew Butterick, a lawyer specialised in AI lawsuits, told Wired in November 2023 that “the point of the Google Books project was to point you to the original books” referring to the Authors Guild v. Google case. But “generative AI doesn’t do any of that”, Butterick said, “It doesn’t point you to the original work. The opposite—it competes with that work.”
Butterick, who is a co-counsel in class-action lawsuits against Github Copilot, Stable Diffusion and Stability AI, told Wired that he is “just one piece of this – I don’t want to call it a campaign against AI, I want to call it the human resistance”.
What does this mean for the AI industry?
A study by McKinsey & Company found that generative AI could add a value of trillions of dollars to the global economy by 2032. However, the growing number of lawsuits against AI companies seems to show legal limitations of AI that could prompt it to slow down. The fundamental principle for training generative data is at stake, potentially creating an existential crisis for the technology.
AI companies’ measures
Given the significant threat that AI lawsuits pose for companies, Kingsolver told Tech Monitor that many firms already “take proactive steps” to address legal concerns. In addition to clear copyright deals with content owners and disclaimers about the potential for hallucination, Kingsolver explains that the biggest AI companies are going even further. He says that “they’ve expended substantial resources to improve their product to […] reduce the risk of hallucination, reportedly to great effect”.
“Whether or not little attention has been paid to this issue in the past, I think it is unquestionable that much more attention will be paid in the future”, Kingsolver says. “The law is, for better or worse, reactive”.
Influence on European policymaking
While most law cases are happening in the US – where generative AI models are largely developed – class-action lawsuits have the potential to influence policy in London and Brussels. For both Rosen and Kingsolver, there is no doubt that European policies and regulations will be impacted by the wave of lawsuits. “Europeans are watching very closely how Americans deal with these issues, and vice versa,” Rosen says.
Given the scale of The New York Times lawsuit, Kingsolver believes that “this litigation […] will inevitably influence policymakers both in the US and around the world as they consider effective means to regulate AI.”