AI lawsuits: the existential threat to generative AI

The question of AI regulations has been on world leaders’ lips for months, from government executives to tech CEOs. The debate between AI innovation and regulation has seen legally binding measures come into play in several countries including the UK, the US, China and EU member states.

Generative AI’s functioning principle has been accused of posing unlawful threats to businesses and individuals. (Photo by Tada Images / Shutterstock)

But all around the world, AI regulations – however varied – are focused on guardrailing the threat that the technology could pose to humanity. Perhaps inevitably then, the focus on finding solutions to new problems has left behind the conversation around the threats that AI poses to laws that are already in place.

From copyright infringement to defamation and misinformation, generative AI’s functioning principle – learning from training data – has been accused of posing unlawful threats to businesses and individuals.

AI copyright infringement cases

Generative AI models are trained with large amounts of data – the larger the datasets, the more performant the model. Based on the input, the models are capable of identifying patterns and generating text, images, videos or sounds as prompted. This means the (often copyrighted) data used to train generative models can sometimes be recognised in AI-generated content, without credits or license. This has sparked waves of litigations and lawsuits, notably from artists and writers who have recognised their work in AI-generated content – or are worried they might.

Some companies, including Shutterstock, Associated Press, Axel Springer, and more recently even Reddit and the Financial Times, have signed license deals with AI makers to let them train generative models with their content without risking infringing copyright. In return, OpenAI gives them access to its products and AI-powered technology. However, other companies and organisations who recognised their copyrighted work in LLM-generated answers have filed lawsuits against some of the most prominent AI companies.

The New York Times v. OpenAI and Microsoft

On 27 December 2023, the New York Times announced it was suing OpenAI and Microsoft because it said “millions of articles” were used to train chatbots “that now compete with it”. After 8 months of failed negotiations with OpenAI around using their content, the New York Times said the suit against the two tech companies results from seeing its investment in journalism taken and used “at no cost”. The Grey Lady accused them of avoiding spending “the billions of dollars The Times invested in creating that work, by taking it without permission or compensation”.

The lawsuit document claims that “the defendants’ GenAI tools can generate output that recites Times content verbatim, closely summarises it, and mimics its expressive style”. It then states that “these tools also wrongly attribute false information to The Times.”

OpenAI has replied in a statement that it remains hopeful to “find a mutually beneficial way to work together, as we are doing with many other publishers.”

However, in March 2024, OpenAI filed a motion arguing that the New York Times “hacked” ChatGPT to create the “highly anomalous results” that formed the basis for the lawsuit and that the bot “is not in any way a substitute for a subscription to the New York Times.” The newspaper denied the claims.

On the other front, Microsoft filed a motion to dismiss part of the lawsuit by calling the newspaper’s claims a false narrative of “doomsday futurology”.

The Grey Lady is not the only one who points a finger at OpenAI. In March 2024, Elon Musk filed a lawsuit against the AI maker accusing it of abandoning its founding principle, which was to “benefit humanity”, in favour of commercial interests. The company called Musk’s allegations “frivolous” and linked to his commercial interests.

Other waves of lawsuits from news organisations against AI companies occurred in February 2024 when The Intercept, AlterNet and Raw Story sued OpenAI, and then in April, when eight daily newspapers including The Chicago Tribune and The New York Daily News filed copyright suits against OpenAI and Microsoft.

Music publishers v. Anthropic

Music publishers Universal Music, ABKCO and Concord Publishing sued AI company and LLM maker Anthropic in October 2023 over accusations of “systematic and widespread” copyright infringement. The three music companies stated in the suit that Anthropic “unlawfully” uses their copyrighted works – mostly lyrics – to train its AI model Claude, an AI-powered bot. This results in the generation of identical copies of lyrics without permission.

This is the first case to address AI’s use of music lyrics. As compensation, the publishers are asking for financial reparation as well as an official court order to stop the alleged copyright infringement.

However, in January 2024, Anthropic filed a motion opposing the copyright infringement claims based on several reasons, including that the case was filed in the wrong jurisdiction – as the company doesn’t run operations in Tennessee, where the case was filed. Moreover, Anthropic argued that considering their AI model only repeated lyrics after it was prompted to do so by the music publishers, copyright liability is on the publishers rather than the AI company – according to a copyright law principle called volitional conduct. The latter argument was contested by the plaintiffs in a Tennessee federal court in February 2024.

Sony Music warning to AI companies

Sony Music, one of the rare music industry giants that have not sued AI makers over copyright infringement, warned more than 700 companies not to use its content without permission. In May 2024, the label sent letters to hundreds of tech firms reminding them they would need explicit consent to use Sony artists’ music, art, or lyrics to train AI models.

Getty Images v. Stability AI

While most legal battles between content creators and AI firms are happening in the US, UK courts have seen one of the biggest AI copyright cases in 2023. Stock image platform Getty Images has sued Stability AI for allegedly infringing copyright laws by using its copyrighted images to train its image generation model Stable Diffusion.

One of the central pieces of evidence presented by Getty Images is the ability of Stable Diffusion to generate images with Getty Images’ recognisable grey and white watermark.

While the case is still at a very early stage, Judge Mrs Justice Smith has made it clear that one of the core issues at play here is whether the training and development of Stability AI’s model happened in the UK – referred to in the case document as “the location issue”. As copyright is considered a territorial right, both parties have to provide evidence of location. Stability AI claims to have developed Stable Diffusion in the US (which would not constitute a break of the UK’s Copyright Designs and Patents Act). At the same time, Getty Images sustains servers and computers were based in the UK.

In December 2023, UK Courts ruled that the case would proceed to trial. In the meantime, Getty Images has also sued Stability AI in the US over similar allegations.

Authors Guild class-action lawsuit

It is clear that the creative industries are particularly affected by the deployment of generative AI. As a reaction to seeing their work used without permission or credit, America’s biggest professional organisation for writers, the Authors Guild, has organised an open letter signed by over 15,000 writers to call for generative AI developers “to obtain consent, credit and fairly compensate writers for the use of copyrighted materials in training AI”. Signatories include the likes of Dan Brown, Margaret Atwood and Celeste Ng.

In 2023, the Authors Guild and 17 authors filed a class-action lawsuit against OpenAI and Microsoft for using their copyrighted work to train their LLMs. In January 2024, The Hollywood Reporter announced that the Authors Guild is reviewing the possibility of a blanket licence for AI companies – although artists could decide to opt out of the model. This comes almost as a do-or-die measure, as the organisation’s CEO Mary Rasenberger told The Hollywood Reporter that “we have to be proactive because generative AI is here to stay”.

Defamation and misinformation

While AI is trained with large quantities of data, the instances of LLMs generating inaccurate information are frequent. Considering the often authoritative and factual tone of generative AI models, false or misleading answers can be a potential civil liability.

Mark Walters v. OpenAI

In June 2023, Mark Walters – an American radio host whose website introduces as “the loudest voice in America fighting for gun rights” – sued OpenAI in what became the first defamation lawsuit stemming from the generation of information by AI.

After a journalist prompted ChatGPT to summarise a court case (by providing a link to the suit), the bot responded that Walters was accused of “defrauding and embezzling funds” from a major pro-gun rights group, the Second Amendment Foundation (SAF). However, Walters’ name is not even mentioned in the lawsuit document, and no such accusations exist against him.

Walters’ lawsuit against ChatGPT states that “every statement of fact in the summary pertaining to Walters is false”, and goes on to claim that the allegations tend to “injure Walters’ reputation and [expose] him to public hatred, contempt or ridicule”.

OpenAI filed a motion to dismiss in July 2023 based on the disclaimers provided by its bot that notify users of potential false information. The document also claimed that Walters experienced no actual harm as a result of the event and that OpenAI “is not subject to general jurisdiction in Georgia”, where the case was filed. The motion to dismiss was rejected by a Georgia state judge in January 2024.

Legal defences

In 2013, a court ruled that Google was not infringing the Authors Guild’s copyrights by scanning and uploading books and excerpts to its online book search database. The ruling was based on the fair use doctrine, which maintains that copyright infringement took place, but allows it to “promote freedom of expression”, the U.S. Copyright Office Fair Use Index states. A fair use defence is commonly employed when a copyrighted work is used for activities such as “criticism, comment, news reporting, teaching, scholarship and research” the fair use index says.

It should therefore come as no surprise that since the 2013 ruling, AI companies have relied heavily on fair use defences. “When it comes to copyright infringement allegations, a defendant’s best friend is the fair use doctrine”, Michael Rosen, an intellectual property lawyer and nonresident senior fellow at the American Enterprise Institute told Tech Monitor. Rosen explains that for example, in The New York Times v. OpenAI case, “the generative AI giant contends that it only uses the Times’s content for training purposes, which should be considered a fair use, and that the Times may have manipulated ChatGPT to obtain the ‘regurgitations’”.

In these cases, the defence is often based on comparing AI training to studying. In the same way that students learn and take inspiration from other people’s work, lawyers argue that AI doesn’t steal training data but consumes and learns from it – as humans would.

But many interpretations are possible. Litigator Justin Kingsolver told Tech Monitor that several legal defences can be applied to AI lawsuits, including the idea that the generation of false claims – “hallucinations” – is not intended with malice and therefore does not constitute defamation. Kingsolver also mentions the frequent use of disclaimers by generative AI models to invalidate the idea that bots’ answers are statements of facts – and therefore that they could be defamatory.

However, fair use and other defences can be insufficient or inadequate. Matthew Butterick, a lawyer specialised in AI lawsuits, told Wired that “the point of the Google Books project was to point you to the original books” referring to the Authors Guild v. Google case. But “generative AI doesn’t do any of that”, Butterick said, “It doesn’t point you to the original work. The opposite—it competes with that work.”

Butterick, who is a co-counsel in class-action lawsuits against Github Copilot, Stable Diffusion and Stability AI, told Wired that he is “just one piece of this – I don’t want to call it a campaign against AI, I want to call it the human resistance”.

What does this mean for the AI industry?

A study by McKinsey & Company found that generative AI could add a value of trillions of dollars to the global economy by 2032. However, the growing number of lawsuits against AI companies seems to show legal limitations of AI that could prompt it to slow down. The fundamental principle for training generative data is at stake, potentially creating an existential crisis for the technology.

AI companies’ measures

Given the significant threat that AI lawsuits pose for companies, Kingsolver told Tech Monitor that many firms already “take proactive steps” to address legal concerns. In addition to clear copyright deals with content owners and disclaimers about the potential for hallucination, Kingsolver explains that the biggest AI companies are going even further. He says that “they’ve expended substantial resources to improve their product to […] reduce the risk of hallucination, reportedly to great effect”.

“Whether or not little attention has been paid to this issue in the past, I think it is unquestionable that much more attention will be paid in the future”, Kingsolver says. “The law is, for better or worse, reactive”.

Influence on European policymaking

While most law cases are happening in the US – where generative AI models are largely developed – class-action lawsuits have the potential to influence policy in London and Brussels. For both Rosen and Kingsolver, there is no doubt that European policies and regulations will be impacted by the wave of lawsuits. “Europeans are watching very closely how Americans deal with these issues, and vice versa,” Rosen says.

Given the scale of The New York Times lawsuit, Kingsolver believes that “this litigation […] will inevitably influence policymakers both in the US and around the world as they consider effective means to regulate AI.”

In fact, in March 2024, the EU passed the first-ever set of legally binding rules around AI. One of the central regulations of this EU AI Act, which will be made law entirely by 2027, is that all AI-powered systems must comply with EU copyright law.

Sign up for our weekly news round-up!

Sign up to the newsletter: In Brief