Three news publications have filed separate lawsuits against Microsoft and OpenAI for copyright issues. This allegedly includes author, title and other key copyright information being removed during the training of artificial intelligence (AI) models. All three cases are being litigated by the same law firm in Southern District of New York.
The Intercept, AlterNet and Raw Story said that ChatGPT, OpenAI’s most prominent generative AI tool, “some of the time” reproduces “verbatim or nearly verbatim copyright-protected works of journalism without providing author, title, copyright or terms of use information contained in those works.” If ChatGPT had been using material that included the necessary copyright information to train AI, the chatbot “would have learned to communicate that information when providing responses.”
No strangers to copyright infringement
The lawsuits claim OpenAI and Microsoft are aware of the potential copyright infringement, with the news organisations pointing to OpenAI’s opt-out system as evidence which allows website owners to block content from web crawlers.
The three US news organisations did not provide examples of the stories that have been reproduced, but claim that regurgitations of articles from AlterNet, Raw Story and The Intercept appear in examples that ChatGPT has leveraged to train bots. These lawsuits reflect a media industry-wide trend of copyright infringement cases against OpenAI and other AI developers, some of which involve the supposed removal of copyright management metadata.
A series of allegations
In December 2023, The New York Times filed a lawsuit against ChatGPT, claiming it reproduces journalistic work without permission. OpenAI, the developer of ChatGPT, asked the court to dismiss parts of the copyright lawsuit, arguing the publication “hacked” its chatbot ChatGPT and other AI systems to regenerate articles as misleading evidence for the lawsuit.
“Raw Story feels that news organisations must stand up to OpenAI, which is violating the Digital Millennium Copyright Act and profiting from the hard work of journalists whose jobs are under siege,” AlterNet and Raw Story’s CEO John Byrne said in a joint statement. “It’s important to democracy that a diverse array of news sites continue to thrive. OpenAI’s violations, if not checked, will further decimate the news industry, and with it, the critical news reporters who affect positive change.”
These latest developments reflect a wider issue of copyright infringement, misinformation, bias and plagiarism reported by corporations in the training of generative AI models using copyrighted data. The proliferation of data and the increased use of generative AI models are prompting ethical debate for corporates and individuals who demand increased regulation as they fear risks of financial and IP losses.