Any copyrighted material used to train the foundation artificial intelligence models that power tools like ChatGPT and Midjourney will have to be disclosed publicly under new European Union rules. The new proposed legislation forms part of the comprehensive Artificial Intelligence Act finally passed in draft form yesterday. It was described as a “compromise” solution by lawmakers.

EU legislators decided to focus on transparency rather than outright banning general use of generative AI tools for text and images. (Photo by Kaspars Grinvalds/Shutterstock)

Work started on the EU AI Act in April 2021 as a way to regulate the use of artificial intelligence technology. Since drafting began, the landscape has changed with an explosion in use of generative and general purpose AI tools, something not widely considered an issue at the time.

As part of the final draft agreement, lawmakers came to a compromise between ignoring copyright and banning the use of copyright content in training AI models. The new regulations require disclosure of any copyrighted material such as images or novels used to train a foundation model.

This isn’t the final stage for the act as it still needs to go to ‘trilogue’. This is the point where lawmakers from the EU and those from the various member states will work out the final details before the bill becomes law. The fundamental “risk-based” approach is unlikely to change, just specific details in practice.

The copyright provision was only drawn up within the past two weeks, following announcements from companies including Microsoft, Google and Salesforce of major deployments of generative AI tools within existing and new product lines. Meta also confirmed it would be adding generative AI tools to Facebook, Instagram and WhatsApp in the coming months.

EU takes a ‘middle ground’ approach to generative AI

“Against conservative wishes for more surveillance and leftist fantasies of over-regulation, parliament found a solid compromise that would regulate AI proportionately, protect citizens’ rights, as well as foster innovation and boost the economy,” Svenja Hahm, a European Parliament deputy told Reuters.

The copyright provision comes under the wider general purpose AI regulations, that is systems that have no specific use case but could be deployed across multiple areas and tasks such as image recognition or large language models. This has been a heated debate in the EU, UK and around the world with calls for stricter obligations on foundation models.

As well as declaring copyrighted content, developers of foundation AI models will have to ensure those models are designed in accordance with EU law and fundamental rights such as freedom of expression. It isn’t clear what the penalty for failing to do so will be but could include being made to retrain the model.

The wider act takes a risk-based approach to regulating artificial intelligence, where tools will be classified based on perceived risk level. For example, very low-risk uses with no impact on individuals will barely be regulated, but those in high-risk areas such as biometric surveillance or making decisions on health or legal issues with be under intense scrutiny. The act doesn’t prevent high-risk uses of AI but requires much higher levels of transparency over how they work and are trained.

Read more: Rishi Sunak declines to back ‘BritGPT’ UK AI model idea