View all newsletters
Receive our newsletter - data, insights and analysis delivered to you

‘We feel awful about this’ – OpenAI fixes ChatGPT bug that may have breached GDPR

The company says it has fixed the problem with titles appearing in other user accounts.

By Ryan Morrison

OpenAI could be in breach of GDPR legislation after the titles assigned to users’ ChatGPT conversations were randomly exposed to other users without consent. The company described it as a “significant issue” with a third-party open-source library that has since been fixed. A legal expert said that any action would depend on the level of harm caused by the titles appearing in the account of another user, and what that information includes.

ChatGPT generates titles automatically for each chat session that can be adapted by the user (Photo: Ascannio / Shutterstock)
ChatGPT generates titles automatically for each chat session that can be adapted by the user. (Photo: Ascannio/Shutterstock)

Co-founder and CEO Sam Altman disclosed the problem on Twitter, saying: “we feel awful about this”.

In ChatGPT, when a new conversation with the chatbot is started a note is created in the sidebar and as the conversation goes on this is given an AI-generated title. The text can be changed by the user or the note deleted. A small group of users were shown other users’ titles by mistake.

Since its launch in November 2022, ChatGPT has become one of the fastest-growing consumer apps in history, hitting 100 million unique monthly users in January alone. It has sparked a flurry of activity with companies like Microsoft, a major investor in OpenAI, and Google launching their own chatbots and integrating generative AI tools into products.

It has also sparked calls for regulation and clarity on where the technology falls within legislation such as GDPR and the upcoming EU AI Act. ChatGPT is built on top of OpenAIs GPT-4 multi-modal large language model which was trained on data scraped from the internet, massive datasets from the likes of Wikipedia and law libraries and other information not disclosed by the company.

Altman says there will be a “technical postmortem” into what caused the glitch and information used in prompts and responses may be used in training the model but only after personally identifiable information has been removed.

Content from our partners
Unlocking growth through hybrid cloud: 5 key takeaways
How businesses can safeguard themselves on the cyber frontline
How hackers’ tactics are evolving in an increasingly complex landscape

Need for regulation of AI

Countries around the world are actively exploring the impact of this type of phenomenon and how to regulate for it and ensure user data is protected. The UK is also working on a new task force to examine the impact of large language models on society, the economy and individuals.

Lillian Edwards, professor of law at Newcastle University, says the Information Commissioners Office (ICO) may examine the type of breach experienced by OpenAI to see if UK data was exposed. In the event of a breach, the regulator will most likely ask the company to ensure it doesn’t happen again rather than take any action. Tech Monitor has asked the ICO for comment.

Caroline Carruthers, CEO and co-founder of Carruthers and Jackson, says protecting user data was a core requirement of any organisation, particularly a data-rich organisation like OpenAI and breaches such as this could erode confidence in its business. Worse, she said, it also highlights the potential data pitfalls of AI.

“Platforms like ChatGPT rely on user data to function, but acquiring that data means users have to be able to trust that their information will be secure,” Carruthers says. “This should serve as a lesson to be learned to other businesses looking to utilise AI: you need to get your data governance basics right before you can graduate on to AI and ML.”

Ali Vaziri, legal director in the data and privacy team at Lewis Silkin said the issue with the AI titles being shared with other users and whether it is a data protection issue depends on whether the original user can be identified from the titles alone. “If the only information available to those other users are the conversation history titles, unless the titles themselves contain information from which the original user can be identified, it probably won’t be a personal data breach as far as a loss of confidentiality is concerned.”

Even if the titles were to contain personally identifiable information, whether it becomes a regulatory issue would depend on the level of harm. “If harm to users is likely, then that will be the trigger for any regulatory notifications which might need to be made,” said Vaziri.

“However, data protection law also requires controllers to ensure the accuracy of personal data they process, so displaying the wrong conversation history titles to a user might amount to a breach of that principle; and since doing so may have affected the integrity of personal data in that user’s account, the incident might constitute a personal data breach on that basis,” he added.

Data privacy and control

Vlad Tushkanov, lead data Scientist at Kaspersky told Tech Monitor users should have had “zero expectation of privacy” as OpenAI warns that any conversation could be viewed by AI trainers and urges users not to share any sensitive information in conversations. He urged users to “treat any interaction with a chatbot (or any other service, for that matter) as a conversation with a complete stranger: you don’t know where the content will end up, so refrain from revealing any personal or sensitive information about yourself or other people.”

Despite the warnings, some users have responded to Altman on Twitter claiming they had titles that included personal and “highly sensitive” information. The bigger issue, says Edwards, is the potential for sensitive information scraped from the internet to leak out in responses.

“It is well known these models leak personal data like sieves,” she warned, adding that “their training datasets contained infinite amounts of personal and often sensitive data and it may emerge randomly in response to a prompt at any point.”

Read more: These companies are creating ChatGPT alternatives

Topics in this article : , ,
Websites in our network
Select and enter your corporate email address Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
  • CIO
  • CTO
  • CISO
  • CSO
  • CFO
  • CDO
  • CEO
  • Architect Founder
  • MD
  • Director
  • Manager
  • Other
Visit our privacy policy for more information about our services, how New Statesman Media Group may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.
THANK YOU