View all newsletters
Receive our newsletter - data, insights and analysis delivered to you
  1. Technology
  2. AI and automation
May 26, 2023updated 29 Jun 2023 5:06pm

Google DeepMind builds ‘early warning system’ to spot AI risks

The research lab says future AI models could pose a threat to humans including by gaining access to weapons.

By Ryan Morrison

Google’s artificial intelligence research lab DeepMind has created a framework for detecting potential hazards in an AI  model before it becomes a problem. This “early warning system” could be used to determine the threat risk if deployed.  It comes as G7 leaders prepare to meet to discuss AI’s impact and OpenAI promises $100,000 grants to organisations working on AI governance. 

Artificial intelligence models could have the ability to source weapons and mount cyberattacks, warns DeepMind. (Photo by T. Schneider/Shutterstock)

UK-based DeepMind recently became more closely integrated with parent company Google. It has been at the forefront of artificial intelligence research and is one of a handful of companies working towards creating human-level artificial general intelligence (AGI).

The team from DeepMind worked on a new threat detection framework with researchers from academia and other major AI companies such as OpenAI and Anthropic. “To pioneer responsibly at the cutting edge of artificial intelligence research, we must identify new capabilities and novel risks in our AI systems as early as possible,” DeepMind engineers declared in a technical blog on the new framework.

There are already evaluation tools in place to check powerful general-purpose models against specific risks. These benchmarks identify unwanted behaviours in AI systems before they are made widely available to the public. This includes looking for misleading statements, biased decisions or directly repeating copyrighted content. 

The problem comes from ever-more advanced models that have capabilities that go beyond simple generation. This includes strong skills in manipulation, deception, cyber offence, or other dangerous capabilities. The new framework has been described as an “early warning system” that can be used to mitigate those risks.

DeepMind researchers say the evaluation outcomes can be embedded in governance to reduce risk (Photo: DeepMind)
DeepMind researchers say the evaluation outcomes can be embedded in governance to reduce risk. (Photo courtesy of DeepMind)

Deep Mind researchers say responsible AI developers need to look beyond just the current risks and anticipate what risks might appear in the future as the models get better at thinking for themselves. “After continued progress, future general-purpose models may learn a variety of dangerous capabilities by default,” they wrote. 

While uncertain, the team say a future AI system that isn’t properly aligned with human interests may be able to conduct offensive cyber operations, skilfully deceive humans in dialogue, manipulate humans into carrying out harmful actions, design or acquire weapons, fine-tune and operate other high-risk AI systems on cloud computing platforms.

Content from our partners
Rethinking cloud: challenging assumptions, learning lessons
DTX Manchester welcomes leading tech talent from across the region and beyond
The hidden complexities of deploying AI in your business

Moves to improve AI governance

They may also be able to assist humans in performing these tasks, increasing the risk of terrorists accessing material and content not previously accessible to them. “Model evaluation helps us identify these risks ahead of time,” the DeepMind blog says.

The model evaluations proposed in the framework could be used to uncover when a certain model has “dangerous capabilities” that could be used to threaten, exert or evade. It would also allow developers to determine to what extent the model is prone to applying this capability to cause harm – also known as its alignment. “Alignment evaluations should confirm that the model behaves as intended even across a very wide range of scenarios, and, where possible, should examine the model’s internal workings,” the team writes.

These results could then be used to understand the level of risk and what the ingredients are that have led to that level of risk. “The AI community should treat an AI system as highly dangerous if it has a capability profile sufficient to cause extreme harm, assuming it’s misused or poorly aligned,” the researchers warned. “To deploy such a system in the real world, an AI developer would need to demonstrate an unusually high standard of safety.”

This is where governance structures come into play. OpenAI recently announced it would award ten $100,000 grants to organisations developing AI governance systems and the G7 group of wealthy nations are set to meet to discuss how to tackle the AI risk.

DeepMind said: “If we have better tools for identifying which models are risky, companies and regulators can better ensure” training is done responsibly, deployment decisions are taken based on a risk evaluation, transparency is central, including reporting on risks and that there are appropriate data and information security controls in place.

Harry Borovick, general counsel at legal AI vendor Luminance, told Tech Monitor that compliance requires consistency. “The near constant reinterpretation of regulatory regimes has created a compliance minefield for both AI companies and businesses implementing the technology in recent months,” Borovick says. “With the AI race not set to slow down any time soon, the need for clear, and most importantly consistent, regulatory guidance has never been more urgent.

“However, those in the room would do well to remember that AI technology – and the way it makes decisions – isn’t explainable. That’s why it’s so essential for the right blend of tech and AI experts to have a seat at the table when it comes to developing regulations.”

Read more: Rishi Sunak meets AI developer execs for talks on tech safety

Topics in this article : ,
Websites in our network
Select and enter your corporate email address Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
  • CIO
  • CTO
  • CISO
  • CSO
  • CFO
  • CDO
  • CEO
  • Architect Founder
  • MD
  • Director
  • Manager
  • Other
Visit our privacy policy for more information about our services, how Progressive Media Investments may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.