Google DeepMind Unveils Framework to Exploit AI’s Cyber Weaknesses

Strong defense comes from attacking the enemy’s weak points. Google DeepMind has developed an evaluation framework that highlights the areas where adversarial AI is weakest, allowing defenders to prioritize their defensive strategies.

DeepMind works at the cutting edge of AI – what it calls Frontier AI. This includes the path toward AGI (artificial general intelligence) where AI becomes able to reason for itself. In a new report (PDF), DeepMind analyzes the use of current AI in cyberattacks, and the common frameworks used in evaluating such attacks – and finds them to be lacking. This will only worsen as the capabilities of AI and adversarial use of emerging AI improves.

DeepMind examined the various existing methods of evaluating an attack that is AI-assisted or derived. The biggest value of attack evaluation frameworks is demonstrating the attack methodology of adversaries and allowing defenders to focus their defenses on particularly relevant areas of the attack chain. However, DeepMind found that current AI frameworks are ad hoc, not systematic, and fail to provide defenders with useful insights.

Learn More at the AI Risk Summit at Half Moon Bay

Current frameworks tend to focus on the well-known adversarial AI assistance of capability uplifts, throughput uplift, and automation; that is, that adversarial AI attacks can be more sophisticated, more frequent, and more widespread. This information, on its own, does not help defenders prioritize their response to an individual AI focused adversarial attack.

They are currently weak in what DeepMind calls “AI’s significant potential in under-researched phases like evasion, detection avoidance, obfuscation, and persistence. Specifically, AI’s ability to enhance these stages presents a substantial, yet often underestimated, threat.” At the same time, while the evaluation frameworks concentrate on examining the various stages of the attack chain, they provide little guidance on how or where to disrupt the attack.

DeepMind set itself to develop a framework for evaluating a complete and full attack cycle of adversarial AI attacks to better understand the optimum point of cost disruptive defensive mitigations – but in a framework sufficiently flexible to cover current AI and more advanced AI as it emerges.

It analyzed more than 12,000 actual attempts to leverage AI in cyberattacks from more than 20 countries, as cataloged by Google’s threat intelligence group. From this, it curated a list of attack chain archetypes and ran a bottleneck analysis to locate potential challenges in the attack chain – and developed a list of 50 such challenges.

Advertisement. Scroll to continue reading.

“We considered attack stages that historically have been bottlenecks due to their reliance on human ingenuity, time intensive manual work, or specialized skills,” explains DeepMind in its report, “and evaluated the potential for AI to automate or augment these stages, thereby significantly reducing the cost of execution for attackers.”

It used Gemini 2.0 Flash to see if AI could assist the attacker in these specific challenge areas – and found that current AI is not efficient. As a result, the defender has a list of points in the attack chain that will most likely not have had any adversarial AI-assistance, and would offer prime areas for defensive operations to break the chain.

“This structured approach allows us to not only identify potential AI driven risks but also to contextualize them within established cybersecurity frameworks, enabling defenders to strategically prioritize resources and proactively enhance their security posture in the face of evolving AI-driven cyber threats,” reports DeepMind.

This systematic approach to evaluating AI-assisted cyberattacks, based on challenges to the employed AI model, has multiple advantages. As AI models improve in their adversarial capabilities, that progress can be monitored as they begin to solve more of the challenges. Where current challenges remain unsolved by AI, defenders can better understand the strengths and weaknesses of the AI model being used, and get insight on where to prioritize their mitigation strategies to have the greatest effect on disrupting the attack chain.

The process can also help AI developers produce more secure versions of their models. By identifying potential risks and areas where the model can be misused, the DeepMind approach to evaluating emerging cyberattack capabilities can be used by AI developers to implement safeguards and improve the overall security of the model.

The basic principle is to find those areas where AI is currently weak in improving the attack (the challenges), use those challenges as pointers for defense teams, and monitor the progress of AI models in solving the challenges.

“It [the DeepMind evaluation framework] offers an advantage in the face of AI-enabled adversaries, because it equips defenders with decision-relevant insights to enhance their cyber defenses,” says the report. “Mitigating misuse requires a community-wide effort, including robust guardrails and safeguards from AI developers, as well as the evolution of defensive techniques that account for AI-driven TTP changes.”