What once took teams now takes minutes
The Anthropic attack, disclosed in mid-November, illustrates the new reality. Chinese hackers manipulated Claude Code, an agentic AI tool designed for legitimate software development, into performing most of the work traditionally done by humans. The AI conducted reconnaissance on target systems, identified security vulnerabilities, wrote custom exploit code, harvested credentials, and exfiltrated data. According to Anthropic's analysis, the attackers were able to automate 80-90% of the
campaign, requiring human intervention at only a handful of critical decision points.
At the attack's peak, the AI made thousands of requests, often multiple per second. That's a pace no team of hackers could sustain.
Testifying before the
House Homeland Security Committee this month, an Anthropic executive called it a proof of concept that concerns about AI-powered hacking are no longer hypothetical. Kevin Mandia, the founder of Mandiant, a cybersecurity company Google spent $5.4 billion to acquire, and now of a new AI-focused security startup called Armadin, offered a blunter prediction. "Offense is going to be all-AI in under two years," he told The Wall Street
Journal.
To be clear, most hacking still doesn't require anything close to this level of sophistication. Millions of people still use "password" as their password. Phishing emails work because someone clicks the link. People give out bank information to strangers who
call them and ask for it. The overwhelming majority of breaches exploit human error, not cutting-edge AI.
But for nation-state attackers and well-resourced criminal groups targeting hardened systems, AI represents a force multiplier that changes the
calculus of what's possible.
The next frontier involves malware that thinks for itself
Today's AI-powered attacks still need to phone home. The malware talks to an AI service in the
cloud, gets instructions, and acts on them. But security researchers are already exploring what happens when that's no longer necessary.
Dreadnode, a security research firm, has prototyped malware that uses the AI already installed on a victim's computer. No internet connection required, no server for defenders to track down and shut off. Their experiment took advantage of the fact that Microsoft now ships CoPilot+ PCs with
pre-installed AI models.
In their proof of concept, Dreadnode created malware that uses the victim's own on-device AI to make decisions about what to do next, eliminating the need for the traditional back-and-forth communication between malware and a hacker's server. The AI assesses the local environment, decides which actions to take, and adapts its behavior
accordingly.
The experiment required more hand-holding than the researchers initially hoped. Current small AI models lack the sophistication of cutting-edge systems, and most computers don't have the specialized hardware to run AI inference without grinding to a halt. Dreadnode still came away convinced that building autonomous malware without external infrastructure “is not only possible but
fairly straightforward to implement.”
As AI hardware becomes more common and on-device models grow more capable, this technique could become practical at scale.
The pace of improvement is what worries researchers most. Just 18 months ago, AI
models struggled with basic logic and had limited coding abilities. Today, they can execute complex multi-step attack sequences with minimal human oversight. Security firms testing frontier models report growing evidence that AI systems are improving at finding weaknesses and stringing them together into attacks.
Fully autonomous AI attacks remain out of reach for now. The Chinese hackers
using Claude still needed to jailbreak the model and approve its actions at key points.
But the same capabilities that help people write code and automate their work can be turned toward breaking into systems. The tools don't care which side you're
on.
—Jackie Snow, Contributing Editor |