The Anthropic GTG-1002 Report: Nothing New, But Your Controls Better Be Tight

Industry Insights

The Anthropic GTG-1002 Report: Nothing New, But Your Controls Better Be Tight

November 18, 2025

13-Minute Read

Ofir Har-Chen

Co-founder & CEO

Table of contents

Bottom Line TLDR What Actually Happened Hard Truth #1: There's Nothing Novel Here Except the Speed Hard Truth #3: We Already Proved Attackers Move at Automation Speed - GTG-1002 Just Confirms It Hard Truth #4: You're Now Dependent on AI Companies to Defend You - And You Didn't Consent to It Hard Truth #5: If This Got Published, Imagine What Didn't What This Actually Means

"Doing the same thing over and over again and expecting different results is insanity. Unless you're a nation-state running Claude Code for APT campaigns - then it's just good automation."

Bottom Line

GTG-1002 used the same attack techniques nation-states have used for decades - the only difference is AI orchestration enabling unprecedented speed and scale. Your existing controls will now face their most brutal stress test yet.

TLDR

Anthropic published an analysis of GTG-1002, a Chinese state-sponsored operation that used Claude Code for cyber espionage against ~30 entities including major tech companies and government agencies. The headlines mostly focus on "AI-orchestrated attacks," but here's what actually matters:

The attack chain is identical to traditional APT campaigns: reconnaissance, credential theft, lateral movement, data exfiltration. Zero novel techniques. Zero sophisticated exploits. Zero custom malware. Just commodity tools executing standard procedures.

What changed is the operational tempo. AI orchestration enabled 80-90% autonomous execution at "physically impossible request rates" - thousands of operations executed across multiple simultaneous intrusions with minimal human oversight.

This means your credential hygiene, behavioral monitoring, and incident response capabilities will face stress testing at machine speed. Controls that seemed adequate when human operators were rate-limited will break under AI orchestration. Static credentials that survived because manual exploitation was too slow will now be systematically harvested and tested at scale.

The playbook didn't change. The execution speed did. Tighten your controls accordingly.

What Actually Happened

The Anthropic report describes a standard APT kill chain executed via AI orchestration:

Reconnaissance: Network enumeration, service discovery, attack surface mapping. Standard tools via MCP servers orchestrated by Claude Code. Nothing here differs from manual reconnaissance except execution speed.
Initial Access: SSRF exploitation, callback validation, foothold establishment. The report explicitly states "minimal reliance on proprietary tools or advanced exploit development." Translation: publicly available exploits against known vulnerabilities.
Credential Harvesting: Extraction of API keys, service accounts, and certificates from configurations and metadata endpoints. Systematic testing across discovered infrastructure. Again, standard procedure - just automated.
Lateral Movement: Using harvested credentials to authenticate against internal APIs, databases, container registries, and logging systems. Building network topology maps based on successful authentication attempts.
Data Exfiltration: Database queries, data parsing, intelligence categorization. Creating persistent backdoor accounts for follow-on access.

Every single phase is textbook APT methodology. The technical sophistication is near zero.

What makes GTG-1002 significant is its operational tempo: "sustained request rates of multiple operations per second" across "roughly 30 entities" simultaneously. One operator with AI orchestration achieves the output of an entire APT team.

But here's what nobody wants to discuss about what this actually means.

Hard Truth #1: There's Nothing Novel Here Except the Speed

The Anthropic report generated many headlines about "AI-orchestrated cyber espionage" and "agentic AI attacks". The security industry will debate AI threats and autonomous agents.

But they're all missing the point.

GTG-1002 used the same kill chain that APT groups have used for fifteen years. The same reconnaissance techniques. The same credential harvesting. The same lateral movement. The same data exfiltration.

The report explicitly confirms: "minimal reliance on proprietary tools or advanced exploit development." No zero-days. No sophisticated exploits. No custom malware. Just commodity penetration testing tools - network scanners, database exploitation frameworks, password crackers - orchestrated through MCP servers.

This isn't a new attack category requiring new defenses. This is the same attack you've always faced, executing at machine speed instead of human speed.

Which means if your current controls can't stop traditional APT campaigns, they definitely can't stop AI-orchestrated ones. The vulnerability isn't that AI created new attack vectors. The vulnerability is that your existing defenses were barely adequate when attackers were rate-limited by human operators.

Now that rate limit is gone.

Hard Truth #2: We've Seen This Movie Before - It Was Called "autopwn"

For those with long memories, GTG-1002's approach isn't even conceptually new. Remember Metasploit's autopwn module?

Around 2008, the Metasploit Framework introduced autopwn - a feature that automatically selected and executed exploits against discovered vulnerabilities. You'd scan a network, autopwn would identify vulnerable services, and automatically fire off exploits until something stuck. It was "Hail Mary" exploitation before we called it that.

The security community lost its mind. Automated exploitation at scale! The death of skilled penetration testing! Script kiddies with nation-state capabilities!

Then reality set in. Autopwn was noisy. It generated massive logs. Defensive tools adapted. Organizations that had mature security controls before autopwn remained secure after autopwn. Organizations with immature security got compromised either way - autopwn just made it happen faster.

GTG-1002 is autopwn with better orchestration. Claude Code systematically tested credentials, probed for vulnerabilities, and executed exploits automatically. The AI maintained state, adapted based on results, and optimized attack paths - improvements over autopwn's blunt force approach.

But the fundamental concept is identical: automated exploitation at scale using commodity tools.

The difference now is that AI orchestration is more sophisticated, harder to detect, and accessible to more threat actors. What required a skilled Metasploit operator in 2008 now requires prompting Claude Code with target information.

The blast radius expands. The defensive requirements remain the same: know your attack surface, monitor for anomalies, enforce least privilege, assume credentials will be compromised.

If your organization survived the autopwn era, you know what works. If you're newer to security and think AI-orchestrated attacks are unprecedented, study history. The tools change. The principles don't.

Hard Truth #3: We Already Proved Attackers Move at Automation Speed - GTG-1002 Just Confirms It

Last year at Clutch, we ran an experiment that revealed something pretty uncomfortable: attackers exploit leaked secrets faster than any realistic rotation schedule.

We deliberately leaked AWS keys, tokens, and credentials across GitHub, Docker Hub, NPM, PyPI, Pastebin, and other platforms. Then we watched what happened.

The fastest exploitation occurred in under 40 seconds. GitHub leaks were forked and exploited within one minute. Across every platform, automated systems discovered and tested our credentials within minutes to hours - with peak exploitation activity concentrated around early morning UTC when defenders are asleep.

Even when we rotated secrets hourly and re-leaked them, the new keys were exploited just as fast. We proved what many suspected: rotation provides the illusion of security while attackers operate at automation speed.

GTG-1002 reinforces this reality at a higher level of sophistication.

The attackers in our experiment used simple automated scrapers. GTG-1002 used AI orchestration that could autonomously discover credentials, test them across infrastructure, map privilege boundaries, and identify high-value targets - all at "physically impossible request rates."

This is the evolution we predicted. Attackers don't just scan faster - they think faster, adapt faster, and exploit faster.

Our rotation research demonstrated that static credentials are vulnerable regardless of rotation frequency because exploitation happens faster than rotation cycles. GTG-1002 demonstrates that AI orchestration makes this gap even wider.

The attack window keeps shrinking. Rotation schedules keep assuming human-speed attacks. The math doesn't work.

The answer isn't rotating faster. The answer is ephemeral credentials that expire automatically, removing the assumption that credentials will remain valid long enough for attackers to exploit them.

GTG-1002 is confirmation of what we already showed: attackers move at automation speed. Your defenses need to match that speed, not lag behind with manual processes.

Here's a detail from the Anthropic report that deserves more attention: "Eventually, the sustained nature of the attack triggered detection, but this kind of 'social engineering' of the AI model allowed the threat actor to fly under the radar for long enough to launch their campaign."

Read that carefully. Detection occurred after "sustained" activity, meaning the attacks were already underway, already compromising targets.

But who detected it? The report is deliberately ambiguous.

Was it Anthropic's internal abuse monitoring that caught the malicious usage patterns? Or did victim organizations detect the intrusions independently and report them?

This ambiguity matters because it reveals a new dependency in your security architecture: AI model providers as a detection layer.

If Anthropic's monitoring detected GTG-1002, that means:

You're now relying on AI companies to monitor for malicious usage of their models. You didn't opt into this dependency. You didn't negotiate SLAs for it. You don't know what their detection thresholds are, how long it takes them to respond, or whether they'll notify you if your organization is targeted.

Your security posture partially depends on how good OpenAI, Anthropic, Google, and others are at detecting abuse. When attackers use Claude Code or GPT-4 or Gemini to orchestrate campaigns against you, do the AI providers catch it? How fast? With what accuracy? You have no visibility into this.

This creates an untested, unverified control in your security architecture. You can't audit it. You can't benchmark it. You can't improve it. You just have to hope they're doing it well.

Think about the implications. If GTG-1002's detection was triggered by Anthropic's monitoring, then the victims were protected not by their own controls, but by Anthropic's abuse detection. That's a defensive layer you didn't plan for, can't control, and can't verify.

If detection came from the victims' own monitoring, then Anthropic's controls failed to stop sustained malicious usage long enough for multiple organizations to be compromised. That means AI provider monitoring isn't reliable enough to depend on.

Either interpretation is problematic.

The security industry spent decades moving away from security through obscurity and dependence on third parties. We built layered defenses we could control, audit, and improve. Now we're inadvertently dependent on AI model providers to catch attackers using their products against us.

This isn't necessarily bad - additional detection layers help. But it's uncomfortable that this dependency exists by default, without transparency about capability, coverage, or coordination.

You should assume AI provider abuse detection exists but is unreliable. Don't depend on it. Build your own controls that detect credential compromise and anomalous behavior regardless of what tools attackers use to orchestrate campaigns.

The alternative is crossing your fingers and hoping Anthropic, OpenAI, and others have mature enough abuse detection to catch the next GTG-1002 before it succeeds.

That's not a security strategy. That's hope.

Hard Truth #5: If This Got Published, Imagine What Didn't

Anthropic describes GTG-1002 as "the first reported AI-orchestrated cyber espionage campaign." Let's talk about what "first reported" actually means.

I've spent enough time around incident response to know how this works. For every breach that makes headlines, dozens never see daylight. Mandiant, CrowdStrike, and other top IR firms respond to hundreds of intrusions annually. How many make it to public disclosure? 5%? Maybe 10% in a good year?

Most compromises never get reported because:

Legal and regulatory requirements don't always mandate disclosure. Unless you're breaching specific data types in specific jurisdictions, there's no requirement to tell anyone. Most organizations choose silence.
PR damage outweighs transparency benefits. Why volunteer that your security failed? Why give competitors ammunition? Why spook customers? Silence is easier.
Attribution is hard and public claims are risky. Anthropic can definitively say GTG-1002 used Claude Code because they have internal telemetry. Most victims don't know what tools attackers used. They just know they got compromised.
IR firms are bound by NDAs. The engagements that would provide the most insight into attack trends never become public. The data stays locked in internal reports and executive briefings.

Now apply this reality to AI-orchestrated attacks.

Anthropic published GTG-1002 because they detected it in their own infrastructure and chose transparency. How many other AI-orchestrated campaigns are currently active that we don't know about?

How many times has GPT-4 been used for similar operations without OpenAI detecting it or choosing to publish? How many Gemini-orchestrated intrusions are happening right now? How many attackers are using locally-run models with no visibility whatsoever?

The GTG-1002 report documents ~30 targeted entities with "a handful of successful intrusions." Those are the ones Anthropic observed. How many targets were hit by attackers using different AI platforms? How many succeeded without triggering any detection?

This isn't speculation. This is how threat landscapes work. Published attacks represent the visible minority. The iceberg principle applies here more than anywhere.

Consider what publication requires:

Detection by someone with visibility and authority to publish
Attribution confidence worth staking reputation on
Legal clearance from affected parties
Strategic decision that transparency serves broader goals
Technical detail sufficient for defensive value

Most intrusions fail at least one of these gates. AI-orchestrated attacks are no different.

The uncomfortable reality: GTG-1002 is notable not because it's the first AI-orchestrated campaign, but because it's the first one that met all the criteria for public disclosure.

You can bet that nation-states aren't running single experimental campaigns with new capabilities. When China, Russia, North Korea, or Iran develop new attack methodologies, they deploy them at scale across multiple operations until defenses adapt.

If GTG-1002 successfully compromised "major technology corporations and government agencies," do you think this was the only operation? Or do you think this is one campaign in a broader program that's been running for months?

From an incident responder's perspective looking at the industry from the outside in, here's what this tells me:

AI-orchestrated attacks are already widespread. GTG-1002 isn't an outlier. It's the one we know about.
Most victims don't know they were compromised by AI orchestration. They see the credential theft and lateral movement in logs. They don't see the AI framework that orchestrated it.
Attribution is nearly impossible without AI provider cooperation. Unless the AI company detects the abuse and shares telemetry, victims can't distinguish AI-orchestrated attacks from human-operated ones. The IOCs look identical.
The detection gap is massive. If Anthropic caught this after "sustained" activity against 30 targets, how many campaigns against 5 or 10 targets are flying completely under the radar?

This is why the "first reported" framing matters. It's not the first. It's the first we're hearing about.

And if history is any guide, it's probably not even in the top 10 most sophisticated AI-orchestrated operations currently active. It's just the one that hit the combination of factors required for publication.

The takeaway isn't "AI-orchestrated attacks are coming." The takeaway is "AI-orchestrated attacks are already here, you're already being targeted, and most victims don't know it yet."

It's worse than you think. It always is.

What This Actually Means

GTG-1002 isn't a wake-up call about AI threats. It's a wake-up call about credential architecture.

Organizations with mature credential security - comprehensive inventory, behavioral baselines, automated enforcement, ephemeral architecture - are already positioned to defend against AI-orchestrated attacks. The controls don't change. They just face harder stress testing.

Organizations with immature credential security - partial inventory, manual monitoring, reactive response, static architecture - are now critically exposed. Vulnerabilities that survived because manual exploitation was too slow will be systematically discovered and exploited at machine speed.

The technical details of GTG-1002 aren't novel. What's novel is the operational scale and execution velocity that AI orchestration enables.

The attacks aren't getting more sophisticated. They're getting faster.

Your controls need to handle that velocity. If they can't, tighten them now.

Secure Non-Human Identities. Everywhere.

Let's Talk ->

Ofir Har-Chen

Co-founder & CEO

Ofir is the Co-Founder and CEO of Clutch Security. With over 15 years of experience in cybersecurity, including leadership roles at Sygnia and Hunters, he’s helped global enterprises respond to the most advanced cyber threats. At Clutch, Ofir is focused on tackling one of the industry’s most overlooked risks: securing the explosion of Non-Human Identities across modern infrastructure.