Anthropic, Nsa cyber offense and the contradictions of its Ai pause call

Anthropic Is Quietly Powering U.S. Cyber Offense-While Calling for an AI Pause

Anthropic, the company behind the Claude family of large language models, has embedded a small team of engineers inside the U.S. National Security Agency (NSA) to help adapt its most advanced AI system, known as Mythos, for offensive cyber operations, according to a report from the Financial Times.

Around half a dozen Anthropic staff are described as “forward‑deployed” at the agency. Their role is not simply to provide technical support, but to tune and customize Mythos for highly specialized use cases. One source familiar with the arrangement suggested the system could assist with infiltrating adversary networks-specifically naming countries such as China and Iran as examples of potential targets.

It remains unclear whether those engineers are currently involved in live cyber operations or whether they are still in the prototyping and integration phase. What is confirmed, however, is that Mythos is the same highly capable model Anthropic has refused to open up to the public, citing fears of large‑scale misuse.

Instead of making Mythos broadly accessible, Anthropic has confined it to a tightly controlled initiative called Project Glasswing. Under this program, only a handful of heavily vetted partners are allowed to experiment with the model. That select group reportedly includes some of the world’s largest technology firms, such as Microsoft, Apple, and Amazon-companies that themselves are racing to embed AI into cloud infrastructure, productivity software, and consumer devices.

The decision to keep Mythos locked behind institutional gates reflects Anthropic’s own messaging about AI risk. Company leaders have repeatedly warned that powerful general‑purpose systems could enable automated cyberattacks, rapid biological weapons research, and large‑scale information operations if they fall into the wrong hands. In internal and public documents, Anthropic has raised the specter of models that can iteratively improve themselves, shortening development loops and pushing humans “out of the loop” in key decision‑making processes.

At the same time, Anthropic finds itself in an increasingly fraught relationship with the U.S. security establishment. The company has initiated legal action against the Pentagon, challenging aspects of the Department of Defense’s AI procurement and deployment strategy. In late February, Defense Secretary Pete Hegseth designated a major AI initiative as a core national security priority, accelerating funding, contracting, and deployment timelines for military‑grade AI tools. Anthropic’s lawsuit argues that parts of this process are opaque, potentially distorted by incumbent defense contractors, and insufficiently attentive to long‑term safety concerns.

This dual posture-supplying cutting‑edge AI to one of the world’s most secretive intelligence agencies while suing another arm of the same government over AI governance-captures the deep contradictions shaping the industry. On one side, Anthropic is leaning into U.S. national security partnerships as a way to shape how frontier AI models are weaponized. On the other, it is publicly urging policymakers and competitors to slow down, pause certain kinds of development, or at least build in stronger safety brakes before unleashing the next wave of systems.

The company has already released detailed research on the ways large models can be “fine‑tuned” or coaxed into bypassing guardrails, even when those guardrails appear robust in casual testing. Its own red‑teaming exercises have shown that seemingly harmless models can, under persistent pressure and with specialized prompting, generate step‑by‑step instructions for cyber intrusions, vulnerability exploitation, and other offensive techniques. Mythos, as Anthropic’s most capable system, sits at the center of those concerns.

From a national security perspective, the NSA collaboration is easy to understand. Offensive cyber operations have historically relied on human analysts and bespoke tooling built over months or years. A system like Mythos promises faster reconnaissance on target networks, automated code analysis, rapid generation of exploit proof‑of‑concepts, and faster adaptation to defensive patches. Even if the model is used only as an assistant-surfacing options, drafting scripts, or summarizing technical intelligence-it could make U.S. cyber units more agile and more dangerous.

Yet the same properties that make Mythos attractive to the NSA are precisely what Anthropic and other safety‑focused labs warn about when they call for a pause or slowdown in frontier AI racing. A model that can help an allied intelligence agency map out vulnerabilities can, in principle, be repurposed by a hostile actor to do the same. Once techniques, datasets, or configurations leak, they are difficult to contain. The line between “defensive” and “offensive” tooling is increasingly blurry: tools designed to probe and harden systems can often be repurposed to break them.

Critics argue that this creates an uncomfortable form of ethical outsourcing. Instead of preventing AI weaponization, companies like Anthropic are choosing to channel it through “trusted” institutions. The assumption is that democratic governments, however flawed, will ultimately use these tools more responsibly than authoritarian regimes. But that premise is not universally accepted, especially in light of past surveillance abuses, mass data collection, and controversial cyber operations carried out by Western intelligence agencies.

For Anthropic, the bet seems to be that working closely with U.S. agencies provides leverage: influence over how powerful models are integrated into sensitive workflows, a seat at the table when new rules are drafted, and a measure of control over who gets access to the most dangerous capabilities. Restricting Mythos to programs like Project Glasswing is framed as a harm‑reduction strategy-better to keep the crown jewels behind limited‑access doors than to release them into the wild and hope for the best.

However, this strategy also amplifies global AI tensions. Rival states see U.S. partnerships with Anthropic and other frontier labs as part of a broader attempt to consolidate AI military advantage. Reports that Mythos could be used to infiltrate Chinese or Iranian networks inevitably feed into threat perceptions in Beijing and Tehran. Those governments are already investing heavily in domestic AI for cyber, intelligence, and military use. The perception that American firms are effectively serving as force multipliers for U.S. intelligence could accelerate their own offensive AI programs and reduce any appetite for international arms‑control‑style agreements.

This stands in sharp contrast to Anthropic’s public rhetoric about multilateral governance, shared safety standards, and international coordination on high‑risk AI research. On paper, the company supports ideas like joint monitoring of frontier models, compute caps, and binding transparency rules for the most powerful systems. In practice, its tight alignment with select Western governments-and the secrecy surrounding programs like Mythos and Project Glasswing-makes it harder to persuade rival blocs that AI norms will be applied fairly across geopolitical lines.

Domestically, the company’s legal challenge to the Pentagon underscores another tension: who gets to decide what “responsible AI” looks like in areas like warfare, intelligence, and cyber operations. By questioning how defense AI contracts are awarded and evaluated, Anthropic is implicitly arguing that technical expertise in model safety should carry more weight than traditional procurement practices or lobbying muscle. The outcome of that fight could shape not only which companies dominate military AI, but also how deeply safety constraints are baked into those systems from the outset.

The broader debate about an “AI pause” is increasingly entangled with these security realities. Calls to halt or slow training of more powerful models often collide with arguments that the U.S. and its allies cannot afford to cede ground to China, Russia, or other rivals. Proponents of acceleration claim that offensive and defensive AI capabilities are now a core part of deterrence. Those urging restraint counter that racing forward without guardrails heightens the risk of catastrophic accidents, escalatory cyber incidents, or the eventual loss of human oversight over critical systems.

Anthropic’s situation illustrates that this is no longer an abstract philosophical conflict. The same company that warns about models capable of self‑directed improvement and autonomous action is now tailoring its most advanced system for one of the world’s most aggressive cyber operators, while simultaneously litigating against the military over how such tools should be governed. Its engineers are writing code inside top‑secret facilities even as its policy teams advocate for tighter global rules and, at times, for temporary halts in certain lines of AI development.

For the public, businesses, and policymakers trying to make sense of this landscape, the key questions are becoming sharper. Can frontier AI labs genuinely balance commercial incentives, national security demands, and long‑term safety? Is restricting powerful models to elite coalitions like Project Glasswing a responsible way to reduce risk, or merely a way to concentrate power? And can a credible case for global AI restraint be made while those same models are integrated into offensive cyber arsenals aimed at geopolitical rivals?

How Anthropic-and its competitors-answer these questions over the next few years will determine whether “responsible AI” remains a branding slogan or evolves into a real set of constraints that meaningfully shape what gets built, who gets access, and how these systems are used in the most sensitive domains, from strategic hacking campaigns to the future of automated warfare itself.