Anthropic limits access to AI that finds security flaws, realizing hackers may use it for exactly that

Share This Post

Anthropic Project Glasswing

The story goes like this: Anthropic creates Claude Mythos, an AI model that’s great at identifying security flaws in software. Due to security issues within Anthropic’s own content management system software, details about Mythos leak to the public ahead of time. After some thinking, Anthropic decides not to release Mythos to the public over concerns that hackers might use it for nefarious purposes.

Yeah, that was probably the right decision.

On Tuesday, Anthropic announced Claude Mythos Preview, a yet-unreleased AI model that could “reshape cybersecurity.” According to the company, it has already found “thousands of high-severity vulnerabilities, including some in every major operating system and web browser.” While this is a good thing, Anthropic also said that bad actors might use Mythos for evil, with potentially “severe” consequences for “economies, public safety, and national security.”

So, instead of just launching Mythos like it would other models, Anthropic decided to only give access to a small number of select companies: Amazon Web Services, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Anthropic calls this “Project Glasswing,” an initiative to strengthen critical software infrastructure with the help of AI. The company also extended access to a group of over 40 additional organizations that “build or maintain critical software infrastructure.”

In an interview with CNBC on Tuesday, Dianne Penn, Anthropic’s head of research product management, said that the move came after a lot of “internal deliberation,” and that it was about giving “a lot of cyber defenders a head start.”

Meanwhile on X, Anthropic CEO Dario Amodei wrote in a post, “Rather than release Mythos Preview to general availability, we’re giving defenders early controlled access in order to find and patch vulnerabilities before Mythos-class models proliferate across the ecosystem.”

“The dangers of getting this wrong are obvious, but if we get it right, there is a real opportunity to create a fundamentally more secure internet and world than we had before the advent of AI-powered cyber capabilities,” he added.

While Claude Mythos Preview is particularly good at finding cybersecurity flaws, it’s actually a general purpose model. But the company does not currently plan to make it broadly available; instead, it will try to figure out how it can safely deploy Mythos-class models to everyone.

Subscribe The Newsletter

Get updates and learn from the best

More To Explore

Do You Want To Stay Connected?

drop a line and keep in touch