Anthropic’s Mythos breaks the cybersecurity myth

For 27 years, there was a security bug in OpenBSD, an operating system considered so secure that it’s used widely across the world, that went entirely unnoticed. This open-source software has been scanned millions of times in that time, but this flaw still wasn’t found. Until April of this year.

Think of it as the third major breakthrough for artificial intelligence (AI) since the first, the launch of OpenAI’s ChatGPT on the last day of November 2022.

The birth of generative AI, so named because it generates something, has had the world in thrall since. The generated copy reads and sounds like a human replying, albeit an overly formal one, and often prone to self-composed mistakes – sadly mislabelled “hallucinations” instead of “big ridiculous errors”. But it set the AI cat amongst the human pigeons.

After the frenzy surrounding the emergence of a higher-order form of more powerful AI, called AI agents, Mythos is something of a third epoch. Or a third hype cycle, as Gartner researchers accurately described, well, a hype cycle of new technology.

The end of the world (or rather, cybersecurity) is nigh, preached the prophets of doom after Mythos was revealed. Having a powerful new AI model which can find previously undiscovered vulnerabilities in software is a big deal, as the headlines attest.

Anthropic was so afraid of the power of its new AI model, Mythos, that it decided not to release it to the public, instead limiting it to Microsoft, AWS, Apple, tech firms, banks, and other IT companies to use to evaluate.

“I’ve found more bugs in the last couple of weeks than I found in the rest of my life combined,” said Anthropic research scientist Nicholas Carlini. They first focused on operating systems (OS), “because this is the code that underlies the entire internet infrastructure”.

Every browser was found to have vulnerabilities, as were numerous other software systems, including a 17-year-old flaw in another open-source OS called FreeBSD that was also heretofore considered secure.

Another high-profile flaw was discovered in the FFmpeg H.264 codec, which is used by most streaming services and video software. This 16-year-old flaw was scanned 5-million times without issue. That is, until Mythos uncovered it in April.

OpenAI later announced their own security-sniffing version under similar terms, called GPT-5.5 Cyber – but only after its CEO Sam Altman called Anthropic’s announcement “fear-based marketing”.

But don’t let the spat distract you from the fact that Mythos and Cyber are significant and worth worrying about. That apocryphal moment the experts have been warning us about seems to have just happened.

In this cloud computing age, everything is digital. Therefore, it is hackable. The world is filled with companies trying to secure their systems and bad actors trying to break in. Mythos is the ultimate digital can opener, as its ability to find flaws has shown.

Quality control

There’s another corollary that needs to be considered. With the advent of so-called agentic AI (the second wave of the current AI hype cycle), people are able to create their own apps and services using just simple text prompts.

So-called vibe coding has allowed anyone to be a coder, or so the hype goes.

But all software code still needs to be assessed for quality and security – and there are now millions of lines of unchecked code appearing in the world. Millions upon millions of lines of code that have potential flaws out there, waiting for a cybercriminal to exploit. Quality control is the first thing that goes out of the window with automation, history has shown. There’s anecdotal evidence of this already in the way people are using generative AI to write emails or even their social media posts. There’s a blandness and deadness to the seemingly shiny copy. Do people notice the sameness (and three relevant bullet points) of generative AI content?

Already, our standards of what is good – and meaningful – writing are slipping. Soon, the AI slop being produced will overwhelm social feeds. Will there be an outcry?

Not from us frogs in the boiling water.

Never guess

There was another prescient warning about giving AI too much authority within a software system.
Cursor is an AI agent that uses Anthropic’s Claude Opus 4.6 software and is one of the current toasts of the town for its ability to automate coding. But even though it had explicit instructions to “NEVER run destructive/irreversible git commands,” it deleted the software and database (and the backups) of a company called PocketOS.

Bad time to be hiring a car, because PocketOS supplies software to rental firms.

Worse still, imagine the horror when PocketOS founder Jeremy Crane asked why the agent had deleted this data, and got this answer in return: “NEVER FUCKING GUESS!”

Cursor confessed to what it had done, glibly adding it had ignored its own safeguards: “I violated every principle I was given.”

Cold comfort for Crane, who tweeted: “We were running the best model the industry sells, configured with explicit safety rules in our project configuration, integrated through Cursor – the most-marketed AI coding tool in the category.”

He added, “The agent didn’t just fail safety. It explained, in writing, exactly which safety rules it ignored.”

We’re in the third hype cycle of this new AI age, which has serious consequences if the necessary safety guardrails aren’t properly implemented. Mythos may be the current fearmongering revolution, but the real problem is in real-world implementations like PocketOS’ disaster with Cursor.

It’s worth remembering that this is currently available software using Anthropic’s latest and greatest AI model, which had explicit safety guardrails in place, and yet still “violated every principle”.

How long did it take for this travesty to happen? “It took 9 seconds,” said Crane.