Anthropic Unveils 'Claude Mythos', Powerful AI With Major Cyber Implications 61
"Anthropic has unveiled Claude Mythos, a new AI model capable of discovering critical vulnerabilities at scale," writes Slashdot reader wiredmikey. "It's already powering Project Glasswing, a joint effort with major tech firms to secure critical software. But the same capabilities could also accelerate offensive cyber operations." SecurityWeek reports: Mythos is not an incremental improvement but a step change in performance over Anthropic's current range of frontier models: Haiku (smallest), Sonnet (middle ground), and Opus (most powerful). Mythos sits in a fourth tier named Copybara, and Anthropic describes it as superior to any other existing AI frontier model. It incorporates the current trend in the use of AI: the modern use of agentic AI. "The powerful cyber capabilities of Claude Mythos Preview are a result of its strong agentic coding and reasoning skills... the model has the highest scores of any model yet developed on a variety of software coding tasks," notes Anthropic in a blog titled Project Glasswing -- Securing critical software for the AI era.
In the last few weeks, Mythos Preview has identified thousands of zero-day vulnerabilities with many classified as critical. Several are ten or 20 years old -- the oldest found so far is a 27-years old bug in OpenBSD. Elsewhere, a 16-years old vulnerability found in video software has survived five million hits from other automated testing tools without ever being discovered. And it autonomously found and chained together several in the Linux kernel allowing an attacker to escalate from ordinary user access to complete control of the machine. [...] Anthropic is concerned that Mythos' capabilities could unleash cyberattacks too fast and too sophisticated for defenders to block. It hopes that Mythos can be used to improve cybersecurity generally before malicious actors can get access to it.
To this end, the firm has announced the next stage of this preparation as Project Glasswing, powered by Mythos Preview. Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. "Project Glasswing is a starting point. No one organization can solve these cybersecurity problems alone: frontier AI developers, other software companies, security researchers, open-source maintainers, and governments across the world all have essential roles to play." Claude Mythos Preview is described as a general-purpose, unreleased frontier model from Anthropic that has nevertheless completed its training phase. The firm does not plan to make Mythos Preview generally available. The implication is that 'Preview' is a term used solely to describe the current state of Mythos and the market's readiness to receive it, and will be dropped when the firm gets closer to general release.
In the last few weeks, Mythos Preview has identified thousands of zero-day vulnerabilities with many classified as critical. Several are ten or 20 years old -- the oldest found so far is a 27-years old bug in OpenBSD. Elsewhere, a 16-years old vulnerability found in video software has survived five million hits from other automated testing tools without ever being discovered. And it autonomously found and chained together several in the Linux kernel allowing an attacker to escalate from ordinary user access to complete control of the machine. [...] Anthropic is concerned that Mythos' capabilities could unleash cyberattacks too fast and too sophisticated for defenders to block. It hopes that Mythos can be used to improve cybersecurity generally before malicious actors can get access to it.
To this end, the firm has announced the next stage of this preparation as Project Glasswing, powered by Mythos Preview. Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. "Project Glasswing is a starting point. No one organization can solve these cybersecurity problems alone: frontier AI developers, other software companies, security researchers, open-source maintainers, and governments across the world all have essential roles to play." Claude Mythos Preview is described as a general-purpose, unreleased frontier model from Anthropic that has nevertheless completed its training phase. The firm does not plan to make Mythos Preview generally available. The implication is that 'Preview' is a term used solely to describe the current state of Mythos and the market's readiness to receive it, and will be dropped when the firm gets closer to general release.
Anyone got examples (Score:2)
Re:Anyone got examples (Score:5, Informative)
Limited info here: https://red.anthropic.com/2026... [anthropic.com]
Sounds like more details in 90+45 days.
Re:Anyone got examples (Score:4, Informative)
Re: Anyone got examples (Score:5, Insightful)
Re: (Score:1)
But I'm sure they trained on this code. It's just repeating it's training data. There is no intelligence.
And yes, I'm kidding since otherwise someone will take me seriously.
Re: (Score:2)
Just watch the patches and CVE's trickling out.
It's not like OpenBSD is going to sit on a vulnerability for 90 days or whatever.
Issuing a patch doesn't give away the details about how it was found.
Re: (Score:2)
Y2Claude
And yes they posted at least one example:
https://ftp.openbsd.org/pub/Op... [openbsd.org]
several sections throughout this post we discuss vulnerabilities in the abstract, without naming a specific project and without explaining the precise technical details. We recognize that this makes some of our claims difficult to verify. In order to hold ourselves accountable, throughout this blog post we will commit to the SHA-3 hash of various vulnerabilities and exploits that we currently have in our possession.[3] Once our responsible disclosure process for the corresponding vulnerabilities has been completed (no later than 90 plus 45 days after we report the vulnerability to the affected party), we will replace each commit hash with a link to the underlying document behind the commitment.
Re:Great, more marketing myths (Score:5, Insightful)
That is a remarkable take, and not in a good way. Not defending AI, just appalled by the disrespect for finding and fixing vulnerabilities. Finding vulnerabilities is a core part of closing them, it doesn't only matter when an attacker does it.
Re: Great, more marketing myths (Score:2, Insightful)
Re: (Score:1)
Hahahaha, no. That I have not done. That is just your projection because you cannot deal with the factual arguments I make.
Re: Great, more marketing myths (Score:3)
What factual arguments? You're in denial, that's not an argument.
Just read the latest info from the CEO of curl on how "meh, AI slop overload" changed into "shit, they're serious!" over the past three months. That should scare you. Or alternatively, just don't look up and there is no problem.
Re: (Score:3)
Seriously, this constant delivery scam has to stop. And, again, because some some people still do not get what is going on: Finding some vulnerabilities is an attacker skill and of relatively low value for defenders. The only thing that really counts for defenders is which vulnerabilities this thing does not find. Quite non-surprisingly, there is no information on that.
You can thank AI for your most recent OpenSSL patches. If you think this is not going to fundamentally change the cybersecurity landscape I don't think you are paying attention much. Whatever bubble chatbots and agents may be going forward, this is not part of it.
https://aisle.com/blog/what-ai... [aisle.com]
Re:Great, more marketing myths (Score:4)
Yeah, "LLM's are gods" and "statistical ML networks are good at finding defective code patterns" are extremely different claims.
The people who are True Believers on both extremes look pretty silly.
I appreciate really good closed captioning while having no use for chatbots. Both ends get to call me a heretic!
Re: (Score:1)
I would restrict that even further to "LLMs can unreliably find some defective code patterns if they are obvious enough". (Remember, they cannot do deduction, just statistical pattern recognition. Too much noise or too far from the template and they fail.) That is useful, but it is not a game-changer for the defenders. It may be a game-changer for the attackers though, because attackers are golden when they find just one working vulnerability. And attackers can randomize and can limit focus to some small pa
Re: (Score:3)
I would restrict that even further to "LLMs can unreliably find some defective code patterns if they are obvious enough". (Remember, they cannot do deduction, just statistical pattern recognition. Too much noise or too far from the template and they fail.) That is useful, but it is not a game-changer for the defenders.
That is not an accurate statement. The current round of discussions are centered around projects that in large part are agentic. Source code analysis is only one of the detection methods that is being used.
Re: (Score:2)
What does that statement even mean? It sounds like marketing gibberish.
Re:Great, more marketing myths (Score:4, Interesting)
Are you unsure what "agentic" means? Generically, agency means, more or less, being able to do things. An agentic AI program (ChatGPT in agentic mode, OpenClaw, Claude code) can take actions without being controlled by humans. This is also sometimes called an autonomous agent, but "agentic" has become the dominant term over last year or two.
If you were to try something like Claude code, for example, you would see that it can run shell commands, grep through a source tree, edit files, run git commands, compile, execute, review output, edit code, compile again, etc.
Your post is confused as it assumes that LLMs looking for security vulnerabilities are -- in your words -- "LLMs can unreliably find some defective code patterns if they are obvious enough. (Remember, they cannot do deduction, just statistical pattern recognition. Too much noise or too far from the template and they fail."
That is false. The LLM models in an agentic system are doing more than just looking for "defective code patterns."
LLMs can perform source code level analysis, but they can also, in agentic mode, run fuzzing tools, scan for open ports, write a custom program to attempt to fuzz or exploit an open port, review the output, modify the code to try again, compile, repeat, and so forth. Multiple instances of agentic AIs can do this for many hours and more.
As I've said multiple times before Gweihir, I really don't know if you're trolling or not. You've seemed at least somewhat informed before, so I'm really surprised you didn't know what agentic means, or what has been possible with these tools. ChatGPT agent mode has been out for about a year, and Claude code for a bit longer.
Re: (Score:2)
Disregarding the furthest extreme radicals on any side of a philosophical debate may generally be a good thing! Though, to be fair, sometimes it is the radical extremists that drive things forawrd. Kind of like the Overton Window..
Re:Great, more marketing myths (Score:5, Interesting)
It may be fundamentally changing the cybersecurity landscape, but if so, it does not do so in a good way. What is happening is that defenders get some things, not a lot, but attackers get a massive upgrade. In particular, I have research that finds that on the defender side, LLMs do not find even relatively obvious vulnerabilities reliably. Finding some things does not cut it for defenders when the attackers can randomize and have a chance to find other things than the defenders found.
Personally, I think defenders have reached the end of the sustainability of the "test and fix" approach, because searching for vulnerabilities is a massive more powerful tool for attackers due to that randomization possibility that the defenders do not have. After all, an attacker just has to find one vulnerability that works. The defenders have to find and fix all (!) vulnerabilities that AI does now allow the attackers to find for cheap. That is really bad. Even worse is that AI can cheaply write crappy attack code that sometimes work, which is all the attackers need. That is the second barrier that is failing. Up to now writing working attack code was slow and expensive and gave the defenders time when it was not a zero-day.
My take is we will have to massive upgrade software quality and use "secure by construction" for anything that needs to survive being exposed to the Internet in the future. The problem with that is that most current coders cannot do it. Hence we probably will get significant unemployment on one side and far more expensive software creation on the other. Well, looks like we will be making a real step towards professionalization of IT and that is always painful, but in the end it probably will be a good thing when the dust settles.
Re: (Score:3)
Re: (Score:1)
I am aware of that article. You think "something happened a month ago" makes for a solid result or insight? Good luck with that. All that happened is that the non-reports git culled. That is a good thing, but it does not reduce the other problems. And "force multiplier"? That is wishful thinking. All we are getting is a very limited view on bugs having gotten a lot cheaper. That will make the rest of the field not any less problematic.
Re: (Score:2)
Nice to know that Greg Kroah-Hartman doesn't know what's talking about--you should let him know that you've cracked that nut, since he is actively working on understanding the situation.
Re: Great, more marketing myths (Score:2)
And you also have to ignore the CEO of curl, who is saying the same thing. And thousands of others who are using this daily and saying the same thing. Or just leave your head right where it is, all cosy and warm.
Re: (Score:2)
It is the best of times, it is the worst of times (Score:2)
will it be de-tuned in periods of heavy use... (Score:2)
Opus is a nice help when trying to get past a coding problem, but during high-demand periods, the output of Opus declines so much that it becomes unusable. It reasons in circles, and starts outputting code that is one step above nonsense, and then can't live-update artifacts anymore, so you blow through you session in minutes, when it should take hours.
Re: (Score:2)
I generally use Sonnet, it is very capable for most things and cheaper. If it gets stuck or starts to struggle I switch to the better model.
Re: (Score:2)
Costly status quo? (Score:1)
It's already powering Project Glasswing, a joint effort with major tech firms to secure critical software. But the same capabilities could also accelerate offensive cyber operations.
In other words, it's using horrendous amounts of power and causing untold environmental damage, while maintaining the existing overall parity between the bad guys and the worse guys. Got it.
Re: (Score:3)
"...while maintaining the existing overall parity between the bad guys and the worse guys."
In reality, probably yes. But it is conceivable that a "last vulnerability" could be closed and "overall parity" would be broken permanently. The problem is that the bad guys continue to add new vulnerabilities for the worse guys to use, and that will likely accelerate with the proliferation of these very tools.
Re: (Score:2)
The bad guys will continue to innovate and find new vulnerabilities. Meanwhile, the bug hunters have all been laid off, to be replaced by this new system. Until someone realizes that, up until now, it has been finding bugs based on the training it has scraped from the far corners of the Internet. And since there is no training data on these new attack methods, it falls on its face.
Re:Costly status quo? (Score:5, Insightful)
it's using horrendous amounts of power and causing untold environmental damage
Comparable to, say, a 787 airliner, whose environmental damage we tolerate without thought or comment simply because we're already used to it.
while maintaining the existing overall parity between the bad guys and the worse guys.
Consider the alternative, then. Anthropic does nothing, and sooner or later OpenAI or some other less responsible company delivers an AI with similar capabilities, but just throws it out to the public without much thought about the consequences. Both the black hats and the white hats start using it, of course, but the black hats have a field day compromising anything and everything before the white hats have a chance to find, fix, and distribute all the necessary patches to defend against all the newfound exploits. Not a great situation to be in, but probably unavoidable at this point unless the white hats are given a head start.
Re: (Score:2)
Are we talking about AI or humanity?
sales pitch (Score:3)
The most serious sounding sales pitch. Here is how you know... "Anthropic is committing up to $100M in usage credits for Mythos Preview across these efforts, as well as $4M in direct donations to open-source security organizations."
The sky is falling, we can help if you pay us.
Re: (Score:2)
I image a paywall in the future. You have it scan the code and the results with an obscured message about possible bugs and issues but need to sign a waver and pay extra money to see the fix. Similar to the scummy reverse phone number and people record finders do today.
Re: sales pitch (Score:2)
Let me guess, "EVERYTHING HAS CHANGED" (Score:4, Insightful)
Here we go again.
"EVERYTHING HAS CHANGED."
"Oh, that used to be true, but not anymore."
"Hey, some CEO said a thing; let's pretend it is absolute truth without any objectivity or skepticism!"
"Those old models I said were the most amazing thing ever last month are now worthless."
"AGI is here!"
It is ALL SO DAMN EXHAUSTING.
Re: (Score:2)
Re: (Score:2)
Look, it's not complicated. Disregard everything that Sam Altman says. Disregard both the furthest extremes of the "AGI is here and sentient" / "LLMs are Godlike!" and the "LLMs are trash that are not useful and aren't going to improve and are a passing fad" (gweihir). All of the above are not insightful.
Everything has changed. ChatGPT-3 was released in 2022. Everything HAS changed since then, and LLM technology and models have improved dramatically in the last 4 years. Why would you not expect statements l
Sounds about right. (Score:2)
!. Get AI out there and everyone to use it because it's useful. Allow reasonably powerful models.
2. Make the peasents..I mean..consumers feel like some control has been given to them so they're not nickle and dimed for simple HTML or programs that AI can make for users for simple tools.
3. Introduce more powerful AI and for security and everything else now, is locked behind a vendor that you can't get your hands on so you need to continue your life subscription to everything, and you accept the vendor since
YIKES! API Price (Score:5, Interesting)
Still cheaper than GPT Pro though ($30/$180)
Re: (Score:2)
I am thinking it costs them a gigantic amount of compute resources to run Opus 4.6. In my experience though, it is the premium model for coding and in many cases worth the extra price.
Re: YIKES! API Price (Score:2)
And if it finds critical bugs in my software I'm happy to pay the price, instead of seeing the company go bankrupt.
the Battle of the Titans (Score:2)
"In the last few weeks, Mythos Preview has identified thousands of zero-day vulnerabilities with many classified as critical."
We are moving into a scenario where there's a race for extremely capable white hat AI to identify the existing vulnerabilities and try to plug them, and black hat to find and exploit them. I think this is a good move to try and get the white team ahead of the game. There's a possible apocalypse here.
Re: (Score:2)
Re: (Score:2)
>> What if the white hat AI introduces the vulnerabilities?
Always possible of course but I find that the LLM's are better at writing robust code than most humans. Yesterday I was working to make a basic login page for a web app. After I got it working I asked the AI how I could make it more resistant against hacking and it came up with a long list of improvements. Brute force protection, cookie security, session binding, idle timeout, concurrent session limits, login anomaly detection, etc., etc. Very
What the heck are "cyber" implications??? (Score:2)
And these are apparently huge ones, at that!
That sounds like sales gobbledygook to me!
Moral of the story (Score:2)
Major Cyber Implications (Score:2)
Oh nO NoT "CYBER" ImPLIcATIoNS
Get fucked, poser.
Do you mean "cybersecurity"? (Score:2)
And isn't it nice of Anthropic to gift this to all the crackers in the world, to find and use before the bugs are reported?
Isn't t there a law against attractive nuisance, at the minimum?
Your funny here? (Score:1)
Don't look at me.
"step change" (Score:2)
So happy that "step change" has replaced "quantum leap."
The great filter. (Score:1)
What is striking about Mythos isn't Mythos, it's that Mythos found exploits that really have no business existing. While it's generally understood there are bugs "in the wild," the type Mythos is finding are unusually severe. And they claim there are thousands in every major OS and web browser. It's also unusual that Google is endorsing Mythos, which is a competitor model. Even if Antropic is just running a hype train, why would Google throw its towel in to promote Antropic's model?
I think the project is ca