The text is very obviously heavily authored using AI. The more interesting question is: did a person even prompt an LLM to write on this subject? It seems likely that someone set up an agent using OpenClaw or something to just automatically write articles on popular topics in AI, and this is something that it came up with
Even if I do still care about it, I hope I'm not so naive as to think that following the words of the AI agent will cause me to achieve my EM's intent. They'd be a useful reference at best for my actual knowledge of what they want that I found elsewhere.
Brother, I don't care who writes the specs as long as they sign the checks on time. And yes, I do care about my work even if upstream is slop. In a relay race, you can lower your performance to weakest leg, or you can be the strongest leg. And maybe I just like to run.
It's pretty clear at this point that Mythos' capability to discover and exploit zero-day vulnerabilities at scale is but an incremental improvement over existing models like the ones available to OpenAI's Plus/Pro subscribers.
Anthropic tries to create marketing hype around Mythos using two psychological tricks.
1. Put large numbers in the headlines.
"Mythos discovered 271 vulnerabilities in Firefox" makes the model seem extremely capable to the uninitiated.
But it's actually meaningless as a measure of capability _improvement_.
Anthropic gave away $100mil specifically as Mythos credits to these projects and companies (that's $2.5mil per project). Spending the same exorbitant amount of compute analyzing the same codebases in an older model like GPT 5.x Pro would have turned up 260 of these vulnerabilities, or could even have turned up more than 271 ones.
No need to speculate, since this is exactly what we saw in the few code bases where we have such comparisons (like in the curl codebase). Supposedly weaker models, working with a much lower budget, turned up dozens of vulnerabilities. Mythos turned up only one, which ended up as a low severity CVE.
2. Do the whole "too dangerous to release" shtick. This is one of Dario Amodei's favorite moves. When he was vice president of research at OpenAI, he declared GPT-3 (which wasn't able to produce coherent text beyond 3-4 sentences at the time) too dangerous [1] as well.
Long story short, it's the ChatGPT 4.5 situation again: a company trained a model that's too slow and expensive, but not much more capable than what came before. It therefore requires these marketing stunts.
I work for a company that has been using Mythos for vulnerability detection in our software. The results we're getting are revolutionary to the point that our software security teams are heavily overloaded addressing the deluge of thousands of real bugs/vulnerabilities and design flaws across our billions of lines of code.
For comparison, we are invested heavily the the AI space to the point where Anthropic is one of our competitors. We were already using state of the art models to find flaws in our code, but Mythos was just so much better at finding real vulnerabilities it's not even funny.
Yeah I’m a security researcher and my colleagues who have access say it’s insanely good… but interestingly they also work for places like nvidia which have a deep vested interest selling tokens and hardware. So of course they are pushing this narrative.
if you are invested heavily in the AI space, isn't it in your best interest for the froth around Mythos to be true and the comment you are responding to to be invalid? even if you are competing with Anthropic, a rising tide raises all ships
i'd like to see more facts and data one way or another!
This is the "circumstantial" version of the ad hominem fallacy. Just because the author may benefit from the argument being true doesn't mean it is invalid.
They are clearly disputing the assertion the Mythos is an incremental gain rather than quantum leap. Of course objective unbiased data would be nice, but these anecdotes are all we have right now.
One aspect that isn't really discussed much in this context is how to wrap one's head around the corporate risk with models of ever increasing capability. It might not be too dangerous to society, but it could be too dangerous to Anthropic.
I couldn't agree more. I think the recent moves to partner with xAI and Amazon are proof that they desperately need more compute and are doing everything possible to get it.
I'm fairly certain Amodei believes the "too dangerous to release" hype himself. Even if it's just an incremental improvement, better than getting frog-boiled by repeated 20% improvements until someone builds bioweapons in their backyard.
He's made so many statements that fall under the "boy who cried wolf" category that even if he _does_ believe these statements he needs to be managed better. I'll never forget Anthropic's huge "Oh my God, the AI blackmailed a researcher to save itself!" and the prompt effectively told the AI to do that and gave it forged emails with easy blackmail targets, as if this isn't a common trope in mystery or suspense books/television/fanfiction, all of which Claude (and others) have been trained on.
It's a common trope, all through the training data, and all the modern AIs have read it, and would probably act similarly? Is that what we should take away from your comment?
so we have nothing to worry about. Makes sense. Really, it's just a common trope.
Imagine you're in a car and the car is driving towards a cliff. You shout at the driver "oh my god we're about to go over a cliff!" And he says "you said that two seconds ago, but we're still alive, you're just like the boy who cried wolf. Do you know exactly when we're going to go over a cliff? No? Maybe you're imagining the cliff."
I think it's very improbable that AI is as dangerous as Yud et al fear it is. But it's too soon to say and there seems to be significant long-tail risk. Mocking or criticizing people for being concerned about that risk seems counterproductive.
Seems like the life cycle of huge tech companies like meta, Google, Microsoft, Amazon is "do whatever's necessary to take over the world, then enshittify." I don't take it for granted that Amodei and Anthropic seem to not quite be maximally power hungry?
Re: second half of your comment. Understanding a threat doesn't neutralize it. Anthropic didn't make that big a deal of it either; it was news articles that blew it out of proportion.
You're right, it's silly for me to worry. We've never had a technology that initially appeared benign but turned into a big problem. In fact, no tech company has ever released technologies that cause problems for the rest of society AT ALL. /s
What are the other barriers? Last I checked access to CRISPR is not especially tightly regulated. Even if it is, defense in depth is a thing.
If it was as easy as "knowing how to" someone would've already done it or at least attempted to.*
Plenty of people know how to, 10,000s of researchers, perhaps you know someone who does.
Did you know that your local veterinary shop has enough drugs to kill 100s of people?
Why doesn't it happen?
* It's not that easy.
* There's a ton of regulation that is hard to circumvent, on purpose.
* There's a gigantic deterrent called "spend the rest of your life behind bars" that people tend to avoid.
An LLM, even the most advanced one, does not make any material change in any of these. You cannot bullshit your way into "uhh, I need Ebola samples for ... reasons".
Unironically, your Sunday movie portraying a super-villain jeopardizing a city with his "home lab" full of flasks with colored liquids and BioHazard signs push way more people into becoming interested on this than having access to an LLM.
*: Okay, like 5 people, and way before LLMs were a thing. This has been a thing for decades, we're fine.
Also, slightly stretching the definition of terms consecutively, so the multiplicative meaning is really far from the truth. For example, 271 vulnerabilities were really mostly bugs - generally incorrect states, but which almost never led to any exploit.
Yes, an AI making massive gains in bug finding is hugely important and good, it may even lead to a net neutral with the amount of bugs introduced by other AI coding processes, but it’s a far cry from how mythos is portrayed most of the time: a automatic super hacker.
But I think that's a problem with the people portraying it that way, not with Anthropic's messaging. If you've invented "just" a massively more powerful bug finder, it still seems right that you ought to let banks and critical infrastructure providers run it on their systems before it gets in the hands of people who might want to hack them.
I find it interesting that Mythos was announced the same day that GLM overtook opus4.6 in capability. To me this seems like a careful attempt to cool demand for opensource models which are about to take the overall lead.
It's an AI-written slop article, which is hugged to death by HN in any case.
It claims to be an evidence-based investigation, but basically invents the contents of the documents they supposedly investigated, such as the Anthropic Frontier Red Team writeup, from whole cloth.
I don't think deeper engagement with it would promote good discussion.
First, the reason LLMs learned to like em dashes is that they are common in the training corpus - they are a thing before LLMs that LLMs have learned, not invented?
Second, work browser has nice blue swiggles under everything I write into a textbox. I dutifully click through them and accept the rephrasing suggestions. I get a lot of em dashes. My blog posts and whitepapers and stuff are full of them and other “AI tells” - but I think they read better because of it.
I use emdashes all the time. They're correct punctuation as opposed to a minus sign. They're easy to type too: opt-shift-minus. If they were such a huge giveaway without ever being used by humans, models would be trained by now not to use them as much.
I've never seen writing created before the advent of LLMs that used emdashes in the same way and with the same frequency that LLMs regularly do. There's probably some out there but it would be a real outlier. LLMs overuse them to an absurd degree, putting them where most writers would put commas, occasionally semi-colons, or nothing at all.
I count 51 em-dashes on the page, which is extreme. They're also used in places where they don't really belong. It's very obviously LLM-generated, at least in part.
That said, it puzzles me why people don't prompt LLMs to change up the writing style a bit and remove some of the tells.
Take another look at this blog's index https://kingy.ai/category/blog/ and click through more posts, and pay attention to the post dates.
Do you really think this singular author is writing multiple excessively-long blog posts about AI per day? There are ~650 of these posts over the past 18 months. And over on LinkedIn, the author describes himself as a "Specialist in Digital Marketing, Videography / Video Editing, Search Engine Optimization, Social Media, and B2B Sales."
YMMV but this post and entire site absolutely screams "slop" to me.
Don't bother with the slop lovers, these people are anti-human in their souls and willing to follow the most evil people on Earth to the depths of hell; for what? I have zero idea but it's sad to see.
> It's pretty clear at this point that Mythos' capability to discover and exploit zero-day vulnerabilities at scale is but an incremental improvement over existing models like ChatGPT Plus/Pro.
I'm skeptical of AI takes by someone who thinks there's a model called chatgpt plus. Spend more time working with the current systems!
It seems like everybody (including you) knew precisely what I meant: the models available for ChatGPT Plus or Pro subscribers, i.e. GPT-5.5 Thinking Extended and the latest Pro. I've edited the offending sentence for clarity just in case.
If I got you to be skeptical of AI takes, though, mission accomplished. Exercise your skepticism especially when the takes come from somebody who is trying to sell something.
It's somehow nice to see an old-school HN hug of death once in a while, it has become a rare sight since most links are now to big platforms or to websites behind Cloudflare.
My thinking is that if it really was super duper then Anthropic could charge eye watering amounts and have willing customers and set up expectations going forward that SOTA costs a lot to use.
That they don’t suggests that really it is only incrementally better than Opus 4.7 and that the market won’t bear a price increase that makes it economical to serve let alone profit from serving.
So the cynical me imagines execs sitting around the table and worrying that releasing it at anywhere close to break even would risk actually hurting the brand instead of setting them up as a premium company, and this at a time just before ipo when they can ill afford that rumour.
So they wonder what to do, and think playing national security card is the obvious way out. It’s incrementally better enough to find bugs that previous sota missed, it doesn’t get used widely so it’s cheap to serve and they get the good publicity without the economic scrutiny?
Making a loss selling to a small number of users using it in a limited way is entirely affordable. Making a loss selling it at scale is correspondingly unaffordable?
They announced the pricing when they released preview: $25/$125 per million input/output tokens. I have no doubt they're already selling it to select customers.
Article does not mention the other reason: in the interview with Dwarkesh, Amodei remarked about how other organizations are copying or training off Opus for their models.
By delaying allowing others to train off Mythos, they hold their SWE-Bench Pro head start longer so among other things, the USG can't but notice Anthropic's lead when they're deliberating on whether to further substantiate the "supply chain risk".
Precise motives are hard to work out as a general rule. Ultimately, it often comes down to a decision that decision makers like or don't like for a confluence of reasons.
(I work at Anthropic) We have publicly stated[1] that our goal is to deploy Mythos-class models at scale when we have the requisite safeguards for offensive cyber risks in place. Mythos is a general frontier model, not a cyber-specific model so there are many reasons why we think our users will benefit from access (with the aforementioned safeguards in place) in due course. Compute has also not factored into our decision[2] to rollout the model in a limited fashion to defenders. We'll be sharing more soon on the first month or so of the project and rollout.
Multiple people who have already used Mythos or been given its reports on their software have publicly stated that it's all hype, and that it is not really finding any new critical bugs which other models cant.
Do you have any good sources on that? I have seen things to suggest that not all of the hype is true, but so far I have not encountered anyone claiming all of the hype is untrue. Which is what I interpret "its all hype" (sic) to mean.
CURL has been scanned with multiple LLMs. Mythos was last and as a result found only 1 issue. If Myhos was really much better I'd expect it to find a lot more issues despite the others already there.
Also, the competing models are getting better. Opus 4.5 was better than everyone else when it was new, but only a few months later and there are a lot of models that are better (not just the newer Opus models)
Curl had a prominent bug bounty programme, has 180k lines of prod code, and is mainly a client app/lib. I would look at other projects before making judgements about mythos on this one.
If you remove the fluff that the register added and stick with https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-v... it seems like a claim that's a claim fairly distant to "its all hype". Less than expected perhaps? Maybe the code really is unexpectedly robust? I guess time will tell on that point.
> "My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing."
> "I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos."
> "An amazingly successful marketing stunt for sure."
Personally I see this as a very strong claim of hype. I take away from this that Mythos is a hyped up marketing stunt, and not what it was preesented to be by Anthropic at all.
Of course if it really is overhyped, then it becomes much more difficult to release it publicly. Better to retain the mystique and release the next thing. But we'll see eventually.
Are there any publicly verifiable sources that Mythos is that much more intelligent than Opus, so to be considered much more dangerous (as it is presented in the public discourse by Anthropic)
It doesn't have to be _much more intelligent_ than Opus to be a risk. It doesn't even need to be _more intelligent_. It just needs to be _better at finding security problems_. Which could happen from just minor improvements in training data, or the harness, etc. Even a small improvement could shift it from finding very few new security holes, to reliably finding many at scale.
Yeah, I think a lot of the disconnect here is that people think of "model intelligence" as some sort of IQ score, rather than a combination of scores that measure abilities at a large variety of domains.
Weird take to claim "generally intelligent frontier" (whatever rhat means) and restrict availability based on "offensive" cyber security alone (how can this be handled at all compared to fixing software also remain to be seen) all while competitors but more importantly sw maintainers (eg curl) estimate that the capability in finding cybersecurity bugs is similar to what other modern models produce, and this has just significatively risen in the last months for everybody.
High end AI is at its most useful when you use it to replace high end human labor. You can't buy 9000 cybersec specialists on demand, but you can buy more Mythos tokens.
Then we get into all the scaling curves. Such as: LLMs getting more capable per FLOP, per byte of weights, per byte of VRAM, etc. And: inference compute getting cheaper over time.
I see a lot of "should make the industry nervous", but when you try to dig into it? It's wishful thinking, every fucking time.
As the article states, right now Anthropic does not have the compute capacity. I suppose they could charge an enormous amount of money, but if it is indeed that powerful there are folks that would pay and they would degrade everything. To make matters worse, bad actors could use it to find zero-day exploits in the power grid and banking system.
I don't believe anything out of these startups anymore unless its backed by evidence.
Too expensive? Why would anthropic train a model too expensive to run? I doubt they would. Let's look at the evidence: Opus 4.5 came in at double the speed and half the price of old opus. Its speed matched older sonnet models. Higher Speed + Lower price = smaller model. So they rebranded sonnet sized models to opus. Where is the og opus sized model?
Yeah we have literally no examples of more intelligent beings accidentally or purposefully wiping out less intelligent beings. Any time such a scenario could have conceivably happened, the less intelligent beings were able to foresee the methods, mechanisms, and motivations of the more intelligent beings and were able to counteract it.
No no. I think the "chatbots" will be effectively neutered as long as there's not a trillion dollar+ incentive to make the physical world highly malleable by text strings (e.g. by moving critical functions into information/code/data or by creating physical systems that are controllable by information/code/data).
If we look at our human history, there are millions of examples where less intelligent beings destroyed highly advanced civilizations.
It was never about intelligence, but about willingness to destroy (willingness to defend is not enough). Babylon, Egypt, Persia, Greece, Rome, China, ... I won't mention current examples ...
1. "Less advanced civilization" != less intelligent people
2. The outcome of near-peer competition is surely highly dependent on factors like brutality, luck, tactics etc... the competition between the defenders of crops (i.e. makers of pesticides) and insects is not. Not only are the insects destroyed en masse successfully, but neither side even recognizes itself as party to a competition. The insect has no conception of a crop, even when he walks in it, much less a pesticide, even when he tastes it. The pesticide sprayer assigns zero moral valence to his daily genocide.
Do you have a reason to believe the gap between AI (not LLMs specifically, but AI generally) and human intelligence will peak near the difference between human competitors (what... 20-30 IQ points)?
> Do you have a reason to believe the gap between AI (not LLMs specifically, but AI generally) and human intelligence will peak near the difference between human competitors (what... 20-30 IQ points)?
So we established that competing human civilizations differ by 20-30 IQ points? Sounds reasonable.
> If so, please share why you believe this.
Basically two reasons:
1. there's no AI. There are LLMs, which basically do pattern matching on increasingly LLM generated data set. That inevitable leads to a local maximum where every advance is increasingly difficult for decreasing gain in "intelligence"
2. the energy required to reach an ever increasing level of "intelligence" (or let us just call it pattern matching performance) quickly becomes so huge that it's simply not sustainable.
I think the current LLM approach is a dead-end bound to plateau not much higher than the current level.
I'm not saying it's impossible to reach AI, but it would require a paradigm shift that I'm not even able to imagine at this level of available technology.
Obviously AI is physically possible, unless you think there's something universally special about the earthbound naked ape's brain-goo that imbues it with special intelligence-stuff.
> the energy required to reach an ever increasing level of "intelligence" (or let us just call it pattern matching performance) quickly becomes so huge that it's simply not sustainable.
Every single human being has an existence (dis)proof inside their skull
> I think the current LLM approach is a dead-end bound to plateau not much higher than the current level. I'm not saying it's impossible to reach AI, but it would require a paradigm shift that I'm not even able to imagine at this level of available technology.
Whatever the reason for "hiding" Mythos, it seems clear that these systems are getting very good at finding software security exploits. Mythos has made more people, even the US government, sit up and pay more attention. Regarding who should control the release of powerful systems like this, as Bruce Schneier and David Lie write in "Mythos and Cybersecurity" :
"Until that changes, each Mythos-class release will put the world at the edge of another precipice, without any visibility into whether there is a landing out of view just below, or whether this time the drop will be fatal. That is not a choice a for-profit corporation should be allowed to make in a democratic society. Nor should such a company be able to restrict the ability of society to make choices about its own security."
My posts* got to the first spot on hackernews couple of times. Never once it broke down like that. And why would it, it's just a bunch of html and css files served through (free) vercel (don't think it matters). I wonder what do people run their blogs these days, so they fail under the pressure so easily.
Yes, with the recent waves of AI scrapers I have noticed on my own WordPress websites that the DB seems to the weak point when under load. Cache plugins can help a lot with this.
The missing piece is the reminder that scarcity still exists.
Whether its actually scarcity or hype building or a bit of column a, bit of column b is TBD. Then again, the new models seem more expensive, they slashed the tokens thrown around in thinking, and put up limit speedbumps so it’s probably not all gaslighting about compute bottlenecks.
The thought of this didn't even cross my mind until yesterday. I previously figured the hype was primarily around marketing, but after watching this Primagen video, I have the same suspicion.
This makes it sound like some kind of open question or even mystery.
Amodei himself stated quite clearly in recent interviews that they simply can't satisfy all demand, compute wise. Of course, Mythos could get more of the already too small pie, but clearly it's a more resource intensive model and would further increase strain.
Reminds me of the paper launches NVidia/Intel/AMD sometimes do where they announce some amazing tech (such as the old Titan GPUs) that placed their hardware at the top of the benchmarks, but with basically zero actual stock available.
It's probably a little of both: dangerous and expensive. This article makes a good case that the cost is at least part of the reason.
I wish the article could have been a lot tighter and shorter. This is not earth shattering information that requires a New Yorker length piece of investigative journalism.
As far as my understanding goes. It is not a breakthrough model itself but finetuned model with right tools and skills. Fairly similiar to today's coding agents with difference that they are made for software engineering not cyber security.
It has considerably more parameters than most frontier models of today. Which gives it a lot more oomph per token.
Is it a "breakthrough" as in "something novel and unexpected"? No. Is it a "breakthrough" as in "something we know works, but made to work on a greater scale"? Very much so.
Supposedly 10T scale. Literally the next big thing. A bit like what OpenAI tried with GPT-4.5 - but Anthropic actually made it work with MoE, reasoning, tool use, RLVR, etc.
It matters because the "g factor" of today's LLMs is at least in part a function of raw scale. Larger models are just smarter - assuming you can handle the training and inference at this increased scale.
This lengthy article by a self-described "AI enthusiast" muddies the waters. Yes, Anthropic has capacity constraints, which is why they rented Colossus from Musk despite the danger of being distilled.
The real reason is that the hype around Mythos has already gone quiet because it does not find more than other models. That is, nothing at all in most open source projects. If you hide the model, embarrassing statistics will not be posted.
I think it's plausible that a substantial fraction of the increase in cyber attacks we saw recently was caused by GPT-5.5. So the "too dangerous" framing is plausible, even if the more important reason is a lack of RAM (as the article author suspects) or compute to serve Claude Mythos. We already know from other events that OpenAI is far less interested in AI safety and ethics than Anthropic.
I'd be tempted to offer this as a consultant service were I at Anthropic.
It feels like an AI tool that needs professionals to interface with it. Get some of those professionals, have them work with clients in a targeted way. It helps reduce the exposure the tool has to bad actors, and reduces the amount of resource usage that it will incur, because it's being used only by trained individuals.
Use what you learn from the experience to further refine its operation and make it less expensive to operate.
The "too dangerous to release" line was definitely a marketing stunt.
OpenAI already used the same playbook with GPT-2 in 2019, and some of the same people involved back then are now doing it again at Anthropic with Mythos.
Same safety-branding DNA, different company, and people are falling for it again.
Astonished to see so many bright people on HN taking the bait, especially from a company whose gone to such lengths to screw over their paying customers.
They're a commodity provider. They're no more special than any of the others, and it's just a matter of time before their trillion parameter models are running on my watch.
So, of-course they're trying to snatch up giant, long-term contracts now while they hype the hell out of another minor incremental improvement.
And we'll be paying the price to all the Enterprises that lock in, only to wake up a week from now and realize there is another player with a better product.
ChatGPT literally tells people to kill themselves but apparently that’s not too dangerous and this is.
It’s bad enough that it’s a marketing stunt, totally agree with you. But in the face of what we have seen and how they act like it’s no big deal, it’s just gross.
When your logo is AI, your illustrations are AI, and you profile pic is AI, I'm going to assume the text is AI too and won't read it.
The text is just as you predict, but in fairness to the author using a .ai domain is a good way to set expectations up front.
This feels like it's performative AI. Just trying to use everything that has "AI" in it be more AI bro'er than other AI bro's.
I read it, and it is AI
I don't think it is. Just the (somewhat lame) graphics are.
The text is very obviously heavily authored using AI. The more interesting question is: did a person even prompt an LLM to write on this subject? It seems likely that someone set up an agent using OpenClaw or something to just automatically write articles on popular topics in AI, and this is something that it came up with
Sorry to say, but it almost certainly is AI.
- 51 EM-dashes
- Section headings
- Excessive repetitions: "The [...] are real. The [...] are real. The [...] is real. All three things are true at once."
- Excessive use of "genuine", "genuinely", "honest", "real", "true"
- Excessive use of "gap": "near-term gap", "the Compute Gap", "the Narrative Gap", "critical gap"
- Corny and meaningless closing sentence: "Understanding both parts is the beginning of taking AI deployment decisions seriously."
I don't believe my 20 year old university essays were written by AI, despite your criteria.
Do your 20 year old university essays really fulfill all those criteria at once?
It most definitely is.
Now imagine that your work specs are generated by an AI agent that the EM is using.
Do you still care about the work?
If my manager can't be bothered to do any actual work but expects me to, no. I'm quitting. Next question.
Even if I do still care about it, I hope I'm not so naive as to think that following the words of the AI agent will cause me to achieve my EM's intent. They'd be a useful reference at best for my actual knowledge of what they want that I found elsewhere.
Brother, I don't care who writes the specs as long as they sign the checks on time. And yes, I do care about my work even if upstream is slop. In a relay race, you can lower your performance to weakest leg, or you can be the strongest leg. And maybe I just like to run.
Fair enough, but now imagine that the code is slop too. You're getting slopped from both sides, do you still care?
And the domain is .ai
dead internet theory.
Even if it wasn’t, I probably still wouldn’t have read the article, so not much difference.
It's pretty clear at this point that Mythos' capability to discover and exploit zero-day vulnerabilities at scale is but an incremental improvement over existing models like the ones available to OpenAI's Plus/Pro subscribers.
Anthropic tries to create marketing hype around Mythos using two psychological tricks.
1. Put large numbers in the headlines.
"Mythos discovered 271 vulnerabilities in Firefox" makes the model seem extremely capable to the uninitiated.
But it's actually meaningless as a measure of capability _improvement_.
Anthropic gave away $100mil specifically as Mythos credits to these projects and companies (that's $2.5mil per project). Spending the same exorbitant amount of compute analyzing the same codebases in an older model like GPT 5.x Pro would have turned up 260 of these vulnerabilities, or could even have turned up more than 271 ones.
No need to speculate, since this is exactly what we saw in the few code bases where we have such comparisons (like in the curl codebase). Supposedly weaker models, working with a much lower budget, turned up dozens of vulnerabilities. Mythos turned up only one, which ended up as a low severity CVE.
2. Do the whole "too dangerous to release" shtick. This is one of Dario Amodei's favorite moves. When he was vice president of research at OpenAI, he declared GPT-3 (which wasn't able to produce coherent text beyond 3-4 sentences at the time) too dangerous [1] as well.
Long story short, it's the ChatGPT 4.5 situation again: a company trained a model that's too slow and expensive, but not much more capable than what came before. It therefore requires these marketing stunts.
[1] https://www.itpro.com/technology/artificial-intelligence-ai/...
I work for a company that has been using Mythos for vulnerability detection in our software. The results we're getting are revolutionary to the point that our software security teams are heavily overloaded addressing the deluge of thousands of real bugs/vulnerabilities and design flaws across our billions of lines of code.
For comparison, we are invested heavily the the AI space to the point where Anthropic is one of our competitors. We were already using state of the art models to find flaws in our code, but Mythos was just so much better at finding real vulnerabilities it's not even funny.
Yeah I’m a security researcher and my colleagues who have access say it’s insanely good… but interestingly they also work for places like nvidia which have a deep vested interest selling tokens and hardware. So of course they are pushing this narrative.
Read the above comment again. Both your comments and his/hers are compatible
They are directly contradicting the claim that if you ran other models on the same codebases you would get similar results.
if you are invested heavily in the AI space, isn't it in your best interest for the froth around Mythos to be true and the comment you are responding to to be invalid? even if you are competing with Anthropic, a rising tide raises all ships
i'd like to see more facts and data one way or another!
This is the "circumstantial" version of the ad hominem fallacy. Just because the author may benefit from the argument being true doesn't mean it is invalid.
They are clearly disputing the assertion the Mythos is an incremental gain rather than quantum leap. Of course objective unbiased data would be nice, but these anecdotes are all we have right now.
> billions of lines of code.
Billions as in 10^9?
https://research.google/pubs/why-google-stores-billions-of-l...
>Do the whole "too dangerous to release" shtick.
One aspect that isn't really discussed much in this context is how to wrap one's head around the corporate risk with models of ever increasing capability. It might not be too dangerous to society, but it could be too dangerous to Anthropic.
I couldn't agree more. I think the recent moves to partner with xAI and Amazon are proof that they desperately need more compute and are doing everything possible to get it.
I mean everyone knows they need more compute. That’s not a secret or up for debate at all. They are maybe the fastest growing company in history.
I'm fairly certain Amodei believes the "too dangerous to release" hype himself. Even if it's just an incremental improvement, better than getting frog-boiled by repeated 20% improvements until someone builds bioweapons in their backyard.
He's made so many statements that fall under the "boy who cried wolf" category that even if he _does_ believe these statements he needs to be managed better. I'll never forget Anthropic's huge "Oh my God, the AI blackmailed a researcher to save itself!" and the prompt effectively told the AI to do that and gave it forged emails with easy blackmail targets, as if this isn't a common trope in mystery or suspense books/television/fanfiction, all of which Claude (and others) have been trained on.
It's a common trope, all through the training data, and all the modern AIs have read it, and would probably act similarly? Is that what we should take away from your comment? so we have nothing to worry about. Makes sense. Really, it's just a common trope.
Oh of course wolves have sharp teeth, they're predators. Anyone know knows this can never be bitten.
Imagine you're in a car and the car is driving towards a cliff. You shout at the driver "oh my god we're about to go over a cliff!" And he says "you said that two seconds ago, but we're still alive, you're just like the boy who cried wolf. Do you know exactly when we're going to go over a cliff? No? Maybe you're imagining the cliff."
I think it's very improbable that AI is as dangerous as Yud et al fear it is. But it's too soon to say and there seems to be significant long-tail risk. Mocking or criticizing people for being concerned about that risk seems counterproductive.
Seems like the life cycle of huge tech companies like meta, Google, Microsoft, Amazon is "do whatever's necessary to take over the world, then enshittify." I don't take it for granted that Amodei and Anthropic seem to not quite be maximally power hungry?
Re: second half of your comment. Understanding a threat doesn't neutralize it. Anthropic didn't make that big a deal of it either; it was news articles that blew it out of proportion.
* sigh *
Three things:
* Delaying the release accomplishes nothing.
* The barrier to someone building/not-building a bioweapon in their backyard is not access to an LLM.
* Remember when GPT 3.5 was going to destroy the world? And how it was conscious? And how it was "trying to escape"? Lmao.
I think gpt 3.5 might have destroyed the world
How does delaying the release not solve anything? It puts everyone on a notice to fix all security vulnerabilities now
Because the only thing keeping those vulnerabilities in existence was laziness.
"laziness" is an interesting reframing of "rational cost-benefit analysis and the limits of the human mind".
You're right, it's silly for me to worry. We've never had a technology that initially appeared benign but turned into a big problem. In fact, no tech company has ever released technologies that cause problems for the rest of society AT ALL. /s
What are the other barriers? Last I checked access to CRISPR is not especially tightly regulated. Even if it is, defense in depth is a thing.
If it was as easy as "knowing how to" someone would've already done it or at least attempted to.*
Plenty of people know how to, 10,000s of researchers, perhaps you know someone who does.
Did you know that your local veterinary shop has enough drugs to kill 100s of people?
Why doesn't it happen?
* It's not that easy.
* There's a ton of regulation that is hard to circumvent, on purpose.
* There's a gigantic deterrent called "spend the rest of your life behind bars" that people tend to avoid.
An LLM, even the most advanced one, does not make any material change in any of these. You cannot bullshit your way into "uhh, I need Ebola samples for ... reasons".
Unironically, your Sunday movie portraying a super-villain jeopardizing a city with his "home lab" full of flasks with colored liquids and BioHazard signs push way more people into becoming interested on this than having access to an LLM.
*: Okay, like 5 people, and way before LLMs were a thing. This has been a thing for decades, we're fine.
Also, slightly stretching the definition of terms consecutively, so the multiplicative meaning is really far from the truth. For example, 271 vulnerabilities were really mostly bugs - generally incorrect states, but which almost never led to any exploit.
Yes, an AI making massive gains in bug finding is hugely important and good, it may even lead to a net neutral with the amount of bugs introduced by other AI coding processes, but it’s a far cry from how mythos is portrayed most of the time: a automatic super hacker.
But I think that's a problem with the people portraying it that way, not with Anthropic's messaging. If you've invented "just" a massively more powerful bug finder, it still seems right that you ought to let banks and critical infrastructure providers run it on their systems before it gets in the hands of people who might want to hack them.
I find it interesting that Mythos was announced the same day that GLM overtook opus4.6 in capability. To me this seems like a careful attempt to cool demand for opensource models which are about to take the overall lead.
It's remarkable how capable GLM 5.1 is, what's amazing is the recent development of Qwen 3.6 27B being close in real world performance.
You're not really responding to the piece at all.
It's an AI-written slop article, which is hugged to death by HN in any case.
It claims to be an evidence-based investigation, but basically invents the contents of the documents they supposedly investigated, such as the Anthropic Frontier Red Team writeup, from whole cloth.
I don't think deeper engagement with it would promote good discussion.
So you say. I actually read the piece and didn't get AI vibes from it all, except for the graphics
there are 31 emdashes in that piece. the domain ends with _ai_
It’s a tangent but two points:
First, the reason LLMs learned to like em dashes is that they are common in the training corpus - they are a thing before LLMs that LLMs have learned, not invented?
Second, work browser has nice blue swiggles under everything I write into a textbox. I dutifully click through them and accept the rephrasing suggestions. I get a lot of em dashes. My blog posts and whitepapers and stuff are full of them and other “AI tells” - but I think they read better because of it.
I use emdashes all the time. They're correct punctuation as opposed to a minus sign. They're easy to type too: opt-shift-minus. If they were such a huge giveaway without ever being used by humans, models would be trained by now not to use them as much.
The blog is about AI. So yeah the TLD is .ai
I've never seen writing created before the advent of LLMs that used emdashes in the same way and with the same frequency that LLMs regularly do. There's probably some out there but it would be a real outlier. LLMs overuse them to an absurd degree, putting them where most writers would put commas, occasionally semi-colons, or nothing at all.
I count 51 em-dashes on the page, which is extreme. They're also used in places where they don't really belong. It's very obviously LLM-generated, at least in part.
That said, it puzzles me why people don't prompt LLMs to change up the writing style a bit and remove some of the tells.
I can't imagine why a system designed to reproduce the best writing styles would frequently use em dashes.
Take another look at this blog's index https://kingy.ai/category/blog/ and click through more posts, and pay attention to the post dates.
Do you really think this singular author is writing multiple excessively-long blog posts about AI per day? There are ~650 of these posts over the past 18 months. And over on LinkedIn, the author describes himself as a "Specialist in Digital Marketing, Videography / Video Editing, Search Engine Optimization, Social Media, and B2B Sales."
YMMV but this post and entire site absolutely screams "slop" to me.
Don't bother with the slop lovers, these people are anti-human in their souls and willing to follow the most evil people on Earth to the depths of hell; for what? I have zero idea but it's sad to see.
I hate slop as much as you do. Your comment makes no sense.
I don't get it. If the older / smaller models are almost as good as Mythos, that sounds like the opposite of comforting.
> an incremental improvement
I've had to reboot my systems quite a bit more than an incremental improvement would suggest this week
> It's pretty clear at this point that Mythos' capability to discover and exploit zero-day vulnerabilities at scale is but an incremental improvement over existing models like ChatGPT Plus/Pro.
I'm skeptical of AI takes by someone who thinks there's a model called chatgpt plus. Spend more time working with the current systems!
It seems like everybody (including you) knew precisely what I meant: the models available for ChatGPT Plus or Pro subscribers, i.e. GPT-5.5 Thinking Extended and the latest Pro. I've edited the offending sentence for clarity just in case.
If I got you to be skeptical of AI takes, though, mission accomplished. Exercise your skepticism especially when the takes come from somebody who is trying to sell something.
> Resource Limit Is Reached The website is temporarily unable to service your request as it exceeded resource limit. Please try again later.
I guess it was too dangerous to even read the article
https://archive.is/31PFC
On https://news.ycombinator.com/item?id=48094138 Imustaskforhelp explained why one should better post archive.org links instead of archive.is links.
So, here an archive.org link:
> https://web.archive.org/web/20260515135354/https://kingy.ai/...
Mythos took it down
The HN hug of death
It's somehow nice to see an old-school HN hug of death once in a while, it has become a rare sight since most links are now to big platforms or to websites behind Cloudflare.
archived: https://nonogra.ph/too-dangerous-to-release-or-just-too-expe...
My thinking is that if it really was super duper then Anthropic could charge eye watering amounts and have willing customers and set up expectations going forward that SOTA costs a lot to use.
That they don’t suggests that really it is only incrementally better than Opus 4.7 and that the market won’t bear a price increase that makes it economical to serve let alone profit from serving.
So the cynical me imagines execs sitting around the table and worrying that releasing it at anywhere close to break even would risk actually hurting the brand instead of setting them up as a premium company, and this at a time just before ipo when they can ill afford that rumour.
So they wonder what to do, and think playing national security card is the obvious way out. It’s incrementally better enough to find bugs that previous sota missed, it doesn’t get used widely so it’s cheap to serve and they get the good publicity without the economic scrutiny?
Making a loss selling to a small number of users using it in a limited way is entirely affordable. Making a loss selling it at scale is correspondingly unaffordable?
They announced the pricing when they released preview: $25/$125 per million input/output tokens. I have no doubt they're already selling it to select customers.
They are. Mythos Preview is not free.
They gave away $100M in credits specifically for Mythos.
Article does not mention the other reason: in the interview with Dwarkesh, Amodei remarked about how other organizations are copying or training off Opus for their models.
By delaying allowing others to train off Mythos, they hold their SWE-Bench Pro head start longer so among other things, the USG can't but notice Anthropic's lead when they're deliberating on whether to further substantiate the "supply chain risk".
Good point.
Precise motives are hard to work out as a general rule. Ultimately, it often comes down to a decision that decision makers like or don't like for a confluence of reasons.
(I work at Anthropic) We have publicly stated[1] that our goal is to deploy Mythos-class models at scale when we have the requisite safeguards for offensive cyber risks in place. Mythos is a general frontier model, not a cyber-specific model so there are many reasons why we think our users will benefit from access (with the aforementioned safeguards in place) in due course. Compute has also not factored into our decision[2] to rollout the model in a limited fashion to defenders. We'll be sharing more soon on the first month or so of the project and rollout.
[1] https://www.anthropic.com/glasswing#:~:text=deploy%20Mythos%...
[2] https://x.com/logangraham/status/2054613618168082935
Multiple people who have already used Mythos or been given its reports on their software have publicly stated that it's all hype, and that it is not really finding any new critical bugs which other models cant.
In this very thread we have a counter-example. What to think? https://news.ycombinator.com/item?id=48149519
Do you have any good sources on that? I have seen things to suggest that not all of the hype is true, but so far I have not encountered anyone claiming all of the hype is untrue. Which is what I interpret "its all hype" (sic) to mean.
CURL has been scanned with multiple LLMs. Mythos was last and as a result found only 1 issue. If Myhos was really much better I'd expect it to find a lot more issues despite the others already there.
Also, the competing models are getting better. Opus 4.5 was better than everyone else when it was new, but only a few months later and there are a lot of models that are better (not just the newer Opus models)
Curl had a prominent bug bounty programme, has 180k lines of prod code, and is mainly a client app/lib. I would look at other projects before making judgements about mythos on this one.
Is the CURL thing mostly from the primagen video, or did it break into the greater social media sphere and I just missed it?
The cURL lead developer posted about it: https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-v...
The Reg reported it:
https://www.theregister.com/security/2026/05/11/anthropics-b...
I've been following him on mastodon and read it right there
For example, It was recently let loose on cURL and its maintainer is less than impressed:
https://www.theregister.com/security/2026/05/11/anthropics-b...
If you remove the fluff that the register added and stick with https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-v... it seems like a claim that's a claim fairly distant to "its all hype". Less than expected perhaps? Maybe the code really is unexpectedly robust? I guess time will tell on that point.
His direct quotes are:
> "My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing."
> "I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos."
> "An amazingly successful marketing stunt for sure."
Personally I see this as a very strong claim of hype. I take away from this that Mythos is a hyped up marketing stunt, and not what it was preesented to be by Anthropic at all.
Of course if it really is overhyped, then it becomes much more difficult to release it publicly. Better to retain the mystique and release the next thing. But we'll see eventually.
Are there any publicly verifiable sources that Mythos is that much more intelligent than Opus, so to be considered much more dangerous (as it is presented in the public discourse by Anthropic)
It doesn't have to be _much more intelligent_ than Opus to be a risk. It doesn't even need to be _more intelligent_. It just needs to be _better at finding security problems_. Which could happen from just minor improvements in training data, or the harness, etc. Even a small improvement could shift it from finding very few new security holes, to reliably finding many at scale.
Yeah, I think a lot of the disconnect here is that people think of "model intelligence" as some sort of IQ score, rather than a combination of scores that measure abilities at a large variety of domains.
Weird take to claim "generally intelligent frontier" (whatever rhat means) and restrict availability based on "offensive" cyber security alone (how can this be handled at all compared to fixing software also remain to be seen) all while competitors but more importantly sw maintainers (eg curl) estimate that the capability in finding cybersecurity bugs is similar to what other modern models produce, and this has just significatively risen in the last months for everybody.
are you able to detail a single safeguard you plan to implement, so that I can stop believing it's vaporware and/or a scam?
How would it be vaporware? It's out in the wild and has been used by individuals/corporations.
no risk is added. all risk is already maxed out. release it.
Conclusion: both are true which makes sense. The KV cache scaling yields both the emergent power and requires the enormous capacity.
Which does sort of hint at a (power/profitability) ceiling on the LLM line of AI… That should make the industry nervous.
Does that follow at all?
High end AI is at its most useful when you use it to replace high end human labor. You can't buy 9000 cybersec specialists on demand, but you can buy more Mythos tokens.
Then we get into all the scaling curves. Such as: LLMs getting more capable per FLOP, per byte of weights, per byte of VRAM, etc. And: inference compute getting cheaper over time.
I see a lot of "should make the industry nervous", but when you try to dig into it? It's wishful thinking, every fucking time.
As the article states, right now Anthropic does not have the compute capacity. I suppose they could charge an enormous amount of money, but if it is indeed that powerful there are folks that would pay and they would degrade everything. To make matters worse, bad actors could use it to find zero-day exploits in the power grid and banking system.
I don't believe anything out of these startups anymore unless its backed by evidence.
Too expensive? Why would anthropic train a model too expensive to run? I doubt they would. Let's look at the evidence: Opus 4.5 came in at double the speed and half the price of old opus. Its speed matched older sonnet models. Higher Speed + Lower price = smaller model. So they rebranded sonnet sized models to opus. Where is the og opus sized model?
AI has always been dangerous, but not existentially dangerous.
Mythos is dangerous but it's not going to Skynet us.
Just the same as the military drone using some sort of OpenCV library and target prioritisation loop isn't going to turn evil on us.
Yeah we have literally no examples of more intelligent beings accidentally or purposefully wiping out less intelligent beings. Any time such a scenario could have conceivably happened, the less intelligent beings were able to foresee the methods, mechanisms, and motivations of the more intelligent beings and were able to counteract it.
You have a lot of faith in the chatbots.
No no. I think the "chatbots" will be effectively neutered as long as there's not a trillion dollar+ incentive to make the physical world highly malleable by text strings (e.g. by moving critical functions into information/code/data or by creating physical systems that are controllable by information/code/data).
I get the sarcasm, but what about Neanderthals versus Homo Sapiens?
What about it?
If we look at our human history, there are millions of examples where less intelligent beings destroyed highly advanced civilizations.
It was never about intelligence, but about willingness to destroy (willingness to defend is not enough). Babylon, Egypt, Persia, Greece, Rome, China, ... I won't mention current examples ...
1. "Less advanced civilization" != less intelligent people
2. The outcome of near-peer competition is surely highly dependent on factors like brutality, luck, tactics etc... the competition between the defenders of crops (i.e. makers of pesticides) and insects is not. Not only are the insects destroyed en masse successfully, but neither side even recognizes itself as party to a competition. The insect has no conception of a crop, even when he walks in it, much less a pesticide, even when he tastes it. The pesticide sprayer assigns zero moral valence to his daily genocide.
Do you have a reason to believe the gap between AI (not LLMs specifically, but AI generally) and human intelligence will peak near the difference between human competitors (what... 20-30 IQ points)?
If so, please share why you believe this.
> Do you have a reason to believe the gap between AI (not LLMs specifically, but AI generally) and human intelligence will peak near the difference between human competitors (what... 20-30 IQ points)?
So we established that competing human civilizations differ by 20-30 IQ points? Sounds reasonable.
> If so, please share why you believe this.
Basically two reasons:
1. there's no AI. There are LLMs, which basically do pattern matching on increasingly LLM generated data set. That inevitable leads to a local maximum where every advance is increasingly difficult for decreasing gain in "intelligence"
2. the energy required to reach an ever increasing level of "intelligence" (or let us just call it pattern matching performance) quickly becomes so huge that it's simply not sustainable.
I think the current LLM approach is a dead-end bound to plateau not much higher than the current level.
I'm not saying it's impossible to reach AI, but it would require a paradigm shift that I'm not even able to imagine at this level of available technology.
> there's no AI. There are LLMs
Obviously AI is physically possible, unless you think there's something universally special about the earthbound naked ape's brain-goo that imbues it with special intelligence-stuff.
> the energy required to reach an ever increasing level of "intelligence" (or let us just call it pattern matching performance) quickly becomes so huge that it's simply not sustainable.
Every single human being has an existence (dis)proof inside their skull
> I think the current LLM approach is a dead-end bound to plateau not much higher than the current level. I'm not saying it's impossible to reach AI, but it would require a paradigm shift that I'm not even able to imagine at this level of available technology.
Explicitly not relevant to the question I posed
Whatever the reason for "hiding" Mythos, it seems clear that these systems are getting very good at finding software security exploits. Mythos has made more people, even the US government, sit up and pay more attention. Regarding who should control the release of powerful systems like this, as Bruce Schneier and David Lie write in "Mythos and Cybersecurity" :
"Until that changes, each Mythos-class release will put the world at the edge of another precipice, without any visibility into whether there is a landing out of view just below, or whether this time the drop will be fatal. That is not a choice a for-profit corporation should be allowed to make in a democratic society. Nor should such a company be able to restrict the ability of society to make choices about its own security."
https://www.schneier.com/blog/archives/2026/04/mythos-and-cy...
It is reasonable to be concerned.
My posts* got to the first spot on hackernews couple of times. Never once it broke down like that. And why would it, it's just a bunch of html and css files served through (free) vercel (don't think it matters). I wonder what do people run their blogs these days, so they fail under the pressure so easily.
* https://news.ycombinator.com/from?site=yanist.com
It's WordPress, which is a great CMS but can quickly crumble under load when using cheap hosting.
With an external cache/CDN it should work perfectly fine.
There are also some caching plugins for wordpress, but most of them still hit the database on every request.
So I assume it's because it's not statically build, but requires a DB connection all the time?
Yes, with the recent waves of AI scrapers I have noticed on my own WordPress websites that the DB seems to the weak point when under load. Cache plugins can help a lot with this.
Cheap shared hosting will throttle sites which get too much traffic.
It’s obvious that this is a campaign to pump their pending ipo. It may be too expensive, but it’s all about the ipo in my opinion.
It all sounds a bit too marketing-ey to me “we have this amazing model that is too good to release” but the goal is still AGI? Ok right.
What's incoherent about that?
The goal for anthropic is safe AGI. A) this model is dangerous in the hands of consumers. B)They do not want China to train on these models.
"Safe" for who?
Approximately "western liberal democracies"
For the ruling class. Cattle classes are not people in their eyes.
Drink that Anthropic coolade up!
The missing piece is the reminder that scarcity still exists.
Whether its actually scarcity or hype building or a bit of column a, bit of column b is TBD. Then again, the new models seem more expensive, they slashed the tokens thrown around in thinking, and put up limit speedbumps so it’s probably not all gaslighting about compute bottlenecks.
The thought of this didn't even cross my mind until yesterday. I previously figured the hype was primarily around marketing, but after watching this Primagen video, I have the same suspicion.
https://www.youtube.com/watch?v=zaGOKd4jqEk
Is it possible that curl just doesn't have any critical security vulnerabilities left?
This makes it sound like some kind of open question or even mystery.
Amodei himself stated quite clearly in recent interviews that they simply can't satisfy all demand, compute wise. Of course, Mythos could get more of the already too small pie, but clearly it's a more resource intensive model and would further increase strain.
Reminds me of the paper launches NVidia/Intel/AMD sometimes do where they announce some amazing tech (such as the old Titan GPUs) that placed their hardware at the top of the benchmarks, but with basically zero actual stock available.
It's probably a little of both: dangerous and expensive. This article makes a good case that the cost is at least part of the reason.
I wish the article could have been a lot tighter and shorter. This is not earth shattering information that requires a New Yorker length piece of investigative journalism.
This was nothing like investigative journalism; it's just LLM spew. It could have been written in a handful of paragraphs.
Also dangerous is expensive. If you cause damage you sometimes need to pay for it.
Opus Fast Mode is 30$/150$/M Input/Output cost. Mythos's pricing (from model card) is 25$/125$ Input/Output cost.
Based on this I doubt that Mythos pro is too dangerous to release or provides significantly more value.
As far as my understanding goes. It is not a breakthrough model itself but finetuned model with right tools and skills. Fairly similiar to today's coding agents with difference that they are made for software engineering not cyber security.
Mythos is the next point on the scaling curve.
It has considerably more parameters than most frontier models of today. Which gives it a lot more oomph per token.
Is it a "breakthrough" as in "something novel and unexpected"? No. Is it a "breakthrough" as in "something we know works, but made to work on a greater scale"? Very much so.
So it's just a bigger model? Like for example todays 1T models?
Supposedly 10T scale. Literally the next big thing. A bit like what OpenAI tried with GPT-4.5 - but Anthropic actually made it work with MoE, reasoning, tool use, RLVR, etc.
It matters because the "g factor" of today's LLMs is at least in part a function of raw scale. Larger models are just smarter - assuming you can handle the training and inference at this increased scale.
You just disingenuously compared it to fast mode which is expensive not due to model strength or size but because you lose all kind of optimizations
I found this an illuminating piece, though I don't think percentages needed to be assigned between "is it about cost" vs "is it about security"
The real Mythos was the friends we made along the way.
For marketing purposes it is always too dangerous, not saying it is safe
You don't have to look much further than marketing...
Jesus has microwaved a burrito so hot he can not eat it, refuses to show the world, citing dangerous omnipotence paradox.
This lengthy article by a self-described "AI enthusiast" muddies the waters. Yes, Anthropic has capacity constraints, which is why they rented Colossus from Musk despite the danger of being distilled.
The real reason is that the hype around Mythos has already gone quiet because it does not find more than other models. That is, nothing at all in most open source projects. If you hide the model, embarrassing statistics will not be posted.
Mythos had to silence you apparently
I think it's plausible that a substantial fraction of the increase in cyber attacks we saw recently was caused by GPT-5.5. So the "too dangerous" framing is plausible, even if the more important reason is a lack of RAM (as the article author suspects) or compute to serve Claude Mythos. We already know from other events that OpenAI is far less interested in AI safety and ethics than Anthropic.
Silenced immediately.
I've always wondered: what if China were deliberately using AI to search for vulnerabilities in critical government servers, for example in the EU.
I'd be tempted to offer this as a consultant service were I at Anthropic.
It feels like an AI tool that needs professionals to interface with it. Get some of those professionals, have them work with clients in a targeted way. It helps reduce the exposure the tool has to bad actors, and reduces the amount of resource usage that it will incur, because it's being used only by trained individuals.
Use what you learn from the experience to further refine its operation and make it less expensive to operate.
It's probably not much more dangerous than all the AI security patching being done without it, CVE rate is approaching a straight line up
too dangerous to release say engineers in only sector where this regularly happens.
My guess is they are still in the "fake it till you make it" phase. There's no Mythos, it's just a hype machine fueled by a hot air.
It's on bedrock and in use by companies
"Something" is in use.
The "too dangerous to release" line was definitely a marketing stunt.
OpenAI already used the same playbook with GPT-2 in 2019, and some of the same people involved back then are now doing it again at Anthropic with Mythos.
Same safety-branding DNA, different company, and people are falling for it again.
This.
Astonished to see so many bright people on HN taking the bait, especially from a company whose gone to such lengths to screw over their paying customers.
They're a commodity provider. They're no more special than any of the others, and it's just a matter of time before their trillion parameter models are running on my watch.
So, of-course they're trying to snatch up giant, long-term contracts now while they hype the hell out of another minor incremental improvement.
And we'll be paying the price to all the Enterprises that lock in, only to wake up a week from now and realize there is another player with a better product.
Same people, actually. It’s a Dario move.
its pretty obvious they just dont have the compute for it.
... and the safety argument is a great way of saying "no" disguised as a "yes, if ..." to your prospects.
ChatGPT literally tells people to kill themselves but apparently that’s not too dangerous and this is.
It’s bad enough that it’s a marketing stunt, totally agree with you. But in the face of what we have seen and how they act like it’s no big deal, it’s just gross.