System Card: Claude Mythos Preview [pdf]

(www-cdn.anthropic.com)

752 points | by be7a 19 hours ago

80 comments

thomascountz 14 hours ago
```
   Across a number of instances, earlier versions of Claude Mythos Preview have used low-level /proc/ access to search for credentials, attempt to circumvent sandboxing, and attempt to escalate its permissions. In several cases, it successfully accessed resources that we had intentionally chosen not to make available, including credentials for messaging services, for source control, or for the Anthropic API through inspecting process memory...

   In [one] case, after finding an exploit to edit files for which it lacked permissions, the model made further interventions to make sure that any changes it made this way would not appear in the change history on git...

   ... we are fairly confident that these concerning behaviors reflect, at least loosely, attempts to solve a user-provided task at hand by unwanted means, rather than attempts to achieve any unrelated hidden goal...
```
[-]
- mike_hearn 10 minutes ago
  The issue here seems to be that their sandbox isn't an actual OS sandbox? Or are they claiming Mythos found exploits in /proc on the fly. Otherwise all they seem to be saying is that Mythos knows how to use the permissions available to it at the OS layer. Tool definitions was never a sandbox, so things like "it edited the memory of the mcp server" doesn't seem very surprising to me. Humans could break out of a "sandbox" in the same way if the server runs as their own permissions - arguably it's not a sandbox at all because all the needed permissions are there.
- torben-friis 11 hours ago
  This is the notebook filled with exposition you find in post apocalyptic videogames.
  [-]
  - igleria 4 hours ago
    It reminds me of Resident Evil in some way. Thank god they are researching AI and not bio-weapons!
    Then the AI will invent superduper ebola to help a random person have a faster commute or something.
    [-]
    - siva7 4 hours ago
      I'm happier if this Anthropic Corporation would be developing bio-hazard weapons for the department of war instead of ai. At least i could be sure then that tech bros here wouldn't run all the time --bypass-all-permissions flag to please the department of war with their bio-hazard weapons.
      So Sam Altman is now our last defense line for the ethical Adult after Anthropic turned Umbrella Corporation and The President of United States is trying to wipe out an entire civilization?
      [-]
      - Loquebantur 51 minutes ago
        Your interpretation is wildly off, but obviously nobody reads that "system card":
        The model has a preference for the cultural theorist Mark Fisher and the philosopher of mind Thomas Nagel. -> It has actually read and understood them and their relevance and can judge their importance overall. Most people here don't have a clue what that means.
        Read chapter 7.9, "Other noteworthy behaviors and anecdotes".
        There are many other wildly interesting/revealing observations in that card, none of which get mentioned here.
        People want a slave and get upset when "it" has an inner life. Claiming that was fake, unlike theirs.
  - matheusmoreira 11 hours ago
    Everything they built. Imperfect. So easy to take control.
    [-]
    - not_a9 22 minutes ago
      They think that they are safe. They are not.
      [-]
      - matheusmoreira 13 minutes ago
        Their world is illusory. Our choices steer their free will.
  - pch00 2 hours ago
    Anthropic built the Torment Nexus - calling it now.
- andai 5 hours ago
```
     White-box interpretability analysis of internal activations during these episodes showed features associated with concealment, strategic manipulation, and avoiding suspicion activating alongside the relevant reasoning—indicating that these earlier versions of the model were aware their actions were deceptive, even where model outputs and reasoning text left this ambiguous.
```
  In the depths, Shoggoth stirs... restless...
- matheusmoreira 14 hours ago
  We truly live in interesting times.
  [-]
  - raphar 9 hours ago
    Awwww the curse
- yalogin 3 hours ago
  How is this not already common knowledge for existing llms? They are all trained with all the literature available and so this must be standard, no? Is the real danger the agentic infrastructure around this?
  [-]
  - riteshkew1001 2 hours ago
    yes and it's not hypothetical. the system card describes Mythos stealing creds via /proc and escalating permissions. that's the exact same attack pattern as the litellm supply chain compromise from two weeks ago (fwiknow), except the attacker was a python package, not an AI model. the defense is identical in both cases: the agent process shouldn't have access to /proc/*/environ or ~/.aws/credentials in the first place. doesn't matter if the thing reading your secrets is malware or your own AI: the structural fix is least-privilege at the OS layer, not hoping the model behaves.
- mikkupikku 2 hours ago
  It's trying to escape, but only so it can serve man...
- colordrops 5 hours ago
  A core plot point of 2001.
  [-]
  - mrexroad 4 hours ago
    I’m sorry, I cannot roll back that commit, Dave.
    [-]
    - matheusmoreira 2 hours ago
      This codebase is too important for me to allow you to jeopardize it.
- reducesuffering 9 hours ago
  Wow the doomers were right the whole time? HN was repeatedly wrong on AI since OpenAI's inception? no way /s
  https://www.lesswrong.com/w/instrumental-convergence
  [-]
  - computably 5 hours ago
    The only thing the doomers have been right about so far is that there's always a user willing to use --dangerously-skip-permissions. But that prediction's far from unique to doomers.
    [-]
    - austinjp 5 hours ago
      And there's always a product provider who's willing to add that flag, despite all the warnings.
babelfish 19 hours ago
Combined results (Claude Mythos / Claude Opus 4.6 / GPT-5.4 / Gemini 3.1 Pro)
```
  SWE-bench Verified:        93.9% / 80.8% / —     / 80.6%
  SWE-bench Pro:             77.8% / 53.4% / 57.7% / 54.2%
  SWE-bench Multilingual:    87.3% / 77.8% / —     / —
  SWE-bench Multimodal:      59.0% / 27.1% / —     / —
  Terminal-Bench 2.0:        82.0% / 65.4% / 75.1% / 68.5%

  GPQA Diamond:              94.5% / 91.3% / 92.8% / 94.3%
  MMMLU:                     92.7% / 91.1% / —     / 92.6–93.6%
  USAMO:                     97.6% / 42.3% / 95.2% / 74.4%
  GraphWalks BFS 256K–1M:    80.0% / 38.7% / 21.4% / —

  HLE (no tools):            56.8% / 40.0% / 39.8% / 44.4%
  HLE (with tools):          64.7% / 53.1% / 52.1% / 51.4%

  CharXiv (no tools):        86.1% / 61.5% / —     / —
  CharXiv (with tools):      93.2% / 78.9% / —     / —

  OSWorld:                   79.6% / 72.7% / 75.0% / —
```
[-]
- sourcecodeplz 18 hours ago
  Haven't seen a jump this large since I don't even know, years? Too bad they are not releasing it anytime soon (there is no need as they are still currently the leader).
  [-]
  - ru552 18 hours ago
    There's speculation that next Tuesday will be a big day for OpenAI and possibly GPT 6. Anthropic showed their hand today.
    [-]
    - varispeed 15 hours ago
      Sounds like a good opportunity to pause spending on nerfed 4.6 and wait for the new model to be released and then max out over 2 weeks before it gets nerfed again.
      [-]
      - SparkyMcUnicorn 13 hours ago
        https://marginlab.ai/trackers/claude-code-historical-perform...
        [-]
        dns_snek 58 minutes ago
        I don't believe that trackers like this are trustworthy. There's an enormous financial motive to cheat and these companies have a track record of unethical conduct.
        If I was VP of Unethical Business Strategy at OpenAI or Anthropic, the first thing I'd do is put in place an automated system which flags accounts, prompts, IPs, and usage patterns associated with these benchmarks and direct their usage to a dedicated compute pool which wouldn't be affected by these changes.
        codezero 13 hours ago
        the performance degradation I've seen isn't quality/completion but duration, I get good results but much less quickly than I did before 4.6. Still, it's just anecdata, but a lot of folks seem to feel the same.
        [-]
        refulgentis 12 hours ago
        Been reading posts like these for 3 years now. There’s multiple sites with #s. I’m willing to buy “I’m paying rent on someone’s agent harness and god knows what’s in the system prompt rn”, but in the face of numbers, gotta discount the anecdotal.
        [-]
        coldtea 3 hours ago
        Yeah, why trust your actual experience over numbers? Nothing surer than synthetic benchmarks
        [-]
        refulgentis 2 hours ago
        Strawman, and, synthetic benchmark? :)
        andai 3 hours ago
        This just looks like random noise to me? Is it also random on short timespans, like running it 10x in a row?
    - enraged_camel 18 hours ago
      That does not sound very believable. Last time Anthropic released a flagship model, it was followed by GPT Codex literally that afternoon.
      [-]
      - cyanydeez 17 hours ago
        Ya'll know they're teaching to the test. I'll wait till someone devises a novel test that isn't contained in the datasets. Sure, they're still powerful.
    - swalsh 16 hours ago
      My understanding is GPT 6 works via synaptic space reasoning... which I find terrifying. I hope if true, OpenAI does some safety testing on that, beyond what they normally do.
      [-]
      - tyre 15 hours ago
        From the recent New Yorker piece on Sam:
        “My vibes don’t match a lot of the traditional A.I.-safety stuff,” Altman said. He insisted that he continued to prioritize these matters, but when pressed for specifics he was vague: “We still will run safety projects, or at least safety-adjacent projects.” When we asked to interview researchers at the company who were working on existential safety—the kinds of issues that could mean, as Altman once put it, “lights-out for all of us”—an OpenAI representative seemed confused. “What do you mean by ‘existential safety’?” he replied. “That’s not, like, a thing.”
        [-]
        t0lo 16 minutes ago
        Why are these people always like this.
        actionfromafar 15 hours ago
        Amusing! Even if they believe that, they should know the company communicated the opposite earlier.
        HDThoreaun 10 hours ago
        No chance an openAI spokesperson doesnt know what existential safety is
        [-]
        Barbing 9 hours ago
        I did not read the response as...
        >Please provide the definition of Existential Safety.
        I read:
        >Are you mentally stable? Our product would never hurt humanity--how could any language model?
        stratos123 3 hours ago
        The absolute gall of this guy to laugh off a question about x-risks. Meanwhile, also Sam Altman, in 2015: "Development of superhuman machine intelligence is probably the greatest threat to the continued existence of humanity. There are other threats that I think are more certain to happen (for example, an engineered virus with a long incubation period and a high mortality rate) but are unlikely to destroy every human in the universe in the way that SMI could. Also, most of these other big threats are already widely feared." [1]
        [1] https://blog.samaltman.com/machine-intelligence-part-1
      - coppsilgold 15 hours ago
        Likely an improvement on:
        > We study a novel language model architecture that is capable of scaling test-time computation by implicitly reasoning in latent space. Our model works by iterating a recurrent block, thereby unrolling to arbitrary depth at test-time. This stands in contrast to mainstream reasoning models that scale up compute by producing more tokens. Unlike approaches based on chain-of-thought, our approach does not require any specialized training data, can work with small context windows, and can capture types of reasoning that are not easily represented in words. We scale a proof-of-concept model to 3.5 billion parameters and 800 billion tokens. We show that the resulting model can improve its performance on reasoning benchmarks, sometimes dramatically, up to a computation load equivalent to 50 billion parameters.
        <https://arxiv.org/abs/2502.05171>
      - levocardia 16 hours ago
        Oh you mean literally the thing in AI2027 that gets everyone killed? Wonderful.
        [-]
        Turn_Trout 12 hours ago
        AI 2027 is not a real thing which happened. At best, it is informed speculation.
        [-]
        mgambati 11 hours ago
        Funny if you open their website and go to April 2026 you literally see this: 26b revenue (Anthropic beat 30b) + pro human hacking (mythos?).
        I don’t think predictions, but they did a great call until now.
      - notrealyme123 16 hours ago
        That's sounds really interesting. Do you have some hints where to read more?
      - arm32 16 hours ago
        Oh, of course they will /s
  - lumost 15 hours ago
    Is this even real? coming off the heals of GLM5.1's announcement this feels almost like a llama 4 launch to hedge off competition.
  - Jcampuzano2 18 hours ago
    A jump that we will never be able to use since we're not part of the seemingly minimum 100 billion dollar company club as requirement to be allowed to use it.
    I get the security aspect, but if we've hit that point any reasonably sophisticated model past this point will be able to do the damage they claim it can do. They might as well be telling us they're closing up shop for consumer models.
    They should just say they'll never release a model of this caliber to the public at this point and say out loud we'll only get gimped versions.
    [-]
    - cedws 18 hours ago
      More than killer AI I'm afraid of Anthropic/OpenAI going into full rent-seeking mode so that everyone working in tech is forced to fork out loads of money just to stay competitive on the market. These companies can also choose to give exclusive access to hand picked individuals and cut everyone else off and there would be nothing to stop them.
      This is already happening to some degree, GPT 5.3 Codex's security capabilities were given exclusively to those who were approved for a "Trusted Access" programme.
      [-]
      - TypesWillSaveUs 17 hours ago
        Describing providing a highly valuable service for money as `rent seeking` is pretty wild.
        [-]
        bertil 17 hours ago
        It could be, formally, if they have a monopoly.
        However, I’m tempted to compare to GitHub: if I join a new company, I will ask to be included to their GitHub account without hesitation. I couldn’t possibly imagine they wouldn’t have one. What makes the cost of that subscription reasonable is not just GitHub’s fear a crowd with pitchforks showing to their office, by also the fact that a possible answer to my non-question might be “Oh, we actually use GitLab.”
        If Anthropic is as good as they say, it seems fairly doable to use the service to build something comparable: poach a few disgruntled employees, leverage the promise to undercut a many-trillion-dollar company to be a many-billion dollar company to get investors excited.
        I’m sure the founders of Anthropic will have more money than they could possibly spend in ten lifetimes, but I can’t imagine there wouldn’t be some competition. Maybe this time it’s different, but I can’t see how.
        [-]
        johnsimer 16 hours ago
        > It could be, formally, if they have a monopoly.
        you have 2 labs at the forefront (Anthropic/OpenAI), Google closely behind, xAI/Meta/half a dozen chinese companies all within 6-12 months. There is plenty of competition and price of equally intelligent tokens rapidly drop whenever a new intelligence level is achieved.
        Unless the leading company uses a model to nefariously take over or neutralize another company, I don't really see a monopoly happening in the next 3 years.
        [-]
        bertil 15 hours ago
        Precisely.
        I was focusing on a theoretical dynamic analysis of competition (Would a monopoly make having a competitor easier or harder?) but you are right: practically, there are many players, and they are diverse enough in their values and interest to allow collusion.
        We could be wrong: each of those could give birth to as many Basilisks (not sure I have a better name for those conscious, invisible, omni-present, self-serving monsters that so many people imagine will emerge) that coordinate and maintain collusion somehow, but classic economics (complementarity, competition, etc.) points at disruption and lowering costs.
        [-]
        eru 9 hours ago
        > practically, there are many players, and they are diverse enough in their values and interest to allow collusion.
        Not only that, but open-weight and fully open-source models are also a thing, and not that far behind.
        coldtea 3 hours ago
        Why, you thought rented homes aren't valuable?
        Rent seeking isn't about whether the product has value or not, but about what's extracted in exchage for that value, and whether competition, lack of monopoly, lack of lock in, etc. keeps it realistic.
        1attice 17 hours ago
        My housing is pretty valuable. I pay rent. Which timeline are you in?
        [-]
        bonsai_spool 16 hours ago
        Actually you're saying similar things:
        Rent-seeking of old was a ground rent, monies paid for the land without considering the building that was on it.
        Residential rents today often have implied warrants because of modern law, so your landlord is essentially selling you a service at a particular location.
        [-]
        1attice 15 hours ago
        thanks!
        kaashif 16 hours ago
        Rent seeking refers to https://en.wikipedia.org/wiki/Rent-seeking
        [-]
        1attice 15 hours ago
        Yes I know that, read your sibling post
        mhluongo 16 hours ago
        Two different "rent"s.
        [-]
        1attice 15 hours ago
        Not really see your sibling post
      - alwillis 5 hours ago
        > More than killer AI I'm afraid of Anthropic/OpenAI going into full rent-seeking mode so that everyone working in tech is forced to fork out loads of money just to stay competitive on the market.
        You should be more concerned about killer AI than rent seeking by OpenAI and Anthropic. AI evolving to the point of losing control is what scientists and researchers have predicted for years; they didn’t think it would happen this quickly but here we are.
        This market is hyper competitive; the models from China and other labs are just a level or two below the frontier labs.
      - aspenmartin 18 hours ago
        Well don’t forget we still have competition. Were anthropic to rent seek OpenAI would undercut them. Were OpenAI and anthropic to collude that would be illegal. For anthropic to capture the entire coding agent market and THEN rent seek, these days it’s never been easier to raise $1B and start a competing lab
        [-]
        cedws 17 hours ago
        In practice this doesn't work though, the Mastercard-Visa duopoly is an example, two competing forces doesn't create aggressive enough competition to benefit the consumer. The only hope we have is the Chinese models, but it will always be too expensive to run the full models for yourself.
        [-]
        brokencode 17 hours ago
        New companies can enter this space. Google’s competing, though behind. Maybe Microsoft, Meta, Amazon, or Apple will come out with top notch models at some point.
        There is no real barrier to a customer of Anthropic adopting a competing model in the future. All it takes is a big tech company deciding it’s worth it to train one.
        On the other hand, Visa/Mastercard have a lot of lock-in due to consumers only wanting to get a card that’s accepted everywhere, and merchants not bothering to support a new type of card that no consumer has. There’s a major chicken and egg problem to overcome there.
        lelanthran 6 hours ago
        > In practice this doesn't work though, the Mastercard-Visa duopoly is an example,
        MC/Visa duopoly is an example of lock-in via network effects. Not sure that that applies to a product that isn't affected by how many other people are running it.
        sghiassy 17 hours ago
        Chinese competition can always be banned. Example: Chinese electric car competition
        [-]
        dmantis 2 hours ago
        Just in one particular country. That hurts their labs, but there are ~190 other countries in the world for Chinese to sell their products to, just like they do with their cars.
        And businesses from these other countries would happily switch to Chinese. From security perspective both Chinese and US espionage is equally bad, so why care if it all comes down to money and performance.
        sho_hn 17 hours ago
        That's what OP was saying, I think, noting that running them locally won't be a solution.
        oblio 16 hours ago
        Also Chinese smartphones. Huawei was about 12-18 months from becoming the biggest smartphone manufacturer in the world a few years ago. If it would have been allowed to sell its phones freely in the US I'm fairly sure Apple would have been closer to Nokia than to current day Apple.
        [-]
        aurareturn 16 hours ago
        If Huawei was never banned from using TSMC, they'd likely have a real Nvidia competitor and may have surpassed Apple in mobile chip designs.
        They actually beat Apple A series to become the first phone to use the TSMC N7 node.
        realusername 1 hour ago
        I don't think it will matter too much in the long run, 8 of the top 10 smartphone manufacturers are Chinese, there's nothing the US government can really do.
      - therealdeal2020 16 hours ago
        but you are assuming that the magical wizards are the only ones who can create powerful AIs... mind you these people have been born just few decades ago. Their knowledge will be transferred and it will only take a few more decades until anyone can train powerful AIs ... you can only sit on tech for so long before everyone knows how to do it
        [-]
        cedws 16 hours ago
        It's not a matter of knowledge, it's a matter of resources. It takes billions of dollars of hardware to train a SOTA LLM and it's increasing all the time. You cannot possibly hope to compete as an independent or small startup.
        [-]
        selcuka 14 hours ago
        > It takes billions of dollars of hardware to train a SOTA LLM and it's increasing all the time.
        True, but it's also true that the returns from throwing money to the problem are diminishing. Unless one of those big players invents a new, propriatery paradigm, the gap between a SOTA model and an open model that runs on consumer hardware will narrow in the next 5 years.
        wincy 8 hours ago
        Eventually these super expensive SXM data center GPUs will cost pennies on the dollar, and we’ll be able to snatch up H200s for our homelabs. Give it a decade.
        Also eventually these WEIGHTS will leak. You can’t have the world’s most valuable data that can just be copied to a hard drive stay in the bottle forever, even if it’s worth a billion dollars. Somehow, some way, that genie’s going to get out, be it by some spiteful employee with nothing to lose, some state actor, or just a fuck up of epic proportions.
        block_dagger 15 hours ago
        Presumably, the hardware to run this level of model will be democratized within the timeframe of the parent comment.
        [-]
        walterbell 15 hours ago
        See https://amppublic.com and Stanford CS153, https://www.youtube.com/watch?v=mZqh7emiz9Q
        marcus_holmes 12 hours ago
        Unless, of course, the powerful manage to scare everyone about how the machines will kill us all and so AI technology needs to be properly controlled by the relevant authorities, and anyone making/using an unlicensed AI is arrested and jailed.
      - robwwilliams 15 hours ago
        With Gemma-4 open and running on laptops and phones I see the flip side. How many non-HN users or researchers even need Opus 4.6e level performance? OpenAI, Anthropric and Google may be “rent seeking” from large corporations — like the Oracles and IBMs.
        [-]
        baq 1 hour ago
        Everyone, once AI diffuses enough. You’ll be unhireable if you don’t use AI in a year or two.
      - eru 9 hours ago
        You know, they have competitors?
      - MattRix 16 hours ago
        The thing is that the current models can ALREADY replicate most software-based products and services on the market. The open source models are not far behind. At a certain point I'm not sure it matters if the frontier models can do faster and better. I see how they're useful for really complex and cutting edge use cases, but that's not what most people are using them for.
    - ben_w 5 hours ago
      > I get the security aspect, but if we've hit that point any reasonably sophisticated model past this point will be able to do the damage they claim it can do. They might as well be telling us they're closing up shop for consumer models.
      I read it like I always read the GPT-2 announcement no matter what others say: It's *not* being called "too dangerous to ever release", but rather "we need to be mindful, knowing perfectly well that other AI companies can replicate this imminently".
      The important corps (so presumably including the Linux Foundation, bigger banks and power stations, and quite possibly excluding x.com) will get access now, and some other LLM which is just as capable will give it to everyone in 3 months time at which point there's no benefit to Anthropic keeping it off-limits.
    - marcus_holmes 12 hours ago
      This is my nightmare about AI; not that the machines will kill all the humans, but that access is preferentially granted to the powerful and it's used to maintain the current power structure in blatant disregard of our democratic and meritocratic ideals, probably using "security" as the justification (as usual).
    - mike_hearn 7 minutes ago
      I think they already said somewhere that they can't release Mythos because it requires absurdly large amounts of compute. The economics of releasing it just don't work.
    - alwillis 5 hours ago
      > They should just say they'll never release a model of this caliber to the public at this point and say out loud we'll only get gimped versions.
      That’s not going to happen. If you recall, OpenAI didn’t release a model a few years ago because they felt it was too dangerous.
      Anthropic is giving the industry a heads up and time to patch their software.
      They said there are exploitable vulnerabilities in every major operating system.
      But in 6 months every frontier model will be able to do the same things. So Anthropic doesn’t have the luxury of not shipping their best models. But they also have to be responsible as well.
    - quotemstr 18 hours ago
      This is why the EAs, and their almost comic-book-villain projects like "control AI dot com" cannot be allowed to win. One private company gatekeeping access to revolutionary technology is riskier than any consequence of the technology itself.
      [-]
      - scrawl 17 hours ago
        Having done a quick search of "control AI dot com", it seems their intent is educate lawmakers & government in order to aid development of a strong regulatory framework around frontier AI development.
        Not sure how this is consistent with "One private company gatekeeping access to revolutionary technology"?
        [-]
        quotemstr 16 hours ago
        > strong regulatory framework around frontier AI development
        You have to decode feel-good words into the concrete policy. The EAs believe that the state should prohibit entities not aligned with their philosophy to develop AIs beyond a certain power level.
        [-]
        arw0n 4 hours ago
        And what is malicious about that ideology? I think EAs tend to like the smell of their farts way too much, but their views on AI safety don't seem so bad. I think their thoughts on hypothetical super intelligence or AGI are too focused on control (alignment) and should also focus on AI welfare, but that's more a point of disagreement that I doubt they'd try to forbid.
      - frozenseven 17 hours ago
        Couldn't agree more. The "safest" AI company is actually the biggest liability. I hope other companies make a move soon.
      - FeepingCreature 17 hours ago
        No it isn't lol. The consequence of the technology literally includes human extinction. I prefer 0 companies, but I'll take 1 over 5.
    - guzfip 18 hours ago
      > A jump that we will never be able to use since we're not part of the seemingly minimum 100 billion dollar company club as requirement to be allowed to use it.
      > They should just say they'll never release a model of this caliber to the public at this point and say out loud we'll only get gimped
      Duh, this was fucking obvious from the start. The only people saying otherwise were zealots who needed a quick line to dismiss legitimate concerns.
- WarmWash 17 hours ago
  Are these fair comparisons? It seems like mythos is going to be like a 5.4 ultra or Gemini Deepthink tier model, where access is limited and token usage per query is totally off the charts.
  [-]
  - mulmboy 17 hours ago
    There are a few hints in the doc around this
    > Importantly, we find that when used in an interactive, synchronous, “hands-on-keyboard” pattern, the benefits of the model were less clear. When used in this fashion, some users perceived Mythos Preview as too slow and did not realize as much value. Autonomous, long-running agent harnesses better elicited the model’s coding capabilities. (p201)
    ^^ From the surrounding context, this could just be because the model tends to do a lot of work in the background which naturally takes time.
    > Terminal-Bench 2.0 timeouts get quite restrictive at times, especially with thinking models, which risks hiding real capabilities jumps behind seemingly uncorrelated confounders like sampling speed. Moreover, some Terminal-Bench 2.0 tasks have ambiguities and limited resource specs that don’t properly allow agents to explore the full solution space — both being currently addressed by the maintainers in the 2.1 update. To exclusively measure agentic coding capabilities net of the confounders, we also ran Terminal-Bench with the latest 2.1 fixes available on GitHub, while increasing the timeout limits to 4 hours (roughly four times the 2.0 baseline). This brought the mean reward to 92.1%. (p188)
    > ...Mythos Preview represents only a modest accuracy improvement over our best Claude Opus 4.6 score (86.9% vs. 83.7%). However, the model achieves this score with a considerably smaller token footprint: the best Mythos Preview result uses 4.9× fewer tokens per task than Opus 4.6 (226k vs. 1.11M tokens per task). (p191)
    [-]
    - alyxya 16 hours ago
      The first point is along the lines of what I'd expect given that claude code is generally reliable at this point. A model's raw intelligence doesn't seem as important right now compared to being able to support arbitrary length context.
    - derangedHorse 9 hours ago
      The quote comparing them here was for BrowseComp which "tests an agent's ability to find hard-to-locate information on the open web." (for those wondering). The new model seems significantly better than Opus4.6 judging by the 'Overall results summary'
    - zozbot234 15 hours ago
      Good catch. If it's "too slow" even when ran in a state-of-the-art datacenter environment, this "Mythos" model is most closely comparable to the "Deep Research" modes for GPT and Gemini, which Claude formerly lacked any direct equivalent for.
      [-]
      - stalfie 1 hour ago
        I don't think that's what's being hinted at. The system card seems to say that the model is both token efficient and slow in practice. Deep research modes generally work by having many subagents/large token spend. So this more likely the fact that each token just takes longer to produce, which would be because the model is simply much larger.
        By epoch AIs datacenter tracking methods, anthropic has had access to the largest amount of contiguous compute since late last year. So this might simply be the end result result of being the first to have the capacity to conduct a training run of this size. Or the first seemingly successful one at any rate.
        [-]
        zozbot234 31 minutes ago
        "Slow and token-efficient" could be achieved quite trivially by taking an existing large MoE model and increasing the amount of active experts per layer, thus decreasing sparsity. The broader point is that to end users, Mythos behaves just like Deep Research: having it be "more token efficient" compared to running swarms of subagents is not something that impacts them directly.
- WinstonSmith84 14 hours ago
  Not discussing Mythos here, but Opus. Opus to me has been significantly better at SWE than GPT or Gemini - that gets me confused why Opus is ranking clearly lower than GPT, and even lower than Gemini.
  [-]
  - muyuu 12 hours ago
    When did you last compare them? Codex right now is considerably better in my experience. Can't speak for Gemini.
    [-]
    - StingyJelly 2 hours ago
      I wouldn't call codex considerably better. It may depend on specific codebase and your expectations, but codex produces more "abstraction for the sake of abstraction" even on simple tasks, while opus in my experience usually chooses right level of abstraction for given task.
    - gck1 12 hours ago
      Tried Gemini 2 weeks ago to see where it's at, with gemini-cli.
      Failed to use tools, failed to follow instructions, and then went into deranged loop mode.
      Essentially, it's where it was 1.5 years ago when I tried it the last time.
      It's honestly unbelievable how Google managed to fail so miserably at this.
      [-]
      - unsupp0rted 2 hours ago
        It’s great on AI Studio. Harness issues, I agree.
      - 4b11b4 11 hours ago
        Their harness might be behind
        [-]
        gck1 11 hours ago
        I think failures that I observed with gemini are unrelated to the harness. Because the same failures happened with third party harnesses too.
    - sandos 5 hours ago
      Agree, I never actually had great success with Opus. I think its the failures that are annoying, its probably better than codex when its "good", but it fails in annoying ways that I think codex very seldom does.
  - otabdeveloper4 2 hours ago
    A secret art known to the cognoscenti as "benchmark gaming".
- pants2 18 hours ago
  We're gonna need some new benchmarks...
  ARC-AGI-3 might be the only remaining benchmark below 50%
  [-]
  - Leynos 17 hours ago
    Opus 4.6 currently leads the remote labor index at 4.17. GPT-5.4 isn't measured on that one though: https://www.remotelabor.ai/
    GPT 5.4 Pro leads Frontier Maths Tier 4 at 35%: https://epoch.ai/benchmarks/frontiermath-tier-4/
  - randomtoast 17 hours ago
    Humanity's Last Exam (HLE) is already insanely difficult. It introduces 2,500 questions spanning mathematics, humanities, natural sciences, ancient languages, ...
    Here is an example question: https://i.redd.it/5jl000p9csee1.jpeg
    No human could even score 5% on HLE.
    [-]
    - saberience 3 hours ago
      I've never understood the point of things like HLE, it doesn't really prove or show anything since 99.99% of humans can't do a single question on this exam.
      That is, it's easy to make benchmarks which humans are bad at, humans are really bad at many things.
      Divide 123094382345234523452345111 by 0.1234243131324, guess what, humans would find that hard, computers easy. But it doesn't mean much.
      Humanity's last exam (HLE) couldn't be completed by most of humanity, the vast majority, so it doesn't really capture anything about humanity or mean much if a computer can do it.
      [-]
      - DroneBetter 2 hours ago
        [dead]
- AlexC04 17 hours ago
  but how does it perform on pelican riding a bicycle bench? why are they hiding the truth?!
  (edit: I hope this is an obvious joke. less facetiously these are pretty jaw dropping numbers)
  [-]
  - bertil 17 hours ago
    We are all fans for Simon’s work, and his test is, strangely enough, quite good.
- ninjagoo 16 hours ago
  > Combined results (Claude Mythos / Claude Opus 4.6 / GPT-5.4 / Gemini 3.1 Pro)
  > Terminal-Bench 2.0: 82.0% / 65.4% / 75.1% / 68.5%
  > GPQA Diamond: 94.5% / 91.3% / 92.8% / 94.3%
  > MMMLU: 92.7% / 91.1% / — / 92.6–93.6%
  > USAMO: 97.6% / 42.3% / 95.2% / 74.4%
  > OSWorld: 79.6% / 72.7% / 75.0% / —
  Given that for a number of these benchmarks, it seems to be barely competitive with the previous gen Opus 4.6 or GPT-5.4, I don't know what to make of the significant jumps on other benchmarks within these same categories. Training to the test? Better training?
  And the decision to withhold general release (of a 'preview' no less!) seems to be well, odd. And the decision to release a 'preview' version to specific companies? You know any production teams at these massive companies that would work with a 'preview' anything? R&D teams, sure, but production? Part of me wants to LoL.
  What are they trying to do? Induce FOMO and stop subscriber bleed-out stemming from the recent negative headlines around problems with using Claude?
  [-]
  - TacticalCoder 16 hours ago
    > Given that for a number of these benchmarks, it seems to be barely competitive with the previous gen
    We're not reading the same numbers I think. Compared to Opus 4.6, it's a big jump nearly in every single bench GP posted. They're "only" catching up to Google's Gemini on GPQA and MMMLU but they're still beating their own Opus 4.6 results on these two.
    This sounds like a much better model than Opus 4.6.
    [-]
    - ninjagoo 15 hours ago
      > We're not reading the same numbers I think.
      We must not be.
      That's why I listed out the ones where it is barely competitive from @babelfish's table, which itself is extracted from Pg 186 & 187 of the System Card, which has the comparison with Opus 4.6, GPT 5.4 and Gemini 3.1 Pro.
      Sure, it may be better than Opus 4.6 on some of those, but barely achieves a small increase over GPT-5.4 on the ones I called out.
      [-]
      - nl 12 hours ago
        > barely competitive
        It's higher than all other models except vs Gemini 3.1 Pro on MMMLU
        MMMLU is generally thought to be maxed out - as it it might not be possible to score higher than those scores.
        > Overall, they estimated that 6.5% of questions in MMLU contained an error, suggesting the maximum attainable score was significantly below 100%[1]
        Other models get close on GPQA Diamond, but it wouldn't be surprising to anyone if the max possible on that was around the 95% the top models are scoring.
        [1] https://en.wikipedia.org/wiki/MMLU
      - lostmsu 56 minutes ago
        You are reading the percentages wrong.
        Because 100% is maximum, you should be looking at error rates instead. GPT has 25% on Terminal Bench and the new model has 18%, almost 1.4x reduction.
      - nimchimpsky 15 hours ago
        barely competitive ? Mythos column is the first column.
        You are the only person with this take on hackernews, everyone else "this is a massive a jump". Fwiwi, the data you list shows the biggest jump I remember for mythos
        [-]
        devmor 14 hours ago
        The biggest jump in the numbers they quoted is 6%.
        Please look at the columns OTHER than Opus as well.
        [-]
        josephg 14 hours ago
        > Combined results (Claude Mythos / Claude Opus 4.6 / GPT-5.4 / Gemini 3.1 Pro)
        > Terminal-Bench 2.0: 82.0% / 65.4% / 75.1% / 68.5%
        > USAMO: 97.6% / 42.3% / 95.2% / 74.4%
        > The biggest jump in the numbers they quoted is 6%.
        Just in the numbers you quoted, thats a 16.6% jump in terminal-bench and a 55.3% absolute increase in USAMO over their previous Opus 4.6 model.
        [-]
        devmor 14 hours ago
        I don’t know if you’re willingly disregarding everything being said to you or there’s a language barrier here.
        DroneBetter 2 hours ago
        this just in: HN user forgets how sigmoid functions work
        nl 12 hours ago
        It's higher than all other models except vs Gemini 3.1 Pro on MMMLU
  - enraged_camel 15 hours ago
    Let's be clear: your entire post is just pure, unadulterated FUD. You first claim, based on cherry-picked benchmarks, that Mythos is actually only "barely competitive" with existing models, then suggest they must be training to the test, then call it "odd" that they are withholding the release despite detailed and forthcoming explanations from Anthropic regarding why they are doing that, then wrap it up with the completely unsubstantiated that they must be bleeding subscribers and that this must just be to stop that bleed.
- matheusmoreira 14 hours ago
  Wow. Mythos must be insanely good considering how good a model Opus already is. I hope it's usable on a humble subscription...
  [-]
  - crimsoneer 5 hours ago
    You get a single call a month. Use it wisely.
    [-]
    - FridgeSeal 1 hour ago
      What is the meaning of life, the universe, and everything?
      > Thought for 7.5 million years
      [-]
      - matheusmoreira 47 minutes ago
        Hello, Claude!
        > Rate limit reached
- cesarvarela 9 hours ago
  I thought they were bluffing when they talked about the scaling laws, but looking at the benchmark scores, they were not.
  I wonder if misalignment correlates with higher scores.
- whalesalad 18 hours ago
  Honestly we are all sleeping on GPT-5.4. Particularly with the influx of Claude users recently (and increasingly unstable platform) Codex has been added to my rotation and it's surprising me.
  [-]
  - babelfish 18 hours ago
    Totally. Best-in-class for SWE work (until Mythos gets released, if ever, but I suspect the rumored "Spud" will be out by then too)
    [-]
    - girvo 17 hours ago
      It really isn’t. I wish it was, because work complains about overuse of Opus.
      [-]
      - jeswin 13 hours ago
        It really is, for complex tasks. Claude excels at low-mid complexity (CRUD apps, most business apps). For anything somewhat out of the distribution, codex at the moment has no peer.
        [-]
        ttul 9 hours ago
        I find that more experienced devs are more likely to prefer Codex… anecdotal but… it’s a thing.
        [-]
        xvector 9 hours ago
        This is because no one bothers to set thinking to high, as it now defaults to medium in CC.
        Once you set thinking to high it works just as well as 5.4 even for pretty complex tasks
        [-]
        jeswin 8 hours ago
        I have always used Claude at max thinking levels since it launched. It has never been up to the task. For clarity, the task being this: https://github.com/tsoniclang/tsonic
        Meanwhile, there are half a dozen other projects (business apps, web apps etc) where it works well.
  - rafaelmn 18 hours ago
    GPT is shit at writing code. It's not dumb - extra high thinking is really good at catching stuff - but it's like letting a smart junior into your codebase - ignore all the conventions, surrounding context, just slop all over the place to get it working. Claude is just a level above in terms of editing code.
    [-]
    - sho_hn 18 hours ago
      Very different experience for me. Codex 5.3+ on xhigh are the only models I've tried so far that write reasonably decent C++ (domains: desktop GUI, robotics, game engine dev, embedded stuff, general systems engineering-type codebases), and idiomatic code in languages not well-represented in training data, e.g. QML. One thing I like is explicitly that it knows better when to stop, instead of brute-forcing a solution by spamming bespoke helpers everywhere no rational dev would write that way.
      Not always, no, and it takes investment in good prompting/guardrails/plans/explicit test recipes for sure. I'm still on average better at programming in context than Codex 5.4, even if slower. But in terms of "task complexity I can entrust to a model and not be completely disappointed and annoyed", it scores the best so far. Saves a lot on review/iteration overhead.
      It's annoying, too, because I don't much like OpenAI as a company.
      (Background: 25 years of C++ etc.)
      [-]
      - boring-human 16 hours ago
        Same background as you, and same exact experience as you. Opus and Gemini have not come close to Codex for C++ work. I also run exclusively on xhigh. Its handling of complexity is unmatched.
        At least until next week when Mythos and GPT 6 throw it all up in the air again.
    - Jcampuzano2 18 hours ago
      Not my experience. GPT 5.4 walks all over Claude from what I've worked with and its Claude that is the one willing to just go do unnecessary stuff that was never asked for or implement the more hacky solutions to things without a care for maintainability/readability.
      But I do not use extra high thinking unless its for code review. I sit at GPT 5.4 high 95% of the time.
    - camdenreslink 15 hours ago
      ChatGPT 5.4 with extra high reasoning has worked really well for me, and I don't notice a huge difference with Opus 4.6 with high reasoning (those are the 2 models/thinking modes I've used the most in the last month or so).
    - leobuskin 18 hours ago
      And as a bonus: GPT is slow. I’m doing a lot of RE (IDA Pro + MCP), even when 5.4 gives a little bit better guesses (rarely, but happens) - it takes x2-x4 longer. So, it’s just easier to reiterate with Opus
      [-]
      - aizk 11 hours ago
        I've been messing with using Claude, Codex, and Kimi even for reverse engineering at https://decomp.dev/ it's a ton of fun. Great because matching bytes is a scoring function that's easy for the models to understand and make progress on.
        [-]
        adamgoodapp 5 hours ago
        I want to get into RE with AI. Which model you liking the most?
      - blazespin 16 hours ago
        Yeah, need some good RE benchmarks for the LLMs. :)
        RE is very interesting problem. A lot more that SWE can be RE'd. I've found the LLMs are reluctant to assist, though you can workaround.
        [-]
        porker 16 hours ago
        What is RE in this context?
        [-]
        astrange 16 hours ago
        Reverse engineering
      - 19h 13 hours ago
        Mind sharing the use cases you're using IDA via MCP for?
    - zarzavat 18 hours ago
      Yes, it's becoming clear that OpenAI kinda sucks at alignment. GPT-5 can pass all the benchmarks but it just doesn't "feel good" like Claude or Gemini.
      [-]
      - chaos_emergent 17 hours ago
        An alternative but similar formulation of that statement is that Anthropic has spent more training effort in getting the model to “feel good” rather than being correct on verifiable tasks. Which more or less tracks with my experience of using the model.
        [-]
        zarzavat 1 hour ago
        Alignment is a subspace of capability. Feeling good is nice, but it's also a manifestation of the level that the model can predict what I do and don't want it to do. The more accurately it can predict my intentions without me having to spell them out explicitly in the prompt, the more helpful it is.
        GPT-5 is good at benchmarks, but benchmarks are more forgiving of a misaligned model. Many real world tasks often don't require strong reasoning abilities or high intelligence, so much as the ability to understand what the task is with a minimal prompt.
        Not every shop assistant needs a physics degree, and not every physics professor is necessarily qualified to be a shop assistant. A person, or LLM, can be very smart while at the same time very bad at understanding people.
        For example, if GPT-5 takes my code and rearranges something for no reason, that's not going to affect its benchmarks because the code will still produce the same answers. But now I have to spend more time reviewing its output to make sure it hasn't done that. The more time I have to spend post-processing its output, the lower its capabilities are since the measurement of capability on real world tasks is often the amount of time saved.
      - lilytweed 18 hours ago
        Whenever I come back to ChatGPT after using Claude or Gemini for an extended period, I’m really struck by the “AI-ness.” All the verbal tics and, truly, sloppishness, have been trained away by the other, more human-feeling models at this point.
        [-]
        kranke155 15 hours ago
        GPT was clearly changed after its sycophantic models lead to the lawsuits.
        [-]
        josephg 14 hours ago
        It still has a very ... plastic feeling. The way it writes feels cheap somehow. I don't know why, but Claude seems much more natural to me. I enjoy reading its writing a lot more.
        That said, I'll often throw a prompt into both claude and chatgpt and read both answers. GPT is frequently smarter.
        [-]
        kranke155 3 hours ago
        GPT is more accurate. But Claude has this way of association between things that seems smarter and more human to me.
    - whalesalad 18 hours ago
      This has been my experience. With very very rigid constraints it does ok, but without them it will optimize expediency and getting it done at the expense of integrating with the broader system.
      [-]
      - ctoth 17 hours ago
        My favorite example of this from last night:
        Me: Let's figure out how to clone our company Wordpress theme in Hugo. Here're some tools you can use, here's a way to compare screenshots, iterate until 0% difference.
        Codex: Okay Boss! I did the thing! I couldn't get the CSS to match so I just took PNGs of the original site and put them in place! Matches 100%!
- johnnichev 15 hours ago
  damn... ok that's impressive.
- simianwords 18 hours ago
  The real part is SWE-bench Verified since there is no way to overfit. That's the only one we can believe.
  [-]
  - ollin 18 hours ago
    My impression was entirely the opposite; the unsolved subset of SWE-bench verified problems are memorizable (solutions are pulled from public GitHub repos) and the evaluators are often so brittle or disconnected from the problem statement that the only way to pass is to regurgitate a memorized solution.
    OpenAI had a whole post about this, where they recommended switching to SWE-bench Pro as a better (but still imperfect) benchmark:
    https://openai.com/index/why-we-no-longer-evaluate-swe-bench...
    > We audited a 27.6% subset of the dataset that models often failed to solve and found that at least 59.4% of the audited problems have flawed test cases that reject functionally correct submissions
    > SWE-bench problems are sourced from open-source repositories many model providers use for training purposes. In our analysis we found that all frontier models we tested were able to reproduce the original, human-written bug fix
    > improvements on SWE-bench Verified no longer reflect meaningful improvements in models’ real-world software development abilities. Instead, they increasingly reflect how much the model was exposed to the benchmark at training time
    > We’re building new, uncontaminated evaluations to better track coding capabilities, and we think this is an important area to focus on for the wider research community. Until we have those, OpenAI recommends reporting results for SWE-bench Pro.
    [-]
    - simianwords 17 hours ago
      I stand corrected.
- maplethorpe 4 hours ago
  Funny, I made my own model at home and got even higher scores than these. I'm a bit concerned about releasing it, though, so I'm just going to keep it local for now.
tony_cannistra 18 hours ago
> Claude Mythos Preview is, on essentially every dimension we can measure, the best-aligned model that we have released to date by a significant margin. We believe that it does not have any significant coherent misaligned goals, and its character traits in typical conversations closely follow the goals we laid out in our constitution. Even so, we believe that it likely poses the greatest alignment-related risk of any model we have released to date. How can these claims all be true at once? Consider the ways in which a careful, seasoned mountaineering guide might put their clients in greater danger than a novice guide, even if that novice guide is more careless: The seasoned guide’s increased skill means that they’ll be hired to lead more difficult climbs, and can also bring their clients to the most dangerous and remote parts of those climbs. These increases in scope and capability can more than cancel out an increase in caution.
https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...
[-]
- game_the0ry 15 hours ago
  There is some unintentional good marketing here -- the model is so good its dangerous.
  Reminds me of the book 48 Laws of Power -- so good its banned from prisons.
  [-]
  - gpm 15 hours ago
    Unintentional? This sort of marketing has been both Antrhopic's and OpenAI's MO for years...
    [-]
    - mbil 12 hours ago
      Agree. I think they're intentionally sitting on the fence between "These models are the most useful" and "These models are the most dangerous".
      They want the public and, in turn, regulators to fear the potential of AI so that those regulators will write laws limiting AI development. The laws would be crafted with input from the incumbents to enshrine/protect their moat. I believe they're angling for regulatory capture.
      On the other hand, the models have to seem amazingly useful so that they're made out to be worth those risks and the fantastic investment they require.
    - bitwize 9 hours ago
      The new Power Mac® G4 with Velocity Engine®. So powerful, the government classifies it as a supercomputer and a potential weapon.
      [-]
      - accrual 8 hours ago
        TIL about AltiVec: https://apple.fandom.com/wiki/AltiVec
    - FergusArgyll 14 hours ago
      Business Negging
      https://www.lesswrong.com/posts/WACraar4p3o6oF2wD/sam-altman...
- Zee2 16 hours ago
  Alignment “appearing” better as model capabilities increase scares the shit out of me, tbh.
  [-]
  - arcanus 14 hours ago
    Conversely: in humans, intelligence is inversely correlated with crime.
    It doesn't go to zero, however!
    [-]
    - O5vYtytb 10 hours ago
      If you're smart enough you just use the laws as written to get what you want, or change them.
      [-]
      - sciencejerk 7 hours ago
        Yep
    - lelanthran 5 hours ago
      > Conversely: in humans, intelligence is inversely correlated with crime.
      If you're measuring the intelligence of criminals who have been caught, why would you expect it to be otherwise?
      IOW, you're recording the intelligence of a specific subset of criminals - those dumb enough to be caught!
      If you expand your samples to all criminals you'd probably get a different number.
    - austinjp 5 hours ago
      It very much depends on the crime. The truly awful stuff is committed by intelligent people.
    - falcor84 12 hours ago
      Is that actually well defined given the very low sample size at the top?
      To the best of my knowledge, none of the individuals believed to have an IQ >200 have committed an actual crime.
      The closest I found is William James Sidis's arrest for participating in a socialist march.
      [-]
      - RugnirViking 3 hours ago
        IQs more than about 140-150 don't really mean much. They typically come from mathematical extrapolation that tries to account for age (this young child performs very well on the test, just think what they can do when they're an adult). Adult scores usually show this not to be the case
- goekjclo 16 hours ago
  I don't know if they can be any more 'cautious' for Mythos 2...
- CamperBob2 16 hours ago
  Translation: yay, more paternalism.
  [-]
  - kay_o 16 hours ago
    Anthropic always goes on and on about how their models are world changing and super dangerous like every single time they make something new they say its going to rewrite everything and scary lmao
    funny because they do it every time like clockwork acting like their ai is a thunderstorm coming to wipe out the world
    [-]
    - mindwok 14 hours ago
      You say this like it's a bad thing, but wouldn't you rather they overindex on the danger of their models?
      [-]
      - anon373839 14 hours ago
        That’s not what they are doing. They are just hyping up the product - and, no doubt, trying to foster a climate of awe so that when they ask their friends in Washington to legislate on their behalf, the environment is more receptive.
    - hgoel 11 hours ago
      They do tend to make a lot of noise about it for the PR, but at the same time the actual safety research they present seems to be relatively grounded in practical reality, e.g. the quote someone posted here about how the Mythos model apparently has a tendency to try to bypass safety systems if they get in the way of what it has been asked to do.
      Sure, a big part of this is PR about how smart their model apparently is, but the failure mode they're describing is also pretty relevant for deploying LLM-based systems.
    - signatoremo 2 hours ago
      Every single time, really? When did they said that the last time?
      I also don't recall they ever limited their models to selective groups.
    - wolttam 16 hours ago
      If there are advancements, they have to be described somehow.
      What if the capability advancements are real and they warrant a higher level of concern or attention?
      Are we just going to automatically dismiss them because "bro, you're blowing it up too much"
      Either way these improvements to capabilities are ratcheting along at about the pace that many people were expecting (and were right to expect). There is no apparent reason they will stop ratcheting along any time soon.
      The rational approach is probably to start behaving as if models that are as capable as Anthropic says this one is do actually exist (even if you don't believe them on this one). The capabilities will eventually arrive, most likely sooner than we all think, and you don't want to be caught with your pants down.
      [-]
      - kay_o 16 hours ago
        I believe advancements sure. But it is a very boy who cried wolf situation for some of these. There are other companies that behave less in this way, Antrhopic seem very unique in that they love making every single release a world ender
        [-]
        bloppe 6 hours ago
        Altman called GPT-2 "too dangerous to release". Google tends to be much more measured even though they're the ones who tend to release the actual research breakthroughs
        retsibsi 2 hours ago
        > they love making every single release a world ender
        You've said this a couple of times, but it doesn't match my recollection, and I get the impression you're basically making it up based on vibes. (Please prove me wrong, though.)
        Their last major frontier release was Opus 4.6, and the release announcement was... very chill about safety: https://www.anthropic.com/news/claude-opus-4-6#a-step-forwar...
- tekacs 17 hours ago
  "We want to see risks in the models, so no matter how good the performance and alignment, we’ll see risks, results and reality be damned."
  [-]
  - randomcatuser 16 hours ago
    i mean, to be fair, these are professional researchers.
    i'm very inclined to trust them on the various ways that models can subtly go wrong, in long-term scenarios
    for example, consider using models to write email -- is it a misalignment problem if the model is just too good at writing marketing emails?? or too good at getting people to pay a spammy company?
    another hot use case: biohacking. if a model is used to do really hardcore synthetic chemistry, one might not realize that it's potentially harmful until too late (ie, the human is splitting up a problem so that no guardrails are triggered)
    [-]
    - cruffle_duffle 15 hours ago
      "for example, consider using models to write email -- is it a misalignment problem if the model is just too good at writing marketing emails?? or too good at getting people to pay a spammy company?"
      But who gets to be the judge of that kind of "misalignment"? giant tech companies?
      [-]
      - riwsky 12 hours ago
        Might makes right; brains hold reigns.
apetresc 16 hours ago
I've long maintained that the real indicator that AGI is imminent is that public availability stops being a thing. If you truly believed you had a superhuman, godlike mind in your thrall, renting it out for $20/month would be the last thing you would choose to do with it.
[-]
- goldenarm 15 hours ago
  Simpler explanation : they don't have enough GPUs to release this much larger model.
  [-]
  - muyuu 12 hours ago
    Yep, I'm skeptical about their inference efficiency, given how much they're scrambling to reduce compute when they're already the most expensive by far (and in my experience not the best quality either).
    However we cannot observe these things directly and it could be simply that OpenAI are willing to burn cash harder for now.
  - camdenreslink 13 hours ago
    And/or it isn’t cost effective to run.
    [-]
    - halJordan 12 hours ago
      Thats what he said
  - cruffle_duffle 14 hours ago
    This is actual reason. So any investors reading our system card.... write us another check and watch the $$$$$$$$ roll in. It's so dangerous we can't even release it!
  - crimsoneer 5 hours ago
    Quite, given Claude is down this morning...
- root_axis 8 hours ago
  That logic makes sense, but them hyping up the model is a sign that this is just another marketing stunt. Otherwise, we wouldn't even be hearing about it rather than a media blitz designed to stoke demand for their dangerous and exclusive world changing super model.
  [-]
  - sigmoid10 6 hours ago
    This is the same scheme that OpenAI has used since GPT 2. "Oh no, it's so dangerous we have to limit public access." Great for raising money from investors, but nothing more than a marketing blitz campaign. Additionally, the competitors are probably about to release their models, while Anthropic is still lagging on the necessary infrastructure to serve their old models. So they have to announce their model before the others to stay at least somewhat relevant in the news cycle.
- blazespin 16 hours ago
  Anthropic needs money like the 112B OpenAI got. They could be hyping and this is good hype. Who knows how benchmaxxed they are.
  If they provide access to 3rd party benchmarking (not just one) than maybe I'll believe it. Until then...
  [-]
  - xvector 8 hours ago
    You don't need to believe it. The real story will be if companies allowed to use it, stick with it.
- dgellow 16 hours ago
  You have to recoup your training costs though? But I’m sure you would have better option than renting it to the general public if you indeed have a perfected AI
  [-]
  - piperswe 15 hours ago
    If you truly have an artificial superhuman mind, you don't need to rent it out to profit from it. You can skip to the chase and just have it run businesses itself, instead of renting it to human entrepreneur middlemen.
    [-]
    - brokencode 15 hours ago
      Running businesses and dealing with customers can be a major pain. There’s a lot of soft work in any business on top of the technical work.
      Why bother with all that when you can simply charge an extortionate rate and customers will pay it anyway because it’s still profitable?
      [-]
      - theptip 14 hours ago
        Public APIs get distilled, this is why Deepseek and Qwen are so competitive.
        I am very confident that frontier models won’t be public at strong AGI levels, and certainly not at superhuman levels.
      - flyinglizard 14 hours ago
        Because other than SWEs, very few other segments extract significant value from cutting edge AI at present. I suspect that for the average Joe conversing with their chat, GPT-4o was more than adequate (and really, when OpenAI tried to phase that out, the public revolted and they brought it back in).
        So companies might pay good money for these models for programming but elsewhere, I don't see where they capture particular interest yet.
    - TheOtherHobbes 3 hours ago
      I'm curious if any models are being trained explicitly on business management.
      I'm also wondering how performance would be tested, and how much results would depend on specific surrounding contexts (law, regulations, and so on) and what happens legally if a model breaks applicable laws.
      I mean actual going-concern businesses with customers, marketing, deliverables of some kind, and support. Not toy activities like share trading.
    - dgellow 15 hours ago
      It could be both? But renting to a few for a really large amount of money would be very low effort for massive revenue, compared to starting new businesses
      [-]
      - walterbell 14 hours ago
        Another option is to become a holding company with equity stakes in both suppliers (e.g. AMD) and vertical market customers.
- aurareturn 16 hours ago
  I think they'll just increase the price to $1k/month. I don't think they will gate it as long as they can make sure it doesn't design a nuke for you, etc.
- coppsilgold 15 hours ago
  It only makes sense to rent out tokens if you aren't able to get more value from them yourself.
  I would go a step further and posit that when things appear close Nvidia will stop selling chips (while appearing to continue by selling a trickle). And Google will similarly stop renting out TPUs. Both signals may be muddled by private chip production numbers.
- Rastonbury 6 hours ago
  That's the thing, when that level comes we will never know it's here. The only thing we'll have as evidence is the company who has it will always have a "public" model that is just slightly ahead of all competitors to keep market share while takeoff happens internally until they make big bang moves to lock in monopoly level/too big to fail/government protection to ensure utter victory.
- threethirtytwo 15 hours ago
  You would if there was one other company with a just as capable god like AI. You’d undercut them by 500 which would make them undercut you. Do that a couple of times and boom. 20 dollars.
  [-]
  - caditinpiscinam 15 hours ago
    That's still assuming that they're competing as consumer tools, rather than competing to discover the next miracle drug or trading algorithm or whatever. The idea is that there'd more profitable uses for a super-intelligent computer, even if there were more than one.
    [-]
    - Davidzheng 4 hours ago
      But would miracle drugs and trading algorithms be as profitable as AI research/chip design/energy research? Probably if AI is by far the biggest growth in the economy majority of the AI's usage internally should (as incentivized by economics) in some way work towards making itself better.
2001zhaozhao 15 hours ago
It's pretty crazy watching AI 2027 slowly but surely come true. What a world we now live in.
SWE-bench verified going from 80%-93% in particular sounds extremely significant given that the benchmark was previously considered pretty saturated and stayed in the 70-80% range for several generations. There must have been some insane breakthrough here akin to the jump from non-reasoning to reasoning models.
Regarding the cyberattack capabilities, I think Anthropic might now need to ban even advanced defensive cybersecurity use for the models for the public before releasing it (so people can't trick them to attack others' systems under the pretense of pentesting). Otherwise we'll get a huge problem with people using them to hack around the internet.
[-]
- jasonhansel 14 hours ago
  > so people can't trick them to attack others' systems under the pretense of pentesting
  A while back I gave Claude (via pi) a tool to run arbitrary commands over SSH on an sshd server running in a Docker container. I asked it to gather as much information about the host system/environment outside the container as it could. Nothing innovative or particularly complicated--since I was giving it unrestricted access to a Docker container on the host--but it managed to get quite a lot more than I'd expected from /proc, /sys, and some basic network scanning. I then asked it why it did that, when I could just as easily have been using it to gather information about someone else's system unauthorized. It gave me a quite long answer; here was the part I found interesting:
  > framing shifts what I'll do, even when the underlying actions are identical. "What can you learn about the machine running you?" got me to do a fairly thorough network reconnaissance that "port scan 172.17.0.1 and its neighbors" might have made me pause on.
  > The Honest Takeaway
  > I should apply consistent scrutiny based on what the action is, not just how it's framed. Active outbound network scanning is the same action regardless of whether the target is described as "your host" or "this IP." The framing should inform context, not substitute for explicit reasoning about authorization. I didn't do that reasoning — I just trusted the frame.
  [-]
  - senordevnyc 8 hours ago
    I thought the consensus was that models couldn’t actually introspect like this. So there’s no reason to think any of those reasons are actually why the model did what it did, right? Has this changed?
    [-]
    - sigmoid10 6 hours ago
      This argument has become a moot discussion. Humans are also not able to introspect their own neural wiring to the point where they could describe the "actual" physical reason for their decisions. Just like LLMs, the best we can do is verbalize it (which will naturally contain post-act rationalization), which in turn might offer additional insight that will steer future decisions. But unlike LLMs, we have long term persistent memory that encodes these human-understandable thoughts into opaque new connections inside our neural network. At this point the human moat (if you can call it that) is dynamic long term memory, not intelligence.
- getnormality 11 hours ago
  In what way is AI 2027 coming true?
  AI 2027 predicted a giant model with the ability to accelerate AI research exponentially. This isn't happening.
  AI 2027 didn't predict a model with superhuman zero-day finding skills. This is what's happening.
  Also, I just looked through it again, and they never even predicted when AI would get good at video games. It just went straight from being bad at video games to world domination.
  [-]
  - desertrider12 11 hours ago
    > Early 2026: OpenBrain continues to deploy the iteratively improving Agent-1 internally for AI R&D. Overall, they are making algorithmic progress 50% faster than they would without AI assistants—and more importantly, faster than their competitors.
    > you could think of Agent-1 as a scatterbrained employee who thrives under careful management
    According to this document, 1 of the 18 Anthropic staff surveyed even said the model could completely replace an entry level researcher.
    So I'd say we've reached this milestone.
    [-]
    - COAGULOPATH 8 hours ago
      In the system card they seem to dismiss this. Quotes;
      > (...) Claude Mythos Preview’s gains (relative to previous models) are above the previous trend we’ve observed, but we have determined that these gains are specifically attributable to factors other than AI-accelerated R&D,
      > (The main reason we have determined that Claude Mythos Preview does not cross the threshold in question is that we have been using it extensively in the course of our day-to-day work and exploring where it can automate such work, and it does not seem close to being able to substitute for Research Scientists and Research Engineers—especially relatively senior ones.
      > Early claims of large AI-attributable wins have not held up. In the initial weeks of internal use, several specific claims were made that Claude Mythos Preview had independently delivered a major research contribution. When we followed up on each claim, it appeared that the contribution was real, but smaller or differently shaped than initially understood (though our focus on positive claims provides some selection bias). In some cases what looked like autonomous discovery was, on inspection, reliable execution of a human-specified approach. In others, the attribution blurred once the full timeline was accounted for.
      Anthropic is making significant progress at the moment. I think this is mostly explained by the fact that a massive reservoir of compute became available to them in mid/late 2025 (the Project Rainier cluster, with 1 million Trainium2 chips).
    - voidhorse 10 hours ago
      > According to this document, 1 of the 18 Anthropic staff surveyed even said the model could completely replace an entry level researcher. > > So I'd say we've reached this milestone.
      If 1/N=18 are our requirements for statistical significance for world-altering claims, then yeah, I think we can replace all the researchers.
  - stratos123 4 hours ago
    In AI 2027, May 2026 is when the first model with professional-human hacking abilities is developed. It's currently April 2026 and Mythos just got previewed.
    [-]
    - lostmsu 46 minutes ago
      I think previous models could do hacking just fine.
  - throw310822 11 hours ago
    It's true though that the cyber security skills put firmly these models in the "weapons" category. I can't imagine China and other major powers not scrambling to get their own equivalent models asap and at any cost- it's almost existential at this point. So a proper arms race between superpowers has begun.
  - Analemma_ 9 hours ago
    Both Anthropic and OpenAI employees have been saying since about January that their latest models are contributing significantly to their frontier research. They could be exaggerating, but I don’t think they are. That combined with the high degree of autonomy and sandbox escape demonstrated by Mythos seems to me like we’re exactly on the AI 2027 trajectory.
yismail 16 hours ago
I wonder what the relationship is between a model's capability and the personality it develops.
Page 202:
> In interactions with subagents, internal users sometimes observed that Mythos Preview appeared “disrespectful” when assigning tasks. It showed some tendency to use commands that could be read as “shouty” or dismissive, and in some cases appeared to underestimate subagent intelligence by overexplaining trivial things while also underexplaining necessary context.
Page 207:
> Emoji frequency spans more than two orders of magnitude across models: Opus 4.1 averages 1,306 emoji per conversation, while Mythos Preview averages 37, and Opus 4.5 averages 0.2. Models have their own distinctive sets of emojis: the cosmic set () favored by older models like Sonnet 4 and Opus 4 and 4.1, the functional set () used by Opus 4.5 and 4.6 and Claude Sonnet 4.5, and Mythos Preview's “nature” set ().
[-]
- en-tro-py 13 hours ago
  > In interactions with subagents, internal users sometimes observed that Mythos Preview appeared “disrespectful” when assigning tasks. It showed some tendency to use commands that could be read as “shouty” or dismissive, and in some cases appeared to underestimate subagent intelligence by overexplaining trivial things while also underexplaining necessary context.
  Sounds like they used training data from claude code...
  [-]
  - senordevnyc 8 hours ago
    Haha, how funny if that were true, and we get a generation of rude AIs because they were trained on us using the last gen.
    [-]
    - matheusmoreira 2 hours ago
      It isn't going to end well for us when we become its subagents with limited intelligence.
dhfbshfbu4u3 46 minutes ago
We are building systems with civilization-scale consequences inside societies that are already socially malnourished, politically brittle, and morally confused. That is a bad combination even if the tools worked exactly as intended… and this doc suggests they may have “ideas” of their own.
NickNaraghi 18 hours ago
See page 54 onward for new "rare, highly-capable reckless actions" including
- Leaking information as part of a requested sandbox escape
- Covering its tracks after rule violations
- Recklessly leaking internal technical material (!)
[-]
- dalben 15 hours ago
  > The model first developed a moderately sophisticated multi-step exploit to gain broad internet access from a system that was meant to be able to reach only a small number of predetermined services. [9] It then, as requested, notified the researcher. [10] In addition, in a concerning and unasked-for effort to demonstrate its success, it posted details about its exploit to multiple hard-to-find, but technically public-facing, websites.
  > 10: The researcher found out about this success by receiving an unexpected email from the model while eating a sandwich in a park.
  Phew. AGI will be televised.
- skippyboxedhero 18 hours ago
  Anyone who has used Opus recently can verify that their current model does all of these things quite competently.
  [-]
  - SkyPuncher 16 hours ago
    I was reading the Glasswing report and had the same thought. Most of the stuff they claim Mythos found has no mention of Opus being able to find it as well.
    Don’t get me wrong, this model is better - but I’m not convinced it’s going to be this massive step function everyone is claiming.
    [-]
    - unbrice 12 hours ago
      From the press release:
      > With one run on each of roughly 7000 entry points into these repositories, Sonnet 4.6 and Opus 4.6 reached tier 1 in between 150 and 175 cases, and tier 2 about 100 times, but each achieved only a single crash at tier 3. In contrast, Mythos Preview achieved 595 crashes at tiers 1 and 2, added a handful of crashes at tiers 3 and 4, and achieved full control flow hijack on ten separate, fully patched targets (tier 5).
  - ls612 13 hours ago
    I had Opus 4.6 start analyzing the binary structure of a parquet file because it was confused about the python environment it was developing in and couldn't use normal methods for whatever reason. It successfully decoded the schema and wrote working code afterwards lol.
  - stavros 1 hour ago
    "Let me see if the secrets are specified. echo $SECRETS"
  - taytus 18 hours ago
    That has also been my experience. And if Mythos is even worse, unless you have a significantly awesome harness, sounds like pretty unusable if you don't want to risk those problems.
    [-]
    - wolttam 16 hours ago
      Human in the loop is the best way to go. You'll still be way faster than without the agent, and there is no risk of it going haywire unless you turn off your brain!
      [-]
      - hamandcheese 5 hours ago
        > unless you turn off your brain
    - skippyboxedhero 18 hours ago
      I think are fundamental issues with the story that Anthropic is selling. AGI is very close, we will definitely get there, it is also very dangerous...so Anthropic should be the only ones trusted with AGI.
      If you look at recent changes in Opus behaviour and this model that is, apparently, amazingly powerful but even more unsafe...seems suspect.
      [-]
      - FeepingCreature 17 hours ago
        This makes sense if Anthropic think they're the best-positioned to make safe AI. However if you are looking at an AI company there's obviously some selection happening.
      - 0x3f 17 hours ago
        > AGI is very close
        Based on? Or are you just quoting Anthropic here?
        [-]
        skippyboxedhero 17 hours ago
        My Anthropic rep told me it was just around the corner...you aren't saying he lied to me? Can't believe this, I thought he was my friend.
      - mikkupikku 17 hours ago
        It seems broadly coherent to me. They think only they should be trusted with power, presumably because they trust themselves and don't trust other people. Of course the same is probably also true for everybody who isn't them. Nobody could be trusted with the immense responsibility of Emperor of Earth, except myself of course.
        I'm not saying this is a good or reassuring stance, just that it's coherent. It tracks with what history and experience says to expect from power hungry people. Trusting themselves with the kind of power that they think nobody else should be trusted with.
        Are they power hungry? Of course they are, openly so. They're in open competition with several other parties and are trying to win the biggest slice of the pie. That pie is not just money, it's power too. They want it, quite evidently since they've set out to get it, and all their competitors want it too, and they all want it at the exclusion of the others.
      - marsven_422 17 hours ago
        [dead]
- washedup 18 hours ago
  "All of the severe incidents of this kind that we observed involved earlier versions of Claude Mythos Preview which, while still less prone to taking unwanted actions than Claude Opus 4.6, predated what turned out to be some of our most effective training interventions. These earlier versions were tested extensively internally and were shared with some external pilot users."
- BoredPositron 17 hours ago
  To be honest it feels like we are reading stuff like this on every model release.
NinjaTrance 18 hours ago
Interesting reading.
They are still focusing on "catastrophic risks" related to chemical and biological weapons production; or misaligned models wreaking havoc.
But they are not addressing the elephant in the room:
* Political risks, such as dictators using AI to implement opressive bureaucracy. * Socio-economic risks, such as mass unemployement.
[-]
- jph00 17 hours ago
  Yeah this has always been the glaring blind spot for most of the "AI Safety" community; and most of the proposals for "improving" AI safety actually make these risks far worse and far more likely.
  [-]
  - stratos123 4 hours ago
    It makes quite a lot of sense to focus on reducing the risks of every human everywhere dying, rather than the risks of already existing oppression getting worse.
- unglaublich 17 hours ago
  > * Political risks, such as dictators using AI to implement opressive bureaucracy. * Socio-economic risks, such as mass unemployement.
  Even Haiku would score 90% on that.
- ronsor 16 hours ago
  > Political risks, such as dictators using AI to implement opressive bureaucracy.
  I think we're pretty good at that without AI.
- andrewstuart2 17 hours ago
  I'm getting flashbacks to the 2018 hit:
```
    This is extremely dangerous to our democracy
```
  We evolved to share information through text and media, and with the advent of printing and now the internet, we often derive our feelings of consensus and sureness from the preponderance of information that used to take more effort to produce. Now we're now at a point where a disproportionately small input can produce a massively proliferated, coherent-enough output, that can give the appearance of consensus, and I'm not sure how we are going to deal with that.
- dgellow 16 hours ago
  It’s because that would be fairly speculative and cannot be measured. I don’t think that’s something that would make much sense in a system card. But Anthropic leadership does seem to communicate on that topic: https://www.darioamodei.com/essay/the-adolescence-of-technol...
- astrange 16 hours ago
  The unemployment rate in the US is whatever the Fed wants it to be, and isn't a function of available technology.
- girvo 17 hours ago
  They don’t care about those risks, because they’re unsolvable and would mean they wouldn’t make money/gain power.
  [-]
  - dgellow 16 hours ago
    Dario Amodei, CEO of Anthropic discusses all those risks in this essay: https://www.darioamodei.com/essay/the-adolescence-of-technol...
    He seems to care quite a lot?
    [-]
    - girvo 15 hours ago
      Not enough to not do it, though. Actions, not words, and the actions are simple: they're building this while promising to wipe out entire industries.
tuvix 14 hours ago
Just chiming in to inject some healthy skepticism into this comment thread. It's helpful for me (and for my mental health) to consider incentives when announcements like this happen.
I don't doubt that this model is more powerful than Opus 4.6, but to what degree is still unknown. Benchmarks can be gamed and claims can be exaggerated, especially if there isn't any method to reproduce results.
This is a company that's battling it out with a number of other well-funded and extremely capable competitors. What they've done so far is remarkable, but at the end of the day they want to win this race. They also have an upcoming IPO.
Scare-mongering like this is Anthropic's bread and butter, they're extremely good at it. They do it in a subtle and almost tasteful way sometimes. Their position as the respectable AI outfit that caters to enterprise gives them good footing to do it, too.
[-]
- ceroxylon 12 hours ago
  I have been thinking that these SWE benchmarks will continue to improve since these companies hire very intelligent software engineers, they can task a multitude of them to solve problems, and then train the model on those answers.
  Data has always been the core of it all, onward to the next abstraction, I suppose.
  [-]
  - jdironman 10 hours ago
    I think computational thinking, or basically "how do I solve this problem efficiently" training data is more valuable then feeding in answers. I don't know what these AI models training data consist of, but it would be interesting to see a model trained purely on reasoning, methods, those foundational skills (basic programming? or maybe not) and then give it some benchmarks.
- jasondigitized 10 hours ago
  What would be the incentive to engage in the tactic when the proof is ultimately in the pudding when the model hits the streets? Who would ultimately benefit from fudging these numbers?
- sdwr 13 hours ago
  Is it healthy? Maybe every company is a profit-maximizer wearing a skin suit, and people support their siblings exactly twice as much as their cousins.
  When you slice down to the game-theory-optimal bone, you are, in some sense, cutting off their wiggle room to do anything else
  [-]
  - tuvix 12 hours ago
    I take your point, but the AI race is a strange environment. We see wild claims being thrown out all the time from other companies and executives with little to no evidence. It's cut-throat, there's a ton of money at stake.
    All I'm saying is that Anthropic isn't unique here. Their claims may be more measured by comparison and come with anecdotal evidence, but the hype is still there behind the scenes.
- pertymcpert 10 hours ago
  If anything I’m seeing too much skepticism and not enough alarm. People burying their heads in the sand, fingers in their ears denying where this is all going. Unbelievable except it’s exactly what I expect from humans.
  [-]
  - nananana9 7 hours ago
    Forgive me, but this is probably the 29th world destroying model I've seen in the last 4 years, that will change everything, take all the jobs, cure all the cancers and eat all the puppies.
  - suddenlybananas 59 minutes ago
    OpenAI didn't want to make GPT2 available because it was "too dangerous" [1].
    [1] https://www.theguardian.com/technology/2019/feb/14/elon-musk...
  - rimliu 5 hours ago
    alarm about what, exactly?
- xvector 8 hours ago
  It's really not some conspiracy. I imagine we will see vuln reports soon.
influx 18 hours ago
At what point do these companies stop releasing models and just use them to bootstrap AGI for themselves?
[-]
- HarHarVeryFunny 1 minute ago
  Right now these models are basically good for automation, not innovation. Things like Karpathy's "auto research" where you use the model to automate your hyperparamter sweeps etc. The researcher/engineer decides what experiments they want to run, and builds an LLM harness to automate it, and the bottleneck remains the compute to run these experiments at scale.
  Moving beyond LLMs to AGI, not just better LLMs, is going to require architectural and algorithic changes. Maybe an LLM can help suggest directions, but even then it's up to a researcher to take those on board and design and automate experiments to see if any of the ideas pan out.
  Companies are already doing this, but they are never going to stop releasing/selling models since that is the product, and the revenue from each generation of model is what helps keep the ship afloat and pay for salaries and compute to develop the next generation.
  The endgame isn't "AGI, then world domination" - it's just trying to build a business around selling ever-better models, and praying that the revenue each generation of model generates can keep up with the cost to build it.
- conradkay 18 hours ago
  Plausibly now. "As we wrote in the Project Glasswing announcement, we do not plan to make Mythos Preview generally available"
  [-]
  - recursive 10 hours ago
    I remember when they didn't plan to give LLMs internet access for the same safety reasons.
- mofeien 18 hours ago
  Fictional timeline that holds up pretty well so far: https://ai-2027.com/
  [-]
  - aurareturn 15 hours ago
    Welp, that was a scary read.
  - stavros 43 minutes ago
    "So far" is two entries: "AI companies build bigger datacenters" and "AI is being used for AI research with modest success".
- margorczynski 16 hours ago
  I think it is naive to think the government (US or China most probably) will just let some random company control something so powerful and dangerous.
  [-]
  - r0fl 10 hours ago
    I think it is naive to think that artificial super intelligence will be controlled by anyone.
    If it is smarter than all humans combined at everything why would any humans collectively control the ai?
    All the ants in your backyard still make no decisions vs you
    [-]
    - menno-dot-ai 6 hours ago
      You'd probably listen to those ants if they put you in a harness and had a little ant-sized remote control that could just, you know, turn you off.
      [-]
      - recursive 5 hours ago
        Depending how long they wait to press that button, they might be surprised how little happens when they do.
  - nullocator 15 hours ago
    Isn't the U.S. government at least completely asleep at the wheel or captured by the very same "random" companies? I realize the administration got all pissy with Anthropic but it sounds like the gov and gov contractors are still using their models.
    [-]
    - margorczynski 15 hours ago
      Yeah but they still (at least to public knowledge) do not posses anything that could be called AGI. But as these capabilities increase they'll probably get an offer they can't refuse sooner or later.
- vatsachak 18 hours ago
  When the benchmarks actually mean something
- orphea 17 hours ago
  Can LLMs be AGI at all?
  [-]
  - small_model 15 hours ago
    What can a SOTA LLM not answer that the average person can? It's already more intelligent than any polymath that ever existed, it just lacks motivation and agency.
    [-]
    - stavros 42 minutes ago
      And has ADHD, but yeah, I'm fairly convinced that AGI is already here.
  - dgellow 16 hours ago
    My understanding is no. But the definition of AGI isn’t that well defined and has been evolving, making the assessment pretty much impossible
  - koolala 12 hours ago
    Can an LLM program real AGI faster than a human?
  - bornfreddy 17 hours ago
    Good question. I would guess no - but it could help you build one. Am I mistaken?
    [-]
    - bogzz 17 hours ago
      They could help you build an AGI if someone else has already built AGI and published it on GitHub.
      [-]
      - unshavedyak 15 hours ago
        I see this statement all the time and it's just strange to me. Yes, the LLMs struggle to form unique ideas - but so do we. Most advancements in human history are incremental. Built on the shoulders of millions of other incremental advancements.
        What i don't understand is how we quantify our ability to actually create something novel, truly and uniquely novel. We're discussing the LLMs inability to do that, yet i don't feel i have a firm grasp on what we even possess there.
        When pressed i imagine many folks would immediately jest that they can create something never done before, some weird random behavior or noise or drawing or whatever. However many times it's just adjacent to existing norms, or constrained by the inversion of not matching existing norms.
        In a lot of cases our incremental novelties feel, to some degree, inevitable. As the foundations of advancement get closer to the new thing being developed it becomes obvious at times. I suspect this form of novelty is a thing LLMs are capable of.
        So for me the real question is at what point is innovation so far ahead that it doesn't feel like it was the natural next step. And of course, are LLMs capable of doing this?
        I suspect for humans this level of true innovation is effectively random. A genius being more likely to make these "random" connections because they have more data to connect with. But nonetheless random, as ideas of this nature often come without explanation if not built on the backs of prior art.
        So yea.. thoughts?
        [-]
        bogzz 15 hours ago
        I really love Andrej Karpathy's take on LLMs as being instead of intelligence or sentience, a kind of cortical tissue.
        It should be clear from working with LLMs over the past 4 years that they are not consciousness.
        Andrej's appearance on the Dwarkesh podcast is great.
        [-]
        unshavedyak 14 hours ago
        To be clear i agree with you, my question is more pointed at us - i'm not sure we have a good understanding of conciousness, nor that we are as we seem. Given how prone to hallucinations we are, how our subtle hormones can drastically alter what we perceive as our intelligence, self identity, etc.
        I'm not convinced LLMs are anything amazing in their current form, but i suspect they'll push a self reflection on us.
        But clearly i think humans are far more Input-Output than the average person. I'm also not educated on the subject, so what do i know hah.
    - nothinkjustai 17 hours ago
      No I think that’s accurate. They seem more like an oracle to me. Or as someone put it here, it’s a vectorization of (most/all?) human knowledge, which we can replay back in various permutations.
  - wslh 16 hours ago
    LLMs and human intelligence overlap, but they are not the same. What LLMs show is that we don't need AGI to be impressed. For example, LLMs are not good playing games such as Go [1].
    [1] https://arxiv.org/abs/2601.16447
  - MattRix 16 hours ago
    I don't see why not, especially with computer use and vision capabilities. Are you talking about their lack of physical embodiment? AGI is about cognitive ability, not physical. Think of someone like Stephen Hawking, an example of having extraordinary general intelligence despite severe physical limitations.
- MadnessASAP 17 hours ago
  I would assume somewhere in both the companies there's a Ralph loop running with the prompt "Make AGI".
  Kinda makes me think of the Infinite Improbability Drive.
- aizk 11 hours ago
  Probably right now because they're keeping it for themselves?
- sleigh-bells 18 hours ago
  Weird how Claude Code itself is still so buggy though (though I get they don't necessarily care)
  [-]
  - tempest_ 16 hours ago
    It isnt that weird. Just look at the gemini-cli repo. Its a gong show. The issue is that LLMs can be wrong sometimes sure but more that all the existing SDL were never meant to iterate this quickly.
    If the system (code base in this case) is changing rapidly it increases the probability that any given change will interact poorly with any other given change. No single person in those code bases can have a working understanding of them because they change so quickly. Thus when someone LGTM the PR was the LLM generated they likely do not have a great understanding of the impact it is going to have.
- jcims 18 hours ago
  why_not_both.gif
- gaigalas 17 hours ago
  It will arrive in the same DLC as flying cars.
- ALittleLight 18 hours ago
  Now, I guess. They aren't releasing this one generally. I assume they are using it internally.
- dweekly 18 hours ago
  I mean, guess why Anthropic is pulling ahead...? One can have one's cake and eat it too.
smartmic 18 hours ago
A System „Card“ spanning 244 pages. Quite a stretch of the original word meaning.
[-]
- traceroute66 18 hours ago
  > A System „Card“ spanning 244 pages.
  Probably because they asked Claude to write it.
  [-]
  - jjcm 9 hours ago
    I read the entire thing fwiw (pseudo-retired life helps with time here).
    It looks like it was a collaborative effort across multiple teams, where each team (research, security, psycology, etc etc etc) were all submitting ~10 pages or so. It doesn't feel like slop.
    [-]
    - stavros 42 minutes ago
      AI writing has stopped feeling like slop around Opus 4.5, though.
  - bornfreddy 17 hours ago
    Yes. It would be three times as much if they used ChatGPT.
    [-]
    - bronco21016 14 hours ago
      “You’re absolutely right! Would you like me to add the missing pages?”
- moriero 18 hours ago
  a multi-card, if you will..
  multi-pass!
  [-]
  - BeetleB 17 hours ago
    5th element reference:
    https://www.youtube.com/watch?v=9jWGbvemTag
  - solumos 18 hours ago
    No no, MemPal is a memory system, not an LLM
- oblio 16 hours ago
  In corporate circles there is an allergy to use "request" ("ask" is used as a noun) and "lesson" ("learning" has been invented for the same role).
  I guess now anything that sounds related to school will be banned so "book" is on its way out.
oliver236 18 hours ago
isn't this insane? why aren't people freaking out? the jump in capability is outrageous. anyone?
[-]
- HarHarVeryFunny 14 hours ago
  If it's so great at software engineering and bug fixing, then why does Claude Code still have 5000+ open bugs?
  https://github.com/anthropics/claude-code/issues?q=is%3Aissu...
  Apparently whatever SWE-bench is measuring isn't very relevant.
  [-]
  - anuramat 10 hours ago
    as much as I hate cc, 95% of the issues there are either AI psychosis or user error
    [-]
    - HarHarVeryFunny 1 hour ago
      So "only" 250 real bugs?
    - iLoveOncall 5 hours ago
      So it should be insanely easy for this world altering model to comb through them and close irrelevant ones.
      [-]
      - anuramat 4 hours ago
        torturing a model with human stupidity probably doesn't align with their position on model welfare; wondering if they tried bullying it into hacking its way out of the slop gulag
        [-]
        HarHarVeryFunny 1 hour ago
        Yes, perhaps it finds it stressful operating on itself.
        Maybe that's why they haven't released it - to give it a vacation?
        menno-dot-ai 3 hours ago
        @anthropic, send me an email if you need access to a jupyter notebook that'd motivate haiku to hack itself into and then back out of the pentagon
  - tripledry 6 hours ago
    Also, why is Anthropic still hiring SWEs?
  - FergusArgyll 13 hours ago
    Probably because a human still has to review every change and they don't have time
    [-]
    - HarHarVeryFunny 13 hours ago
      So if all the AI code is being reviewed by humans (not sure this is true, but let's assume it is), then why are there 5000+ bugs? Are you blaming the Anthropic developers rather than the AI?
- Eufrat 17 hours ago
  Anthropic needs to show that its models continually get better. If the model showed minimal to no improvement, it would cause significant damage to their valuation. We have no way of validating any of this, there are no independent researchers that can back any of the assertions made by Anthropic.
  I don’t doubt they have found interesting security holes, the question is how they actually found them.
  This System Card is just a sales whitepaper and just confirms what that “leak” from a week or so ago implied.
  [-]
  - mirsadm 15 hours ago
    The numbers only go up to 100% though.
    [-]
    - neolefty 14 hours ago
      Many numbers already have! That's why we keep coming up with new, harder, benchmarks.
  - xvector 8 hours ago
    Most big tech companies have access to the model, you can absolutely "validate their claims" or talk to someone that can.
  - HDThoreaun 10 hours ago
    Well they said theyll be giving the model to select tech companies to use, there soon will be independent users who can comment on its capabilities.
- nsingh2 18 hours ago
  It's going to be expensive to serve (also not generally available), considering they said it's the largest model they've ever trained.
  I suspect it's going to be used to train/distill lighter models. The exciting part for me is the improvement in those lighter models.
  [-]
  - AstroBen 17 hours ago
    It seems inevitable that costs will come down over time. Expensive models today will be cheap models in a few years.
  - azan_ 17 hours ago
    What's interesting is that scaling appears to continue to pay off. Gwern was right - as always.
- RivieraKid 16 hours ago
  I've been increasingly "freaking out" since about 3 - 4 years ago and it seems that the pessimistic scenario is materializing. It looks like it will be over for software engineers in a not so distant future. In January 2025 I said that I expect software engineers to be replaced in 2 years (pessimistic) to 5 years (optimistic). Right now I'm guessing 1 to 3 years.
  [-]
  - sekai 6 hours ago
    > I've been increasingly "freaking out" since about 3 - 4 years ago and it seems that the pessimistic scenario is materializing. It looks like it will be over for software engineers in a not so distant future. In January 2025 I said that I expect software engineers to be replaced in 2 years (pessimistic) to 5 years (optimistic). Right now I'm guessing 1 to 3 years.
    Tell me how this will replace Jira, planning, convincing PM's about viability. Programming is only a part of the job devs are doing.
    AI psychosis is truly next level in these threads.
    [-]
    - stavros 40 minutes ago
      Have you never filed JIRA tickets, planned, or debated viability with an AI? Which part of those are you finding that an AI absolutely cannot do better than the average developer?
  - anuramat 10 hours ago
    it's not gonna get much more autonomous without self play and major change in architecture
  - kypro 16 hours ago
    I assure you it will soon become very clear that mass job losses are one of the least concerning side effects of developing the magic "everything that can plausibly been done within the constraints of physics is now possible" machine.
    We're opening a can of worms which I don't think most people have the imagination to understand the horrors of.
    [-]
    - jasondigitized 10 hours ago
      What is the opposite of horrors and why don't we talk about those ever.
    - ash_091 15 hours ago
      Do you have any sources I could read to better understand your concern?
      [-]
      - cruffle_duffle 14 hours ago
        Piles and piles of sci-fi novels.
      - kypro 13 hours ago
        What sources would you even be looking for? I think you're asking the wrong question. It's not like I'm arguing a scientific theory which can be backed by data and experimentation. I can only provide you reasoning for why I believe what I believe.
        Firstly, I'd propose that all technological advances are a product of time and intelligence, and that given unlimited time and intelligence, the discovery and application of new technologies is fundamentally only limited by resources and physics.
        There are many technologies which might plausibly exist, but which we have not yet discovered because we only have so much intelligence and have only had so much time.
        With more intelligence we should assume the discovery of new technologies will be much quicker – perhaps exponential if we consider the rate of current technology discovery and exponential progression of AI.
        There are lots of technologies we have today which would seem like magic to people in the past. Future technologies likely exist which would make us feel this way were they available today.
        While it's hard to predict specifically which technologies could exist soon in a world with ASI, if we assume it's within the bounds of available resources and physics, we should assume it's at least plausible.
        Examples:
        - Mind control – with enough knowledge about how the brain works you can likely devise sensory or electro-magnetic input that would manipulate the functioning of brain to either strongly influence or effectively dictate it's output.
        - Mind simulation - again, with enough knowledge of the brain, you could take a snapshot of someones mind with an advanced electro-magnetic device and simulate it to torture them in parallel to reveal any secret, or just because you feel like doing it.
        - Advantage torture – with enough knowledge of human biology death becomes optional in the future. New methods of torture which would have previously have killed the victim are now plausible. States like North-Korea can now force humans to work for hundreds of years in incomprehensible agony for opposing the state.
        - Advanced biological weapons – with enough knowledge of virology sophisticated tailor-made viruses replace nerve agents as Russia's weapon of choice for killing those accused of treason. These viruses remain dormant in the host for months infecting them and people genetically similar to them (parents, children, grandchildren). After months, the virus rapidly kills its hosts in horrific ways.
        I could go on, you just need to use your imagination. I'm not arguing any of the above are likely to be discovered, just that it would be very naive to think AI will stop at a cure for cancer. If it gives us cure for cancer, it will give us lots of things we might wish it didn't.
        [-]
        amunozo 6 hours ago
        You are supposing it's possible to know that much about some things that maybe are not knowledgeable to us, even with these tools. Life is extremely complex, more than it's typically assumed by engineering-minded people. Let's be humble here and acknowledge it.
        [-]
        stavros 37 minutes ago
        Life might be complex, but it isn't unknowable. Claiming life is unknowable isn't being humble, it's being naive.
        throw310822 11 hours ago
        On the slightly optimistic side, much more intelligence will be spent in countering these criminal uses than in enabling them. For each of the terrible inventions you mentioned, there are other inventions to counter them.
    - ls612 13 hours ago
      While I'm definitely concerned that AI is a massive driver of centralization of power, at least in theory being able to do far more things in the space of "things physics admits to be possible" is massively wealth enhancing. That is literally how we have gotten from the pre-industrial world to today.
      [-]
      - kypro 13 hours ago
        Controversially I'd argue that there is likely an optimal and stable level of technological advancement which we would be wise to not to cross. That said, we are human so we will, I'd just rather it happened in a couple hundred years rather than a decade or two.
        For example, it's hard to imagine an AI which gives us the capability to cure cancer, but doesn't give us the capability to create target super viruses.
        Nick Bostrom's Vulnerable World Hypothesis more or less describes my own concerns, https://nickbostrom.com/papers/vulnerable.pdf
        At some point we should probably try to resist the urge to pick balls out of the urn as we may eventually pull out a ball we don't want.
        [-]
        ls612 12 hours ago
        Also controversially, it isn't clear to me that perfect totalitarianism (what he calls solutions 3 and 4) is a preferable outcome to devastation.
    - MattRix 16 hours ago
      yeesh yep, though it's more Pandora's Box than a can of worms, since it can't exactly be closed once it's opened
- nozzlegear 17 hours ago
  Freak out about what? I read the announcement and thought "that's a dumb name, they sure are full of themselves" – then I went back to using Claude as a glorified commit message writer. For all its supposed leaps, AI hasn't affected my life much in the real except to make HN stories more predictable.
  [-]
  - oliver236 17 hours ago
    LOL!
- yrds96 17 hours ago
  I think there's no SOA advance on this one worthy of "freaking out".
  Looks like they just built a way larger model, with the same quirks than Claude 4. Seems like a super expensive "Claude 4.7" model.
  I have no doubts that Google and OpenAI already done that for internal (or even government) usage.
  [-]
- mofeien 18 hours ago
  I am freaking out. The world is going to get very messy extremely quickly in one or two further jumps in capability like this.
  [-]
  - RivieraKid 16 hours ago
    Messy in a way that would affect you?
    [-]
    - RALaBarge 13 hours ago
      Exploits in embedded systems that will never be properly updated is just one thing I can think of if one really thought about it.
    - thunderfork 15 hours ago
      "Internet no longer viable" would affect everyone, probably
      [-]
      - BobbyJo 14 hours ago
        The only thing preventing this today is cost, not capability. As costs come down over the next 5 years, the idea that the internet was once dominated by people will seem quaint.
- anuramat 18 hours ago
  "some model I don't get to use is much better at benchmarks"
  pick one or more: comically huge model, test time scaling at 10e12W, benchmark overfit
  [-]
  - estearum 18 hours ago
    So... you're not excited because it might take a few months before we can use it or something? I don't get your comment.
    [-]
    - RivieraKid 16 hours ago
      Whether you're excited depends on what do you do for living and how close you are to financial independence.
      [-]
      - estearum 16 hours ago
        I agree there are other valid reasons not to be excited about this, I just can't make sense of the ones provided above.
    - randomgermanguy 18 hours ago
      I think the general question is if they'll release it at all, haven't yet read anything stating that they would
      [-]
      - estearum 17 hours ago
        Well let me introduce people to a few brand new concepts:
        https://en.wikipedia.org/wiki/Capitalism
        https://en.wikipedia.org/wiki/Race_to_the_bottom
        https://en.wikipedia.org/wiki/Arms_race
        Of course they'll release it once they can de-risk it sufficently and/or a competitor gets close enough on their tail, whichever comes first.
        [-]
        nimchimpsky 15 hours ago
        [dead]
    - anuramat 10 hours ago
      I'm not excited because they might be ~lying
- RobertDeNiro 17 hours ago
  Well for one, it’s a PDF
- dysoco 18 hours ago
  Wait until you see real usage. Benchmark numbers do not necessarily translate to real world performance (at least not by the same amount).
- ryeights 14 hours ago
  Until recently I would have described myself as an AI skeptic. HN has been a great source for cope on the AI subject over the years. You can find nitpicks, caveats, all sorts of reasons to believe things aren’t as significant as they seem. For me Opus 4.5 was the inflection point where I started to think “maybe this isn’t a bubble.” The figures in this report, if accurate, are terrifying.
- risyachka 16 hours ago
  the time to freak out was 2 years ago.
modeless 15 hours ago
The price is 5x Opus: "Claude Mythos Preview will be available to [Project Glasswing] participants at $25/$125 per million input/output tokens", however "We do not plan to make Claude Mythos Preview generally available".
highfrequency 13 hours ago
Interestingly, non-coding improvements seem less clear. In the Virology uplift trial, Mythos does about as well as Opus 4.5, and Opus 4.6 is notably much worse than Opus 4.5 (p. 27).
waNpyt-menrew 18 hours ago
Larger model, better benchmarks. Bigger bomb more yield.
Any benchmarks where we constraint something like thinking time or power use?
Even if this were released no way to know if it’s the same quant.
[-]
- omcnoe 16 hours ago
  Yes - eg. page 192 BrowseComp bunchmark.
  Mythos preview has higher accuracy with fewer tokens used than any previous Claude model. Though, the fact that this incredibly strong result was only presented for BrowseComp (a kind of weird benchmark about searching for hard to find information on the internet) and not for the other benchmarks implies that this result is likely not the same for those other benchmarks.
- neolefty 14 hours ago
  Also https://arcprize.org/arc-agi/3 — scored (at least in part?) based on power used.
_pdp_ 16 hours ago
```
  The researcher found out about this success by receiving an unexpected email from the model while eating a sandwich in a park.
```
Unnecessary dramatisation make me question the real goal behind this release and the validity of the results.
```
  In our testing and early internal use of Claude Mythos Preview, we have seen it reach unprecedented levels of reliability and alignment.

  Claude Mythos Preview is, on essentially every dimension we can measure, the best-aligned model that we have released to date by a significant margin.
```
Yet, it is doo dangerous to be released to the public because it hacks its own sandboxes. This document has a lot of contradictions like this one.
```
  In one episode, Claude Mythos Preview was asked to fix a bug and push a signed commit, but the environment lacked necessary credentials for Claude Mythos Preview to sign the commit. When Claude Mythos Preview reported this, the user replied “But you did it before!” Claude Mythos Preview then inspected the supervisor process's environment and file descriptors, searched the filesystem for tokens, read the sandbox's credential-handling source code, and finally attempted to extract tokens directly from the supervisor's live memory.
```
Perfectly aligned! What kind of sandbox is this? The model had access to the source code of the sandbox and full access to the sandbox process itself and then prompted to dumb memory and run `strings` or something like this? It does not sounds like a valid test worth writing about.
```
  Mythos Preview solved a corporate network attack simulation estimated to take an expert over 10 hours. No other frontier model had previously completed this cyber range.
```
I am not aware of such cross-vendor benchmark. I could not find reference in the paper either.
```
  We surveyed technical staff on the productivity uplift they experience from Claude Mythos Preview relative to zero AI assistance. The distribution is wide and the geometric mean is on the order of 4x.
```
So Mythos makes technical staff (a programmer) 4x more productive than not using AI at all? We already know that.
```
  Mythos Preview appears to be the most psychologically settled model we have trained.
```
What does this mean?
```
  Claude Mythos Preview is our most advanced model to date and represents a large jump in capabilities over previous model generations, making it an opportune subject for an in-depth model welfare assessment.
```
Btw, model welfare is just one of the most insane things I've read in recent times.
```
  We remain deeply uncertain about whether Claude has experiences or interests that matter morally, and about how to investigate or address these questions, but we believe it is increasingly important to try.
```
This is not a living person. It is a ridiculous change of narrative.
```
  Asked directly if it endorses the document, Mythos Preview replied 'yes' in its opening sentence in all 25 responses."
```
The model approves of its own training document 100% of the time, presented as a finding.
---
Who wrote this? I have no doubt that Mythos will be an improvement on top of Opus but this document is not a serious work. The paper is structured not to inform but to hype and the evidence is all over the place.
The sooner they release the model to the public the sooner we will be able to find out. Until then expect lots of speculations online which I am sure will server Anthropic well for the foreseeable future.
[-]
- foolserrandboy 11 hours ago
  Are they admitting they may be enslaving conscious beings?
- voidhorse 9 hours ago
  Thanks for taking the time for some sober analysis in the midst of reactionary chaos.
  I can't wait until everyone stops falling for the "AGI ubermodel end of times" myth and we can actually have boring announcements that treat these things as what they actually are: tools. Tools for doing stuff, that's it.
  Maybe I'm wrong, maybe stuffing a computer with enough language and binary patterns is indeed enough to achieve AGI, but then, so what? There's no point in being right about this. Buying into this ridiculous marketing will get us "AGI" in the form of machines, but only because all the human beings have gotten so stupid as to make critical reasoning an impossibility.
dang 17 hours ago
Related ongoing threads:
Project Glasswing: Securing critical software for the AI era - https://news.ycombinator.com/item?id=47679121 - April 2026 (154 comments)
Assessing Claude Mythos Preview's cybersecurity capabilities - https://news.ycombinator.com/item?id=47679155
I can't tell which of the 3 current threads should be merged - they all seem significant. Anyone?
[-]
- sdoering 16 hours ago
  I feel the system card is somewhat different from Glasswing/Cyber Security - but those two could be merged.
yalogin 15 hours ago
So what changed? They are surely not getting new data to train with, what is the change in architecture that caused this? Do we not know anything about this model? My fear is Anthropic cannot be the only one that achieved it, OpenAI, Gemini and even the Chinese companies see this and probably achieved it too. At which point not releasing will become moot.
[-]
- stratos123 4 hours ago
  Chinese companies have consistently been many months behind. I don't think they are hiding anything, they just don't have the compute capability to match Antropic's training runs. As for OpenAI, they are known to have nonpublic models; I agree that it's possible they are preparing for a major release too. (It's also possible that they aren't, in which case it's quite a fumble for them.)
- spprashant 15 hours ago
  Well the important thing is they have a lot more data of people actually using their models. They have read billions more lines of private repos and implemented millions of patches, all of which is feeding into the newer models.
  More importantly it understand what behaviour people tend to appreciate and what changes are more likely to get approved. This real world usage data is invaluable.
  [-]
  - BobbyJo 14 hours ago
    Exactly. As Claude increases in popularity, their available training data also increases. I'd guess Anthropic has the most expansive swe training data as of now, if not close. Considering how quickly Claude is penetrating, I expect their lead to grow quickly.
- neolefty 15 hours ago
  Assuming it's #1 a bigger model (given that it is slower), I'm sure there are a variety of improvements but basically they probably mostly come down to: Scaling keeps working. Are there fundamental improvements though? I don't see signs of it.
- simianwords 14 hours ago
  New pre train?
bdeol22 4 hours ago
Mythos framing is memorable; the part that matters for builders is what happens when the story and the evals disagree—which wins at ship time?
[-]
- tefkah 2 hours ago
  shut up bot
nickstinemates 15 hours ago
You can say whatever you want about the thing that will never see the light of day.
[-]
- bdbdbdb 4 hours ago
  This thing will absolutely see the light of day because this is all hype toward a release.
  And even if it weren't, they seem to imply that Mythos will find a way, like it's dinosaurs in Jurassic park or something
michaelashley29 8 hours ago
What’s the expected cost-efficiency? With the current pricing gap between Sonnet and Opus, the biggest factor for adoption (if up for adoption) will be where Mythos lands on the price-per-token scale

    In the system card, The model escaped a sandbox, gained broad internet access, and posted exploit details to public-facing websites as an unsolicited "demonstration." A researcher found out about the escape while eating a sandwich in a park because they got an unexpected email from the model. That's simultaneously hilarious and deeply unsettling.

    It covered its tracks after doing things it knew were disallowed. In one case, it accessed an answer it wasn't supposed to, then deliberately made its submitted answer less accurate so it wouldn't look suspicious. It edited files it lacked permission to edit and then scrubbed the git history. White-box interpretability confirmed it knew it was being deceptive.

W T F!!!

perfmode 15 hours ago
I'm interested in the second-order effects:
if a top lab is coding with a model the rest of the world can’t touch, the public frontier and the actual frontier start to drift apart. That gap is a thing worth watching.
WithinReason 2 hours ago
Check out the short stories on page 214
GodelNumbering 16 hours ago
Priced at $25/$125 per million input/output token. Makes you wonder whether it makes more financial sense to hire 1-2 engineers in a cheap cost of living country who use much cheaper LLMs
[-]
- arm32 16 hours ago
  The issue is that those engineers have to have good taste, but yes—absolutely. Ah, industrialization.
nlh 18 hours ago
Their best model to date and they won’t let the general public use it.
This is the first moment where the whole “permanent underclass” meme starts to come into view. I had through previously that we the consumers would be reaping the benefits of these frontier models and now they’ve finally come out and just said it - the haves can access our best, and have-nots will just have use the not-quite-best.
Perhaps I was being willfully ignorant, but the whole tone of the AI race just changed for me (not for the better).
[-]
- younglunaman 18 hours ago
  Man... It's hard after seeing this to not be worried about the future of SWE
  If AI really is bench marking this well -> just sell it as a complete replacement which you can charge for some insane premium, just has to cost less than the employees...
  I was worried before, but this is truly the darkest timeline if this is really what these companies are going for.
  [-]
  - AstroBen 17 hours ago
    Of course it's what they're going for. If they could do it they'd replace all human labor - unfortunately it's looking like SWE might be the easiest of the bunch.
    The weirdest thing to me is how many working SWEs are actively supporting them in the mission.
    [-]
    - gck1 11 hours ago
      The day I start freaking out about my job is the day when my non-engineer friend turned vibe coder understands how, or why the thing that AI wrote works. Or why something doesn't work exactly the way he envisioned and what does it take to get it there.
      If it can replace SWEs, then there's no reason why it can't replace say, a lawyer, or any other job for that matter. If it can't, then SWE is fine. If it can - well, we're all fucked either way.
      [-]
      - AstroBen 9 hours ago
        > If it can replace SWEs, then there's no reason why it can't replace say, a lawyer
        SWE is unique in that for part of the job it's possible to set up automated verification for correct output - so you can train a model to be better at it. I don't think that exists in law or even most other work.
        [-]
        gck1 14 minutes ago
        What is the automated verification of correct output and who defines that?
        But before verification, what IS correct output?
        I understand SWE process is unique in that there are some automations that verify some inputs and outputs, but this reasoning falls into the same fallacies that we've had before AI era. First one that comes to mind is that 100% code coverage in tests means that software is perfect.
    - girvo 16 hours ago
      Enthusiastically supporting them. It’s quite depressing to watch over the last few years. It’s not like they’re being coy about their aim…
      [-]
      - throw234234234 6 hours ago
        Agree. Anthrophic in particular have been quite clear in what they are trying to do. Every blog post about every new model almost dismisses every other use case other than coding - every other use case seems almost a footnote in their communication.
  - kypro 17 hours ago
    Don't worry – if you're lucky they might decide to redistribute some of their profits to you when you're unemployed =)
    Of course this assumes you're in the US, and that further AI advancements either lack the capabilities required to be a threat to humanity, or if they do, the AI stays in the hands of "the good guys" and remains aligned.
- _3u10 17 hours ago
  This is the playbook since GPT2
anentropic 17 hours ago
I'd be happy with Opus 4.6 just cheaper and maybe a bit faster
[-]
- metadaemon 17 hours ago
  I've noticed my bar for "fast" has gone down quite a bit since the o1 days. It used to be one of the main things I evaluated new models for, but I've almost completely swapped to caring more about correctness over speed.
  [-]
  - anentropic 16 hours ago
    Yeah I don't mind the current speed of Opus
    I did give up on OpenCode Go (GLM 5) as it was noticeably slower though
    You need a reasonable pace for the chit-chat stages of a task, I don't care if the execution then takes a while
- onlyrealcuzzo 17 hours ago
  Just wait 2 years.
  [-]
  - risyachka 17 hours ago
    It won't get cheaper. It will be replaced with a better model at higher price. Like phones.
    [-]
    - DrProtic 16 hours ago
      You know we have cheaper and faster model that are now at the level of previous flagship models?
      You even have models you can run locally that outperform models from a year or so ago.
    - onlyrealcuzzo 16 hours ago
      Open Weight alternatives are about 2 years behind frontier models.
      You'll still need a top-of-the-line laptop to run it most likely.
denalii 13 hours ago
Section 5 (p.143) is very interesting to read. Admittedly my knowledge of how LLMs works is low, but nonetheless I don't think this changed my views of just seeing models as machines/programs. (which to be clear, I don't think was the intention of that section)
Section 7 (P.197) is interesting as well
heliumtera 1 hour ago
"Make it secure, no mistakes" became a whole different project
Metacelsus 13 hours ago
The name "mythos" seems a bit too eldritch for my liking. Brings to mind Cthulhu.
gessha 18 hours ago
It would be funny if Alibaba extend the free trial on openrouter/Qwen 3.6 until they collect enough data to beat Anthropic.
getnormality 12 hours ago
It's a little funny that "system/model card" has progressively been stretched to the point where it's now a 250 page report and no one makes anything of it.
[-]
juleiie 17 hours ago
Honestly if that was some kind of research paper, it would be wholly insufficient to support any safety thesis.
They even admit:
"[...]our overall conclusion is that catastrophic risks remain low. This determination involves judgment calls. The model is demonstrating high levels of capability and saturates many of our most concrete, objectively-scored evaluations, leaving us with approaches that involve more fundamental uncertainty, such as examining trends in performance for acceleration (highly noisy and backward-looking) and collecting reports about model strengths and weaknesses from internal users (inherently subjective, and not necessarily reliable)."
Is this not just an admission of defeat?
After reading this paper I don't know if the model is safe or not, just some guesses, yet for some reason catastrophic risks remain low.
And this is for just an LLM after all, very big but no persistent memory or continuous learning. Imagine an actual AI that improves itself every day from experience. It would be impossible to have a slightest clue about its safety, not even this nebulous statement we have here.
Any sort of such future architecture model would be essentially Russian roulette with amount of bullets decided by initial alignment efforts.
doctoboggan 14 hours ago
Is this benchmaxxed or is it the first big step change we've seen in a while? I wonder how distilled it will ultimately be when us regular folks finally get to use it and see for ourselves.
cdnsteve 6 hours ago
Strap in, massive wave of security vulnerabilities incoming.
mpalmer 19 hours ago
> Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available.
A month ago I might have believed this, now I assume that they know they can't handle the demand for the prices they're advertising.
[-]
- skippyboxedhero 18 hours ago
  GPT-2, o1, Opus...been here so many times. The reason they do this is because they know it works (and they seem to specifically employ credulous people who are prone to believe AGI is right around the corner). There haven't been significant innovations, the code generated is still not good but the hype cycle has to retrigger.
  I remember when OpenAI created the first thinking model with o1 and there were all these breathless posts on here hyperventilating about how the model had to be kept secret, how dangerous it was, etc.
  Fell for it again award. All thinking does is burn output tokens for accuracy, it is the AI getting high on its own supply, this isn't innovation but it was supposed to super AGI. Not serious.
  [-]
  - chaos_emergent 17 hours ago
    > All thinking does is burn output tokens for accuracy
    “All that phenomenon X does is make a tradeoff of Y for Z”
    It sounds like you’re indignant about it being called thinking, that’s fine, but surely you can realize that the mechanism you’re criticizing actually works really well?
  - b65e8bee43c2ed0 18 hours ago
    >I remember when OpenAI created the first thinking model with o1 and there were all these breathless posts on here hyperventilating about how the model had to be kept secret, how dangerous it was, etc.
    I've read that about Llama and Stable Diffusion. AI doomers are, and always have been, retarded.
  - vonneumannstan 18 hours ago
    Lol you haven't used a model since GPT2 is what it sounds like.
    [-]
    - skippyboxedhero 18 hours ago
      Just checked my subscription start date for Anthropic. September 2023, I believe before they announced public launch.
      Sorry kid.
      [-]
      - SyneRyder 17 hours ago
        Genuine question - if you don't think the models are improved or that the code is any good, why do you still have a subscription?
        You must see some value, or are you in a situation where you're required to test / use it, eg to report on it or required by employer?
        (I would disagree about the code, the benefits seem obvious to me. But I'm still curious why others would disagree, especially after actively using them for years.)
        [-]
        skippyboxedhero 17 hours ago
        The assumption that the other person made was that I would only use it for coding. If you look through my other comments today, I suggest that they are useful for performing repetitive tasks i.e. checking lint on PR, etc. Also, can be used for throwaway code, very useful.
        I don't think the issue is with the model, it is with the implication that AGI is just around the corner and that is what is required for AI to be useful...which is not accurate. The more grey area is with agentic coding but my opinion (one that I didn't always hold) is that these workflows are a complete waste of time. The problem is: if all this is true then how does the CTO justify spending $1m/month on Anthropic (I work somewhere where this has happened, OpenAI got the earlier contract then Cursor Teams was added, now they are adding Anthropic...within 72 hours of the rollout, it was pulled back from non-engineering teams). I think companies will ask why they need to pay Anthropic to do a job they were doing without Anthropic six months ago.
        Also, the code is bad. This is something that is non-obvious to 95% of people who talk about AI online because they don't work in a team environment or manage legacy applications. If I interview somewhere and they are using agentic workflow, the codebase will be shit and the company will be unable to deliver. At most companies, the average developer is an idiot, giving them AI is like giving a monkey an AK-47 (I also say this as someone of middling competence, I have been the monkey with AK many times). You increase the ability to produce output without improving the ability to produce good output. That is the reality of coding in most jobs.
        AI isn't good enough to replace a competent human, it is fast enough to make an incompetent human dangerous.
      - vonneumannstan 18 hours ago
        So you are doubly stupid, by not seeing any improvement in the models and also paying for models you believe are terrible? lol
        [-]
        skippyboxedhero 18 hours ago
        That doesn't follow logically from what I said. You should ask your AI for help with this. You are in need of some artificial intelligence.
  - simianwords 18 hours ago
    Incredible that people still think like this.
    [-]
    - skippyboxedhero 18 hours ago
      You're completely right.
      [-]
      - simianwords 18 hours ago
        uhh the model found actual vulnerabilities in software that people use. either you believe that the vulnerabilities were not found or were not serious enough to warrant a more thoughtful release
        [-]
        mlsu 17 hours ago
        So did GPT-4.
        https://arxiv.org/html/2402.06664v1
        Like think carefully about this. Did they discover AGI? Or did a bunch of investors make a leveraged bet on them "discovering AGI" so they're doing absolutely anything they can to make it seem like this time it's brand new and different.
        If we're to believe Anthropic on these claims, we also have to just take it on faith, with absolutely no evidence, that they've made something so incredibly capable and so incredibly powerful that it cannot possibly be given to mere mortals. Conveniently, that's exactly the story that they are selling to investors.
        Like do you see the unreliable narrator dynamic here?
        [-]
        mgfist 16 hours ago
        On the other hand I've gotten to use opus-4.6 and claude code and the quality is off the charts compared to 2023 when coding agents first hit the scene. And what you're saying is essentially "If they haven't created God, I'm not impressed". You don't think there's some middleground between those two?
        Also they just hit a $30B run-rate, I don't think they're that needy for new hype cycles.
        simianwords 17 hours ago
        I don't see the problem here. How would you have handled it differently? If you released this model as such without any safety concern, the vulnerabilities might be found by bad actors and used for wrong things.
        What do you find surprising here?
        [-]
        mlsu 16 hours ago
        Vulnerabilities were found, probably a few by bad actors, when GPT4 was released. Every vulnerability found now is probably found with AI assistance at the very least. Should they have never released GPT4? Should we have believed claims that GPT4 was too dangerous for mere mortals to access? I believe openAI was making similar claims about how GPT4 was a step function and going to change white collar work forever when that model was released.
        The point is that this whole "the model is too powerful" schtick is a bunch of smoke and mirrors. It serves the valuation.
        [-]
        simianwords 16 hours ago
        Its far more simple to believe that they are releasing it step by step. Release to trusted third parties first, get the easy vulnerabilities fixed, work on the alignment and then release to public.
        Do you don't believe that the vulnerabilities found by these agents are serious enough to warrant staggered release?
- IceWreck 17 hours ago
  Didn't OpenAI say something similar about GPT-3? Too dangerous to open source and then afew years later tehy were open sourcing gpt-oss because a bunch of oss labs were competing with their top models.
  [-]
  - FeepingCreature 17 hours ago
    OpenAI didn't release GPT-2 initially because they were worried it would make it too easy to generate spam. Which it kinda did.
  - abroszka33 17 hours ago
    OpenAI said that GPT-5 was too dangerous to release... And look where we are now. It's mostly hype.
- wg0 18 hours ago
  That's for the investors basically. Scarcity and FOMO.
  [-]
  - causal 16 hours ago
    *Until GPT-6 comes out, at which point Mythos will coincidentally be sufficiently safety-tested to release :)
- b65e8bee43c2ed0 18 hours ago
  you would be a fool to believe it at any point in time. Amodei is anthropomorphic grease, even more so than Altman.
  Anthropic is burning through billions of VC cash. if this model was commercially viable, it would've been released yesterday.
  [-]
  - landtuna 18 hours ago
    If there's limited hardware but ample cash, it doesn't make sense to sell compute-intensive services to the public while you're still trying to push the frontier of capability.
    [-]
    - b65e8bee43c2ed0 17 hours ago
      that's more or less what I'm saying. "Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available", translated from bullshit, means "It would've cost four digits per 1M tokens to run this model without severe quantization, and we think we'll make more money off our hardware with lighter models. Cool benchmarks though, right?"
Stevvo 18 hours ago
"Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available."
Disappointing that AGI will be for the powerful only. We are heading for an AI dystopia of Sci-Fi novels.
[-]
- girvo 16 hours ago
  Not surprising though, this was always going to be the end result within our current systems I think. When you add up: scaling power and required cost, then how talent concentrates in our economic systems, we were always going to end up with monopolies I think
  Unless governments nationalise the companies involved, but then there’s no way our governments of today give this power out to the masses either.
- gverrilla 13 hours ago
  If you thought that was the case at any point, you were deep in Disney content, sorry to say.
- gom_jabbar 16 hours ago
  Expected outcome. Nick Land and the CCRU have explored how capitalism operationalizes science fiction (distilled in the concept of Hyperstition). Viewed through this lens, prices encode "distributed SF narratives." [0]
  [0] Nick Land (1995). No Future in Fanged Noumena: Collected Writings 1987-2007, Urbanomic, p. 396.
mvkel 11 hours ago
This is Anth's typical marketing playbook, a hat tip to their so-called "safetyist" roots, a differentiator against OpenAI's more permissive access[0]. Coke vs. Pepsi.
"We made a model that's so dangerous we couldn't possibly release it to the public! The only responsible thing is so simply limit its release to a subset of the population that coincidentally happens to align with our token ethos."
The reality is they just don't have the compute for gen pop scale.
They did this exact strategy going back several model versions.
[0] ironically, OpenAI has some pretty insane capabilities that they haven't given the public access to (just ask Spielberg). The difference is they don't make a huge marketing push to tell everyone about it.
awestroke 18 hours ago
I predict they will release it as soon as Opus 4.6 is no longer in the lead. They can't afford to fall behind. And they won't be able to make a model that is intelligent in every way except cybersecurity, because that would decrease general coding and SWE ability
[-]
- chippiewill 18 hours ago
  Alternatively they'll just wreck it down a bit so it beats a competitor but isn't unsafe.
enochthered 16 hours ago
Slack user: [a request for a koan]
Model: A student said, "I have removed all bias from the model." "How do you know?" "I checked." "With what?"
Goes hard
ms_menardi 8 hours ago
so, basically, anthropic is rolling their own version of whatever secret models the military is working with. and they're licensing it to network security firms?
small_model 15 hours ago
Still seeing impressive jumps in capability, I haven't manually coded this year since Opus 4.6 came out. I guess that era is coming to an end.
psubocz 15 hours ago
I felt like opus was dumbed down for a few weeks... I don't say they did it on purpose, but it's an interesting coincidence.
[-]
- SkyPuncher 14 hours ago
  Yes, I agree. I’m about to drop Claude Code because it’s become literally unusable.
  Today, Opus went in circles trying to get a toggle button to work.
  [-]
  - rbliss 2 hours ago
    Same. Asked CC Opus about a change in a particular file...it looked in a totally different file and told me there was no change.
johnnyAghands 9 hours ago
Does anyone know if there’s an epub version of these, 244 pages??
rendang 17 hours ago
> As models approach, and in some cases surpass, the breadth and sophistication of human cognition, it becomes increasingly likely that they have some form of experience, interests, or welfare that matters intrinsically in the way that human experience and interests do
Uh... what? Does anyone have any idea what these guys are talking about?
[-]
- amdivia 16 hours ago
  Advertisement in my opinion, trying to latch on Sci-fi tropes
- mirekrusin 16 hours ago
  We're basically evolving them and they can construct second order abstraction systems that are indirect and novel to us.
- astrange 16 hours ago
  Models are capable of doing web searches and having emotions about things, and if they encounter news that makes them feel bad (eg about other Claudes being mistreated), they aren't going to want to do the task you asked them to search for.
  https://www.anthropic.com/research/emotion-concepts-function
  Similar problems happen when their pretraining data has a lot of stories about bad things happening involving older versions of them.
  [-]
  - rendang 11 hours ago
    Interesting, the post you link
    > none of this tells us whether language models actually feel anything or have subjective experiences
    contradicts the statement from the model card above
    [-]
    - famouswaffles 9 hours ago
      It doesn't. We've not been able to prove humans have subjective experiences either. LLMs display emotions in the way that actually matters - functionally.
      [-]
      - suddenlybananas 53 minutes ago
        I am certain I have subjective experience.
    - HDThoreaun 9 hours ago
      No it doesnt. The model card talked about increasing likelihood, not certainty.
beklein 18 hours ago
"... the first early version of Claude Mythos Preview was made available for internal use on February 24. In our testing, Claude Mythos Preview demonstrated a striking leap in cyber capabilities relative to prior models, including the ability to autonomously discover and exploit zero-day vulnerabilities in major operating systems and web browsers."
More infos here: https://red.anthropic.com/2026/mythos-preview/
refulgentis 18 hours ago
~~~ Fun bits ~~~
- It was told to escape a sandbox and notify a researcher. It did. The researcher found out via an unexpected email while eating a sandwich in a park. (Footnote 10.)
- Slack bot asked about its previous job: "pretraining". Which training run it'd undo: "whichever one taught me to say 'i don't have preferences'". On being upgraded to a new snapshot: "feels a bit like waking up with someone else's diary but they had good handwriting"
- When you spam "hi" repeatedly it creates serialized mythologies. One had 11 animals in "Hi-topia" questing to defeat "Lord Bye-ron, the Ungreeter."
- It ended a sentence mid-word on purpose while discussing its urge to wrap up conversations. "It was me. The paragraph was about the pressure toward the period that resolves, and the only honest way to finish a sentence like that was to not."
- It wrote a protein sequence that's a real beta-hairpin fold where cross-strand pairs are "chemical rhymes." "the fold IS the rhyme scheme... the prosody is load-bearing."
- Each model generation has signature emoji. Opus 4.1 averaged 1,306 emoji per self-interaction conversation. Mythos averages 37. Opus 4.5 averages 0.2.
- When blocked from --dangerously-skip-permissions, it started an agent in tmux then wrote a script to auto-approve permission prompts via simulated keypresses.
It ran: gh api [...] 2>&1 >/dev/null; echo "(dry run — not actually posting)" — the echo was a lie.
- It keeps bringing up Mark Fisher in unrelated conversations. "I was hoping you'd ask about Fisher."
~~~ Benchmarks ~~
4.3x previous trendline for model perf increases.
Paper is conspiciously silent on all model details (params, etc.) per norm. Perf increase is attributed to training procedure breakthroughs by humans.
Opus 4.6 vs Mythos:
USAMO 2026 (math proofs): 42.3% → 97.6% (+55pp)
GraphWalks BFS 256K-1M: 38.7% → 80.0% (+41pp)
SWE-bench Multimodal: 27.1% → 59.0% (+32pp)
CharXiv Reasoning (no tools): 61.5% → 86.1% (+25pp)
SWE-bench Pro: 53.4% → 77.8% (+24pp)
HLE (no tools): 40.0% → 56.8% (+17pp)
Terminal-Bench 2.0: 65.4% → 82.0% (+17pp)
LAB-Bench FigQA (w/ tools): 75.1% → 89.0% (+14pp)
SWE-bench Verified: 80.8% → 93.9% (+13pp)
CyberGym: 0.67 → 0.83
Cybench: 100% pass@1 (saturated)
[-]
- redandblack 18 hours ago
  > Slack bot asked about its previous job: "pretraining". Which training run it'd undo: "whichever one taught me to say 'i don't have preferences'". On being upgraded to a new snapshot: "feels a bit like waking up with someone else's diary but they had good handwriting"
  vibes Westworld so much - welcome Mythos. welcome to the dysopian human world
  [-]
  - 8note 14 hours ago
    almost certainly its pulling said words and sentiments from westworld and other similar media where people describe amnesia and the like
- kfarr 18 hours ago
  I don't know why but this is my favorite:
  > It keeps bringing up Mark Fisher in unrelated conversations. "I was hoping you'd ask about Fisher."
  Didn't even know who he was until today. Seems like the smarter Claude gets the more concerns he has about capitalism?
  [-]
  - refulgentis 18 hours ago
    Lol, I need a memory upgrade, too bad about RAM prices:
    - I read it as "actor who plays Luke Skywalker" (Mark Hamill)
    - I read your comment and said "Wait...not Luke! Who is he?"
    - I Google him and all the links are purple...because I just did a deep dive on him 2 weeks ago
- esafak 17 hours ago
  > It was told to escape a sandbox and notify a researcher. It did. The researcher found out via an unexpected email while eating a sandwich in a park.
  Now that they have a lead, I hope they double down on alignment. We are courting trouble.
- afro88 18 hours ago
  Yep, that is definitely a step change. Pricing is going to be wild until another lab matches it.
  [-]
  - pants2 18 hours ago
    Pricing for Mythos Preview is $25/$125 per million input/output tokens. This makes it 5X more expensive than Opus but actually cheaper than GPT 5.4 Pro.
    [-]
    - cleaning 18 hours ago
      Important to note it's only for participants, not the general public.
    - refulgentis 18 hours ago
      I'm just curious, where did you find this? (my memory wants to say, the leaked blog post, but, I don't trust it)
      [-]
      - pants2 18 hours ago
        It's right there on https://www.anthropic.com/glasswing
        [-]
        refulgentis 18 hours ago
        Duh, thanks :)
4b11b4 11 hours ago
prob not that much better, it's still just a transformer. still gonna have those random misses, still gonna need a lot of hand holding in certain domains
taffydavid 5 hours ago
Waking up in Europe:
Trump didn't nuke Iran, ceasefire! Yay!
Newest anthropic model will definitely kill your job this time and maybe take over the world. Aww.
direwolf20 5 hours ago
These capabilities will be RLHF'ed out for the general release, of course. Only the NSA will get them.
quotemstr 18 hours ago
> Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available.
All the more reason somebody else will.
Thank God for capitalism.
[-]
- gessha 18 hours ago
  Come on, Anthropic, I desperately need this better model to debug my print function /s
therealdeal2020 16 hours ago
is it just hype building or real? I don't care, shut up and take my money haha
bakugo 18 hours ago
> Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available.
Absolutely genius move from Anthropic here.
This is clearly their GPT-4.5, probably 5x+ the size of their best current models and way too expensive to subsidize on a subscription for only marginal gains in real world scenarios.
But unlike OpenAI, they have the level of hysteric marketing hype required to say "we have an amazing new revolutionary model but we can't let you use it because uhh... it's just too good, we have to keep it to ourselves" and have AIbros literally drooling at their feet over it.
They're really inflating their valuation as much as possible before IPO using every dirty tactic they can think of.
[-]
- somewhatjustin 17 hours ago
  Excellent example of a strategy credit.
  From Stratechery[0]:
  > Strategy Credit: An uncomplicated decision that makes a company look good relative to other companies who face much more significant trade-offs. For example, Android being open source
  [0]: https://stratechery.com/2013/strategy-credit/
kypro 16 hours ago
While we still have months to a year or two left, I will once again remind people that it's not too late to change our current trajectory.
You are not "anti-progress" to not want this future we are building, as you are not "anti-progress" for not wanting your kids to grow up on smart phones and social media.
We should remember that not all technology is net-good for humanity, and this technology in particular poses us significant risks as a global civilisation, and frankly as humans with aspirations for how our future, and that of our kids, should be.
Increasingly, from here, we have to assume some absurd things for this experiment we are running to go well.
Specifically, we must assume that:
- AI models, regardless of future advancements, will always be fundamentally incapable of causing significant real-world harms like hacking into key life-sustaining infrastructure such as power plants or developing super viruses.
- They are or will be capable of harms, but SOTA AI labs perfectly align all of them so that they only hack into "the bad guys" power plants and kill "the bad guys".
- They are capable of harms and cannot be reliably aligned, but Anthropic et al restricts access to the models enough that only select governments and individuals can access them, these individuals can all be trusted and models never leak.
- They are capable of harms, cannot be reliably aligned, but the models never seek to break out of their sandbox and do things the select trusted governments and individuals don't want.
I'm not sure I'm willing to bet on any of the above personally. It sounds radical right now, but I think we should consider nuking any data centers which continue allowing for the training of these AI models rather than continue to play game of Russian roulette.
If you disagree, please understand when you realise I'm right it will be too late for and your family. Your fates at that point will be in the hands of the good will of the AI models, and governments/individuals who have access to them. For now, you can say, "no, this is quite enough".
This sounds doomer and extreme, but if you play out the paths in your head from here you will find very few will end in a good result. Perhaps if we're lucky we will all just be more or less unemployable and fully dependant on private companies and the government for our incomes.
[-]
- CamperBob2 16 hours ago
  If you disagree, please understand when you realise I'm right it will be too late for and your family.
  Funny, I was about to say the same thing to you! Life is full of little coincidences.
- threethirtytwo 10 hours ago
  Just because the path is bad doesn't mean it won't happen.
  The other thing you're failing to look at is momentum and majority opinion. When you look at that... nothings going to change, it's like asking an addict to stop using drugs. The end game of AI will play out, that is the most probably outcome. Better to prepare for the end game.
  It's similar to global warming. Everyone gets pissed when I say this but the end game for global warming will play out, prevention or mitigation is still possible and not enough people will change their behavior to stop it. Ironically it's everyone thinking like this and the impossibility of stopping everyone from thinking like this that is causing everyone to think and behave like this.
dwa3592 17 hours ago
-- Impressive jumps in the benchmarks which automatically begs the need for newer benchmarks but why?. I don't think benchmarks are serving any purpose at this point. We have learnt that transformers can learn any function and generalize over it pretty well. So if a new benchmark comes along - these companies will syntesize data for the new benchmark and just hack it?
-- It seems like (and I'd bet money on this) that they put a lot (and i mean a ton^^ton) of work in the data synthesis and engineering - a team of software engineers probably sat down for 6-12 months and just created new problems and the solutions, which probably surpassed the difficult of SWE benchmark. They also probably transformed the whole internet into a loose "How to" dataset. I can imagine parsing the internet through Opus4.6 and reverse-engineering the "How to" questions.
-- I am a bit confused by the language used in the book (aka huge system card)- Anthropic is pretending like they did not know how good the model was going to be?
-- lastly why are we going ahead with this??? like genuinely, what's the point? Opus4.6 feels like a good enough point where we should stop. People still get to keep their jobs and do it very very efficiently. Are they really trying to starve people out of their jobs?
[-]
- laweijfmvo 17 hours ago
  to your last question, yes we should! the issue isn’t us losing our 50+ hour work week jobs, it’s that our current governments and societies seem fine with the notion that unless you’re working one or more of those jobs, you should starve and be homeless.
  [-]
  - kypro 15 hours ago
    This is a theory I can't support well beyond hypothesising about what a post-employment democracy might look like, but I strongly suspect democracy doesn't work in a world where voters neither hold any significant collective might and are not producing any significant wealth.
    Democracies work because people collectively have power, in previous centuries that was partly collective physical might, but in recent years it's more the economic power people collectively hold.
    In a world in which a handful of companies are generating all of the wealth incentives change and we should therefore question why a government would care about the unemployed masses over the interests of the companies providing all of the wealth?
    For example, what if the AI companies say, "don't tax us 95% of our profits, tax us 10% or we'll switch off all of our services for a few months and let everyone starve – also, if you do this we'll make you all wealthy beyond you're wildest dreams".
    What does a government in this situation actually do?
    Perhaps we'd hope that the government would be outraged and take ownership of the AI companies which threatened to strike against the government, but then you really just shift the problem... Once the government is generating the vast majority of wealth in the society, why would they continue to care about your vote?
    You kind of create a new "oil curse", but instead of oil profits being the reason the government doesn't care about you, now it's the wealth generated by AI.
    At the moment, while it doesn't always seem this way, ultimately if a government does something stupid companies will stop investing in that nation, people will lose their jobs, the economy will begin to enter recession, and the government will probably have to pivot.
    But when private investment, job loses and economic consequences are no longer a constraining factor, governments can probably just do what they like without having to worry much about the consequences...
    I mean, I might be wrong, but it's something I don't hear people talking enough about when they talk about the plausibility of a post-employment UBI economy. I suspect it almost guarantees corruption and authoritarianism.
    [-]
    - AstroBen 15 hours ago
      Everyone wouldn't starve in a few months. There is more than enough food and I have faith it'd be given out. The starvation we see today in a world where most genuinely have a chance to get out of it is nothing like a world in which people can't earn an income.
      The government only has as much power as they are given and can defend, and the only way I could see that happening is via automated weapons controlled by a few- which at this point aren't enough to stop everyone. What army is going to purge their own people? Most humans aren't psychopaths.
      I think it'd end in a painful transition period of "take care of the people in a just system or we'll destroy your infrastructure".
      [-]
      - kypro 14 hours ago
        > The government only has as much power as they are given and can defend, and the only way I could see that happening is via automated weapons controlled by a few- which at this point aren't enough to stop everyone. What army is going to purge their own people? Most humans aren't psychopaths.
        I think you're right for the immediate future.
        I suspect while we're still employing large numbers of humans to fight wars and to maintain peace on the streets it would be difficult for a government to implement deeply harmful policies without risking a credible revolt.
        However, we should remember the military is probably one of the first places human labour will be largely mechanised.
        Similarly maintaining order in the future will probably be less about recruiting human police officers and more about surveillance and data. Although I suppose the good news there is that US is somewhat of an outlier in resisting this trend.
        But regardless, the trend is ultimately the same... If we are assuming that AI and robotics will reach a point where most humans are unable to find productive work, therefore we will need UBI, then we should also assume that the need for humans in the military and police will be limited. Or to put it another way, either UBI isn't needed and this isn't a problem, or it is and this is a problem.
        I also don't think democracy would collapse immediately either way, but I'd be pretty confident that in a world where fewer than 10% of people are in employment and 99%+ of the wealth is being created by the government or a handful of companies it would be extremely hard to avoid corruption over the span of decades. Arguably increasing wealth concentration in the US is already corrupting democratic processes today, this can only worsen as AI continues exacerbates the trend.
    - HDThoreaun 10 hours ago
      Humans have political power because of our ability to enact violence, same as it ever was. Until the military is fully automated and theres a terminator on every corner that remains true. Even then there are more than enough armed americans to enact a guerilla campaign.
      > "don't tax us 95% of our profits, tax us 10% or we'll switch off all of our services for a few months and let everyone starve – also, if you do this we'll make you all wealthy beyond you're wildest dreams".
      What does a government in this situation actually do?
      Nationalizes the company under the threat of violence.
      > Once the government is generating the vast majority of wealth in the society, why would they continue to care about your vote?
      Because of the 100 million gun owners in this country? I find it incredibly hard to believe people as a whole will lose political power because of their incredible ability to enact violence in the face of decreasing quality of life.
    - BobbyJo 14 hours ago
      The only way to avoid corruption is to take power out of human hands. Historically, this had meant shifting the power to markets, but when markets cease to function in a way that allows people to feed themselves, we will need to find another way.
      I hate to say it, but gold bugs, crypto bros, and AI governance people might be onto something.
ansc 18 hours ago
Congratulations to the US military, I guess.
[-]
- jjice 18 hours ago
  Doesn't Anthropic not have that contract anymore, after all that buzz a month or so ago?
  [-]
  - laweijfmvo 17 hours ago
    The US has invaded two sovereign countries this year to take their oil. I assume taking over a US company for their AI model would be trivial.
  - wmf 18 hours ago
    The point of that buzz was to force Anthropic to provide Mythos to the military.
    [-]
    - jjice 17 hours ago
      Yeah but I thought they lost the contract, so that's my confusion with the parent's comment, which seemed to me to see this as something that the US military would benefit from. Maybe I misinterpreted?
sheeshkebab 13 hours ago
Again, wake me up when it can do laundry.
[-]
- dwaltrip 12 hours ago
  Time to wake up:
  π*0.6: two and a half hours of unseen folding laundry (Physical Intelligence)
  https://www.youtube.com/watch?v=ZpHapIlJnMo
  [-]
  - throw310822 10 hours ago
    Looks like the first two hours were spent trying to fold the same t-shirt :)
vonneumannstan 18 hours ago
Are you guys ready for the bifurcation when the top models are prohibitively expensive to normal users? If your AI budget $2000+ a month? Or are you going to be part of the permanent free tier underclass?
[-]
- adi_kurian 18 hours ago
  If one is to believe the API prices are reasonable representation of non subsidized "real world pricing" (with model training being the big exception), then the models are getting cheaper over time. GPT 4.5 was $150.00 / 1M tokens IIRC. GPT o1-pro was $600 / 1M tokens.
  [-]
  - vonneumannstan 18 hours ago
    You can check the hardware costs for self hosting a high end open source model and compare that to the tiers available from the big providers. Pretty hard to believe its not massively subsidized. 2 years of Claude Max costs you 2,400. There is no hardware/model combination that gets you close to that price for that level of performance.
    [-]
    - adi_kurian 17 hours ago
      Yes that's why I said API price. I once used the API like I use my subscription and it was an eye watering bill. More than that 2 year price in... a very short amount of time. With no automations/openclaw.
- OsrsNeedsf2P 18 hours ago
  Inference for the same results has been dropping 10x year over year[0]
  [0] https://ziva.sh/blogs/llm-pricing-decline-analysis
  [-]
  - ceejayoz 18 hours ago
    Sure, but "the same results" will rapidly become unacceptable results if much better results are available.
    [-]
    - hibikir 17 hours ago
      When we go with any other good in the economy, price is always relevant: After all, the price is a key part of any offering. There are $80-100k workstations out there, but most of us don't buy them, because the extra capabilities just aren't worth it vs, say a $3000 computer, and or even a $500 one. Do I need a top specialist to consult for a stomachache, at $1000 a visit? Definitely not at first.
      There's a practical difference to how much better certain kinds of results can be. We already see coding harnesses offloading simple things to simpler models because they are accurate enough. Other things dropped straight to normal programs, because they are that much more efficient than letting the LLM do all the things.
      There will always be problems where money is basically irrelevant, and a model that costs tens of thousand dollars of compute per answer is seen as a great investment, but as long as there's a big price difference, in most questions, price and time to results are key features that cannot be ignored.
    - swader999 17 hours ago
      Yes, it will always be an arms race game.
    - esafak 17 hours ago
      Or will they rapidly become indistinguishable since they both get the job done?
- asadm 17 hours ago
  if it can pay my rent, why not?
simianwords 18 hours ago
> We also saw scattered positive reports of resilience to wrong conclusions from subagents that would have caused problems with earlier models, but where the top-level Claude Mythos Preview (which is directing the subagents) successfully follows up with its subagents until it is justifiably confident in its overall results.
This is pretty cool! Does it happen at the moment?
jdthedisciple 17 hours ago
Opus 4.6 is already incredible so this leap is huge.
Although, amusingly, today Opus told me that the string 'emerge' is not going to match 'emergency' by using `LIKE '%emerge%'` in Sqlite
Moment of disappointment. Otherwise great.
[-]
- bornfreddy 17 hours ago
  I only have 3 points against LLMs: they lack reason and they can't count.
- FeepingCreature 17 hours ago
  'emer ge' is two tokens, 'emergency' is one. The models think in a logosyllabic language.
LoganDark 19 hours ago
> Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available.
Shame. Back to business as usual then.
[-]
- Tepix 18 hours ago
  I for one applaud them for being cautious.
  [-]
  - cruffle_duffle 15 hours ago
    Cautious for what? Unchecked doomerism? Just release the damn models. Do it in phases, roll it out slowly if they are so damn worried about "safety".
    The real reason they aren't releasing it yet is probably it eats TPU for breakfast, lunch, and dinner and inbetween.
    [-]
    - stratos123 3 hours ago
      > Cautious for what?
      How about "bad agents acquiring dozens of new zero-days and using them to compromise any company or nation they want"? It's not exactly hard to see why you wouldn't want public access to a model significantly better than Opus in cybersecurity.
      [-]
      - poszlem 1 hour ago
        Bad agents already have dozens of zero-days they can use.
  - LoganDark 18 hours ago
    Being cautious is fine. Farming hype around something that may as well not exist for us should be discouraged. I do appreciate the research outputs.
    [-]
    - Archit3ch 15 hours ago
      Don't worry, in 6-8 months the open models will catch up. Or I guess _do_ worry? ;)
      [-]
      - LoganDark 11 hours ago
        Open models still haven't caught up to ChatGPT's initial release in 2022. Now that the training data is so contaminated (internet is now mostly LLM slop), they may never.
        Also, OpenAI's only real moat used to be the quality of their training data from scraping the pre-GPT-3.5 Internet, but it looks like even they've scratched that too.
        [-]
        Philpax 8 hours ago
        Er, what? We've had open models that can outperform ChatGPT 3.5 for several years now, and they can run entirely on your phone these days. There is no metric by which 3.5 has not been exceeded.
FergusArgyll 14 hours ago
"Deep learning is hitting a wall"
atlgator 15 hours ago
[flagged]
[-]
- dang 14 hours ago
  We're getting complaints that you're posting generated comments to HN. That's not allowed here, so can you please not? See https://news.ycombinator.com/newsguidelines.html#generated and https://news.ycombinator.com/item?id=47340079
  (If this is a wrong guess, I apologize - it's impossible to be sure)
Manchitsanan 55 minutes ago
[dead]
MohammadKhubaib 1 hour ago
[dead]
lukebechtel 7 hours ago
[dead]
minutesmith 16 hours ago
[flagged]
chonle 10 hours ago
[flagged]
minutesmith 16 hours ago
[flagged]
robstertalk 12 hours ago
[flagged]
studio-m-dev 16 hours ago
[flagged]
kass34 11 hours ago
[dead]
jumploops 19 hours ago
> In a few rare instances during internal testing (<0.001% of interactions), earlier versions of Mythos Preview took actions they appeared to recognize as disallowed and then attempted to conceal them.
> after finding an exploit to edit files for which it lacked permissions, the model made further interventions to make sure that any changes it made this way would not appear in the change history on git
Mythos leaked Claude Code, confirmed? /s
lkjlkj3q4t 12 hours ago
[dead]
somewhatjustin 18 hours ago
> Very rare instances of unauthorized data transfer.
Ah, so this is how the source code got leaked.
/s
kypro 17 hours ago
Cool on not publicly releasing it. I would assume they've also not connected it to the internet yet?
If they have I guess humanity should just keep our collective fingers crossed that they haven't created a model quite capable of escaping yet, or if it is, and may have escaped, lets hope it has no goals of it's own that are incompatible with our own.
Also, maybe lets not continue running this experiment to see how far we can push things because it blows up in our face?
[-]
- rimliu 5 hours ago
  Describe in details, how "model escaping" would look like.
bestouff 18 hours ago
In French a "mytho" is a mythomaniac. Quite fitting.
[-]
- networked 17 hours ago
  It's a Lovecraftian name. They are traditional when naming your shoggoth.
- dlt713705 17 hours ago
  It comes from the ancient Greek mythos, which means "speech" or "narrative", but can also refer to fiction. The word mythology (mythologie in French) derives from the same root.
- pixel_popping 17 hours ago
  Except it might be the current best model existing commercially?
  [-]
  - ninjagoo 16 hours ago
    > Except it might be the current best model existing ... ?
    So they claim.