10 comments

  • lebovic 9 hours ago
    The post is light on details, and I agree with the sentiment that it reads like marketing. That said, Opus 4.6 is actually a legitimate step up in capability for security research, and the red team at Anthropic – who wrote this post – are sincere in their efforts to demonstrate frontier risks.

    Opus 4.6 is a very eager model that doesn't give up easily. Yesterday, Opus 4.6 took the initiative to aggressively fuzz a public API of a frontier lab I was investigating, and it found a real vulnerability after 100+ uninterrupted tool calls. That would have required lots of of prodding with previous models.

    If you want to experience this directly, I'd recommend recording network traffic while using a web app, and then pointing Claude Code at the results (in Chrome, this is Dev Tools > Network > Export HAR). It makes for hours of fun, but it's also a bit scary.

  • samfundev 20 hours ago
    Glad to see that they brought in humans to validate and patch vulnerabilities. Although, I really wish they linked to the actual patches. Here's what I could find:

    https://cgit.ghostscript.com/cgi-bin/cgit.cgi/ghostpdl.git/c...

    https://github.com/OpenSC/OpenSC/pull/3554

    https://github.com/dloebl/cgif/pull/84

    • shoo 11 hours ago
      Yeah, having a layer of human experts to sanity check and weed out hallucinated false positive issues seems like an important part of this process:

      > To ensure that Claude hadn’t hallucinated bugs (i.e., invented problems that don’t exist, a problem that increasingly is placing an undue burden on open source developers), we validated every bug extensively before reporting it. [...] for our initial round of findings, our own security researchers validated each vulnerability and wrote patches by hand. As the volume of findings grew, we brought in external (human) security researchers to help with validation and patch development.

      Based on the experiences shared by curl's maintainers over the last couple of years, resulting in them ending their bug bounty program [1] [2] [3], I'd suggest the "growing risk of LLM-discovered [security issues]" is primarily maintainers being buried under a deluge of low-effort zero-value LLM-hallucinated false positive security issue reports, where the reporter copy-pastes LLM output without validation.

      [1] https://daniel.haxx.se/blog/2026/02/03/open-source-security-...

      [2] https://daniel.haxx.se/blog/2026/01/26/the-end-of-the-curl-b...

      [3] https://daniel.haxx.se/blog/2025/07/14/death-by-a-thousand-s...

      • sublinear 5 hours ago
        Ending a bug bounty program seems like a mistake.

        Why not just change the incentives? Don't pay for patches. Move the money over to human review of the infinite cesspool with an emphasis on how the findings are presented. Maintainers rank and filter by how concise the reviews are and how critical the bugs are. Stop allowing wide open pull requests for bugs and make that it's own new workflow.

        Bugs rarely happen in isolation and many are regressions. Many are related to features added or refactors. Fixing bugs should be more about understanding the nature of the project than just playing whack-a-mole. LLMs don't have as good of a memory as humans and much of the meta discussion would be out-of-band for the LLMs. We shouldn't be paying for monkey work. We should be paying the humans that deeply understand "the lore" of the project and can apply it in a meaningful way.

        In the first place, it's a long time coming that some maintainers feel the pressure to take the direction of the projects more seriously, and in some cases let others step up. So many open source projects need to be stop being the stereotype of lone genius pet projects or cultish power grabs. When people whine about open source not getting paid, this is the real reason why. It's not that the money or value isn't there, but a lack of confidence in the maintainers.

  • jsnell 2 hours ago
  • throwa356262 2 hours ago
    I just tested this using Calude and at least with 4.5 this does not seem to be possible. The context grows very quickly and the LLM gets lost and starts hallucinating. Maybe I am missing some key ingredient here?

    Of course, if you have large team of AI and security experts and an unlimited token budget things can look different.

  • tznoer 12 hours ago
    Grepping for strcat() is at the "forefront of cybersecurity"? The other one that applied a GitHub comment to a different location does not look too difficult either.

    Everything that comes out of Anthropic is just noise but their marketing team is unparalleled.

  • nielsbot 4 hours ago
    Wondering how many of these memory errors would be caught by running the Clang Static Analyzer (or similar) on them.

    https://clang-analyzer.llvm.org

    Alternatively, testing these projects with ASan enabled:

    https://clang.llvm.org/docs/AddressSanitizer.html

  • octoberfranklin 11 hours ago
    This reads like an advertisement for Anthropic, not a technical article.
    • blackqueeriroh 11 hours ago
      Okay, so if that’s the case, what do you have that’s constructive to say about it?
      • irishcoffee 10 hours ago
        Their comment was constructive for me, now I’m not going to read the article.
  • cyanydeez 11 hours ago
    Is there a polymarket on the first billion dollar AI company to 0$ by their own insecure Model deployment?
  • username223 9 hours ago
    "Evaluating and mitigating the growing risk of LLM-developed 0-days" would be much more interesting and useful. Try harder, guys.
  • catlifeonmars 6 hours ago
    > Our view is this is a moment to move quickly—to empower defenders and secure as much code as possible while the window exists.

    Yawn.