> I think we're mving towards humans no longer needing to understand a codebase, and letting AI drive it.
Hard disagree. Even the best frontier models generate output that's not what I asked for. Sometimes I realize that I get lazy in my prompting and the lack of specificity winds up showing up in the output. Just the other day, a coworker built a huge feature using frontier models and it slipped an IDOR in.
I just don't see a world in which we completely cede control of the codebase to AI because it's still my ass on the line if I ship something that completely borks production. If I'm not reading code regularly, then I lose the ability to read code, and if I lose that ability, then I'm no longer a developer.
This “short leash” seems like more of a crutch to me, and a sign of not giving the AI enough detail on the problem to begin with, or not reviewing and iterating on its output.
Hand-holding great models like Fable through implementation is a waste of time, and a waste of Fable. You can have increasingly nuanced discussions with stronger models, and they write a lot better code than they used to. The process of discussing designs and their implementations, questioning things that look weird to you, and actually reading the AI’s responses also helps to find better solutions.
For example, one time I wanted to write a greedy solver for a problem, and in my discussion with Opus on the idea it suggested using an existing MILP library to solve the problem exactly. I’d never even heard of MILP, but my final implementation ended up being better and simpler than what I’d have done alone.
If you have invested significantly in the planning phase and there is momentum in the architecture and conventions that already exist in the project, the implementation phase might not need as much oversight as is suggested here.
> You can discover that your initial idea was dumb and a better one exists
The planning and architecture phase is usually where I make these types of discovery at a high level.
> Your agent might go “off the rails” and start doing something you don’t want it to do
Candidly these orthogonal, inadvertent edits aren't as bad as they once were and for impactful changes there should be at least some test coverage, even if that test coverage is just "freezing" what was implemented.
As you mentioned the final review discussion is a good chance to verify beyond what review or adversarial review agents find.
yep. got a laptop i dont care about that claude can play with in wsl.
its the fun of funemployment.
starting work again is gonna be an interesting change though. its currently straightforward letting it run, then giving a broad critique and setting up new introspection/closed loop feedback for an hour over a beer, then letting it run wild again after
>>You never use “YOLO” mode (aka “dangerously skip permissions”)
Do you mean this?
I'm curious how are people using Claude in any way other than bypass-permissions. I've tried for so long to maintain a curated list of things Claude can use, but inevitably I would always come back only to find it stuck because it decided to pipe an output of one tool into another and that's not explicitly allowed so it stopped even though it was just greping or whatever. I found it infuriating. In bypass-permissions it "just works" but then again I only use it to analyze existing code and suggest new changes(and even if it breaks something that's what source control is for?)
It does do this to frustrate you, save 30 tokens, and then waste a few thousand more when it didn't get all the context it needed by grep'ping. You have to be involved in the process though. It frequently wants to do things that are so incorrect, that even if it would be more convenient to just totally ignore it, it would be insane to actually ignore it. Do you trust it to not accidentally rm -rf the .git/ right after it helpfully force pushes to remote? I don't. Even if I don't expect it to do that, why would I ALLOW it to be able to?
I did it by making a huge database of allowlisted bash and having hooks check each one against the list. It makes a recursively parsed tree so it can handle gnarly blocks of bash. And then it outputs to the agent what failed and tells it to break it up next time. Then, in agent instructions, I impress on it strongly to use composable bash tools rather than trying to write python/ruby/perl scripts.
It was a bit of work, admittedly, but it's picked up a few users and I learned a lot from designing the research process and parsing the syntax trees.
I actually want to be alerted about everything that's not auto-approved, though. With safe commands auto-approved, it's much less noisy. I think it's important to read your code, as it develops, not just at the end, and understand what agents are doing.
Build your own MCP of allowed tools. Cargo. Ripgrep. File read and write, including directory listing and find. some git commands. Then block everything else.
I’ve found unexpected success in using ephemeral NixOS VMs for local development… once you authenticate your agent you can let it run wild without worrying about permissions.
AI is a junior to mid-level engineer. If you treat it as such, you get the best of both vibe coding and rigorous engineering without all this paranoia.
Since the very beginning I've ran Claude from an isolated VM on yolo mode. This is just like giving an engineer their own laptop. Claude works on a feature up to a PR worthy point. I review the diff, just like I would with another engineer, and massage it to get it in the right shape and move on.
Inexperienced engineers make the same mistakes described I've even seen rm -rf albeit not from root! I would have lost my mind micromanaging someone with all permissions denied.
LLMs are still next token predictors, just because you can give it more vague instructions and it still finds the right steps to follow, it doesn't mean it's intelligent. It means you're speaking the same language as the harness they trained your model on.
And that has a limit. If you are stuck at PoC level or simple apps, you have no idea how limited the current models still are. There you really need to break tasks down, not just trust a token predictor to list steps that sound good. There has to be a human in the loop somewhere, because by the time you start skipping permissions, best case you get the jackpot, more likely is you get a suboptimal solution and token waste and what's genuinely still terrifying when the model ignores instructions and does some stupid nonsense, ruining your day. It really is as sharp as a CNC machine. It's not not useful, but could be dangerous, so maybe don't try to carve wood with a monster machine, or park your Ferrari in that crammed neighbourhood if you don't know how to parallel park.
"Next token prediction" is an interface, not an algorithm. A process that "predicts next tokens" can be arbitrarily complex or simple, and arbitrarily capable or incapable of performing a given task.
Saying that an LLM can or can't do something because it's a "token predictor" is a category error. The interface isn't a hard limit.
I'm not sure how you're defining "intelligent", but I'd like to know how it is able to exclude a language model, while still including humans, without simply defining it with an axiom that predefines LLMs as lacking intelligence.
An LLM has a fixed number of ways it can express itself. we can give it an array of 14 billion options but it still has to chose one to output. Humans have no such limitation.
An LLM does not persist in consciousness from one token to the next. Each generation, happening hundreds of times a second, will be initialized, generate an output, and terminate. Humans are not stateless like an LLM.
You're conflating a singular model with a much larger system, but I want to address some of your points anyway.
> An LLM has a fixed number of ways it can express itself
While deterministic, there is not a fixed number of ways it can express itself, given that we can use settings like temperature to inject randomness into the output.
> An LLM does not persist in consciousness from one token to the next
While a model alone does not update itself to persist some form of history, there are a number of ways to overcome this, e.g. episodic memory, fine-tuning, and other self-improvement systems exist, which can indeed carry forward what you've called "consciousness".
> Humans are not stateless like an LLM.
Again, you're conflating a singular model with a very large system. An LLM might be stateless, but an agentic system that relies on LLMs is very often not.
It's impossible for someone to doubt their own sentience. The literal act of doubting is enough to dissipate all doubt. Solipsism is essentially the one certainty that every mind out there has.
Doubting the sentience of machines and even other humans is perfectly fine though. Only empathy allows people to make the leap and assume other humans have souls.
while the how is different, the what has many parallels. E.g. both the brain and LLMs appear to learn distributions of representations, they both develop a hierarchy of those representations, both have early layers that process simple features, with later ones processing more abstract concepts, both predict missing information...
Even if you could understand human cognition to the level required to say, confidently, that it’s done one word at a time, it’s likely not! Natural language is not a prerequisite for human intelligence, as evidenced by the fact that we went from primates to commenting on HN.
Natural language is, however, a prerequisite for the existence of LLMs. It’s more similar to methods for storing and retrieving information, like the printing press or a database, than it is to a sentient being.
That’s not to say that LLMs can’t do crazy things, because they already have. Our language can encode a whole lot of information, and it’s incredible that we’ve found a way to distill that so effectively.
Maybe I'm too optimistic, but given appropriate skills and references (not just for writing but also reviewing) and intelligent use of subagents for isolated reviews and checks, you can lengthen the leash a bit.
But you still need to properly review plans and PRs to keep a good mental model of the codebase. This effectively limits the number of tasks being done in parallel to maybe 2-3. Though you'll be mentally exhausted and probably start to make mistakes or take shortcuts in reviews yourself.
Last year it was, “AI is just a stochastic parrot.”
This year it’s, “AI can write the code, but a human still has to review it!” (Using AI, of course.)
Give it another year and the narrative will be: “Only AI is capable of reviewing code, and only AI can review the AI’s review. Humans just need to read the AI’s final opinion so they still have meaningful oversight.”
The goalposts keep moving. The certainty never does.
The regress ends somewhere, because (barring some pretty sharp changes to the way the law works basically everywhere) ultimately someone has to certify the outcomes as acceptable. This might be in the form of the market (though AI-adjacent stuff seems extremely prone to prolonged market failures), this might be regulatory in nature. This might be the executive management of the companies involved.
Personally I think that if you cranked the capability up high enough the first person you'd run into who absolutely demanded more than vibes and didn't care about your singularity thesis would be the representative of a reinsurance firm: mostly to do serious stuff without bending the law, you need insurance, and I am unaware of anyone writing serious policies (certainly not ones that make any economic sense) that underwrite the risk of AI autonomy outcomes financially.
When Swiss Re writes a policy that Anthropic Cinematic Universe or whatever iteration we're on won't fuck it up?
Now maybe we're talking. Until then you ask three practitioners and get nine answers, no one knows what they're talking about unless they're doing a really good job keeping it quiet (and that's probably what you'd do!).
This post seems like some decent advice mixed in with a lot of overconfidence and unverifiable claims.
“expert developers whose skills have reached the point where they outclass any and all “frontier AI models” in their area of expertise”
Are any developers saying they outclass any and all frontier models? I’d say at best it’s mixed at this point. The best developers still do certain things better, but not even close to all things.
“The problem is that even code written and/or reviewed by Fable 5, will stink”
I'm curious whether Opus4.8 or similar can attain Mythos level through good system prompting and steering? You would expect this to work if it's true that the strength of Mythos is its unwillingness to quit before it gets a desired outcome
As a Mythos user (I’m part of Project Glasswing), I would say that abliterated models [1][2] produce similar, if not identical, results. While good prompting and steering won’t give Claude Opus 4.8 the same capabilities as Mythos (preview 1), using abliterated models (if you have the computational power to run the larger ones) will get you close to the same goals as people who have access to Mythos (preview 1) [3].
[3] I specifically refer to “preview 1” because the newer versions (Fable 5 / Mythos 5) don’t appear to offer the same level of freedom as the very first version that I was able to use through Project Glasswing. This is one of the reasons why I continue running our massive security scans with “preview 1”, or at least I was running them until June 30, when the program’s policy changed.
I think that Anthropic is gaslighting us with their new model releases. Specifically, I think they have some good base model and are just fine-tuning it until they achieve desired outcome, or the desired outcome is achieved accidentally as part of fine-tuning. My theory is based on the fact that as a long-term (if you can call it that way) Claude user I keep noticing the same patterns it outputs. It's not trivial but certainly possible to see when something has been written by Claude because it has a different style than GPT.
However they have quite good harness in their backend which is the actual model.
I <3 how everyone and their brother feels qualified to write advice to hundreds? thousands? of other developers about AI ... based on a couple months of experience as a personal user.
I mean, it's like writing a book about how to use React or Django or some other major software ... after you used it for one project for a month!
Authors: I know this is the Internet, and I know bloggers blog about whatever pops into their head ... but if you are going to act like an authority, how about you learn more than the average reader before you start telling them authoritatively what to do?
People are doing what they've always done with any other new technology, and sharing what, personally, works for them. People can take or leave the advice.
Right but there's a marked difference between a "I just tried this new tech and here's what I think" vs. "I've used this tech for a few months and now I'm going to speak like I know everything about it".
I have no beef with people writing about new tech, but I do have beef with claiming that "____ is the correct way to do it" ... based on nothing except "I feel proud of the last three months I spent with Claude".
It's an open problem of clearly large value how to get reliably useful and trustworthy outcomes from AI systems in many domains, software is maybe the signal example of that. If one had solved it resoundingly and scaleably, one could in fact "get rich quick".
It is unsurprising that a lot of people claim to know how to get rich quick.
I believe it is possible to solve this problem, and I have my own horses in the race which I won't threadjack to promote here, but it's the central problem of our profession at the moment. We've all seen the truly discontinuous outcomes and we've all seen allegedly national security dangerous models (which at one time was GPT-3) faceplant with it's shoelaces tied together. I wanted to see if Fable was really all that and I left it overnight on some fairly straightforward C++ (code DSv4 Flash works on with moderate supervision) and it's pretty roast worthy, I gave it a chance to redeem itself this morning and it's ticked up a bit (I still think it's roughly Opus 4.8 with a Project Zero fine tune and DRO trained off the constant gratuitous yield tic which is pretty clearly an intentional gimp).
I give all such claims 30 seconds of my time because someone is going to actually be right one of these days.
There are a lot of people with a long career in the old way of doing things are feeling incredibly threatened and defensive and desperate to virtue signal about AI.
This is probably slower than writing the code yourself. Doesn't make sense to me. Using an agent without YOLO mode is not wort it.
The way I rather do it is tightly control the output by skills written yourself, prompts, plans, etc. and have the closest possible outcome you would write yourself.
Not really if it takes you 15 minutes to write a 50 line function but it takes the AI 90 seconds then you already are at a 10x speedup just for this task.
This (non-yolo mode AI coding) is actually how we used to code in the old days (2023).
Better method start to realizing that everything that every program do is data transformations and or movement
Then you ask llm to subdivide data in a tree along the domain model, classifing streaming vs storing nodes
Then for each node you discuss with the ai for the best data structure
Then you ask for an interface that fully encapsulate the structure and every mutation only allows to go from a valid state to a valid state and bidding else is allowed to touch the state
And that's mostly it just connect all the interfaces until input goes to monitor or to storage or to api or wherever the destination is
Efficient != effective, and the author outlines as much. Regardless, while you're technically correct, it's kinda like saying the Fantasy Land Specification[1] (aka the "Algebraic JavaScript Specification") is pure. The problem is that purely functional fantasy lands rarely exist outside of fairytales. In other words, life is a lot like JavaScript and never that simple.
> The AI will have gone off the rails multiple times and you will only notice it later when you actually try to use the software.
Except that said AI can now themselves use your software and find and fix bugs themselves, not to mention drive new features.
>Your agent might go “off the rails” and start doing something you don’t want it to do
This happens but far less often than it used to, and the case for full autonomous agents is getting stronger, not weaker.
>It is humanly impossible to build your own understanding of a codebase
This again feels outdated. I think we're mving towards humans no longer needing to understand a codebase, and letting AI drive it.
Hard disagree. Even the best frontier models generate output that's not what I asked for. Sometimes I realize that I get lazy in my prompting and the lack of specificity winds up showing up in the output. Just the other day, a coworker built a huge feature using frontier models and it slipped an IDOR in.
I just don't see a world in which we completely cede control of the codebase to AI because it's still my ass on the line if I ship something that completely borks production. If I'm not reading code regularly, then I lose the ability to read code, and if I lose that ability, then I'm no longer a developer.
Hand-holding great models like Fable through implementation is a waste of time, and a waste of Fable. You can have increasingly nuanced discussions with stronger models, and they write a lot better code than they used to. The process of discussing designs and their implementations, questioning things that look weird to you, and actually reading the AI’s responses also helps to find better solutions.
For example, one time I wanted to write a greedy solver for a problem, and in my discussion with Opus on the idea it suggested using an existing MILP library to solve the problem exactly. I’d never even heard of MILP, but my final implementation ended up being better and simpler than what I’d have done alone.
If you have invested significantly in the planning phase and there is momentum in the architecture and conventions that already exist in the project, the implementation phase might not need as much oversight as is suggested here.
> You can discover that your initial idea was dumb and a better one exists
The planning and architecture phase is usually where I make these types of discovery at a high level.
> Your agent might go “off the rails” and start doing something you don’t want it to do
Candidly these orthogonal, inadvertent edits aren't as bad as they once were and for impactful changes there should be at least some test coverage, even if that test coverage is just "freezing" what was implemented.
As you mentioned the final review discussion is a good chance to verify beyond what review or adversarial review agents find.
Am I wrong? Are you guys just YOLOing everything these days?
its the fun of funemployment.
starting work again is gonna be an interesting change though. its currently straightforward letting it run, then giving a broad critique and setting up new introspection/closed loop feedback for an hour over a beer, then letting it run wild again after
Do you mean this?
I'm curious how are people using Claude in any way other than bypass-permissions. I've tried for so long to maintain a curated list of things Claude can use, but inevitably I would always come back only to find it stuck because it decided to pipe an output of one tool into another and that's not explicitly allowed so it stopped even though it was just greping or whatever. I found it infuriating. In bypass-permissions it "just works" but then again I only use it to analyze existing code and suggest new changes(and even if it breaks something that's what source control is for?)
I run mine in a container, so it doesn't have access to the SSH key I use to push.
It was a bit of work, admittedly, but it's picked up a few users and I learned a lot from designing the research process and parsing the syntax trees.
I actually want to be alerted about everything that's not auto-approved, though. With safe commands auto-approved, it's much less noisy. I think it's important to read your code, as it develops, not just at the end, and understand what agents are doing.
Since the very beginning I've ran Claude from an isolated VM on yolo mode. This is just like giving an engineer their own laptop. Claude works on a feature up to a PR worthy point. I review the diff, just like I would with another engineer, and massage it to get it in the right shape and move on.
Inexperienced engineers make the same mistakes described I've even seen rm -rf albeit not from root! I would have lost my mind micromanaging someone with all permissions denied.
And that has a limit. If you are stuck at PoC level or simple apps, you have no idea how limited the current models still are. There you really need to break tasks down, not just trust a token predictor to list steps that sound good. There has to be a human in the loop somewhere, because by the time you start skipping permissions, best case you get the jackpot, more likely is you get a suboptimal solution and token waste and what's genuinely still terrifying when the model ignores instructions and does some stupid nonsense, ruining your day. It really is as sharp as a CNC machine. It's not not useful, but could be dangerous, so maybe don't try to carve wood with a monster machine, or park your Ferrari in that crammed neighbourhood if you don't know how to parallel park.
Saying that an LLM can or can't do something because it's a "token predictor" is a category error. The interface isn't a hard limit.
I'm not sure how you're defining "intelligent", but I'd like to know how it is able to exclude a language model, while still including humans, without simply defining it with an axiom that predefines LLMs as lacking intelligence.
An LLM does not persist in consciousness from one token to the next. Each generation, happening hundreds of times a second, will be initialized, generate an output, and terminate. Humans are not stateless like an LLM.
> An LLM has a fixed number of ways it can express itself
While deterministic, there is not a fixed number of ways it can express itself, given that we can use settings like temperature to inject randomness into the output.
> An LLM does not persist in consciousness from one token to the next
While a model alone does not update itself to persist some form of history, there are a number of ways to overcome this, e.g. episodic memory, fine-tuning, and other self-improvement systems exist, which can indeed carry forward what you've called "consciousness".
> Humans are not stateless like an LLM.
Again, you're conflating a singular model with a very large system. An LLM might be stateless, but an agentic system that relies on LLMs is very often not.
Doubting the sentience of machines and even other humans is perfectly fine though. Only empathy allows people to make the leap and assume other humans have souls.
Natural language is, however, a prerequisite for the existence of LLMs. It’s more similar to methods for storing and retrieving information, like the printing press or a database, than it is to a sentient being.
That’s not to say that LLMs can’t do crazy things, because they already have. Our language can encode a whole lot of information, and it’s incredible that we’ve found a way to distill that so effectively.
But you still need to properly review plans and PRs to keep a good mental model of the codebase. This effectively limits the number of tasks being done in parallel to maybe 2-3. Though you'll be mentally exhausted and probably start to make mistakes or take shortcuts in reviews yourself.
Last year it was, “AI is just a stochastic parrot.”
This year it’s, “AI can write the code, but a human still has to review it!” (Using AI, of course.)
Give it another year and the narrative will be: “Only AI is capable of reviewing code, and only AI can review the AI’s review. Humans just need to read the AI’s final opinion so they still have meaningful oversight.”
The goalposts keep moving. The certainty never does.
Personally I think that if you cranked the capability up high enough the first person you'd run into who absolutely demanded more than vibes and didn't care about your singularity thesis would be the representative of a reinsurance firm: mostly to do serious stuff without bending the law, you need insurance, and I am unaware of anyone writing serious policies (certainly not ones that make any economic sense) that underwrite the risk of AI autonomy outcomes financially.
When Swiss Re writes a policy that Anthropic Cinematic Universe or whatever iteration we're on won't fuck it up?
Now maybe we're talking. Until then you ask three practitioners and get nine answers, no one knows what they're talking about unless they're doing a really good job keeping it quiet (and that's probably what you'd do!).
“expert developers whose skills have reached the point where they outclass any and all “frontier AI models” in their area of expertise”
Are any developers saying they outclass any and all frontier models? I’d say at best it’s mixed at this point. The best developers still do certain things better, but not even close to all things.
“The problem is that even code written and/or reviewed by Fable 5, will stink”
I’m skeptical. Example prompt and output please.
[1] https://huggingface.co/search/full-text?q=abliterated&type=m...
[2] https://webdecoy.com/blog/wtf-are-abliterated-models-uncenso...
[3] I specifically refer to “preview 1” because the newer versions (Fable 5 / Mythos 5) don’t appear to offer the same level of freedom as the very first version that I was able to use through Project Glasswing. This is one of the reasons why I continue running our massive security scans with “preview 1”, or at least I was running them until June 30, when the program’s policy changed.
• https://huggingface.co/huihui-ai/Huihui-GLM-5.2-abliterated-...
• https://huggingface.co/huihui-ai/Huihui-Kimi-K2.5-BF16-ablit...
• https://huggingface.co/huihui-ai/Huihui-Qwen3.5-397B-A17B-ab...
• https://huggingface.co/huihui-ai/Huihui-DeepSeek-V4-Flash-ab...
• https://huggingface.co/huihui-ai/Huihui-Qwen3-VL-235B-A22B-I...
• https://huggingface.co/huihui-ai/Huihui-Qwythos-9B-Claude-My...
• … so on and so forth.
However they have quite good harness in their backend which is the actual model.
I mean, it's like writing a book about how to use React or Django or some other major software ... after you used it for one project for a month!
Authors: I know this is the Internet, and I know bloggers blog about whatever pops into their head ... but if you are going to act like an authority, how about you learn more than the average reader before you start telling them authoritatively what to do?
I have no beef with people writing about new tech, but I do have beef with claiming that "____ is the correct way to do it" ... based on nothing except "I feel proud of the last three months I spent with Claude".
It is unsurprising that a lot of people claim to know how to get rich quick.
I believe it is possible to solve this problem, and I have my own horses in the race which I won't threadjack to promote here, but it's the central problem of our profession at the moment. We've all seen the truly discontinuous outcomes and we've all seen allegedly national security dangerous models (which at one time was GPT-3) faceplant with it's shoelaces tied together. I wanted to see if Fable was really all that and I left it overnight on some fairly straightforward C++ (code DSv4 Flash works on with moderate supervision) and it's pretty roast worthy, I gave it a chance to redeem itself this morning and it's ticked up a bit (I still think it's roughly Opus 4.8 with a Project Zero fine tune and DRO trained off the constant gratuitous yield tic which is pretty clearly an intentional gimp).
I give all such claims 30 seconds of my time because someone is going to actually be right one of these days.
if you want to beat it, give it more turns before it has to "wrap up a session"
The way I rather do it is tightly control the output by skills written yourself, prompts, plans, etc. and have the closest possible outcome you would write yourself.
This (non-yolo mode AI coding) is actually how we used to code in the old days (2023).
Better method start to realizing that everything that every program do is data transformations and or movement
Then you ask llm to subdivide data in a tree along the domain model, classifing streaming vs storing nodes
Then for each node you discuss with the ai for the best data structure
Then you ask for an interface that fully encapsulate the structure and every mutation only allows to go from a valid state to a valid state and bidding else is allowed to touch the state
And that's mostly it just connect all the interfaces until input goes to monitor or to storage or to api or wherever the destination is
[1] - https://github.com/fantasyland/fantasy-land