Use boring languages with LLMs

(jry.io)

143 points | by evakhoury 4 days ago

42 comments

jryio 3 hours ago
Author here, wasn't expecting this piece of writing to show up on HN.
The specifics of Python were chosen only due to the language ecosystem being fragmented and inconsistent while Python remains an essential learning, research, and now ML programming language (it was my first language and I still love it).
My thoughts on LLM generated code have changed immensely in the last 9 months as I've taken on teams and projects through my consulting work [1] as a fractional CTO. Python remains a difficult, flakey, and inconsistent programming language for complex production systems. Most other programming languages suffer from fragmented toolchains and ecosystems: JavaScript (famously), PHP, and even C/C++ to a degree.
Languages with a single way to do things benefit the most: Ruby, Rust, Swift (even). Low entropy is the way to go and convention > configuration seems to pay off with LLMs.
Mean cost of management is more important than specific edge examples "X company run on Y language". I think that 'boring' languages with rock-solid compilers, toolchains, testing frameworks, and package managers make for high return on engineering time and production maintenance.
[1]: sancho.studio
gertlabs 1 hour ago
When you're working on something difficult that requires a model to reason intelligently, lower level and strongly typed languages often outperform on the same problems [0]. We have a few hypotheses about why, with a moderately high correlation between performance and token density of the output program -- i.e. more token dense languages are more difficult for programs to reason about.
Most models come up with the least effective solutions when writing Python.
[0] https://gertlabs.com/rankings
[-]
- ajb 1 hour ago
  That's a very interesting page, but the language ranking is wildly different for "average percentage" (python bottom) and "success rate" (python second). Sounds like there is some subtly about this.
  [-]
  - gertlabs 46 minutes ago
    Success rate is essentially loading/compilation success + ability to adhere to the environments' rules.
    For one-shot responses, the majority of failures are environmental/syntax, which naturally favors interpreted languages. For longer agentic coding sessions, models solve the environment issues quickly and it becomes a fair comparison of who comes up with the smarter solution. You can filter for that here: https://gertlabs.com/rankings?mode=agentic_coding
- christophilus 1 hour ago
  I ran a little test with Go, TypeScript, Clojure, F#, Haskell, and Rust. Token count was roughly in the same ballpark, but it used the fewest for TypeScript, then Go. The rest required a bit more. Clojure always won in terms of lines of code though, generally coming in at 1/2 the size of the Go or Typescript solutions.
  [-]
  - iLemming 0 minutes ago
    [delayed]
keithnz 15 minutes ago
I think use any language that can achieve / or is close to native speed and has a reasonable ecosystem of significant libraries around it. Trivial libs are pretty much dead as AI will implement what you need, so if you need something like MQTT, its much easier when you have mature lib that handles that. I've experimented a bunch of language with LLM, like Go, Rust, C, C++, C#, Kotlin. All work fine. My decision on what to use depends on what the larger ecosystem provides and what I'm programming for (embedded, backend, Web, GUI, App etc). I'd probably add in swift if I get around to doing iOS stuff. There's no real "best" here, multiple options are likely going to be fine choices. Crazy thing is, if you don't like your language choice you can use AI to change it (ideally early on). Just for fun I got AI to convert one of my TUI apps to various languages. Went reasonably well.
jaen 2 hours ago
LLMs have a limited context window - similar to the limited attention span and memory of humans. LLMs also have trouble attending to many constraints at once.
Therefore the best language for agents is likely the one that, on one hand erases all irrelevant details (ie. raises the level of abstraction and does not force focusing on eg. memory management), and on the other hand encodes any domain-relevant details in the code (eg. using advanced type systems, annotations, contracts, spec-like tests eg. property-based).
Human readability is a separate concern and still relevant, but the two mentioned properties actually generally improve on that as well (at least for engineers persistent enough to scale the tower of abstraction).
Based on this, it seems Go is certainly not that "agent endgame" language. It has large amounts of boilerplate, a general lack of safety around concurrency features, a pretty middling static safety story overall with a generally underpowered type system.
I don't think the perfect language exists, yet, but just wildly imagining, it would probably be something like a cross between Scala, Elixir and Lean (or equivalents). Unfortunately none of those languages also have the large training corpus required to make them perform well in all agenting engineering situations (yet).
For any language comparison, one must separate the expressiveness of the language, which limits the long-term possibilities for agents, and the training corpus, which is what mostly gives it the current standing. I think we are still in the phase where the languages are separated by essentially random non-design factors such as the amount of training environments the frontier labs are willing to create for them.
Given that, the syntax does not matter all that much, as long as the base language itself is flexible enough - as a another wild idea, it's also possible that eg. Python could mostly swallow all these features through external tools (eg. the pre-existing type checkers or linters), and if the frontier labs bother to RL on those tools, that would also work (see also: Mojo).
[-]
- devin 2 hours ago
  I know Clojure misses the mark for you in some major areas, but having a real, proper REPL for an agent to interact with makes for an extremely strong feedback loop. IMO, a strong candidate for an "agent endgame" language has this.
  [-]
  - jaen 2 hours ago
    I love REPL-driven development, and exploratory "programming" via small snippets is likely part of the endgame (the more the agent strays outside its comfort zone, the more it's needed). It also looks like it paradoxically saves context, since piling up many small snippets is still better than trying to fix a one-shot gone horribly wrong.
    And yeah, if Clojure had a better static safety story, it would actually rank high, since it's high-abstraction with metaprogramming capabilities, excellent runtime specifications, and has a good "garden path" ie. natural code is good code.
    (As a devil's advocate though, REPLs can be replaced with gluing together scripts with ad-hoc state/caching via the file system, which LLMs seem to be pretty good at already...)
    So as an addendum, introspectability is also important ie. allowing to discover the state and composition of the system at any moment - though that's partially a language (eg. reflection) and partially a tooling (eg. debuggers) issue.
    [-]
    - iLemming 13 minutes ago
      It sounds almost squarely written by someone without practical experience of REPL-driven development. Specifically with Lisp REPLs. Sure, other languages also have REPLs, but if you dig just a bit deeper, you'd learn that every single step there in R[ead] E[val] P[rint] L[oop] has differences. That makes the entire holistic experience of using the language drastically different. Not universally better for every domain and every case, just different.
      Here's one of many practical examples I can give you - my WM on Mac is Yabai, it is hooked up to Hammerspoon, which can be scripted with Lua, which means I can use Fennel, which means I can have Lispy REPL. And from here, it is a bit difficult to explain the difference. The challenge is that the magic is invisible to people who haven't felt it.
      The key insight is "the image vs the file". In most languages, your program is a description that gets turned into a running thing. The REPL is bolted on - a convenience wrapper around that same lint/compile/run/restore-the-state cycle. Python, Ruby, Lua, C#, etc. REPLs work that way. You're still fundamentally working with files that produce processes.
      In a Lisp, the running system is the environment. There's no gap between "the code" and "the live thing". When I connect to Hammerspoon's Lua runtime via Fennel, I'm not sending scripts to a subprocess - I'm reaching into a living system and reshaping it while it runs.
      The missing vocabulary here is "liveness", not "fast feedback" - that's a pale shadow of it. Liveness means the environment has no opinion about what's "done" versus "in progress". Everything is always mid-flight and accessible.
      So I can actually reach out in live REPL session to let's say Slack app window and extract the data about every single element in the app, get the content, compare and continuously reiterate, without having to restart anything, without even saving the code - just pure data extraction without compiling, dealing with state changes, etc. I can interactively move the window, resize it, hide, or maximize it - all that programmatically. Imagine DevTools on steroids, only it works for everything, not just web apps.
      Learning Lisp and Clojure allowed me to truly experience the genuine joy of programming, because it makes it feel like you're playing a video game. And now, can you even imagine what happens when you open up access to all that awesomeness and grant it to an LLM? Most people have zero idea what it feels and looks like, when you can point an agent to a REPL running in a k8s cluster and it introspects things on the fly, while you let another agent poke through the UI and they work as a team to fix something or develop a new feature.
      > if Clojure had a better static safety story, it would actually rank high
      This framing reveals an assumption - the primary value of a type system is catching errors early, and that Clojure is just a dynamically typed language that would be improved by adding that. In a Lisp image, "early" and "late" barely exist as meaningful categories. The feedback loop isn't compile-time vs runtime - it's just... now. You evaluate a form and you know immediately. The error is right there, in context, with the live data that caused it.
      Static types are, in a real sense, a compensation for the gap I just described - the gap between the description and the running thing. When you can't easily inspect or reshape the live system, you want the compiler to tell you as much as possible before you cross that gap.
      Clojure has Spec (which can do things most other type systems would struggle to express), it has instrumentation, it has a rich data inspection story. You said "debugger", Clojure has Flowstorm, which is one of the best debugger experiences I have ever encountered in any language, and I have used more than a few.
keepamovin 13 hours ago
I agree with the idea that boringly predictable should be what is preferred but anecdotally my experience in using Go with LLMs is that they trip up a lot on the races and locking from go’s thread model. I haven’t seen the same problem in rust which is now why I’m doing all my LLM work for tooling in rust.
The parallelism issue in particular was also not something I noticed agent struggling with in JavaScript, although JavaScript concurrency model is clearly fundamentally different.
The concurrency issues that I saw LMM‘s face was one reason why I created freelang which uses a very boring and audible concurrency model of OS processes that use the file system to talk instead of IPC, shared state, or anything like that. Higher overhead, lower throughput, but more boring and hopefully less bugs: https://github.com/DO-SAY-GO/freelang
[-]
- nathannaveen 13 hours ago
  Personally I generally try to avoid concurrency when writing code with AI since I feel AI makes concurrency unnecessary complex in Golang.
  [-]
  - dnautics 5 hours ago
    I don't have this issue in elixir.
  - quietbritishjim 11 hours ago
    Probably reasonable, but that means you're disagreeing with the article's point about goroutines being good for LLMs (since goroutines are a form of concurrency). I've never seriously used Go so I don't know how easy it is to avoid using them.
  - keepamovin 13 hours ago
    Right. But for things like GUI or orchestration tooling it’s unavoidable
itpragmatik 2 hours ago
Java 21, Spring Boot 4.x, Spring AI 2.x - probably most boring stack that is working fantastic for me to generate solid, reliable code for agents, mcp servers using Claude Code or Cursor.
sheepianka 14 hours ago
I disagree. "Boring" languages leave a lot of assumptions in code, which will start to compound the more changes model (and programmers) make to the code.
The more assumptions I can move to compile time the better models are at dealing with emerging complexity.
I would go the other way with LLMs and I wish for liquid types and effects in Rust to make type specifications even more strict.
P.S. effects and liquid types and type specifications in general add a lot of busywork, but models have higher level of tolerance to busywork compared to developers.
[-]
- tikhonj 2 hours ago
  Sounds like OxCaml is pretty close to what you want. You get access to similar capabilities as Rust, but also stricter typing and an (optional) effect system. I don't know of an equivalent to Liquid Types, but it seems like the same approach that worked for Haskell would work naturally in OxCaml.
  [-]
  - SwellJoe 59 minutes ago
    You need a huge corpus of training data for the language for models to be good at using it. Rust has that (so does Go). OxCaml probably does not (unless there's some iceberg of open code out there that I'm unaware of). I'll take a slightly sub-optimal language with excellent training data coverage of my use case over a perfect language that the models have barely seen 100% of the time.
    Which is why I think it's silly to suggest creating a new language "for agents". Unless one or more of the frontier AI companies commit to creating a language and the training corpus for a new language, there's no good way to bootstrap a language that is ideal for agents. You need the huge pile of high quality code as a prerequisite for a language being good for agents. And, the argument applies similarly poorly for some language that looks like it has a good shape for agents, if it doesn't have a lot of human written code from the past decade or whatever. It's not a good language for agents unless agents already know the language really, really, well because of a huge pile of code in that language in its training data.
- bluegatty 3 hours ago
  The problem is that most of Rust annotations are related to memory management in special conditions aka solving problems that don't even exist in JS or Java. That's not going help the AI solve problem space issues, it just helps the AI (and us) do things in the solution space aka solve lower level things we consider important.
  [-]
  - zozbot234 2 hours ago
    Memory management annotations are actually quite rare in Rust. It mostly uses plain RAII as seen in C++. You do need to annotate whenever an object may have multiple "owners" extending its lifecycle (which requires refcounting) but that's often directly visible in the problem domain.
    [-]
    - bluegatty 1 hour ago
      & and &mut is all over the code in Rust. &T by far the most common arg type.
      In the OP article, they mention 'don't need to worry about thread cause the concept does not exist' - well, & does not exist in Python.
      Those things are related to low-level computational issues (memory management) not problems space issues (moving money, transcoding the file, checking the spreadsheet), so a lot of &/&mut etc. and all that extra thinking slows down AI for the same reason it slows down you and I.
      In particular, building in rust requires us to think a bit different about how we create the program in the first place and I don't think AI is very good at architecture yet.
      Probably ... eventually none of this will really matter though, it will just be like 'compiler pedantry' for the small number of people who work on those things.
dnautics 5 hours ago
I don't think the python package manager is the high level difficulty for LLMs doing python. I think the high level difficulty are nonlocal effects. At any given callsite, it might be difficult to know exactly what is going to happen to the data you pass into the call.
[-]
- AustinDev 3 hours ago
  A sibling mentioned that LLMs benchmark better on elixir. Immutability and functional programming are likely the reason why it benches better.
lmm 14 hours ago
Rather than "boring", this seems to be reaching for something like the concept of a "pit of success", or https://haskellforall.com/2016/04/worst-practices-should-be-... . I don't think the fact that the most common pitfalls in Go are well known should be taken as a sign that it doesn't have more esoteric pitfalls as well; it's just that the common cases (like nil) are the ones that everyone sees all the time.
TheGRS 3 hours ago
I can probably fix package manager issues by hand, and quickly with a little rubber ducking with the LLM itself. I'm not sure that's a huge problem in the grand scheme.
There's a lot of stuff in Python's favor in regard to coding with LLMs: its wildly popular so there's a lot of references for the right and wrong ways to use it, it can be typed using included libraries - its as simple as telling the LLM "use typing for this", and there are several great lint and unit testing tools to cover the hallucinations and poor decisions. The flexibility seems like an advantage to me personally, but I've always been a Python stan.
[-]
- mountainriver 2 hours ago
  Why choose a significantly worse language when you are writing it in the same English either way.
  It’s increasingly obvious that whole swaths of developers will just continue using the language they did before LLMs “just cause”
  It’s more identity based at this point. My LLMs write Rust for me and I couldn’t tell you the difference outside of it being way faster and more reliable
  [-]
  - TheGRS 2 hours ago
    For immediate term you should stick with what you know. I think that makes for much better prompting where you are coming in with experience with the language and the general style you'd like to see.
    Rust is a language I would like to adopt longterm, but its not one I can easily grok and so my output would be worse for it.
znnajdla 3 hours ago
Instead of empty theorizing, we should have benchmarks for this. There is at least one benchmark which suggests that LLMs are better at writing Elixir than most other languages: see the AutoCodeBenchmark.
[-]
- jryio 3 hours ago
  Send me an email I'd love to make one
- justinhj 2 hours ago
  Thanks for the tip. If they are representative the results are quite surprising. Anecdotally, I always thought Go would be on one of languages LLM's were best at.
  https://github.com/Tencent-Hunyuan/AutoCodeBenchmark/blob/ma...
  [-]
  - vunderba 2 hours ago
    In my experience, LLMs are pretty good at writing Go code. In fact, that’s one of my principal use cases: I have a number of bespoke Node programs from many years ago that usually run in the background on my computer.
    Each process (with the Node runtime engine) consumes about 50–100 MB of RAM. One of the first things I tried was using large language models to help port them over to Go. Since the model has a point of reference (the original Node program plus its unit tests) it’s been easy to use a test-driven approach and ensure the Go version maintains parity with the originals.
    Memory usage usually drops from around ~50 MB to about ~5 MB per process, which really adds up when you’ve got a dozen of these programs running at once.
quietbritishjim 13 hours ago
> Python is the same story but sung in a different key. Asking a simple question like “which package manager are you using?”
This is annoying but only needs to be solved once at the start, either by the LLM or the human guiding it. A single prompt of "Set up a uv project in this directory with Python 3.13" is enough that it's never an issue again for that repo.
> Goroutines are a far more tractable primitive for coding agents than threads, callbacks, async/await, or any of the colored-function regimes that dominate elsewhere.
I disagree with this. Goroutines, along with threads, callbacks, and traditional async, are all in the same category: spaghetti of unbounded background tasks. Structured concurrency [1] on the other hand is dramatically easier to reason about. Python has support for this (in Trio and asyncio.TaskGroup) as do other languages like Kotlin and Swift. Function colouring a red herring; if anything, it's useful because it highlights the scheduling/cancellation points in your code.
[1] https://vorpus.org/blog/notes-on-structured-concurrency-or-g...
-----
This really does read as "Go is my favourite language". In fairness, that's a good reason to choose a language to use with an LLM (so long as it's powerful enough and not too obscure). But let's not pretend it's the best language for everyone.
[-]
- pyrale 12 hours ago
  > This really does read as "Go is my favourite language".
  Because it always is that.
  People advocating for boring languages always advocate for their boring language. For instance, if you tell a gopher that you agree with the point, and therefore the project is going to use java, they won’t be happy about it.
  [-]
  - jshen 4 hours ago
    Except for the part where he says that he found Go infuriating to use as a developer prior to LLMs.
- lionkor 12 hours ago
  RE the article you've linked:
  > everyone knows goto was bad.
  Absolutely hard disagree. You can write extremely clean and resilient C with C89, goto, and a handful of rules. Telling people `goto` is bad is how we get shitty C programs and paradigms where goto would have been better.
  Goto isn't bad, its misuse is bad. Beginners will write shit code regardless of whether you tell them they can or can't use goto. That's also exactly what Dijkstra was arguing, if you read past the much misquoted "goto considered harmful", which he never said (it was an editorialized title, and not even the full version).
  [-]
  - quietbritishjim 11 hours ago
    > > everyone knows goto was bad.
    > Absolutely hard disagree. You can write extremely clean and resilient C with C89, goto, and a handful of rules.
    That's a different goto. The one in C89 can only jump around within functions, but the article is talking about goto that can jump between any two points in the whole codebase arbitrarily. It stresses that point a bit more later on in the article, but you can already see it from the FLOW-MATIC code quoted above (which doesn't even have functions).
    Your point actually still stands: it's theoretically possible to write clean code using even the more general goto. (Probably by building abstractions with it like "function" and "for loop".) But would you be happy doing that with someone else - or especially with a coding agent? It's better that the "handful of rules" are enforced by the language, in my opinion.
    ---
    Edit:
    > That's also exactly what Dijkstra was arguing, if you read past the much misquoted "goto considered harmful", which he never said
    I just re-read the original "GOTO considered harmful" article (it's short and clear) and, while the title might not have been his, Dijkstra was definitely making a very plain argument that goto is bad for everyone and should be scrapped. He says in the introduction:
    > I [have] become convinced that the go to statement should be abolished from all "higher level" programming languages (i.e. everything except, perhaps, plain machine code).
    And in the conclusion:
    > The go to statement as it stands is just too primitive; it is too much an invitation to make a mess of one's program.
    [-]
    - Nevermark 2 hours ago
      Here is a structured cross-codebase GOTO for C++, with optional declared & typed continuation values, sub-tasks/states/state-machines, and optional delegated typed termination values.
      // Syntax: { ...; y = go_to state1(x, ...); } // Meaning: Cross-codebase GOTO w/continuation values // Implementation: tail call #DEFINE go_to return // Syntax: { ... y = go_do state2(x, ...); ... } // Meaning: Cross-codebase sub-task/sub-state // Implementation: normal call #DEFINE go_do // Syntax: { ... go_terminate(y); } // Meaning: State machine termination // Implementation: normal return #DEFINE go_terminate return // Syntax: int state3 state(int x, ...) { ... } // Meaning: Structured state definition // Implementation: normal function #DEFINE state // SYNTAX: if (GOTO_NOT_HARMFUL) { ... }; // Meaning: GOTO is now cleaned up // Derivation: Achieved #DEFINE GOTO_NOT_HARMFUL true
      Example:
      int state1 state() { ...; go_to state2(m); } int state2 state(int m) { ...; y = go_do substate2a(); go_to state3(); } int substate2a state() { ... ; go_to substate2b(q); } int substate2b state() { ... ; go_terminate(q); } int state3 state() { ...; switch (...) { case 1: go_to state4(v); case 2: go_to state5(); ...} } int state4 state(int v) { ...; go_terminate(r); } int state5 state() { ...; go_to state3(); } // State cycle
      So now you can have your #include <go_to>
      EDIT: Compressed/cleaned up my mess
  - jmiskovic 12 hours ago
    Bad code is result of not following enough good practices rather than following one too many. Goto creates too much cognitive load. It had to be phased out to make room for better ideas on describing the intention with code more clearly.
  - waffletower 2 hours ago
    I have written C code for several decades. I hate a priori, divorced from context, rules such as "never use goto". "Avoid premature optimization" as a rule is much worse. I used goto a few times in that timespan, surrounded it with rebellious comments, but largely the advice has more than weakly held for me. I don't think goto somehow liberates C developers and avails superior architectures.
- kibwen 12 hours ago
  > Function colouring a red herring
  Any time you see anyone overly fixating on "function coloring" for any context other than ancient versions of Javascript it's a clue that the speaker has no idea what they're talking about.
- dminik 13 hours ago
  > Goroutines are a far more tractable primitive for coding agents than threads
  Goroutines are literally threads. Yeah, this really is a "go is my fav" article.
  [-]
  - bfivyvysj 13 hours ago
    I use both go and python in my day job. The code that Sonnet produces for Go is much better than the Python it creates.
    This could be because our go code is typically smaller more defined services but I don't really believe that since even the isolated python services are pretty spaghetti looking.
  - Retr0id 13 hours ago
    Goroutines are not directly equivalent to threads.
    [-]
    - quietbritishjim 11 hours ago
      If 100 goroutines are handled by 10 threads, the effect on correctness is identical: any two can be running in parallel with each other (not just concurrently). From the point of view of this discussion, that's all that matters.
      [-]
      - Retr0id 10 hours ago
        Correctness is nice but performance characteristics exist too.
    - IshKebab 12 hours ago
      They used to not be, because they were cooperatively scheduled and threads can be preempted. But they added goroutine preemption in Go 1.14 so in practice there aren't really any significant differences to threads, at least in semantics. (At least as far as I remember; been a while since I wrote any Go.)
      You can be pedantic and say they aren't technically threads but that doesn't really matter from a programming perspective.
      [-]
      - masklinn 4 hours ago
        Even when goroutines were cooperatively scheduled, because the cooperation was mostly hidden and every function call was a yield point the average developer would treat them as being cooperative... until they spawned too many goroutines with a tight loop (and no function call) and the runtime locked up.
        > You can be pedantic and say they aren't technically threads but that doesn't really matter from a programming perspective.
        They are technically threads: they are independently scheduled, concurrent units of execution sharing an address space. They're just not OS (or kernel) threads. Hell, technically userspace threads (generally cooperatively scheduled) are the original, they predate kernel threads by a decade or two.
  - p2detar 11 hours ago
    As someone below said, they might be from programming perspective, but technically they are not. See GOMAXPROCS for more info.
    That being said the whole `tractable primitive` thing used in the article sounds somewhat sloppy to me. I don't quite get it. Yeah, they could be easier for an agent to write than async/await, but threads are also trivial in that matter, and you'd still need a mutex with go routines.
- jshen 4 hours ago
  > This is annoying but only needs to be solved once at the start, either by the LLM or the human guiding it. A single prompt of "Set up a uv project in this directory with Python 3.13" is enough that it's never an issue again for that repo.
  This isn't true. I'm mostly using python and UV with claude and it periodically decides to try to run scripts directly instead of using UV.
- cyanydeez 11 hours ago
  no, i believe it needs to be solved in finetuning. collectively, the OSS community should be pooling local LLM resources and providing a route to models that inately choose best practices as a "full stack" engineer would see them.
  in your mind, you think a harness and prompt is sufficient framing to keep the LLM output to design goals. but no matter your context size, as it grows, anomolous gradients appear that try to normalize competing patterns of development.
  the only real way is directly training out unwanted crosswise options.
  [-]
  - quietbritishjim 11 hours ago
    I think you replied to the wrong comment. (Or, at least, I have no idea what you're saying.)
badlibrarian 3 hours ago
My experience a year ago (back when half of HN was still in denial about what was already working, let alone what was to come) was that Python was the linqua franca of LLMs. You could achieve almost anything that fit in 700 lines or less if you told it to write it in Python.
Times change, and I work more in R&D space than on legacy codebases, but I still ask it to write something in Python then convert it to the actual language on occasion. I don't know if I'm tricking the context window, forcing alternate pathways, or both, but it works.
[-]
- roughly 2 hours ago
  > Times change, and I work more in R&D space than on legacy codebases, but I still ask it to write something in Python then convert it to the actual language on occasion. I don't know if I'm tricking the context window, forcing alternate pathways, or both, but it works.
  My experience with LLMs is that they perform best in one of two modes - either one carefully scoped context or translating between two different contexts without modification - so this modality lines up with that fairly nicely: think in the programming language the LLM thinks "best" in and then translate that to the one you want.
  That said, there's often enough structural and conceptual differences between languages that a direct "transliteration" between, say, Python and Go is going to result in some fairly crummy Go, so I'm curious what you see in terms of the fidelity of that translation - do you mostly get "Python written in Go," or does the LLM really do a proper conversion from one language to the other?
  [-]
  - badlibrarian 2 hours ago
    I have strict context in place on what I expect from the final language (C# or C++) and I'm frequently left with my jaw open. Used my preferred json library on C++, used LINQ appropriately in C#. Mapped AWS libraries appropriately and used existing credential stores. Certainly better than what I got when I asked for the native version first, which is why I do the hurdle. It feels hacky but it works. In a year it probably won't be necessary.
sunshowers 2 hours ago
In my experience, LLMs benefit greatly from the existence of sum types and exhaustive pattern matching.
sorenjan 1 hour ago
Why are we having computer programs generate source code in the first place? Shouldn't they generate something lower level, like an AST or some computational graph or something? Source code is made to be written and read by humans, and is then translated into machine code via various transformations. In theory a program should look the same to a computer no matter which language it started out as.
We have decades of compiler research, static code analysis etc, why do these extremely complicated black boxes of billions of parameters have to produce readable source code as their main output?
[-]
- carlostkd 1 hour ago
  In 5–10 years there will be no more code written by humans.
- JadeNB 1 hour ago
  > Why are we having computer programs generate source code in the first place? Shouldn't they generate something lower level, like an AST or some computational graph or something? Source code is made to be written and read by humans, and is then translated into machine code via various transformations. In theory a program should look the same to a computer no matter which language it started out as.
  Presumably because LLMs are trained on corpora read, and for now still probably mostly written, by humans, rather than on corpora consisting mostly of ASTs or graphs?
jmull 12 hours ago
This is an interesting idea, but I'd want to see something solid before acting on it.
From what I can tell, LLMs know/use patterns above the syntax and idioms of specific languages and the syntax and idioms of specific languages and how to apply the former to the latter.
The bottleneck isn't what languages the LLM can handle, but what I can handle coming out of the LLM. The general advice, then, is to use the language (and related setup/environment) you're familiar with.
[-]
- embedding-shape 12 hours ago
  Yeah, probably that last point would depend on how you actually use LLMs, if you don't review the code at all, not even glancing on it (aka "vibe coding") then it probably doesn't matter so much, yolo and deploy.
  But for the rest of us who aren't vibe coding, but pairing with LLMs and actively steer them, correct things and iterate while reviewing the code, design and architecture deeply then yes, I agree a lot, matters more that you're familiar with the language than the LLM, they can pick up new programming languages in a message so doesn't really matter, knowledge seems to come from programming, not locked into a specific language's syntax.
bluegatty 3 hours ago
"The concurrency model is the first of these. Goroutines are a far more tractable primitive for coding agents than threads, callbacks, async/await, or any of the colored-function regimes that dominate elsewhere. They are simple, type-safe, and ubiquitously used in the corpus the model was trained on. There is no question of what color your function is, because the question does not exist."
I don't really buy the intuition (aka Goroutines are more 'clear' than 'coloured' functions or threads), and there's no evidence presented for this either.
Although this could very well be true, I'm doubtful without seeing some real world data points.
The 'general premise' aka 'cosine similarity' may have been true before bit it may not be that anymore.
AI just pretty good at anything it's 'seen enough' and that's it, I think it's more likely a 'threshold' problem than an ability problem, at least for most things.
'Rust' may represent a different domain, given the very detailed nature of notation and the vast possibilities that arise from that.
wryoak 4 days ago
Contradictory anecdote: there’s basically only one way to write Elm, as it is a very trend-resistant language with minimal updates over long timespans, but most agents in my experience will throw Haskell syntax and Prelude functions into their Elm output. Compiler or LSP will often set them right but they still try it initially
[-]
- amarant 4 hours ago
  I think you touch on a language feature I think is very important: compile time errors over runtime ones.
  I'm biased, I preferred it this way before AI. But even so I think there is real merit. Firm guardrails and clear feedback seem to benefit AI.
  Anecdotally, the worst AI performance I've seen was with gdscript, which is basically python minus the huge corpus of training data. Best results I'm getting with rust, which is in the opposite end of the strictness spectrum.
- Weebs 1 hour ago
  I keep asking LLMs if I can define an interface implementation as an expression or without creating a new type in C#
  Every single one I ask always happily says yes, and starts claiming C# has local classes (a feature of Java)
- gampleman 13 hours ago
  I do agentic Elm development every day (it's my job). I feel like what you describe was a problem with models perhaps two years ago. Today's models don't seem to struggle with it at all and in fact do seem to benefit from what the author describes.
- BoumTAC 14 hours ago
  I’ve just started a new app with an Elm frontend. I’m using Grok Build, and it integrates really well.
  The compiler is incredibly helpful because it catches errors and gives clear explanations and the LLM can iterate over it. I’ve also added the elm-review package with the default configuration, which is fantastic for ensuring code quality.
- djohnmustard 14 hours ago
  Interesting, what models are you using? My use with sonnet 4.6 has been a breeze for the most part
- epolanski 15 hours ago
  Interesting, I have a different experience.
  I have worked extending the Elm compiler and both Opus 4.6, GPT 5.4 and GLM 5 had no issues both with the Elm compiler (written in Haskell) and my extended Elm.
  I didn't see them hallucinate much, not more than on mainstream languages.
tlonny 2 hours ago
I think _some_ but not _too much_ typechecking is the sweet spot for LLMs.
Without any typechecking, LLMs obviously find it harder to work agentically and validate their work.
With too much typechecking (I'm looking at you, rust), I've found agents get themselves stuck in local "architectural minima" and end up doing insane shit to mitigate ownership/borrow-checker issues inherent in the design they ended up with.
That said, if you're hands-on I think rust is a fantastic language for pairing with an LLM.
Animats 2 hours ago
I wonder what we end up with as an LLM-friendly programming language. It's likely to be something rather formal, with entry and exit assertions. Humans hate writing those, but LLMs need them to keep them on track and give them goals.
haolez 2 hours ago
This made me remember of a benchmark that I saw a few months ago about LLMs being unexpectedly _very good_ with Perl when compared to any other language. I couldn't find it right now. If someone knows what I'm talking about, please post it here :)
suis_siva 1 hour ago
My experience is that higher-kinded languages (ie. Haskell) allow for "controlled chaos". I design a type-system, the higher kinded types, the interfaces (though it's getting rarer I need to do this) and I let Claude slop the implementation.
Additionally, fault-tolerant languages such as Erlang/Elixir allow me to not worry about the billions of edge-cases, and let Claude aggressively implement a mostly good-enough application. With LLMs, accepting a limited amount of failure may be a necessity (depending on the business/domain), and that's exactly what the BEAM enables.
janpeuker 2 hours ago
Great post showing the ironic revenge of opinionated architecture in times of cheap code. Exactly what LLMs can’t deliver, they always seem to be bias towards added complexity, not simplification.
ChrisMarshallNY 13 hours ago
I program in two languages: Swift (my main language), for client work, and PHP, for backend work. It’s overwhelmingly Swift.
In the last year or so, I have been using LLMs, to assist my work, with generally, excellent results.
I have noticed that the LLM delivers much better PHP, than Swift. I seldom need to rewrite or correct, the PHP code I get from it, and am constantly correcting the Swift. Part of the reason, may be that I am a much better Swift programmer, than PHP programmer, and there’s just a lot more Swift code. I haven’t really taken the time to analyze it.
I have my theories, as to why, but it’s not something I’m really into researching. I’ve just noted the trend.
[-]
- CharlesW 3 hours ago
  > I have noticed that the LLM delivers much better PHP, than Swift.
  If you ever have the time and inclination to try Axiom (https://charleswiltgen.github.io/Axiom/), I'd really appreciate knowing if you feel it quantitatively changes the Swift experience with your LLM/coding harness of choice, especially in regards to Swift concurrency.
  [-]
  - ChrisMarshallNY 3 hours ago
    Looks like a labor of love.
    Honestly, not sure that I'll be able to use it, right now (no telling, in the future). Looks like you really did a good job, though!
- sleepy_keita 13 hours ago
  I think it's pretty simple - there's just so much more open source PHP code out there in the LLM's training dataset. Swift has been around for much less, and most Swift is closed-source - not that many years have passed since Swift has been able to run on non-Apple platforms, too.
  [-]
  - iamcalledrob 13 hours ago
    I would also bet that 90% of Swift training data is UI code.
    And UI code quality tends to be technically pretty crummy/low-discipline. Your UI code doesn't need much consideration around data races, for example.
    [-]
    - ChrisMarshallNY 12 hours ago
      You wouldn't know that, if you looked at the UI code the LLM gives me. It loves threads; often mixing GCD and async/await, in one function.
      A lot of the code I need to tear out, looks like that.
      Most of the code I write is UI. It's actually fairly intense work, but relies on the underlying SDK, rather than language tricks.
      I find the UIKit code I get, is a lot more robust and performant, than the SwiftUI code.
  - psychoslave 12 hours ago
    PHP changed a lot since its early days. To the point where anything old would be considered bad practice by contemporary ecosystem point of view. So duration is not all. Comparatively, C seems to have stick to the same idiom all the way. That's not to praise C style form here to be clear, which is ridiculously obfuscating and feeling like pointless collection of linguistic awkward atrocities. But at least it's consistent in doing it.
- dust-jacket 13 hours ago
  > Part of the reason, may be that I am a much better Swift programmer, than PHP programmer
  Hard not to think that's a major part of it. IME you make loads more corrections in languages you're more opinionated about (and opinionated usually follows more experience & confidence).
  I correct AI Python all the time. When it cranks out TypeScript I just check it works.
  [-]
  - ChrisMarshallNY 12 hours ago
    Yes and no. I have been working with PHP for over 25 years, but I've never loved the language. It's always been a "necessary evil." I know it fairly well. I've only been writing Swift for half that time, but I like the language, and play with it more.
    I feel that the reason posited by another poster is more likely. There's a ton of mature, well-written, shipping, PHP out there, due to the open nature of most PHP, as opposed to Swift; where the more robust and mature implementation is likely behind proprietary walls. Most of the public Swift that I see, tends to be folks showing off its fancier features, in relatively small code samples.
tracker1 7 hours ago
For myself, I've generally setup a few boundaries... for JS projects, I tend to use Deno for tooling, even targeting npm lately. Similarly, I've favored modern TypeScript over JS. Often Hono + OpenAPI + Zod as a set for services.
I've also been doing quite a bit of Rust for web services and wasm targets, which has worked exceedingly well... similarly with Tokio + Axum, etc.
I have seen very few issues with either of the above... that said, C# has been a bit more painful by comparison... I often rely on FastEndpoints for services and Grate for database migrations, and LLMs often get a bit tangled with those libraries in practice.
eithed 5 hours ago
I think that this not only applies to languages, but general patterns that you use. Don't mix functional with OO. Don't mix repositories with DAOs. Don't mix MVC and MVVM. Code should be predictable in what it does and what you expect from other developers how to code. If you don't have that then you shouldn't blame LLM when it goes haywire and starts doing whatever
Yokohiii 13 hours ago
> Languages and ecosystems with low variance in their training corpus are represented better and executed more reliably by coding agents.
So I think the author is saying that go is a simple language that tends to have less solutions to the same problem. I personally agree to that to a degree.
What I don't agree on is that we can choose what "low variance" is. There is a lot of go code out there, it's shape may have little "noise", but the variance is massive.
zitterbewegung 14 hours ago
I haven’t had an issue using Python with LLMs where I have to decide “Should one use pip, poetry, or uv?” Since there is enough training data using pip or just choose that since it is the most boring solution and many of the commands map to uv since uv has a superset of features. Not that go is a bad solution honestly I would just say use what you know best.
citbl 13 hours ago
We are the point now where we let LLM dictate the language?
[-]
- badgersnake 13 hours ago
  Yeah, but then generally you pick the language your dev team are most familiar with.
  Or you hire a team of specialists for the language you want. Perhaps niche languages should have fine-tuned LLMs in the same way.
gregman1 3 hours ago
Haskell is I think a great language for llms - just make everything as pure as possible and you are golden.
sd9 13 hours ago
I wonder if the training data for some languages has higher quality code. I can imagine some niche languages having a higher standard than, for example Python, which surely has a bunch of random buggy scripts in the mix.
On the other hand, even if that were true, I don’t know how important it would actually be since LLMs can generalise across languages well.
It might be best to pick languages where it’s just harder to screw up, the canonical example being to prefer typescript over JavaScript.
Havoc 14 hours ago
> From a model’s standpoint, there are simply too many ways to write any of this
They seem quite good at figuring this out in my experience
wewewedxfgdf 14 hours ago
Has Go become a "boring language"?
[-]
- pjmlp 14 hours ago
  It has been boring from the start.
  It would be an interesting language, had it been released at the time of any of its influences, Oberon in 1987, Limbo in 1995.
  Back when the type system ideas from CLU, Standard ML, Cedar were still taking off among industrial programming languages.
- ncruces 14 hours ago
  The most non boring thing you can say about it is that it's terrible for ignoring most of what we've learned in the past couple decades of programming language design.
  That generates plenty of excitement.
- dist-epoch 14 hours ago
  Become? Always has been.
  It was intentionally designed for programmers with limited skill.
  Go language creator Rob Pike:
  > The key point here is our programmers are Googlers, they’re not researchers. They’re typically, fairly young, fresh out of school, probably learned Java, maybe learned C or C++, probably learned Python. They’re not capable of understanding a brilliant language but we want to use them to build good software. So, the language that we give them has to be easy for them to understand and easy to adopt.
  [-]
  - p2detar 11 hours ago
    > It was intentionally designed for programmers with limited skill.
    No. That is not true. It was designed as a language so programmers of all levels can be productive at the scale of what Google does and across possibly many different teams, no matter your prior background. Google does a lot at scale and a language that is easy to pick up and handles concurrency seamlessly is definitely a helpful tool.
    [-]
    - pjmlp 11 hours ago
      Nope, it was designed by one Oberon, and two UNIX heads, disgruntled to be faced with C++ at their work, they happened to be working at Google, and got support from their managers for developing it further.
      Thanks to Docker pivoting from Python into Go, and Kubernetes from Java into Go, while it was still pre-1.0, it managed to take off, and has more users outside Google than at Google itself, where Java, Kotlin, C++, Python still dominate most projects outside Kubernetes ecosystem.
      There is a certain irony that Google would need a language like Go, given their hiring process.
      [-]
      - anthk 10 hours ago
        Go it's just condensed Plan9 C philosophy (and Limbo/Alef) for legacy Unix users. If we ditched Unix in the 90's being Inferno and Plan9 under a libre licenses GNU would be running Emacs under an Inferno kernel.
        Also, proper namespaces from the start, Unicode, 9p even under Emacs and who knows what. Oh, and fore sure far less exploits, and with no Kubernetes or Docker nonsense. Half of VC's would be bankrupt today because damn namespaces would do the 90% of today's backends seamlessly.
        And maybe we would be using some Inferno based smartphone with custom UI's and programming in Limbo for that. Oh, and batteries lasting a week for sure.
        [-]
        pjmlp 8 hours ago
        Except that it throws away many nice things about Inferno and Limbo, and also proves the point that techonlogy needs good sponsors to make it out into the market..
  - antonvs 13 hours ago
    > They’re not capable of understanding a brilliant language
    What a coincidence, since Rob Pike wasn't capable of designing a brilliant language.
  - anthk 10 hours ago
    Not that true; it was dessigned as the C succesor for Unix. The Go counterpart for the Post-Unix OS would be Inferno and Limbo. 9front has a C and compiler suite (among CSP) which is like the link between their Unix/C and modern Go. Literally as the first Go releases used C plan9 compilers.
  - boxed 13 hours ago
    Such a weird take from him since Go has a bunch of pit falls that are not needed at all.
perlgeek 12 hours ago
Try to apply first principles to LLM coding:
* Chances are that fewer people (maybe even none) will look at the code when it's LLM-generated
* Amount of code being written isn't all that critical anymore
* Keeping patches small isn't that big of a deal anymore (because it's now the LLM's job to maintain it, not the human's)
All of this implies: boilerplate isn't a good reason to avoid a language anymore. (I hate this conclusion, because I hate boilerplate).
Then the question is: what kind of language can you use that buys safety with boilerplate? Probably a statically typed one, possibly with lots of asserts... Eiffel? I don't know if there's enough Eiffel code around the Internet to train LLMs, so maybe a more popular one would be better.
Maybe Java or C#? Haskell? OCaml?
The article suggests golang, and I think there are use cases where golang would be a good candidate.
It would be quite interesting to run an experiment: give separate instances of the same LLM coding agent the task to implement a specific application, and use different languages. Then compare quality, code size, runtime performance and token cost. Ideal would be a multi-stage development that better simulates a real development workflow (bug reports and new feature requests come in over time).
dionian 3 hours ago
Large codebases are much easier to manage with type safety. Not a fan of Go but definitely much better than python in this regard.
snikeris 12 hours ago
I was a little surprised to find when I gave an LLM REPL access to the running program, it readily started using it during development and debugging.
xnx 12 hours ago
Also use boring languages without LLMs
tern 12 hours ago
Rust, Elixir, and Go are the way to go for LLMs in my testing and experience, for this and other reasons
[-]
- mohamedkoubaa 12 hours ago
  I have no professional experience in Rust and over 10 years in c++ but to me the decision to use rust in a greenfield project written by agents was obvious.
- cpursley 12 hours ago
  Here’s some of the reasons it’s so good with Elixir:
  https://dashbit.co/blog/why-elixir-best-language-for-ai
dogleash 3 hours ago
>Languages and ecosystems with low variance in their training corpus are represented better and executed more reliably by coding agents.
Just narrow your window of thought to easier problems for the LLM, and all of a sudden the LLMs do everything you want!
Reminds me of playing around with image generation models. Someone who's been practicing can crank out prompts for really impressive images back to back. But you try to use an everyday object or concept the model isn't trained on? Everybody will race to show off how smart they are by saying "just don't hold it like that."
KaiShips 3 hours ago
[flagged]
zuogl 13 hours ago
[flagged]
shantnutiwari 12 hours ago
What??
Python __is__ a boring language (it is mature and well supported) with a somewhat convoluted package manager that has gotten a lot better since that xkcd came out.
Yeah, I get it, Go is better for distributing your code-- just one binary you can copy. But what does that have to do with "boring"?