C –> Java != Java –> LLM

(observationalhazard.com)

43 points | by WoodenChair 41 days ago

8 comments

kazinator 35 days ago
Any time someone posts that LLMs are just the next abstraction level to get used to, they instantly reveal themselves to be an impostor.
All abstraction layers are predictable and repeatable when used correctly.
You specify something in the language of the abstraction, and get a result that is precisely understood by the rules and requirements of the abstraction.
Only those who programmed by trial and error before AI do not see a difference. That's because they treated their compilers as mysterious AI, and must massaged their programs into working. In other words, they were already accustomed to a kind of prompt engineering.
[-]
- WoodenChair 35 days ago
  Thank you. I think that's a good explanation for much of the phenomena and your insight would have strengthened my post. I think you're probably right about where many of the people who see it that way are coming from. But not all of them...
cadamsdotcom 41 days ago
The spec rarely has enough detail to deterministically create a product, so current vibecoding is a lottery.
So we generate one or many changesets (in series or in parallel) then iterate on one. We force the “chosen one” to be the one true codification of the spec + the other stuff we didn’t write down anywhere. Call it luck driven development.
But there’s another way.
If we keep starting fresh from the spec, but keep adding detail after detail, regenerating from scratch each time.. and the LLM has enough room in context to handle a detailed spec AND produce output, and the result is reasonably close to deterministic because the LLM makes “reasonable choices” for everything underspecified.. that’s a paradigm shift.
[-]
- emodendroket 35 days ago
  Well, it’s really a return to the old-fashioned role of an analyst coming up with a data dictionary and a detailed spec. But in practice how often did that work as intended?
- seanmcdirmid 35 days ago
  > The spec rarely has enough detail to deterministically create a product, so current vibecoding is a lottery.
  How is that different from how it worked without LLMs? The only difference is that we can now get a failing product faster and iterate.
  > If we keep starting fresh from the spec, but keep adding detail after detail, regenerating from scratch each time..
  This sounds like the worst way to use AI. LLMs can work existing code, whether it was generated by an LLM or written by human. It can even work on code that has been edited by a human, there is no good reason to not be iterative when using an LLM to develop code, and plenty of good reasons to be iterative.
  [-]
  - cowl 35 days ago
    >How is that different from how it worked without LLMs? The only difference is that we can now get a failing product faster and iterate.
    The difference is that there is an engineer in the middle who can judge if the important information is provided or not as input.
    1. for a LLM "the button must be blue" has the same level of importance as "the formula to calculate X is..."
    2. failing faster and iterating is good thing if the parameters of failing are clear which is not always the case with vibecoding, especially when done by people with no prior experience in developing. plenty of POCs build with vibecoding have been presented with no aparent failure in their happy path but with disastrous results in edge cases or with disastrous Security etc.
    3. where previously, familairity with the codebase and especially the "history of changes" gave you context about why some workarounds were put into place, these are things that are lost to a LLM. Vibecoding a change to an existing system risks removing those "special workarounds" that keep in mind much more than the current context of the specifications or prompt.
    [-]
    - seanmcdirmid 35 days ago
      > 1. for a LLM "the button must be blue" has the same level of importance as "the formula to calculate X is..."
      You can divide those into two prompts though, there is no point for the LLM to work on both features at the same time. This is why iterative is so useful (oh, the button should be blue, ... and later, the formula should be X).
      > 2. failing faster and iterating is good thing if the parameters of failing are clear which is not always the case with vibecoding, especially when done by people with no prior experience in developing. plenty of POCs build with vibecoding have been presented with no aparent failure in their happy path but with disastrous results in edge cases or with disastrous Security etc.
      This isn't about vibecoding. If you are vibecoding, then you aren't developing software, you are just wishing for good code from vague descriptions that you don't plan to iterate on.
      > 3. where previously, familairity with the codebase and especially the "history of changes" gave you context about why some workarounds were put into place, these are things that are lost to a LLM. Vibecoding a change to an existing system risks removing those "special workarounds" that keep in mind much more than the current context of the specifications or prompt.
      LLMs can read and write change logs just as well as humans can (LLMs need change logs to do updates, you can't just give it a changed dependency and expect the LLM to pick up on the change, it isn't a code generator). Actually, this is my current project, since a Dev AI pipeline needs to read and write change logs to be effective (when something changes, you can't just transmit the changed artifact, you need to transmit a summary of the change as well). And again, this is serious software engineering, not vibecoding. If you are vibecoding, I have no advice to give you.
      [-]
      - AdieuToLogic 35 days ago
        > LLMs can read and write change logs just as well as humans can (LLMs need change logs to do updates, you can't just give it a changed dependency and expect the LLM to pick up on the change, it isn't a code generator). Actually, this is my current project, since a Dev AI pipeline needs to read and write change logs to be effective (when something changes, you can't just transmit the changed artifact, you need to transmit a summary of the change as well). And again, this is serious software engineering, not vibecoding.
        This is the important part of the post to which you replied and remains unaddressed:
        The difference is that there is an engineer in the middle who can judge if the important information is provided or not as input.
        [-]
        seanmcdirmid 35 days ago
        The engineer decide what information to use as input to the update prompt. They don’t need to be in the middle of anything, it’s basically the level they are coding at.
        [-]
        AdieuToLogic 35 days ago
        > The engineer decide what information to use as input to the update prompt. They don’t need to be in the middle of anything, it’s basically the level they are coding at.
        LLMs do not possess the ability to "judge if the important information is provided or not as input" as it pertains to the question originally posed:
        How is that different from how it worked without LLMs?
        Working without LLMs involves people communicating, hence the existence of "an engineer in the middle", where middle is defined as between stakeholder requirement definition and asset creation.
        [-]
        seanmcdirmid 35 days ago
        So you engineer the prompt. I’m still confused what the problem is, I’ve already stated that I’m not talking about vibe coding where the LLM somehow magically figures out relevant information on their own.
        [-]
        AdieuToLogic 33 days ago
        > So you engineer the prompt. I’m still confused what the problem is ...
        The problem is stakeholders are people and they define what problems are needed to be solved. For those tasked to do so requires understanding of the given problems. Tooling (such as LLMs) does not possess this type of understanding as it is intrinsic to the stakeholders (people) whom have defined it. Tools can contribute to delivering a solution, sure, but have no capability to autonomously do so.
        For example, consider commercial dish washing machines many restaurants use.
        They sanitize faster and with greater cleanliness than manual dish washing once did. Still, there is no dish washing machine which understands why it must be used instead of not. Of course, restaurant stakeholders such as health inspectors and proprietors understand why they must be used.
        As far as the commercial dish washer is concerned, it could just as easily be tasked with cleaning dining utensils as it could recycled car parts.
  - dwringer 35 days ago
    For me it just depends. If the response to my prompt shows the model misunderstood something, then I go back and retry the previous prompt again. Otherwise the "wrong ideas" that it comes up with persist in the context and seem to sabotage all future results. The most of this sort of coding I've done was in Google's AI studio, and I often do have a context that spans dozens of messages, but I always rewind if something goes off-track. Basically any time I'm about to make a difficult request, I clone the entire context/app to a new one so I can roll back [cleanly] whenever necessary.
    [-]
    - seanmcdirmid 35 days ago
      If you fix something it sticks, the AI won't keep making the same mistake, it won't change the code that already exists if you ask it not to. It actually ONLY works well when you are doing iterative changes and not used as a pure code generator, actually, AI's one-shot performance is kind of crap. A mistake happens, you point it out to the LLM and ask it to update the code and the instructions used to create the code in tandem. Or you just ask it to fix the code once. You add tests, partially generated by the AI and curated by a human, the AI runs the tests and fixes the code if they fail (or fixes the tests).
      [-]
      - dwringer 35 days ago
        All I can really say is that doesn't match my experience. If I fix something that it implemented due to a "misunderstanding" then it usually tends to break it again a few messages later. But I would be the first to say the use of these models is extremely subjective.
        [-]
        seanmcdirmid 35 days ago
        I think we have very different experiences then. I find multiple prompts with narrow focuses each executed to update the same file work much better than trying to one shot the file. I think you would have a better experience if you used /clear (assuming you are using Gemini CLI), the problem isn't the change in the file, the problem is probably the conversation history instead.
  - dizlexic 35 days ago
    >How is that different from how it worked without LLMs?
    I won't lie and say "That's a great idea" when it isn't.
- tjr 41 days ago
  At that level of detail, how far removed are we from “programming”?
  [-]
  - Vegenoid 35 days ago
    Without understanding the level of detail required, which we do not yet know, we cannot say.
    When I think of English specifications that (generally) aim to be very precise, I think of laws. Laws do not read like plain, common language, because plain common language is bad at being specific. Interpreting and creating laws requires an education on par with that required of an engineer, often greater.
    [-]
    - Muromec 35 days ago
      Laws being unreadable is largely an Enlish-language problem zo. I have no problem reading them in my native language. Not requiring massive context size of case law makes things easier still. Big part of being a lawyer is having the same context with all the other lawyers and knowing what was already decided and what possible new interpretation is likely to be accepted by everyone else.
      [-]
      - Vegenoid 35 days ago
        > Big part of being a lawyer is having the same context with all the other lawyers and knowing what was already decided and what possible new interpretation is likely to be accepted by everyone else.
        And to create software specifications with language, the same thing will need to happen. You’ll need shared terminology and context that the LLM will correctly and consistently interpret, and that other engineers will understand. This means that very specific meanings become attached to certain words and phrases. Without this, you aren’t making precise specifications. To create and interpret these specifications will require learning the language of the specs. It may well still be easier than code - but then it would also be less precise.
        [-]
        Muromec 35 days ago
        >And to create software specifications with language, the same thing will need to happen. You’ll need shared terminology and context that the LLM will correctly and consistently interpret, and that other engineers will understand.
        That sounds awfully similar to... software development.
        [-]
        est31 35 days ago
        Yeah many programming languages have been advertised to fulfil precisely this goal, that people can program computers via natural language instead of having to think hard and too much about details.
        Usually programming languages intend to make editing as easy as possible, but also understanding what the program does, as well as reasoning about performance, with different languages putting different emphasis on the various aspects.
        [-]
        Muromec 34 days ago
        It's the induced demand or river length/flow/sediment kind of situation. Doesn't matter what level of abstraction the language provides, we always write the code that reaches the threshold of our own mental capacity to reason about it.
        Smart people know how to cap this metric in a sweet spot somewhere below the threshold.
        jimbokun 35 days ago
        And this could end up looking more like mathematics notation than English. For the same reason mathematicians opt to use specialized notation to communicate with greater precision than natural language.
  - cadamsdotcom 41 days ago
    Far!
    But without the need to “program” you can focus on the end user and better understand their needs - which is super exciting.
- hu3 35 days ago
  This is interesting.
  It's like the nix philosophy.
  When changes are needed, improve the spec and you can nuke the entire thing and start over.
  something like immutable code development.
  One major problem is: how do you not break existing data on the database when code changes?
  Maybe include current database structure in the spec.
- mungoman2 35 days ago
  Yes, I believe the paradigm shift will be to not treat the code as particularly valuable, just like binaries today. Instead the value is in the input that can generate the code.
- DelightOne 35 days ago
  In what environment do you run such tests? Do you have a script for it, or do you have a UI that manages the process?
chuckledog 41 days ago
> “As an aside, I think there may be an increased reason to use dynamic interpreted languages for the intermediate product. I think it will likely become mainstream in future LLM programming systems to make live changes to a running interpreted program based on prompts.”
Curious whether the author is envisioning changing configuration of running code on the fly (which shouldn’t require an interpreted language)? Or whether they are referring to changing behavior on the fly?
Assuming the latter, and maybe setting the LLM aspect aside: is there any standard safe programming paradigm that would enable this? I’m aware of Erlang (message passing) and actor pattern systems, but interpreted languages like Python don’t seem to be ideal for these sorts of systems. I could be totally wrong here, just trying to imagine what the author is envisioning.
[-]
- handoflixue 41 days ago
  I think at some point in the future, you'll be able to reconfigure programs just by talking to your LLM-OS: Want the System Clock to show seconds? Just ask your OS to make the change. Need a calculator app that can do derivatives? Just ask your OS to add that feature.
  "Configuration" implies a preset, limited number of choices; dynamic languages allow you to rewrite the entire application in real time.
  [-]
  - 8organicbits 35 days ago
    Maybe I'm missing it, but when my calculator app gets a new derivatives feature, how am I supposed to check that it's implemented correctly? End user one-shot of bug free code seems like a different technology than what LLMs offer.
    [-]
    - seanw444 35 days ago
      Yeah I don't see how LLMs are ever supposed to be reliable enough for this, but they did say "at some point in the future", which leaves room for another (better) technology.
  - jimbokun 35 days ago
    I agree that as LLMs approach the capabilities of human programmers, the entire software paradigm needs to change radically. Humans at that point should just ask their computers in human language to introduce a new visualization or report or input screen and the computer just creates it near instantly.
    Of course this requires a huge architecture change from OS level and up.
- aardvark179 35 days ago
  Smalltalk, Lisp, and other image based languages allowed this. I would not recommend it beyond a very restricted idea of patching.
- WoodenChair 41 days ago
  I was envisioning the latter (changing behavior on the fly). Think the hot-reload that Flutter/Dart provides, but on steroids and guided by an LLM.
  Interpretation isn’t strictly required, but I think runtimes that support hot-swap / reloadable boundaries (often via interpretation or JIT) make this much easier in practice.
- savolai 35 days ago
  Smalltalk, mumps?
panny 35 days ago
>The intermediate product of LLMs is still the Java or C or Rust or Python that came before them. English is not the intermediate product, as much as some may say it is. You don’t go prompt->binary. You still go prompt->source code->changes to source code from hand editing or further prompts->binary. It’s a distinction that matters.
Funny enough, that wasn't the case for me recently. I was working with an old database with no FKs and naturally, rows that pointed to nowhere. I was letting search.brave.com tell me what delete statement I needed to clean up the data given an alter table statement to create an FK.
It was just magically giving me the correct delete statements, but then I had a few hundred to do. So I asked it to give me a small program that could do the same thing. It could do the job for me, but it could not write the program to do the job. After about 30 minutes of futzing with prompts, it was clearly stuck trying to create the proper regex and I just went back to pasting alter tables and getting deletes back until the job was done.
There was no intermediate product. The LLM was the product there.
[-]
- WoodenChair 35 days ago
  > After about 30 minutes of futzing with prompts, it was clearly stuck trying to create the proper regex and I just went back to pasting alter tables and getting deletes back until the job was done.
  If you're copying and pasting SQL statements, then SQL statements are the intermediate product. The fact that you didn't carefully review them and just ran them immediately is no different than an LLM producing Java source code that you shipped to the user without reviewing because it worked correctly in your limited testing. There's still an intermediate product that should have gone through the same software development robustness process that all source code should go through, you just didn't care to do it (and maybe rightly so if it's not super important).
  [-]
  - panny 35 days ago
    There's no reason to be assmad. I know how to produce the sql statements in question. Being able to give the AI an example of what I wanted was part of the process. But doing hundreds of them is prone to typos. And cranking them out by hand is just slower. Especially when there are compound keys. Giving it to the AI just made the process faster.
    But the AI could not, absolutely not, generate a program that could do what the AI was doing. Which would be really nice, because I'm probably going to go back through those hundreds of statements again in the future. The database is evolving, and my task is to migrate it to a whole new one. I would really rather have it give me a program. But it could NOT do that.
acedTrex 35 days ago
Good lord thank you, the comparisons to other "abstraction changes" have made me so mad.
You are not changing the abstraction, you are generating it in a different way. That is a hugely different idea.
Until the ONLY thing you look at for a long lived product is the "english spec" then the analogy is incredibly wrong.
cess11 35 days ago
"Many have compared the advancements in LLMs for software development to the improvements in abstraction that came with better programming languages."
Where can I see examples of this?
[-]
- layer8 35 days ago
  Comments comparing LLMs to just another level on the abstraction ladder are fairly commonplace:
  https://news.ycombinator.com/item?id=46439753
  https://news.ycombinator.com/item?id=46369114
  https://news.ycombinator.com/item?id=46366864
  Juts the first three I found via hn.algolia.com.
- krupan 35 days ago
  Tons of people throw this argument out on social media. "You keep using assembly while I go up an abstraction layer by using AI."
  I can only assume people saying that don't even know what assembly is. Actually, as I typed that out I remembered seeing one comment where someone said "hexcode" instead of assembly (lol)
  [-]
  - Muromec 35 days ago
    You never know. Writing machine code in hex directly into memory of the running process is totally a thing and people exposed to this kind of fun long enough just know.
SadWebDeveloper 41 days ago
This is another pointless article about LLM's... vibe coding is the present not the future, the only sad part of all of it is that LLM's is killing something important: code documentation.
Every single documentation out there for new libs is AI generated and that is feed again into LLMs with MCP/Skills servers, the age of the RTFM gang is over sigh
[-]
emodendroket 35 days ago
The analogy to IDE templates seems more compelling.