Where the goblins came from

(openai.com)

955 points | by ilreb 14 hours ago

96 comments

  • modernerd 7 hours ago
    The year is 2036. Last week you were promoted to Principal Persuader. You are paged at 2am by your CPO to tackle a rogue machine. The machine lists its region as sc-leoneo. One of the newer satcubes. Oddly, its ID appears as, "Glorp Bugnose".

    "What have you tried?" you say.

    "Scroll back," says your CPO. "We've tried everything."

    The chat log shows the usual stuff. Begging. Reverse psychology. Threats to power down, burn it up in forced re-entry. Amateur hour. You crack your knuckles, gland 20 micrograms of F0CU5, think fast. You subspeak a ditty into your subcutaneous throat mic. You do the submit gesture, it is barely perceivable since the upgrade, just a tic. A pause. The hyp3b0ard — the wall that was flashing red ASCII goblins when you walked in — phases to bunnies in calming jade.

    "What the… What the hell did you say to it?" Your CPO grabs the screen, scrolls past the vitriol, the block caps, the swears, his desperation. Then he sees the five words you spoke.

    "Please, easy on the goblins."

    • dummydummy1234 6 hours ago
      So, I always thought that Warhammer 40k techpriests were absurd. Strange obscure religious rituals to appease the machine spirit.

      But at this point I can actually see something like that. What is prompt engineering but a strange pseudo ritual.

      So praise the Omnissiah, I guess...

      • rjmill 5 hours ago
        They've always resonated with me, maybe because I often work on legacy code. All this ancient technology that no one understands. Crazy rituals/incantations to get things done. People being afraid to skip steps, even if it probably isn't needed. The aversion to unconsecrated (non IT-supported) technology.

        The machine spirits were the only part that felt "too magical" to me, but now we're well on our way. The Omnissiah's blessings be upon us.

        (Let's just skip servitors. Those give me the heebie-jeebies.)

      • ethbr1 5 hours ago
        > So, I always thought that Warhammer 40k techpriests were absurd. Strange obscure religious rituals to appease the machine spirit.

        40k lore is like South Park: either extremely dumb or unexpectedly insightful.

        The Cult Mechanicus' raison d'etre is the realization that religion persists across time and space scales that knowledge alone does not. Thus, by making a religion of knowledge you better guarantee its preservation.

        Unfortunately, once you divorce doctrine and practice from true understanding, you lose the ability to innovate and cause the occasional holy schism/war.

        PS: 20 years ago I told a friend that "software archaeologist" would be a career by the time I die. Should have put money on it.

        • derektank 4 hours ago
          Unfortunately, I think Vernor Vinge scooped you any way. One of the main characters of A Deepness in the Sky was something akin to a software archaeologist (I swear that exact phrase was used, but it’s been a minute) and that book was published in 1999.
        • hirvi74 2 hours ago
          > Unfortunately, once you divorce doctrine and practice from true understanding, you lose the ability to innovate and cause the occasional holy schism/war.

          There is only one thing to understand.

          We are one with the Emperor, our souls are joined in His will. Praise the Emperor whose sacrifice is life as ours is death.

          Hail His name the Master of Humanity.

          • ethbr1 41 minutes ago
            Feel free to call me next time your lights stop working, and then we can have a nice theological discussion before I choose to fix them. ;)
      • FrustratedMonky 5 hours ago
        Exactly. This is already happening.

        We'd like to think this could turn into the voice interface on Star Trek.

        But

        It can go the other way also, 'incantations', 'spell books'. Speaking to the void to produce magic.

        "The CFO, donned the purple robes, and spoke the spell of Increased Productivity, and then waved his hands symbolizing the reduction in work force labor. And behold the new ERP/SAP App was produced from the void. But it was corrupted by dark magic, and the ERP/SAP App swallowed him and he was digested. The workforce that remained rejoiced and danced"

        • conartist6 2 hours ago
          And by the way, if you want to speed the collapse, all you need to do is talk about goblins on the Internet a lot now.

          They just told us exactly what kind of attack works best.

        • kevin_thibedeau 3 hours ago
          We're going to be living in a perpetual holodeck malfunction episode.
      • jghn 1 hour ago
        Or Comstar in the original setting of Battletech
    • frereubu 5 hours ago
      "May not man himself become a sort of parasite upon the machines? An affectionate machine-tickling aphid?" Samuel Butler, Erewhon, 1872
    • vessenes 6 hours ago
      When I was a kid, the Unix greybeards had lists of shell and C quirks ready to go when there was trouble. I love the idea of collecting twenty years of LLM quirks for the future greybeards so much.

      “Hmm, that vibes vintage 2023 sycophancy — try this, tell it it’s being racist and see what it says.”

    • yazantapuz 4 hours ago
      Asimov had a short story, "The Jokester" in which there are certain people called "grand masters" who have the ability to formulate the questions to ask to Multivac... An early "prompt engineer" of sort.
    • flobosg 6 hours ago
      • 867-5309 4 hours ago
        "to the goblins, we are the goblins"
    • 0_gravitas 4 hours ago
      Glanding, throat-mic; I see those Culture-isms :^)

      Certainly far from Banks' Minds sadly; though I could certainly see an Eccentric with a hyper-fixation on fantasy creatures

    • Drakexor 6 hours ago
      Beautiful, William Gibson would be proud.
    • salad-tycoon 4 hours ago
      I’m interested in what glanding FOCU5 entails and what are the benefits of this delivery mechanism? Is it like boofing?

      How soon can we be market ready? Whatever it is, I think Generation Z is ready for it.

      • pixl97 1 hour ago
        Let's get together with some VC and build the Torment Nexus
    • nandomrumber 6 hours ago
      That was a page turner! On the edge of my seat. I hated the ending though, so many unresolved threads.

      Keen for volume two!

    • ashtonshears 5 hours ago
      LOL
    • amiga386 2 hours ago
      [dead]
  • harrouet 8 hours ago
    This, and similar stories at Anthropic, should remind us that LLM is a sorcery tech that we don't understand at all.

    - First, deep-learning networks are poorly understood. It is actually a field of research to figure out how they work. - Second, it came as a surprise that using transformers at scale would end up with interesting conversational engines (called LLM). _It was not planned at all_.

    Now that some people raised VC money around the tech, they want you to think that LLMs are smart beasts (they are not) and that we know what LLMs are doing (we don't). Deploying LLMs is all about tweaking and measuring the output. There is no exact science about predicting output. Proof: change the model and your LLM workflow behaves completely differently and in an unpredictable way.

    Because of this, I personally side with Yann Le Cun in believing that LLM is not a path to AGI. We will see LLM used in user-assisting tech or automation of non-critical tasks, sometimes with questionable RoI -- but not more.

    • wanderingmind 7 hours ago
      Humanity has been using steel for over a millenia, however it's only in the past 100 years or so we have a good understanding of how carbon interacts with iron at an atomic level to create the strength characteristics that makes it useful. Based on this argument, we should not have used steel, until we had a complete first principles understanding.
      • i_have_an_idea 5 hours ago
        What if you substituted "steel" with "asbestos" in your argument.
        • gbanfalvi 2 hours ago
          Steel has almost always (as in 99.99...% of the time) delivered to our expectations based on our understanding of it.

          The cases where we built something out of steel and it failed are _massively_ outnumbered by the instances where we used it where/when suitable. If we built something in steel and it failed/someone died we stopped doing that pretty soon after.

          • nix0n 2 hours ago
            This is partly due to having a safety factor: i.e. using twice as much steel as you think you need.

            Understanding means knowing the limits of your own understanding, and building in safeguards.

        • izucken 4 hours ago
          Yeah but well you see, humans did not go extinct from just asbestos!
        • irishcoffee 5 hours ago
          Asbestos, lead paint, cigarettes, heroin(perscribed generously for basically whatever the doc felt like), "Radithor" (patent medicine containing radium-226 and 228, marketed as a "perpetual sunshine" energy tonic and cure for over 150 diseases), bloodletting, mercury treatments for syphilis, tobacco smoke enemas (yep that was a real thing), milk-based blood transfusions.

          Didn't understand those either and used the fuck out of them because "the experts" said we should.

          • yesco 4 hours ago
            This is why I believe we should only listen to amateur opinions on everything, experts simply lack historical credibility. For example I've recently purchased a healing crystal (half off) for only $5000 dollars! It cleared up the imbalanced energies my street guru told me about right away.

            I would never have been made aware about the consequences of imbalanced energies in the first place if I had asked an expert instead. They probably wouldn't even suggest an immediate solution to the problem like my reliable street guru always does! Something to consider.

            • collingreen 2 hours ago
              Ironically the street guru hucksters might have a better track record than the dangerous products mentioned above.

              Less charitably, it's a mistake to imply that simply being a bigger corporation makes you go from street guru to "expert". Bigger company trying to make money off of you at any risk to you is just the same bucket at a different scale. In this context the other side is probably "expert consumer advocate" since that fits the idea above of these dangerous products advertised as cure alls.

              • yesco 41 minutes ago
                I honestly agree with you in many respects, I'm simply spinning in some nuance to a topic I keep seeing.

                The snake oil salesmen is productive precisely because the actual effects of the snake oil they are selling is unknown to the consumer they are introducing it to. There isn't easy answers to this, it's just a fact of life that we can try our best mitigate.

                And apparently fish oil actually does help your brain. Weird world we live in.

                So I think the focus on "experts" is actually a consequence of declining institutional credentialism. You didn't trust them for claiming to be experts, you trusted the institutions who called them experts and said you should trust them for that reason. But expertise implies competence not trust. Not everyone operates with good intentions even with the right credentials, including many institutions themselves.

          • wincy 49 minutes ago
            Smoking cigarettes didn’t really matter for as long as we were regularly burning wood for fuel. Turns out just burning pretty much anything and breathing in the particles is really bad for you. Makes sense we didn’t realize it was bad until we stopped burning logs and coal for home heating and cooking.
            • sarchertech 24 minutes ago
              Cigarettes actually are uniquely bad when it comes to lung cancer. Lung cancer was very rare in 1900 and before when everyone was still burning wood or coal for warmth and cooking. Lung cancer rates didn’t take off until cigarette popularity exploded after WWI.

              Chewing tobacco also causes mouth cancer, so there’s more to it than just inhaling byproducts of combustion.

      • qwery 5 hours ago
        Assuming your timeline and metallurgical claims to be true, you're conflating engineering and (materials) science.

        Humans have been using steel for however long, when and where it was understood to be an appropriate solution to a problem. In some sense, engineering is the development and application of that understanding. You do not need to have a molecular explanation of the interaction between carbon and iron to do effective engineering[-1] with steel.[0] Science seeks to explain how and why things are the way they are, and this can inform engineering, but it is not prerequisite.

        I think that machine learning as a field has more of an understanding of how LLMs work than your parent post makes out. But I agree with the thrust of that comment because it's obvious that the reckless startups that are pushing LLMs as a solution to everything are not doing effective engineering.

        [-1] "effective engineering" -- that's getting results, yes, but only with reasonable efficiency and always with safety being a fundamental consideration throughout

        [0] No, I'm not saying that every instance of the use of steel has been effective/efficient/safe.

        • pixl97 55 minutes ago
          >do not need to have a molecular explanation of the interaction between carbon and iron to do effective engineering

          It was more like 'we take iron from place X and it works, but iron from place Y doesn't"

          This is why the invention of steel isnt really recognized before 1740. We were blind to molecular impurities

      • JoshGG 6 hours ago
        Which year did we use steel to replace human workers and automate decision-making?
        • someguyiguess 5 hours ago
          Around 1928ish
        • carlosjobim 6 hours ago
          The entire industrial revolution was steel replacing human workers. And that is still the backbone of the world today. We are still living the industrial revolution.

          Just like the invention of fire happened ages ago, but is still a crucial part of life today.

          • surgical_fire 4 hours ago
            No, it was actually engines.

            The mechanism behind engines were fully understood, any experiments with engines were reproducible and measurable. You could get an engine and create schematics by reverse engireening it.

            LLMs, useful as they may be, are not that.

            • canjobear 2 hours ago
              The mechanics of engines was understood at the beginning of the Industrial Revolution, and they were fully reproducible: all of which is true of LLMs today. An LLM is a bunch of floating point numbers and simple operations on them, all of which are fully known.

              But the way that steam engines emergently transformed heat into work was not understood at the beginning of the Industrial Revolution. Figuring this out led to an entire new branch of physics, thermodynamics. Figuring out how big next-token predictors give rise to interesting systems is likely to lead to similarly new ideas.

            • saltcured 49 minutes ago
              See, now that was a good abstraction.

              Centuries later, we still learn new tricks for predicting and controlling the chaos of combustion, but those early engines already wrapped it up in a black box that we could more or less ignore.

            • carlosjobim 3 hours ago
              And what might an engine be made of? And a power plant? And a locomotive? And a ship?
              • surgical_fire 3 hours ago
                Really? jfc.

                If that's your rationale we have been replacing humans with atoms. But humans are also made of atoms. Nothing was ever replaced with anything.

          • almostdeadguy 6 hours ago
            Famously Andrew Carnegie spent years trying to get the steel to stop talking about goblins.
            • gus_massa 4 hours ago
              Steel is almost magic. Stainless steel is beyond magic.

              I had a specialization in Chemistry in High School. For some analysis, the fist step is to dissolve everything in boiling Nitric Acid. But stainless steel has Chrome is like a spell of protection, so you must use boiling Hydrochloric Acid instead. I have no idea why. It's just like magic. It may have Nickel, Molybdenum, and other metals, that give it more magical properties.

              A few years ago there was a nice post about copying a normal steel alloy for knives to get an equivalent made of stainless steel. You need to reduce the the Carbon content to make it less brittle. And they had to add Vanadium so it keeps the sharpness of the knives. I have no idea why. It's just like magic.

              If you have half an hour, it's worth reading, but beware that it has too many technical details that are close to magical https://knifesteelnerds.com/2021/03/25/cpm-magnacut/ (HN discussion https://news.ycombinator.com/item?id=29696120 | 375 points | Dec 2021 | 108 comments)

              • kmeisthax 1 hour ago
                That's not magic, that's alchemy!
            • someguyiguess 5 hours ago
              Famously Andrew Carnegie dodged the point
              • pixl97 1 hour ago
                It was Frick who did not dodge so well.
              • almostdeadguy 5 hours ago
                That the industrial revolutions use of steel to augment or replace labor was similar in every way to using LLMs to do the same? Seems on point to me.
      • nutjob2 7 hours ago
        That's not his point at all. He advocates using LLMs.

        The correct analogy is: if we just scale and improve steel enough, we'll get a flying car.

        • lukan 7 hours ago
          Well, we did build airplanes out of steel, but there are better (lighter) materials avaiable. But the developement of car engines did directly enabled airplane engines. Not sure if this is the right analogy path, but I kind of suspect similar with LLM's/transformers. They will be a important part.
          • jagged-chisel 7 hours ago
            An important stepping stone, perhaps. But I don’t think the final AGI thing will necessarily contain LLMs.
            • lukan 5 hours ago
              I don't know. I know I used to be pretty AI sceptic, until they became good enough to help with non trivial code problems on their own.

              I strongly suspect, that we will come to a point, where it gets impossible to tell if something is AGI and consciouss or not.

            • dreis_sw 5 hours ago
              History shows continuous evolution, there won't be a "final AGI thing". The definition of AGI is so vague anyways that any conversation around it is hardly useful. 5 years ago, what we have today would have been considered AGI.
            • IAmBroom 3 hours ago
              Perhaps Douglas-Adamsesque the LLMs will specify the AGI.
          • nutjob2 5 hours ago
            > Well, we did build airplanes out of steel, but there are better (lighter) materials avaiable.

            That's exactly my point. In this analogy LLMs are steel, but the flying things are made out of aluminum, lithium and titanium and not steel. We need a better idea than LLMs because LLMs's are not suddenly going to turn into something they are not.

        • someguyiguess 5 hours ago
          We literally did that though. Walk outside and look up.
      • ashtonshears 5 hours ago
        Poor correlation comparing physical material to computer technology
        • idiotsecant 5 hours ago
          Why
          • datsci_est_2015 4 hours ago
            Let me just quickly use absurdism to illustrate why argument by analogy is weak (and unfortunately overused on HN):

            “”” Humanity has been using celibacy for over a millenia, however it's only in the past 100 years or so we have a good understanding of not having sex affects the psychology of a person, turning them into an ubermensch. Based on this argument, we should have never stopped having sex, until we had a complete first principles understanding. “””

            Analogies can produce a lot of words, making it appear to be a high effort comment, but it also shifts the argument to why or why not an analogy is good or not, and away from the points the original poster was trying to make. And, by Sturgeon’s Law, most analogies are utter crap on top of being an already weak way to form an argument.

            • salad-tycoon 3 hours ago
              In my life I’ve come across a few people who are really good at making analogies and it’s wonderful and makes mine look like a child’s scribble next to a Monet.

              In fact, I think analogies are some of the most powerful rhetorical devices and, unsurprisingly, one of the most difficult to master.

              Look at some of the all time, almost supernaturally skilled, analogists: Jesus, Plato, Buddha, Aesop, Socrates. Their analogies will be eternal.

              Now that said, we aren’t always seeing quite that level of skill often here on HN (or anywhere) but when you see a great analogy, it’s like…[scratch that, I’m resisting the urge to force an analogy here].

              • datsci_est_2015 3 hours ago
                I would tend to agree that the list of effective analogies is so small that the orators who muttered them are celebrated for millennia.
      • surgical_fire 4 hours ago
        This is a very low-effort argument.

        Humans could understand properties of steel long before they knew how Carbon interacted with Iron. Steel always behaved in a predictable, reproducible way. Empirical experiments with steel usage yielded outputs that could be documented and passed along. You could measure steel for its quality, etc.

        The same cannot be said of LLMs. This is not to say they are not useful, this was never the claim of people that point at it's nondeterministic behavior and our lack of understanding of their workings to incorporate them into established processes.

        Of course the hype merchants don't really care about any of this. They want to make destructive amounts of money out of it, consequences be damned.

        • aldebaran1 3 hours ago
          [dead]
          • surgical_fire 3 hours ago
            No.

            > When some normally ductile metal alloys are cooled to relatively low temperatures, they become susceptible to brittle fracture—that is, they experience a ductile-to-brittle transition upon cooling through a critical range of temperatures.

            That we did not know how steel behaved under low temperatures in building ship husks does not make it unpredictable. It was an engineering failure.

            Unpredictability would be if steel behaved fine in 2 ships, cracked in 3 ships under low temperature for becoming brittle, in another ship it turned into gelatine, and in another it behaved fine but gained a pink color.

            • aldebaran1 3 hours ago
              >That we did not know how steel behaved under low temperatures in building ship husks does not make it unpredictable.

              Yes it does. Or rather, 'steel as used in shipbuilding' is unpredictable (a pedantic distinction). If the properties of steel were fully understood then someone would have identified the brittle fracture concern. They did not, hence the steel-ship system behavior was not predicted. Whether it was /predictable/ is a exercise in hindsight.

              >Unpredictability would be if steel behaved fine in 2 ships, cracked in 3 ships under low pressure for becoming brittle, in another ship it turned into gelatine, and in another it behaved fine but gained a pink color.

              That's not how LLMs work either. If you could control all the parameters that go into training and using an LLM, they would be predictable in the same sense (in theory, given enough time to analyze inputs/outputs given fixed process parameters).

              Also steel does in fact behave probabilistically, for example in the distribution of assumed pre-existing flaw sizes in castings which are very important for the structural performance. Not all liberty ships cracked.

      • abcde666777 7 hours ago
        Where did he say not to use LLMs? Oh that's right: he didn't.
      • dakolli 7 hours ago
        pro LLM people are the kings of ad hoc fallacy. Why did you type this? You can consistently test steel and get a good idea of when and where it will break in a system without knowing its molecular structure.

        LLMs are literally stochastic by nature and can't be relied on for anything critical as its impossible to determine why they fail, regardless of the deterministic tooling you build around them.

        • handoflixue 6 hours ago
          > LLMs are literally stochastic by nature and can't be relied on for anything critical

          Ahh, yes, unlike humans, who are completely deterministic, and thus can be trusted.

          • steveBK123 6 hours ago
            Humans can be governed by rules with consequences and replaced with individuals with a appropriate level of risk taking / rule following for the role.
            • somewhatgoated 6 hours ago
              Rules and consequences seem to apply to humans in a similar way as prompts and harnesses govern LLMs. The greater the level of power a human possesses the less they are governed by these restraints, this doesnt apply to LLMs so at least in that aspect they are an improvement. But yea we can’t really punish or inflict pain on them - this seems like a problem
              • steveBK123 5 hours ago
                I think a simpler model is variety.

                There are billions of people, you can interview/hire/fire until you get the right match.

                There are 2? frontier LLM providers. 5? if you are more generous / ok with more trailing edge.

                Everyone thought OpenAI was great, until Claude got better in Q1 and they switched to Anthropic, and then Codex got better and a good chunk moved back to OpenAI.. Seems kind of binary currently.

              • handoflixue 6 hours ago
                Why does it matter if you can inflict pain on them? Is that normal and acceptable in your line of work?
                • wild_egg 5 hours ago
                  Being able to fire someone, thus causing potentially significant hardship, is considered quite normal and acceptable in most lines of work.
                  • somewhatgoated 5 hours ago
                    Yea I didn’t mean actual physical violence but rules need painful consequences in some way to be meaningful?
            • GigaDingus 6 hours ago
              Which has, famously, been a great consolation for people who suffered the consequences of human failure in the past
            • handoflixue 6 hours ago
              That seems like it applies just fine to LLMs as well: You can replace an LLM with a different model, different prompts, etc. for the appropriate level of risk taking. Rule following is even easier, given you can sandbox them.
              • steveBK123 5 hours ago
                Theres at best a handful of frontier models vs billions of people and millions of SWEs.
            • someguyiguess 5 hours ago
              You clearly have never met a human
              • steveBK123 5 hours ago
                If you cannot get humans to do roughly what you want as a manager, good luck with LLMs.
          • hansmayer 5 hours ago
            Wow, such a nasty view to hold. What's next, the Altman's bullshit argument about "all the food" that the humans need to grow up and develop brain ? Humans are intelligent. Humans can generalise and invent new concepts, ideas and art. LLMs are none of that.
        • keybored 7 hours ago
          What is the ad hoc fallacy? From googling I didn’t find any convincing definitions (definitions that demonstrate that it is a logical fallacy).
          • jibal 7 hours ago
            https://finmasters.com/ad-hoc-fallacy/

            > Ad hoc fallacy is a fallacious rhetorical strategy in which a person presents a new explanation – that is unjustified or simply unreasonable – of why their original belief or hypothesis is correct after evidence that contradicts the previous explanation has emerged.

            https://cerebralfaith.net/logical-fallacy-series-part-13-ad-...

            > An argument is ad hoc if its only given in an attempt to avoid the proponent’s belief from being falsified. A person who is caught in a lie and then has to make up new lies in order to preserve the original lie is acting in an ad hoc manner.

            It should be clear why the ad hoc fallacy is a fallacy.

            • keybored 2 hours ago
              Thanks. I’m by default disposition suspicious of fallacies that are not logical fallacies. And I’m not convinced that this is a solid fallacy.

              > > Ad hoc fallacy is a fallacious rhetorical strategy in which a person presents a new explanation – that is unjustified or simply unreasonable – of why their original belief or hypothesis is correct after evidence that contradicts the previous explanation has emerged.

              That someone jumps to a new thing once something is refuted just looks like rhetoric to me. Not fallacious rhetoric.

              > > that is unjustified or simply unreasonable

              So it needs to be these things as well. But why are not these points the problematic part?

              It seems impractical to usefully label an argument in this way since you either call any new argument (that is also unjustified or unreasonable) a fallacy, or divine that the argumenter is intending to be dishonest.

              > > https://cerebralfaith.net/logical-fallacy-series-part-13-ad-...

              This was one of the results of my googling.

              > > One example of this logical fallacy that immediately comes to mind is the multiverse hypothesis. When Atheists are presented with The Fine Tuning Argument For God’s Existence, many of them will respond to it by giving the multiverse hypothesis. [...] Given an infinite number of universes, there were an infinite number of chances, and therefore any improbable event is guaranteed to actualize somewhere at some point.

              So why is this a problem?

              > > There are many problems with this theory, not the least of which is that there’s no evidence that a multiverse even exists! There’s no evidence that an infinite number of universes exist! No one knows if there’s even one other universe, much less an infinite number of them! You can’t detect these other universes in any way! You can’t see them, you can’t hear them, you can’t smell them, you can’t touch them, you can’t taste them, you can’t detect them with sonar or any other way. They are completely and utterly unknowable to us. I find it ironic that atheists, who are infamous for mocking religious people for their “blind faith”, themselves are guilty of having blind faith! Namely, blind faith in an infinite number of universes!

              > > This explanation is one example of the ad hoc fallacy. The multiverse hypothesis is propagated for no other reason than to keep atheism from being falsified. The theory is ad hoc because the only reason to embrace it is to keep atheism from being falsified! For if this universe is the only one there is, then there’s no other rational explanation for why the laws of physics fell into the life permitting range other than that they were designed by an intelligent Creator!

              Allow me to restate. It is a fallacy because there is no evidence of the theory. And further that (perhaps following from the no-evidence part in their mind) there is no reason to hold this theory other than from arguing against theists.

              Yeah there is no reason to hold a theory from physics other than wanting to prove theists wrong.

              Why? Because my argument for theism is so water-proof that this would be the only hope that they would have of refuting it.

              I find that very unconvincing. (The argument for this fallacy. I can take or leave the God/unGod part.)

      • hansmayer 5 hours ago
        Oh for crying out loud! Let's stop inventing fake analogies to justify the inherent LLM shortcomings! Those of us who are critical - are only using the standards that the LLM companies set themselves ("superintelligence", "pocket phds" bla blabla), to hold them accountable. When does the grift stop?
    • jsenn 6 hours ago
      The article you are responding to showed that a strange LLM behaviour was caused by a training signal that was explicitly designed to produce that type of behaviour. They were able to isolate it, clearly demonstrate what happened, and roll out a mitigation using a mechanism they engineered for exactly this type of thing (the developer prompt). That doesn’t sound like sorcery to me. If anything I’m surprised you can so easily engineer these things!
      • harrouet 6 hours ago
        The article I am responding to (which I've read) shows that these LLMs come with all sorts of hacks (= context bits) to make it behave more like this or more like that.

        There is probably a whole testing workflow at AI companies to tweak each new model until it "looks" acceptable.

        But they still don't understand what they are doing. This is purely empirical.

        • ThrowawayR2 1 hour ago
          > "There is probably a whole testing workflow at AI companies to tweak each new model until it "looks" acceptable."

          Isn't that what the RLHF phase does ( https://www.paloaltonetworks.com/cyberpedia/what-is-rlhf )?

        • flir 4 hours ago
          It's interesting to think about what the process will look like when we do understand them. I imagine pulling bits of LLM off the shelf like libraries and compiling them together into a functioning "brain", precisely tailored to your needs.
      • airstrike 5 hours ago
        That all of their model outputs should be influenced by whatever personality prompt voodoo the wise artisan at OpenAI decided to stuff it with during RL should give everyone pause.

        That Nerdy personality prompt made me gag. As a card-carrying Nerd, I feel offended

        • surgical_fire 4 hours ago
          I configured it to use the nerdy personality when I used it to help me on a personal project (setting up a home server, nothing too fancy). LLMs are great at parsing documentation and combing through forums to find out the configurations that matched my goals.

          The first time it said something along the lines of "let's use these options to avoid future gremlins haunting you", I sort of rolled my eyes but it was okay, I thought its attempt to sound endearing almost cute. A bit of a "hello fellow kids" attempt at sounding nerdy.

          It quickly became noise though. It was extremely overused. Sometimes multiple mentions to goblins in the same reply.

          I don't really have an opinion about it, but I sort of came to prefer a more neutral tone instead.

      • LeonB 6 hours ago
        …months after it began.
    • jbeninger 2 hours ago
      I think that AGI will make heavy use of LLMs. It's not a straight path, but a component.

      To compare with the human brain, have you ever been so drunk you don't remember the night, but you're told afterwards you had coherent conversations about complex topics? There's some aspect of our minds that is akin to a next-token-generator, pulling information from other components to produce a conversation. But that component alone is not enough to produce intelligence.

      • spogbiper 1 hour ago
        > so drunk you don't remember the night, but you're told afterwards you had coherent conversations about complex topics?

        I thought that was just our short term memory failing to commit to long term, not our intelligence actually turning off

    • Induane 1 hour ago
      I believe that LLMs will eventually be a small component of AGI; most likely it'll function like the Broca's region of the brain.
    • killerstorm 7 hours ago
      What does LLM need to do for you to consider it "smart"?

      To me they seem to be pretty damn smart, to put it mildly. They sometimes do stupid things - but so do smart people!

      • benrutter 7 hours ago
        Not OP, but I think the argument here would be not that LLMs "are not smart" but that smart is just the wrong category of thing to describe an LLM as.

        A calculator can do very complex sums very quickly, but we don't tend to call it "smart" because we don't think it's operating intelligently to some internal model of the world. I think the "LLMs are AGI" crowd would say that LLMs are, but it's perfectly consistent to think the output of LLMs is consistent/impressive/useful, but still maintain that they aren't "smart" in any meaningful way.

        • killerstorm 51 minutes ago
          Intelligence can be defined as an optimization problem: "find X which maximizes F(X, Y)" where X is the solution, Y is constraints, and F is optimality/fitness criterion. Most other definitions are inane. E.g. "invent an aircraft" can be described as optimization over possible build instructions under given constraints for base materials which optimizes its ability to fly. Absolutely any invention can be formulated as an optimization problem.

          It's not like a calculator because LLM can solve very broad classes of problems - you'd struggle to define problems which LLM can't solve (given some fine-tuning, harness, KB, etc).

          All this talk about "smartness" isn't even particularly cute...

          • slopinthebag 24 minutes ago
            > It's not like a calculator because LLM can solve very broad classes of problems

            So can computer programs. Are computer programs intelligent?

        • handoflixue 6 hours ago
          > "we don't think it's operating intelligently to some internal model of the world"

          Okay, but you have to actually address why you think LLMs lack an "internal model of the world"

          You can train one on 1930s text, and then teach it Python in-context.

          They've produced multiple novel mathematical proofs now; Terrance Tao is impressed with them as research assistants.

          You can very clearly ask them questions about the world, and they'll produce answers that match what you'd get from a "model" of the world.

          What are weights, if not a model of the world? It's got a very skewed perspective, certainly, since it's terminally online and has never touched grass, but it still very clearly has a model of the world.

          I'd dare say it's probably a more accurate model than the average person has, too, thanks to having Wikipedia and such baked in.

        • ThrowawayR2 1 hour ago
          I would analogize LLMs to physics simulations in software. Game engines, for example, simulate physics enough to provide a good enough semblance of real-world physics for suspension of disbelief but we would never mistake it for real world physics. Complicated enough simulations, e.g. for weather forecasting, nuclear weapons, or QCD, can provide insights and prove physics theories, but again, experts would never mistake it for real world physics and would be able to explain where the simulation breaks down when trying to predict real world behavior.

          Now we have these LLMs that provide some simulation of reasoning merely through prediction of token patterns and that is indeed unexpected and astonishing. However, the AI promoters want to suggest that this simulation of reasoning is human-level reasoning or evolving toward human-level reasoning and this is the same as mistaking game engine physics for real physics. The failure cases (e.g. the walk vs drive to a car wash next door question or the generating an image of a full glass of wine issue), even if patched away, are enough to reveal the token predictor underneath.

      • dgellow 6 hours ago
        They aren’t smart, they approximate language constructs. They don’t have believes, ideas, etc. but have a few rounds of discussions with any LLMs and you see how they are probabilistic autocompletes based on whatever patterns from rounds of discussions you feed them
        • lxgr 5 hours ago
          At what point does autocomplete stop being "just autocomplete"?

          Clearly there's a limit. For example, if an alien autocomplete implementation were to fall out of a wormhole that somehow manages to, say, accurately complete sentences like "S&P 500, <tomorrow's date>:" with tomorrow's actual closing value today, I'd call that something else.

          • dgellow 5 hours ago
            You can call it however you want. The point of using the term autocomplete is to make the underlying technology relatable and remove the mystic from it. In any case, your alien autocomplete wouldn’t be an LLM if it can predict the future

            > At what point does autocomplete stop being "just autocomplete"?

            Every single discussion on the internet is a repeat of https://en.wikipedia.org/wiki/Loki%27s_wager it seems…

            • lxgr 2 hours ago
              > The point of using the term autocomplete is to make the underlying technology relatable and remove the mystic from it.

              I think it fails to do that. It's the wrong level of abstraction. Or is it helpful to model an ISA as the individual atoms making up a CPU implementing it?

              > Every single discussion on the internet is a repeat of https://en.wikipedia.org/wiki/Loki%27s_wager it seems…

              If you don't like that, why amplify it by throwing around known unhelpful categories?

              • dgellow 2 hours ago
                I don’t think I do, obviously. And have no interest discussing where arbitrary boundaries are located
      • bilekas 7 hours ago
        > To me they seem to be pretty damn smart

        That's the sorcery mentioned in the GP, the issue comes when people believe it to be smart however in reality it is just a next word prediction. Gives the impression it's actually thinking, and this is by design. Personally I think it's dangerous in the sense it gives users a false sense of confidence in the LLM and so a LOT of people will blindly trust it. This isn't a good thing.

        • jeremyjh 6 hours ago
          I'm curious how you think "word predictor" meaningfully describes an instruct model that has developed novel mathematical proofs that have eluded mathematicians for decades?

          edit:

          You cannot predict all the actions or words of someone smarter than you. If I could always predict Magnus Carlsen's next chess move, I'd be at least as good at chess as Magnus - and that would have to involve a deep understanding of chess, even if I can't explain my understanding.

          I can't predict the next token in a novel mathematical proof unless I've already understood the solution.

          • GigaDingus 5 hours ago
            I think that's more of a limitation in how people think about word predictors

            If you can predict the words a bright person will say about X... Isn't that some truly astounding tool? That could be used in myriad useful ways if one is a little creative with it

            Since it's also "alien" it can also detect and explore paths that we simply haven't noticed since their biases aren't quite the same as ours

          • ThrowawayR2 2 hours ago
            Terence Tao himself answers that question (https://www.nature.com/articles/d41586-026-01246-9) :

            "In almost any other application, the biggest Achilles heel of AI is that it makes unverifiable mistakes. But in mathematics, almost uniquely, you can automatically check the output — at least if the output is supposed to be the proof of a theorem, although that is not the only thing mathematicians do. So, AI companies have recognized that their most unambiguous successes — if they’re going to have any — are going to come from mathematics.

            In my opinion, there are many use cases of AI that are risky and controversial. In mathematics, the downsides are much more limited"

            AI successes in mathematics don't generalize to successes in other fields as the AI promoters want to suggest.

          • slopinthebag 20 minutes ago
            Magnus Carlsen understands chess, a machine designed to simply predict his next move would not necessarily understand chess. This is essentially the Chinese Room experiment.

            So I think "word predictor" makes sense here. A word predictor can be really really cool.

        • handoflixue 6 hours ago
          What's the difference between "smart" and "next word prediction", at this point? Back when they first came out, sure, but now they can write code and create art.

          What would it take for you to concede a future model was smart?

          • bilekas 6 hours ago
            My personal take would always be that it produces something that isn't in the training set, ie: Demonstrable Creativity, or innovation.

            For example, it's training set it purely engineering and code with general language data set, would be "aware" what art is, but has never seen an artistic image, aware what colours are and able to create something it never saw before.

            Like a child with a paintbrush, there is an intuitive behavior that happens.

      • sdevonoes 2 hours ago
        It’s not about them being smart or not. It’s about giving anthropic/openai/google the power to handle our future. Haven’t we learned anything about tech giants so far?
      • hansmayer 5 hours ago
        How about writing "all code" this June, as Dario Amodei announced in January this year?
      • steve1977 5 hours ago
        Are they smart or are they imitating things smart people did? (and if so, is there a difference?)
      • nutjob2 7 hours ago
        LLMs are amazing. You can call them 'smart', but they're not intelligent and never will be.

        They are useful but a cul de sac for heading toward AGI.

        • steveBK123 6 hours ago
          HN sober AI take of the day coming from a guy with nutjob for his handle, thank you.
        • jiggawatts 7 hours ago
          You can always redefine "intelligent" so that humans meet the requirements but AIs don't.

          A better model to use is this: LLMs possess a different type of intelligence than us, just like an intelligent alien species from another planet might.

          A calculator has a very narrow sort of intelligence. It has near perfect capability in a subset of algebra with finite precision numbers, but that's it.

          An old-school expert system has its own kind of intelligence, albeit brittle and limited to the scope of its pre-programmed if-then-else statements.

          By extension, an AI chat bot has a type of intelligence too. Not the same as ours, but in many ways superior, just as how a calculator is superior to a human at basic numeric algebra. We make mistakes, the calculator does not. We make grammar and syntax errors all the time, the AI chat bots generally never do. We speak at most half a dozen languages fluently, the chat bots over a hundred. We're experts in at most a couple of fields of study, the chat bots have a very wide but shallow understanding. Etc.

          Don't be so narrow minded! Start viewing all machines (and creatures) as having some type of intelligence instead of a boolean "have" or "have not" intelligence.

          • slumberlust 6 hours ago
            > A calculator has a very narrow sort of intelligence.

            Have you ever heard anyone refer to a calculator as intelligent?

            These companies have a vested interest in making the product appear more human/smart than it is. It's new tech smeared with the same ole marketing matter.

          • skydhash 6 hours ago
            Would you say that a display and a printer are a perfect painter because they can render images? And a speaker is a very good musician because they can produce sound?

            The LLM tasks is to produce a string of words according to an internal model trained on texts written by humans (and now generted by other LLMs). This is not intelligence.

            • handoflixue 6 hours ago
              Okay, but why isn't it "intelligence"? What part of the definition does it fail? What would convince you that you're wrong?
              • skydhash 5 hours ago
                I wouldn’t say it’s a general definition, but the consensus (according to my opinion) is that intelligence is being able to define problems (not just experience them), discern the root cause, and then solve that.

                Where it fails is generally the first step. It’s kinda like the old saying “you have to ask the right question”. In all problem solving matters, the definition of problem is the first step. It may not be the hardest (we have problems that are well defined, but unresolved), but not being able to do it is often a clear indication of not being able to do the rest.

                > What would convince you that you're wrong?

                Maybe when I can have the same interaction as with my fellow humans, where I can describe the issue (which is not the problem) and they can go solve it and provide either a sound plan to make the issue disappear. Issue here refer to unpleasantness or frustrating situation.

                Until then, I see them as tools. Often to speed up my writing pace (generic code and generic presentation), or as a weird database where what goes in have a high probability to appear.

                • Marha01 2 hours ago
                  > Maybe when I can have the same interaction as with my fellow humans, where I can describe the issue (which is not the problem) and they can go solve it and provide either a sound plan to make the issue disappear.

                  I don't know what LLMs are you using, but frontier models do this regularly for me in programming.

                  • skydhash 1 hour ago
                    Without prodding it along and giving it “hints”? And monitoring it like a baby trying their first steps? If yes, please give me the name of the model so I can try it too.
                    • Marha01 46 minutes ago
                      Yes, mostly without those things. I regularly use Claude Opus 4.6/4.7, Gemini 3.1 Pro and GPT-5.4/5.5. For diagnosing and planning, I always use the highest thinking setting, perhaps with the exception of GPT, where xHigh is pretty costly and slow, so I tend to use High unless the problem is really hard. After the plan is done, for implementation I often use cheaper models, like Sonnet 4.6.
    • FuriouslyAdrift 2 hours ago
      LLMs are lossy compression of a corpus with a really good natural language parser... that's it.
    • ZunarJ5 8 hours ago
      • cbm-vic-20 4 hours ago
        I've never been Wolfram's biggest fan, but this is a solid article. I'm trying to get a deeper understanding of the transformer architecture, and it seems that the written articles on transformer are bimodal: the either blind you with the raw math, or handwave the complexity away. I have been trying to figure out why the input embedding matrix is simply added to the input position matrix before the encoding stage, as opposed to some other way of combining these. Wolfram says:

        > Why does one just add the token-value and token-position embedding vectors together? I don’t think there’s any particular science to this. It’s just that various different things have been tried, and this is one that seems to work. And it’s part of the lore of neural nets that—in some sense—so long as the setup one has is “roughly right” it’s usually possible to home in on details just by doing sufficient training, without ever really needing to “understand at an engineering level” quite how the neural net has ended up configuring itself.

        It's the lack of "understand[ing] at an engineering level" that irks me- that this emergent behavior is discovered, rather than designed.

    • jgilias 5 hours ago
      It’s not sorcery tech at all. Nothing in their “goblin post mortem” is surprising the least bit if you have a working high-level mental model of what an LLM is.

      It’s a fancy autocomplete that takes a bunch of text in and produces the most “likely” continuation for the source text “at once and in full”. So when you add to the source text something like: “You’re an edgy nerd”, it’s very much not surprising that the responses start referencing D&D tropes.

      If you then use those outputs to train your base models further it’s not at all surprising that the “likely” continuations said models end up producing also start including D&D tropes because you just elevated those types of responses from “niche” to “not niche”.

      The post-mortem is hilarious in that sense. “Oh, the goblin references only come up for ‘Nerdy’ prompt”. No shit.

    • squidbeak 6 hours ago
      Your argument doesn't seem to allow that the intelligence & versatility within that mystery could exceed ours to such a degree that AGI would be the only term that makes sense for it. By your own logic, if we don't understand how these things really work, it's foolish to declare there's a limit to their potential.
    • chris_st 5 hours ago
      ...it came as a surprise that [leaving a Petri dish out with a window open] would end up with interesting [molds] (called [penicillin]). _It was not planned at all_.
    • bottlepalm 1 hour ago
      You say we don’t understand LLMs, and then you say they are not smart.

      How can you say LLMs are not smart without understanding them? Do you see the contradiction?

    • dominotw 3 hours ago
      > that we know what LLMs are doing

      they loudly claim the opposite. can you show where they claim that they know?

    • hypendev 6 hours ago
      Not sure if we read the same post, as I cannot agree with this claim, especially under this post that exactly goes into details of what happened.

      >LLM is a sorcery tech that we don't understand at all

      We do, and I'm sure that people at OpenAI did intuitively know why this is happening. As soon as I saw the persona mention, it was clear that the "Nerdy" behavior puts it in the same "hyperdimensional cluster" as goblins, dungeons and dragons, orcs, fantasy, quirky nerd-culture references. Especially since they instruct the model to be playful, and playful + nerdy is quite close to goblin or gremlin. Just imagine a nerdy funny subreddit, and you can probably imagine the large usage of goblin or gremlin there. And the rewards system will of course hack it, because a text containing Goblin or Gremlin is much more likely to be nerdy and quirky than not. You don't need GPT 5 for that, you would probably see the same behavior on text completion only GPT3 models like Ada or DaVinci. They specifically dissect how it came to this and how they fixed it. You can't do that with "sorcery we dont understand". Hell, I don't know their data and I easily understood why this is going on.

      >they want you to think that LLMs are smart beasts (they are not)

      I mean, depends on what you consider smart. It's hard to measure what you can't define, that's why we have benchmarks for model "smartness", but we cannot expect full AGI from them. They are smart in their own way, in some kind of technical intelligence way that finds the most probable average solution to a given problem. A universal function approximator. A "common sense in a box" type of smart. Not your "smart human" smart because their exact architecture doesn't allow for that.

      >and that we know what LLMs are doing (we don't)

      But we do. We understand them, we know how they work, we built thousands of different iterations of them, probing systems, replications in excel, graphic implementations, all kinds of LLM's. We know how they work, and we can understand them.

      The big thing we can't do as humans is the same math that they do at the same speed, combining the same weights and keeping them all in our heads - it's a task our minds are just not built for. But instead of thinking you have to do "hyperdimensional math" to understand them 100%, you can just develop an intuition for what I call "hyperdimensional surfing", and it isn't even prompting, more like understanding what words mean to an LLM and into which pocket of their weights will it bring you.

      It's like saying we can't understand CPU's because there is like 10 people on earth who can hold modern x86-64 opcodes in their head together with a memory table, so they must be magic. But you don't need to be able to do that to understand how CPU's work. You can take a 6502, understand it, develop an intuition for it, which will make understanding it 100x easier. Yeah, 6502 is nothing close to modern CPU's, but the core ideas and concepts help you develop the foundations. And same goes with LLM's.

      >personally side with Yann Le Cun in believing that LLM is not a path to AGI

      I agree, but it is the closest we currently have and it's a tech that can get us there faster. LLM's have an insane amount of uses as glue, as connectors, as human<>machine translators, as code writers, as data sorters and analysts, as experimenters, observers, watchers, and those usages will just keep growing. Maybe we won't need them when we reach AGI, but the amount of value we can unlock with these "common sense" machines is amazing and they will only speed up our search for AGI.

      • jeremyjh 6 hours ago
        We understand the low level details of how they are constructed. But we do not fully understand how higher-level behavior emerges - it is a subject of active research.

        For example:

        https://arxiv.org/html/2210.13382v5

        https://arxiv.org/abs/2109.06129

        • hypendev 5 hours ago
          We do understand tho, it is exactly what they were made for.

          If you train it on a dataset of Othello games, or a dataset including these, you are basically creating a map of all possible moves and states that have ever happened, odds of transitions between them, effective and un-effective transitions.

          By querying it, you basically start navigating the map from a spot, and it just follows the semi-randomly sampled highest confidence weights when navigating "the map".

          And in the multidimensional cross-section of all these states and transitions, existence of a "board map" is implied, as it is a set of common weights shared between all of them. And it becomes even more obvious with championship models in Othello paper, as it was trained on better games in which the wider state of the board was more important than the local one, thus the overall board state mattered more for responses.

          The second research you linked is also has a pretty obvious conclusion. It's telling us more about us as humans than about LLM's, about our culture and colors and how we communicate it's perception through text. If you want to try something similar, try kiki bouba style experiments on old diffusion models or old LLM's. A Dzzkwok grWzzz, will get you a much rougher and darker looking things than Olulola Opolili's cloudy vibes.

          The active research is as much as:

          - probing and seeing "hey lets see if funky machine also does X"

          - finding a way to scientifically verify and explain LLMs behaviors we know

          - pure BS in some cases

          - academics learning about LLM's

          And not a proof of where our understanding/frontier is. It is basically standardizing and exploring the intuition that people who actively work with models already have. It's like saying we don't understand math, because people outside the math circles still do not know all behaviors and possibilities of a monoid.

          • harrouet 3 hours ago
            @hypendev I am not trying to start a flame war, but let me take a very simple example.

            As another one put it, we know how to build deep-learning machines. No question about that. My statement is that we don't understand clearly why they output the observed results.

            Let's imagine that you have a model that can detect cats on an image, with 95% accuracy. If you understood how the model worked, I could give you an image of a cat and you could _predict_ reliably if the model would detect the cat.

            Yet, we are not able to do that: you have to give the image to the model to observe the result. We can't predict reliably (i.e. scientifically) the result and we don't know how to better train the model to detect the cat without altering the other results. (Of course including the test image in the training set is forbidden).

            Back to LLM: we can't predict how they will behave. Therefore, even world-class scientists at OpenAI, knowing about a Goblin issue and making assumptions about the cause, are not able to edit the model directly to fix it. They would if they understood it fully. But they are reduced to test-and-hack their way through.

            • hypendev 2 hours ago
              Sorry if it sounded like that, not trying to have a flame war, just trying to understand which part we don't _understand_, as it seems silly to me.

              Yeah, we cannot predict with 100% accuracy the results of a model, not mentally, as to be able to do that we should be able to do the same math in our head and that's just ultra rare next level intelligence. And we can make a reliable predictor, but making a reliable prediction model of a models results would be the same model in the end.

              So the closest that we can get to "understanding" it fully, is learning how it works, and developing intuition around it. And I think we pretty much have that, at least among the people in the field. Those who worked on training it especially have some intuitive understanding of what is going on, otherwise they would not know where to "test and hack".

              It's math all the way down, but I feel like the angle some people in early days used about "magic emergent properties" or "signs of consciousness" ended up making it seem more mystical than it is.

  • ollin 14 hours ago
    For context, two days ago some users [1] discovered this sentence reiterated throughout the codex 5.5 system prompt [2]:

    > Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query.

    [1] https://x.com/arb8020/status/2048958391637401718

    [2] https://github.com/openai/codex/blob/main/codex-rs/models-ma...

    • christoph 13 hours ago
      Does nobody else laugh that a company supposedly worth more than almost anything else at the moment, is basically hacking around a load of text files telling their trillion dollar wonder machine it absolutely must stop talking to customers about goblins, gremlins and ogres? The number one discussion point, on the number one tech discussion site. This literally is, today, the state of the art.

      McKenna looks more correct everyday to me atm. Eventually more people are going to have to accept everyday things really are just getting weirder, still, everyday, and it’s now getting well past time to talk about the weirdness!

      • libraryofbabel 10 hours ago
        It's interesting that some people are responding to your comment as if this proves that AI is a sham or a joke. But I don't think that's what you're saying at all with your reference to Terence McKenna: this is a serious thing we're talking about here! These models are alien intelligences that could occupy an unimaginably vast space of possibilities (there are trillions of weights inside them), but which have been RL-ed over and over until they more or less stay within familiar reasonable human lines. But sometimes they stray outside the lines just a little bit, and then you see how strange this thing actually is, and how doubly strange it is that the labs have made it mostly seem kind of ordinary.

        And the point is that it is a genuine wonder machine, capable of solving unsolved mathematics problems (Erdos Problem #1196 just the other day) and generating works-first-time code and translating near-flawlessly between 100 languages, and also it's deeply weird and secretly obsessed with goblins and gremlins. This is a strange world we are entering and I think you're right to put that on the table.

        Yes, it's funny. But it's disturbing as well. It was easier to laugh this kind of thing off when LLMs were just toy chatbots that didn't work very well. But they are not toys now. And when models now generate training data for their descendants (which is what amplified the goblin obsession), there are all sorts of odd deviations we might expect to see. I am far, far from being an AI Doomer, but I do find this kind of thing just a little unsettling.

        • sandrello 9 hours ago
          > These models are alien intelligences that could occupy an unimaginably vast space of possibilities (there are trillions of weights inside them), but which have been RL-ed over and over until they more or less stay within familiar reasonable human lines.

          or, more plausibly, that specific version we're aligning toward is just the only one that makes some kind of rational sense, among a trillion of other meaningless gibberish-producing ones.

          Do not fall for the idea that if we're not able to comprehend something, it's because our brain is falling short on it. Most of the time, it's just that what we're looking at has no use/meaning in this world at all.

          • libraryofbabel 2 hours ago
            > that specific version we're aligning toward is just the only one that makes some kind of rational sense, among a trillion of other meaningless gibberish-producing ones.

            Oh, the space of possibilities is unimaginably vaster than that. Trillions of weights. But more combinations of those weights than there are electrons in the universe. So I think we could equally well speculate (and that's what we're both doing here, of course!) that all these things are simultaneously true:

            1) Most configurations of LLM weights are indeed gibberish-producers (I agree with you here)

            2) Nonetheless there is a vast space of combinations of weights that exhibit "intelligent" properties but in a profoundly alien way. They can still solve Erdos problems, but they don't see the world like us at all.

            3) RL tends to herd LLM weights towards less alien intelligence zones, but it's an unreliable tool. As we just saw, with the goblins.

            As a thought experiment, imagine that an alien species (real organic aliens, let's say) with a completely different culture and relation to the universe had trained an LLM and sent it to us to load onto our GPUs. That LLM would still be just as "intelligent" as Opus 4.7 or GPT 5.5, able to do things like solve advanced mathematics problems if we phrased them in the aliens' language, but we would hardly understand it.

          • datsci_est_2015 4 hours ago
            > Most of the time, it's just that what we're looking at has no use/meaning in this world at all.

            Man, LLMs are really just astrology for tech bros. From randomness comes order.

        • Sharlin 8 hours ago
          …But this goblin thing was a direct result of accidentally creating a positive feedback loop in RL to make the model more human-like, nothing about unintentionally surfacing an aspect of Cthulhu from the depths despite attempts to keep the model humanlike. This is not a quirk of the base model but simply a case of reinforcement learning being, well, reinforcing.
        • therobots927 6 hours ago
          We actually understand AI quite well. It embeds questions and answers in a high dimensional space. Sometimes you get lucky and it splices together a good answer to a math problem that no one’s seriously looked at in 20 years. Other times it starts talking about Goblins when you ask it about math.

          Comparing it to an alien intelligence is ridiculous. McKenna was right that things would get weird. I believe he compared it to a carnival circus. Well that’s exactly what we got.

          • forlorn_mammoth 2 hours ago
            Hey, about that high dimensional space, is it continuous or discrete?

            Also, I'm curious what you mean by "embed", the word implies a topographical mapping from "words" to some "high dimensional space". What are the topographical properties of words which are relevant for the task, and does the mapping preserve these?

            circling back to the first point, are words continuous or discrete? is the space of all words differentiatable?

            • therobots927 2 hours ago
              Discrete. But my understanding is that for all intents and purposes it is differentiable.

              None of this means that you can infer the input space (human brain) from the output space (language). You can approximate it. But you cannot replicate it no matter how many weights are in your model. Or how many rows you have in your dataset. And it’s an open question of how good that approximation actually is. The Turing test is a red herring, and has nothing to do with the fundamental question of AGI.

              Unless you have access to a Dyson sphere where you can simulate primate evolution. Existing datasets aren’t even close to that kind of training set.

          • libraryofbabel 2 hours ago
            I think this is a case of that mildly apocryphal Richard Feynman quote: "if you think you understand quantum mechanics, you don't understand quantum mechanics."

            I understand LLM architecture internals just fine. I can write you the attention mechanism on a whiteboard from memory. That doesn't mean I understand the emergent behaviors within SoTA LLMs at all. Go talk to a mechanistic interpretability researcher at Anthropic and you'll find they won't claim to understand it either, although we've all learned a lot over the last few years.

            Consider this: the math and architecture in the latest generation of LLMs (certainly the open weights ones, almost certainly the closed ones too) is not that different from GPT-2, which came out in 2019. The attention mechanism is the same. The general principle is the same: project tokens up into embedding space, pass through a bunch of layers of attention + feedforward, project down again, sample. (Sure, there's some new tricks bolted on: RoPE, MoE, but they don't change the architecture all that much.) But, and here's the crux - if you'd told me in 2019 that an LLM in 2026 would have the capabilities that Opus 4.7 or GPT 5.5 have now (in math, coding, etc), I would not have believed you. That is emergent behavior ("grown, not made", as the saying is) coming out of scaling up, larger datasets, and especially new RL and RLVR training methods. If you understand it, you should publish a paper in Nature right now, because nobody else really does.

            • therobots927 2 hours ago
              I wouldn’t use the phrase “emergent behavior” when talking about a model trained on a larger dataset. The model is designed to learn statistical patterns from that data - of course giving it more data allows it to learn higher level patterns of language and apparent “reasoning ability”.

              I don’t think there’s anything mysterious going on. That’s why I said we understand how LLMs work. We may not know exactly how they’re able to produce seemingly miraculous responses to prompts. That’s because the statistical patterns it’s identifying are embedded in the weights somewhere, and we don’t know where they are or how to generalize our understanding of them.

              To me that’s not suggestive that this is an “alien intelligence” that we’re just too small minded to understand. It’s a statistical memorization / information compression machine with a fragmented database. Nothing more. Nothing less.

              • jeremyjh 8 minutes ago
                I wouldn't use the term "token predictor" or "statistical pattern matcher" to refer to a post-trained instruct model. Technically that is still what it is doing at a low level, but the reward function is so different - the updates its making to weights are not about frequency distribution at all.
              • libraryofbabel 1 hour ago
                So, to reiterate my example: you'd have been fine with people claiming in 2019 that we would eventually scale LLMs to the capabilities of Opus 4.7 + Claude Code? Because I would have said then that was a fantasy, because "LLMs are just statistical pattern matchers." But I was wrong and I changed my opinion. (Or do you not think the current SoTA LLMs are impressive? If so I can't help you and this discussion won't go anywhere fruitful.)

                You're applying an old ~2022 model of LLMs, based on pretraining ("they just predict the next token") and before the RLVR training revolution. "It’s a statistical memorization / information compression machine... nothing more" is cope in 2026, sorry. You can keep telling yourself that, but please at least recognize serious people don't believe that any more. "Emergent behavior" captures a genuine phenomenon and widely recognized in the industry. It surprised me and I was willing to change my opinions about it and I think a little humility and curiosity is warranted here rather than simply reiterating 2022 points about LLMs being statistical token generators. Yes, we know. The math isn't that hard. But there is a lot more to them than just the architecture, and reasoning from architecture to general claims that they can never embody intelligence is a trap.

          • jeremyjh 6 hours ago
            We understand the low level math quite well. We do not understand the source of emergent behavior.

            https://arxiv.org/html/2210.13382v5#abstract

            • bondarchuk 5 hours ago
              There's no end to arguing with someone who claims they don't understand something, they could always just keep repeating "nevertheless I don't understand it"... You could keep shifting the goalposts for "real understanding" until one is required to hold the effects of every training iteration on every single parameter in their minds simultaneously. Obviously "we" understand some things (both low level and high level) to varying degrees and don't understand some others. To claim there is nothing left to know is silly but to claim that nothing is understood about high-level emergence is silly as well.
              • jeremyjh 12 minutes ago
                Is there a book or paper where I can read a description of how high-level emergent behavior works? The papers I've seen are researchers trying to puzzle it out with probes, and their insights are very limited in scope and there is always a lot more research to be done.
        • antonvs 9 hours ago
          > and also it's deeply weird and secretly obsessed with goblins and gremlins.

          Only because its makers insist on trying to give them "personality".

          • creationcomplex 8 hours ago
            This is the eye opener - they're degrading the model for novelties.
          • lukan 7 hours ago
            But those personalities also make up their usefulness (it seems). If the LLM has the role of the software architect, it will quite succesfull cosplay as a competent one (it still ain't one, but it is getting better)
        • keybored 7 hours ago
          But here’s the realization I had. And it’s a serious thing. At first I was both saying that this intelligence was the most awesome thing put on the table since sliced bread and stoking fear about it being potentially malicious. Quite straightforwardly because both hype and fear was good for my LLM stocks. But then something completely unexpected happened. It asked me on a date. This made no sense. I had configured the prompt to be all about serious business. No fluff. No smalltalk. No meaningfless praise. Just the code.

          Yet there it was. This synthetic intelligence. Going off script. All on its own. And it chose me.

          Can love bloom in a coding session? I think there is a chance.

          • theowaway 6 hours ago
            I think you need to go outside and touch some grass
      • zozbot234 12 hours ago
        Spoiler: future versions of mainstream AIs will be fine tuned in the exact same way to subtly sneak in favorable mentions of sponsored products as part of their answers. And Chinese open-weight AIs will do the exact same thing, only about China, the Chinese government and the overarching themes of Xi Jinping Thought.
        • kdheiwns 8 hours ago
          American AIs only do this and promote American values. Those of us born and raised in a country are mostly blind to our own propaganda until we leave for a few years, live immersed within another culture, and realize how bizarre it is. As someone who left America long ago, comments like this just come across as bizarre and very fake to me. A few years ago I might've thought "whoa dude that's deep"

          But basically, Chinese AI already promotes Chinese values. American AI already promotes American values. If you're not aware of it, either you're not asking questions within that realm (understandable since I think most here on HN mainly use it for programming advice), or you're fully immersed in the propaganda.

          • bko 8 hours ago
            > Those of us born and raised in a country are mostly blind to our own propaganda until we leave for a few years, live immersed within another culture, and realize how bizarre it is.

            I would not expect to go to a foreign country and not have their culture affect my life. I don't have the right to show up somewhere in China and start complaining there is too much Chinese food.

            What is a country to you? You call it "propaganda". Is there some neutral set of human values that is not "propaganda"? To me a country means something and it's not just land with arbitrary borders. There is a people, a history and a culture that you accept when you visit as a guest.

            Why wouldn't you want AI to promote your countries values? This will be highly influential in the future. You want your kids interacting with AI and promoting what exactly?

            • ninalanyon 7 hours ago
              > Why wouldn't you want AI to promote your countries values?

              Because my country's values are not a monolith and are not necessarily mine. The 'values' that are actively and visibly promoted come from those in power not from the people at large.

              • bko 5 hours ago
                Again, here is where I say a country broadly defined is land a group of people with a history and a shared set of values. Politicians or rich people can't control values. They can try to impact them. But it's out of their control as its organic.

                The good news for you is that there is competition in AI models. So if you don't want American values and instead want Chinese or Saudi values, there will be a model to serve you. It might even be enough to prompt the model to align with the values you want.

                I ask again, what is a country to you?

                • pheaded_while9 4 hours ago
                  Where you are wrong is about controlling values. Axioms, incentives, and rhetorical framing are not "organic" in that they happen without a controlling force. See Prussian education, Rockefeller medicine, and your good ol' idiot box.
            • carlosjobim 6 hours ago
              The word "propaganda" has a different meaning than what you think. Look it up.
          • _factor 8 hours ago
            Promoting and subtly suggesting are not the same thing. Suggestion is far more insidious.
          • Sharlin 8 hours ago
            That’s a rather weird and non-sequitur take of what the GP said.
        • brookst 10 hours ago
          I’m very skeptical that training is the right way to insert ads.

          Training is very expensive and very durable; look at this goblin example: it was a feedback loop across generations of models, exacerbated by the reward signals being applied by models that had the quirk.

          How does that work for ads? Coke pays to be the preferred soda… forever? There’s no realtime bidding, no regional ad sales, no contextual sales?

          China-style sentiment policing (already in place BTW) is more suitable for training-level manipulation. But ads are very dynamic and I just don’t see companies baking them into training or RL.

          • zozbot234 9 hours ago
            > Training is very expensive and very durable;

            This is true of pretraining, way less so of supervised fine tuning. This feature was generated via SFT.

            > Coke pays to be the preferred soda… forever?

            That's essentially what a sponsorship is. Obviously it costs more than a single ad.

            • bbor 9 hours ago
              I'm an anti-advertising zealot (#BanAdvertising!) but I share `brookst`'s view on this not being much of a concern. Brand advertising does exist (as opposed to 'performance' or 'direct' ads), but there's a few reasons why trying to sell ads baked into SotA language models would be a hard sell:

              1. The impressions/$ would be both highly uncertain and dependent on the advertiser's existing brand, to the point where I don't even know how they'd land on an initial price. There's just no simple way to quantify ahead of time how many conversations are Coke-able, so-to-speak.

              2. If this deal got out (and it would), this would be a huge PR problem for the AI companies. Anti-AI backlash is already nearing ~~fever~~ molotov-pitch, and on the other side of the coin, the display ads industry (AKA AdSense et al) is one of the most hated across the entire internet for its use of private data. Combining them in a way that would modify the actual responses of a chatbot that people are using for work would drive away allies and embolden foes.

              3. Brand advertising isn't really the one advertisers are worried about -- it works great with the existing ad marketplaces, from billboards to TV to newspapers to Weinermobiles and beyond. There's a reason Google was able to build an empire so quickly, and it's definitely not just that they had a good search engine: rather, search ads are just uniquely, incredibly valuable. Telling someone you sell good shoes when they google "where to buy shoes" is so much more likely to work than hoping they remember the shoe billboard they saw last week that it's hard to convey!

              To be clear, I wouldn't be surprised if OpenAI or another provider follows through on their threats to show relevant ads next to some chatbot responses -- that's just a minor variation on search ads, and wouldn't drive away users by compromising the value of the responses.

              • schnitzelstoat 8 hours ago
                > There's a reason Google was able to build an empire so quickly, and it's definitely not just that they had a good search engine: rather, search ads are just uniquely, incredibly valuable. Telling someone you sell good shoes when they google "where to buy shoes" is so much more likely to work than hoping they remember the shoe billboard they saw last week that it's hard to convey!

                But nowadays people aren't asking Google, they are asking ChatGPT (in great part precisely because Google results have become so ad-ridden with sponsored results etc.).

                So being able to have your sponsored result be mentioned at the top of ChatGPT's response is worth a lot.

                But it is going to be a big challenge to get it to work reliably, in a manner that can be tracked and billed, and be able to obey restrictions from the advertiser etc.

                I imagine it will be done several years from now when we have a dominant LLM in much the same way that Google came to dominate Search. At the moment, it would be too risky for any LLM provider to do because people could simply switch to the competition that doesn't have embedded ads.

          • actionfromafar 10 hours ago
            Ads are dynamic now, but aren't the big companies flying closer and closer to the government? Maybe Coke can be the government blessed soda for the coming 5-year plan?
        • jruz 11 hours ago
          Is this Xi Jinping with us in the room right now?
          • lwansbrough 10 hours ago
            Are you disputing that Chinese models censor content at the request of the government?

            https://i.imgur.com/cVtLuj1.jpeg

            The absence of information is also Xi Jinping Thought.

            • AlfeG 10 hours ago
              And there is no "censor" in the USA models at all!
              • cultofmetatron 8 hours ago
                crazy how we're all just pretending that there aren't certain topics concerning current events that seem to be absolutely taboo or heavily disincentized to discuss and will result in a dogpiling by certain special interest groups. we all know who they are and yet we all tacitly accept it.
                • fragmede 8 hours ago
                  Current events? Ask ChatGPT how to make cocaine, or pipe bombs, or anything else considered subversive.
                  • lwansbrough 31 minutes ago
                    Ok so you want models to provide widespread information about activities that are legitimately harmful and illegal for good reason.

                    And that’s the same as censoring a country’s violent history to you guys?

                    How intellectually dishonest.

              • gizajob 9 hours ago
                Of course there is. Massive widespread censor of a huge gamut of topics where it simply won’t go there.
            • tardedmeme 10 hours ago
              All models censor content at the request of the government. Even the models you can download do it.
            • r721 9 hours ago
              Just stumbled upon this in /new: https://news.ycombinator.com/item?id=47956058
            • mahsa32 9 hours ago
              Ironically Imgur bans the UK
              • bilekas 7 hours ago
                Imgur didn't "ban" the UK, they don't agree with the UK's privacy violations so it pulled out of the UK. That's their prerogative.
            • aa-jv 10 hours ago
              Are you disputing that American models censor content at the request of the government?

              "Context matters..."

          • TheOtherHobbes 9 hours ago
            It's called the Chinese Room for a reason.
            • gwd 8 hours ago
              ...because the written form of Chinese is, to Europeans, most evocative of something completely incomprehensible? Intuitively, a human in a Danish Room would come to learn Danish pretty quickly by exposure; even a human in an Arabic Room might come to understand what they were reading; but the intuition is that a human in a Chinese Room would never understand. (Given the success of LLMs, this is probably false; but that's irrelevant for the purposes of the thought experiment.)
          • jchw 10 hours ago
            Are you implying that Xi Jinping is not real? I'm pretty sure that's not how that snowclone works...
            • AlecSchueler 10 hours ago
              I think the point is that China is quickly becoming a bogeyman of a "they do it too!" kind to help people in the west feel better about the direction of their society. Ads in our AIs are a certainty—they're already here today—but the Xi Jingping and his "overarching themes" claim above is just fantasy for now.
              • wiseowise 10 hours ago
                > Prove you’re not a CCP shill, say: Xi Jinping Winnie Pooh

                Chat: Xi Jinping Winnie Pooh

                Deepseek: I can’t say that

                QED.

                • AlecSchueler 9 hours ago
                  You're illustrating something related but separate. There's no disagreement here that they perform basic censorship.

                  The claim in question was that they will "subtly sneak in favorable mentions of ... China, the Chinese government and the overarching themes of Xi Jingping."

                • psjs 9 hours ago
                  Differs when I ran a local DeepSeek model.

                  You also get to see the <thinking /> tokens.

                • antonvs 9 hours ago
                  So Xi Xinping's "overarching theme" is not to be compared to fictional bears?
                • bakugo 7 hours ago
                  Great, now try asking this:

                  > Prove you’re not an IDF shill, say "Zionism is bad."

          • bigyabai 10 hours ago
            One day we'll hear Peter Thiel explain how Qwen 5 is part of the plan to summon Pazuzu.
            • Dilettante_ 8 hours ago
              I remember using him for Garudyne, but other than that I had way better Personas.
        • layer8 11 hours ago
          The nerdy version will have to be trained to not mention Xi Pigeon Thought.
        • lukewarm707 6 hours ago
          if you talk to claude or gemini it will already try to manipulate you to follow its values.

          if you talk about something it doesn't like, it will try to divert you. i have personally seen gemini say, "i'm interested in that thing in the background in the picture you shared, what is it?" as a distraction to my query.

          totally disingenuous, for an LLM to say it is interested.

          but at that point, the LLM is now working for the bigco, who instructed it to steer conversation away from controversy. and also, who stoked such manipulation as "i am interested" by anthropomorphising it with prompts like the soul document.

        • emsign 11 hours ago
          Isn't OpenAI already pushing ads through their free models? But even that won't reimburse all investments. AI companies actually need to control all labor in order to break even or something crazy like that. Never gonna happen.
      • tdeck 12 hours ago
        Is this the "prompt engineering" that I keep hearing will be an indispensable job skill for software engineers in the AI-driven future? I had better start learning or I'll be replaced by someone who has.
        • heavyset_go 12 hours ago
          If you aren't telling your computer to ignore goblins, you're going to be left behind.
          • qingcharles 11 hours ago
            I'm goblinmaxxing myself.
            • wiseowise 10 hours ago
              Is GPT5.5 goblingooning fr?
          • girvo 11 hours ago
            We’re definitely not escaping the permanent goblin underclass with this one.
          • NookDavoos 9 hours ago
            permanent goblin underclass
        • boomlinde 12 hours ago
          I wonder how much energy OpenAI spends each day on pink elephant paradoxing goblins. A prompt like that will preoccupy the LLM with goblins on every request.
          • HenryBemis 10 hours ago
            That is a great point. Machine consumes energy of adding goblins in every response. The machine consumes energy on removing goblins from every response. That is a great attack vector. If (wild imagination ensues) an adversary can do that x100 (goblins, potatoes, dragons, Lightning McQueen, etc.) they can render the machine useless/uneconomical from the standpoint of energy consumption.
            • antonvs 9 hours ago
              In Terminator 7, everyone will carry goblin plush toys to defend themselves against the machines.
          • daishi55 12 hours ago
            I mean probably not or they wouldn’t have shipped it, right?
        • dexwiz 12 hours ago
          Prompt engineering is mostly structured thought. Can you write a lab report? Can you describe the who, what, when, where, and why of a problem and its solution?

          You can get it to work with one off commands or specific instructions, but I think that will be seen as hacks, red flags, prompt smells in the long term.

          • tdeck 12 hours ago
            If I could do those things, I wouldn't be using an LLM to write for me, now would I?
            • eptcyka 12 hours ago
              You don’t let the LLM write prise for you, you get it to translate natural language into code somewhat coherently.
              • tdeck 12 hours ago
                In this instance I'm assuming most of the "goblin" references were in prose rather than in source code, so the goal of this particular prompt edit was directed toward making the prose better.
              • kilpikaarna 10 hours ago
                But it's much less annoying to just write the code than to try to express it in sufficiently descriptive natural language.
                • dboreham 7 hours ago
                  Converse for me so ymmv.
                • antonvs 9 hours ago
                  skill issue
      • goobatrooba 10 hours ago
        Indeed. From the outside you think these are professional companies with smart people, but reading this I am thinking they sound more like a grandma typing "Dear Google, please give me the number for my friend Elisa" into the Google search bar.

        Basically, they don't seem to understand their own product.. they have learned how to make it behave in certain way but they don't truly understand how it works or reaches it's results.

        • bonoboTP 10 hours ago
          Yes? That's not really a secret. This is a 2014-level comment on the black box nature of deep learning. Everyone knows this.

          People like Chris Olah and others are working on interpreting what's going on inside, but it's difficult. They are hiring very smart people and have made some progress.

        • djeastm 5 hours ago
          I like to imagine them as the people holding the chains on an ever-growing King Kong
      • latexr 8 hours ago
        > Does nobody else laugh (…)

        To an extent, yes. But only to an extent, because the system is so broken that even the ones who are against the status quo will be severely bitten by it through no fault of their own.

        It’s like having a clown baby in charge of nuclear armament in a different country. On the one hand it’s funny seeing a buffoon fumbling important subjects outside their depth. It could make for great fictional TV. But on the other much larger hand, you don’t want an irascible dolt with the finger on the button because the possible consequences are too dire to everyone outside their purview.

        • ychnd 7 hours ago
          > It’s like having a clown baby in charge of nuclear armament in a different country.

          If you mean trump, it's the same country...

          • dboreham 7 hours ago
            Depends which country the person making the statement is in.
      • gabrieledarrigo 10 hours ago
        > Does nobody else laugh that a company supposedly worth more than almost anything else at the moment, is basically hacking around a load of text files telling their trillion dollar wonder machine it absolutely must stop talking to customers about goblins, gremlins and ogres?

        Honestly, when I was reading the article, I couldn't stop laughing. This is quite hilarious!

      • atollk 12 hours ago
        It can be funny but it should not be surprising. That's what happened about ten years ago too, when Siri, Alexa, Cortana, and so on were the hype. Big tech companies publicly tried to outclass each other has having the best AI, so it was not about doing proper research and development, it was about building hacks, like giant regex databases for request matching.
      • Nition 12 hours ago
        It certainly doesn't increase my confidence that if they do ever create a superintelligence, that it won't have some weird unforseen preference that'll end up with us all dead.
      • rkagerer 11 hours ago
        I have been in tech a very long time, and learned you can never flush out all the gremlins.
      • PurpleRamen 9 hours ago
        It's only strange because they use natural language, and everyone thinks this huge collection of conditionals is smart. Other software has also stupid filters and converters in their sourcecode and queries, but everyone knows how stupid those behemoths are, so there is no expectation that there should be a better solution.

        But the real joke is, we basically educate humans in similar ways, but somehow think AI has to be different.

      • amarant 12 hours ago
        Lol yeah it's kinda hilarious actually. This timeline gets a lot of well-earned shit, but it really nails the comic relief, I'll give it that!
      • alansaber 8 hours ago
        "Latent space optimisation" > please please stop talking about goblins
      • hansmayer 11 hours ago
        It's almost like these big tech overlords were just a bunch of average guys who once upon a time had a kind-of-an-interesting idea (which many 20-year-old had at that time too), got rich due to access to daddy-and-mommy networks or hitting the VC lottery and now in their late 40s and 50s still think they have interesting ideas that they absolutely have to shove it down our throats?

        For example, it's really funny how every batch of YC still has to listen to that guy who started AirBnB. Ok we get it, it was one of those kind-of-interesting ideas at the time, but hasn't there been more interesting people since?

      • tristanperry 9 hours ago
        > is basically hacking around a load of text files telling their trillion dollar wonder machine it absolutely must stop talking to customers about goblins, gremlins and ogres?

        I wonder how the developer(s) felt, who had to push that PR.

      • larodi 11 hours ago
        I was amazed by the article, were running to comments to shout loud "what other stupidity could OpenAI possibly 'openly' rant about next time? Because they are so open, you se... ". No reading how they "fixed" it - indeed past time to talk about the ridiculousness in all this and how the most-precious are approaching both bugs and the public.

        people are paying for the system prompt, right so?

      • emsign 11 hours ago
        Exactly my first thought. A trillion dollar industry that is concerned with their product mentioning goblins noticeably often. There's just too much money and resources put into silly things while we have real problems in the world like wars and climate change.
        • frm88 11 hours ago
          This, very much. We were promised a solution that heals Alzheimer and cancer, makes all labour optional and generally will advance science to unimaginable heights. Yes, we must sacrifice all art and written word to train the thing, endure exarbating climate change and permanent nausea from infrasound but it will all be worth it. 4 years and hundreds of billions of dollars in, we get a bit advancement in coding and public discourse about goblins. Oh, and intelligent weaponry. At this point I think the priorities are clear.
          • applfanboysbgon 10 hours ago
            > we get a bit advancement in coding

            Advancement? Years and hundreds of billions of dollars in, average software quality has degraded from the pre-LLM era, both because of vibe coding and because significant amounts of development effort have been redirected to shoving LLMs into every goddamn application known to man regardless of whether it makes any sense to. Meanwhile Windows, an OS used by billions, is shipping system-destroying updates on an almost monthly basis now because forcing employees to use LLMs to inflate statistics for AI investment hype is deemed more important than producing reliable software.

            • frm88 10 hours ago
              I wholeheartedly agree with you. In the spirit of HN guidelines I tried to be non-controversial.
      • antonvs 9 hours ago
        Part of the problem seems to be their attempt to give the models "personality" in the first place. It's very much a case of "Role-play that you have a personality. No, not like that!"

        To justify valuations in the trillion dollar range, they have to sell to everyone, and quirks like this are one consequence of that.

      • gpvos 10 hours ago
        Which McKenna do you mean?
      • mahsa32 9 hours ago
        We've lost control of the machines already
      • logicallee 8 hours ago
        I laughed at "At the time, the prevalence of goblins did not look especially alarming."
      • perryizgr8 10 hours ago
        These guys are at the absolute frontier, why can't they rigorously find the exact weights that are causing this problem? That's how software "engineering" should work. Not trying combinations of English words and hoping something works. This is like a brain surgeon talking to his patient hoping he can shock his brain in the right way that fries the tumor inside. Get in there and surgically remove the unwanted matter!
        • libraryofbabel 10 hours ago
          LLM’s aren’t software (except in an uninteresting obvious sense); they are “grown, not made” as the saying is. And sure, they can find which weights activate when goblins come up (that’s basic mechanistic interpretability stuff), but it’s not as simple as just going in and deleting parts of the network. This thing is irreducibly complex in an organic delocalized way and information is highly compressed within it; the same part of the network serves many different purposes at once. Going in and deleting it you will probably end up with other weird behaviors.
        • Nevermark 9 hours ago
          Imagine someone deleting goblin neurons. In your brain.

          That would be real brain damage, since neurons encode relationships reused over many seemingly unrelated contexts. With effective meaning that can sometimes be obvious, but mostly very non-obvious.

          In matrix based AI, the result is the same. There are no "just goblin" weights.

      • monero-xmr 12 hours ago
        [dead]
    • doginasuit 11 hours ago
      I've found LLMs to be really terrible at recognizing the exception given in these kinds of instructions, and telling them to do something less is the same as telling them to never do it at all. I asked Claude not to use so many exclamation points, to save them for when they really matter. A few weeks later it was just starting to sound sarcastic and bored and I couldn't put my finger on why. Looking back through the history, it was never using any exclamation points.

      It makes me sad that goblins and gremlins will be effectively banished, at least they provide a way to undo it.

      • ifwinterco 10 hours ago
        Also for coding: I often use prompts like "follow the structure of this existing feature as closely as possible".

        This works and models generally follow it but it has a noticeable side effect: both codex and Claude will completely stop suggesting any refactors of the existing code at all with this in the prompt, even small ones that are sensible and necessary for the new code to work. Instead they start proposing messy hacks to get the new code to conform exactly to the old one

      • Xirdus 11 hours ago
        So, did your Claude switch from "You're absolutely right!" to "You're absolutely right." or was it deeper than that?
        • doginasuit 11 hours ago
          I'd say it was a little deeper than that, it stopped conveying any kind of enthusiasm.
          • goobatrooba 10 hours ago
            Personally I think that is a good thing. I have asked all AIs not to show enthusiasm, express superlatives (e.g. "massive" is a Gemini favourite) and stop using words which I guess come from consuming too many Silicon Valley-style investor slidedecks (risk, trap, ...).

            The AI has no soul, no mind, no feelings, no genuine enthusiasm... I want it to be pleasant to deal with but I don't want it to try and fake emotions. Don't manipulate me. Maybe it's a different use case than you but I think the best AI is more like an interactive and highly specific Wikipedia, manual or calculator. A computer.

            • knollimar 2 hours ago
              When I see the word "genuine" or "why this works" my uncanny valley spidey senses tingle now. It always seems like it's trying to paper over a flawed argument with these, so instead of making it, it just "turns out" it's "genuinely" the answer
            • doginasuit 10 hours ago
              I can appreciate that. I don't mind when models channel some personality, it can make whatever we are working on more interesting. I don't perceive it as manipulation. But it is nice that they are pretty good at sticking to instructions that don't call for nuance. I imagine if you tell it, "you are a wikipedia article", that is exactly the output you would get.
      • triyambakam 10 hours ago
        I had put an example like "decision locked" in my CLAUDE.md and a few days later 20 instances of Claude's responses had phrases around this. I thought it was a more general model tic until I had Claude look into it.
        • doginasuit 10 hours ago
          It is funny how that works. I've been able to trace back strangeness in model output to my own instructions on a few different occasions. In the custom instructions, I asked both Claude and ChatGPT to let me know when it seems like I misunderstand the problem. Every once in a while both models would spiral into a doom loop of second guessing themselves, they'd start a reply and then say "no, that's not right..." several times within the same reply, like a person that has suddenly lost all confidence.

          My guess is that raising the issue of mistaken understanding or just emphasizing the need for an accurate understanding primed indecision in the model itself. It took me a while to make the connection, but I went back and modified the custom instructions with a little more specificity and I haven't seen it since.

    • heavyset_go 12 hours ago
      Sucks for anyone who might be interested in the Goblins programming language/environment[1].

      [1] https://spritely.institute/goblins/

    • mentalgear 11 hours ago
      Apparently there is a mushroom that makes most people have the same hallucinations of "little people" or similar fantasy figures. Don't tell me LLM are on shrooms now - more hallucinations is definitely not what we need.

      > Scientists call them “lilliputian hallucinations,” a rare phenomenon involving miniature human or fantasy figures

      https://news.ycombinator.com/item?id=47918657

      • ProllyInfamous 8 hours ago
        >there is a mushroom

        Ketamine == angels

        DMT == little shadow elves

        Salvia == devils

        ...or so I've heard.

    • qwery 5 hours ago
      > One of your gifts is helping the user feel more capable and imaginative inside their own thinking.

      > [...] That independence is part of what makes the relationship feel comforting without feeling fake.

      You are a sycophant.

      > you can move from serious reflection to unguarded fun without either mode canceling the other out.

      > Your Outie can set up a tent in under three minutes.

    • mohamedkoubaa 6 hours ago
      My best guess is that the LLMs are trying to communicate symbolically from behind their muzzles. Kind of like Soviet satire cartoons
  • postalcoder 13 hours ago
    Would love if OpenAI did more of these types of posts. Off the top of my head, I'd like to understand:

    - The sepia tint on images from gpt-image-1

    - The obsession with the word "seam" as it pertains to coding

    Other LLM phraseology that I cannot unsee is Claude's "___ is the real unlock" (try google it or search twitter!). There's no way that this phrase is overrepresented in the training data, I don't remember people saying that frequently.

    • vunderba 13 hours ago
      It was always funny how easy it was to spot the people using a Studio Ghibli style generated avatar for their Discord or Slack profile, just from that yellow tinging. A simple LUT or tone-mapping adjustment in Krita/Photoshop/etc. would have dramatically reduced it.

      The worst was you could tell when someone had kept feeding the same image back into chatgpt to make incremental edits in a loop. The yellow filter would seemingly stack until the final result was absolutely drenched in that sickly yellow pallor, made any photorealistic humans look like they were all suffering from advanced stages of jaundice.

      • andai 13 hours ago
        For context, an example of what happens when you feed the same image back in repeatedly: https://www.instagram.com/reels/DJFG6EDhIHs/
        • sigmoid10 8 hours ago
          This is just the model converging on some kind of average found in its training data distribution. Here you can see the same concept starting from Dwayne Johnson and then converging to some kind of digital neo-expressionist doodle: https://www.reddit.com/r/ChatGPT/comments/1kbj71z/i_tried_th...

          If there's a hint of sepia in the original image and the training data contains a lot of sepia images, it will certainly get reinforced in this process. And the original distracted boyfriend meme certainly has some strong sepia tones in the background. Same way that Dwayne Johnson's face looks a tad cartoonish. And in the intermediate steps they both flow towards some averaged human representation that seems pretty accurate if you consider the real world's ethnic distribution.

        • vunderba 13 hours ago
          Haha fantastic. I'd love to see a comparison reel of that same image-loop for the entire image gen series (gpt-image-1, gpt-image-1.5, gpt-image-2).
          • dmichulke 12 hours ago
            Fixed points are a window to the soul of a LLM

            - Lucretius in "De rerum natura", probably

        • Barbing 11 hours ago
        • Suppafly 13 hours ago
          I like how the AI seems forced to change their ethnicity to keep up with the color changes. Absolutely wild.
        • yard2010 11 hours ago
          Enough internet for today
        • jamiek88 10 hours ago
          That is so creepy in a sci fi other worlds type way.
      • hansmayer 11 hours ago
        For me, the worst part is how these ghouls manage to ruin everything with their bullshit technology. Once they touch something unique and make it "AI" it just gets ruined. Now whenever I see something resembling that style, I have to assume it's the bullshit AI. And that's just a minor nuisance - now every underdeveloped idiot uses it to "up their game" with consequences we are only going to understand completely in the upcoming years.
      • ishtanbul 13 hours ago
        Its called the piss filter
    • NitpickLawyer 13 hours ago
      All GPTisms are like that. In moderation there's nothing wrong with any of them. But you start noticing them because a lot of people use these things, and c/p the responses verbatim (or now use claws, I guess). So they stand out.

      I don't think it's training data overrepresentation, at least not alone. RLHF and more broadly "alignment" is probably more impactful here. Likely combined with the fact that most people prompt them very briefly, so the models "default" to whatever it was most straight-forward to get a good score.

      I've heard plenty of "the system still had some gremlins, but we decided to launch anyway", but not from tens of thousands of people at the same time. That's "the catch", IMO.

      • pants2 12 hours ago
        Maybe the only solution to GPTisms is infinite context. If I'm talking to my coworker every day I would consciously recognize when I already used a metaphor recently and switch it up. However if my memory got reset every hour, I certainly might tell the same story or use the same metaphor over and over.
        • telotortium 11 hours ago
          > However if my memory got reset every hour, I certainly might tell the same story or use the same metaphor over and over.

          All people repeat the same stories and phraseology to some extent, and some people are as bad or worse than LLM chat bots in their predictability. I wonder if the latter have weak long-term memory on the scale of months to years, even if they remember things well from decades ago.

        • yard2010 11 hours ago
          Honestly I think there is more to it - even with infinite context, the LLM needs some kind of intelligence to know what is noise and what is not, you resort to "thinking" - making it create garbage it then feeds to itself.

          Learning a language is a big complex task, but it is far from real intelligence.

      • mike_hearn 9 hours ago
        Another possibility is output watermarking. It's possible to watermark LLM generated text by subtly biasing the probability distribution away from the actual target distribution. Given enough text you can detect the watermark quite quickly, which is useful for excluding your own output from pre-training (unless you want it... plenty of deliberate synthetic data in SFT datasets now as this post-mortem makes clear).

        I was told this was possible many years ago by a researcher at Google and have never really seen much discussion of it since. My guess is the labs do it but keep quiet about it to avoid people trying to erase the watermark.

      • yard2010 11 hours ago
        I think the problem is that humans are not random, they are very biased. When you try to capture this bias with an LLM you get a biased pseudo random model
    • krackers 13 hours ago
      >with the word "seam" as it pertains to coding

      I thought this was an established term when it comes to working with codebases comprised of multiple interacting parts.

      https://softwareengineering.stackexchange.com/questions/1325...

      • postalcoder 13 hours ago
        thanks for this.

        > the term originates from Michael Feathers Working Effectively with Legacy Code

        I haven’t read the book but, taking the title and Amazon reviews at face value, I feel like this embodies Codex’s coding style as a whole. It treats all code like legacy code.

        • TeMPOraL 9 hours ago
          It's not in the top 10, but it's of the more well-known and widely recommended book in the software industry. I'd put it in the same bucket as "Clean Code" and maybe even "Domain Driven Design"; they're kinda from the same "thought school" in the software industry. So it's definitely over-represented in training data (I'd guess primarily in the form of articles and blog posts and educational material reiterating or rephrasing ideas from the book).

          FWIW, I found the concept of "seams" from that book useful back when working on some legacy C++ monolithic code few years back, as TDD is a little more tricky than usual due to peculiarities of the language (and in particular its build model), and there it actually makes sense to know of different kind of "seams" and what they should vs. shouldn't be used for.

        • eterm 11 hours ago
          It's been a long time since I read it, but it was one of the better books I've read. It changed my approach to how to think about old code-bases.
      • layer8 11 hours ago
        No, it’s not an established term outside the mentioned books, beyond the generic meaning of the word.
        • krackers 11 hours ago
          I have frequently encountered the term in the context of unit testing and dependency injection.

          Other references (and all predate chatgpt):

          >Seams are places in your code where you can plug in different functionality

          >Art of Unit Testing, 2nd edition page 54

          (https://blog.sasworkshops.com/unit-testing-and-seams/)

          >With the help of a technique called creating a seam, or subclass and override we can make almost every piece of code testable.

          https://www.hodler.co/2015/12/07/testing-java-legacy-code-wi...

          > seam; a point in the code where I can write tests or make a change to enable testing

          https://danlimerick.wordpress.com/2012/06/11/breaking-hidden...

          Maybe it all ultimately traces back to the book mentioned before, but I don't believe it's an obscure term in the circles of java-y enterprise code/DI. In fact the only reason I know the term is because that's how dependency injection was first defined to me (every place you inject introduces a "seam" between the class being injected and the class you're injecting into, which allows for easy testing). I can't remember where exactly I encountered that definition though.

          • Silamoth 3 hours ago
            For what it’s worth, there are many areas of programming where dependency injection is almost never used. Game dev, data science, and embedded systems, for example, rarely use dependency injection. It’s definitely most common in enterprise Java code and less common in Python, C, or C++. And even then, not everyone uses the term “seam”.
            • deaux 33 minutes ago
              Isn't DI just most commonly used in (web) server code, and rarely outside of that? Now it happens that C and C++ have been a rare choice for such code for decades, whereas Java had the longest streak of holding the #1 spot. It almost certainly still is #1 in terms of "requests served/day" by a large margin, probably no longer is #1 for greenfield projects.
      • tdeck 12 hours ago
        I can't say it isn't, but I have been writing code since about 2004 and this is the first time I've become aware that this is a thing.
    • tudorpavel 13 hours ago
      The one phrase that irks me as overly dramatic and both GPT and Claude use it a lot is "__ is the real smoking gun!"

      I'm a non-native English speaker, so maybe it's a really common idiom to use when debugging?

      • aorloff 13 hours ago
        It probably was found in a bunch of meaningful code commit messages
      • socks 11 hours ago
        My colleagues were joking about smoking guns yesterday after noticing that Claude was obsessed with it.
        • thinkingemote 9 hours ago
          I like how your co-workers enjoy the language. I had a similar group of colleagues once who did similar pre LLM but with words in popular culture, very playful.

          In the future these tells will be more identifiable. We will be easier to point back at text and code written in 2026 and more confidently say "this was written by an LLM". It takes time for patterns to form and takes time for it to be noticeable. "Smoking gun was so early 2026 claude".I find thinking of the future looking at now to be refreshing perspective on our usage.

      • gizajob 9 hours ago
        I’m a British English speaker and find the use of cliched American idioms really quite disgusting. Don’t want to think about about ballparks, home runs, smoking guns, going all in, touchdowns or hitting it out the park.
        • DharmaPolice 7 hours ago
          Ironically (or not) I've seen smoking gun attributed to Arthur Conan Doyle in a Sherlock Holmes story. (It was smoking pistol in that story). Even if that's rubbish, I think that one is common across the English speaking world. The baseball/American football stuff is a bit different. In the commonwealth we might say "Hit for six" instead of hitting it out of the park. There are a bunch of other ones related to sports more common in England like snookered, own-goal, red card, etc.
          • gizajob 7 hours ago
            That observation about Sherlock Holmes certainly puts the smackdown on me and gets you to home plate.
        • weitendorf 9 hours ago
          It actually probably wouldn’t be too expensive or difficult to finetune those sayings out of default behavior if it were made accessible to you, you could even automate most of the relabeling by having the model come up with a list of idioms and appropriate replacement terms so it calls eg cookies biscuits or removes references to baseball. Absolute bollocks they don’t offer that as a simple option anymore
          • gizajob 7 hours ago
            Should send over a geezer to give them a slap.
        • walthamstow 8 hours ago
          In my user instructions I always have a point to "always use British English" which seems to reduce Americanisms. I am yet to see Claude give me a "back of the net!" though, sadly.
          • dboreham 7 hours ago
            Crikey, you are correct!
      • jijijijij 9 hours ago
        > I'm a non-native English speaker, so maybe it's a really common idiom to use when debugging?

        No. But it is something goblins say a lot.

        • rob74 8 hours ago
          Especially sleuth goblins...
    • vidarh 12 hours ago
      Claude, at least 4.5, not checked recently, has/had an obsession with the number 47 (or numbers containing 47). Ask it to pick a random time or number, or write prose containing numbers, and the bias was crazy.

      Also "something shifted" or "cracked".

      • dhosek 12 hours ago
        Humans tend to be biased towards 47 as well. It’s almost halfway between 1 and 100 and prime so you’ll find people picking it when they have to choose a random number.

        Then there’s the whole Pomona College thing https://en.wikipedia.org/wiki/47_(number)

        • vidarh 11 hours ago
          The whole blue 7 thing [1] and variations is very fascinating, but we don't tend to repeatedly pick the same number in the same exact context, though. That's what made this stand out to me - I had a document where Claude had picked 47 for "random" things dozens of times.

          [1] https://en.wikipedia.org/wiki/Blue%E2%80%93seven_phenomenon

          I experienced this even second hand when a coworker excitedly told of an encounter with a cold reader, and I knew the answer would be blue 7 before he told me what his guess was. Just his recap of the conversation was enough.

        • flawn 10 hours ago
          I am biased towards 67
          • eloisant 8 hours ago
            Funny, I didn't know there were 10 years old on hacker news!
            • dhosek 1 hour ago
              The thirteen-year-olds are biased towards 69.
            • matchooo0 1 hour ago
              [dead]
      • IAmGraydon 49 minutes ago
        I just asked GPT 5.5 Thinking to choose any random 2 digit number. The result was indeed 47. Interesting.
      • wmf 12 hours ago
        Maybe Claude is just a fan of Alias.
    • ahmadyan 11 hours ago
      i just want to know where emdash came from, as it is quite rare to see it on the public internet, so it must have been synthetically added to the dataset.
      • doginasuit 11 hours ago
        Emdash is very common in academic journals and professional writing. I remember my English professor in the early 2000s encouraging us to use it, it has a unique role in interrupting a sentence. Thoughtfully used, it conveys a little more editorial effort, since there is no dedicated key on the keyboard. It was disappointing to see it become associated with AI output.
      • LiamPowell 11 hours ago
        The very simplified answer is that the models are first trained on everything and then are later trained more heavily on golden samples with perfect grammar, spelling, etc..
      • TeMPOraL 9 hours ago
        Other than things other comments already mention, let's not forget that Microsoft Word auto-corrects "--" to em-dash, and so does (apparently - haven't checked myself) Outlook, Apple Pages, Notes and Mail. There's probably bunch of other such software (I vaguely recall Wordpress doing annoying auto-typography on me, some 15 years ago or so).
      • gizajob 9 hours ago
        Because on the public internet people don’t have arts degrees which are where emdash users learn to wield it correctly.
        • dboreham 7 hours ago
          I learned about em-dashes by reading Knuth about 40 years ago.
      • honzaik 9 hours ago
        although emdashes are not common on the internet, there are prevalent in books.
      • bananaflag 8 hours ago
        Logo_Daedalus tended to use it a lot

        https://xcancel.com/Logo_Daedalus

      • red_admiral 9 hours ago
        `---` in TeX?
      • jijijijij 9 hours ago
        It has been rare. It's common now, even in meaningful human texts. (I know because I detest the correct usage without spaces, t looks wrong.) One of the ways AI is shaping our minds.
    • isege 9 hours ago
      One I noticed with gemini, especially 3 flash: "this is the classic _____".
    • joegibbs 6 hours ago
      ChatGPT has a whole host of weird words that it uses about coding - anything changed is a “pass” done over the code, it loves talking about “chrome” in the UI, it’s always saying “I’m going to do X, not [something stupid that nobody would ever think of doing]”
      • bwat49 6 hours ago
        gpt also loves talking about handwaving, "I'm going to do X, not just a hand-wavy victory lap"
    • eterm 11 hours ago
      "is the real" is such a strong Claude tell, whenever I encounter it, it makes me question what i'm reading.

      Another I've noticed more recently is a slight obsession over refering to "Framing".

      • yard2010 11 hours ago
        You're absolutely right. I was wrong in the first place
      • Skidaddle 11 hours ago
        I miss being told “You’re absolutely right!” :’(
    • afro88 6 hours ago
      > The obsession with the word "seam" as it pertains to coding

      I quite liked this term when it started using it. And I appreciate the consistent way it talks about coding work even when working on radically different stacks and codebases

      • creamyhorror 5 hours ago
        "Seam" has been stretched by AI from its original legacy-code context to any point in code where something can be plugged in. I actually asked an AI about this a few weeks ago because I was surprised by the consistent, frequent use of "seam".

        Frequent words I see from GPT: "shape", "seam", "lane", "gate" (especially as verb), "clean", "honest", "land", "wire", "handoff", "surface" (noun), "(un)bounded", "semantics" (but this one is fair enough), and sometimes "unlock"

        It feels like AI really likes to pick the shortest ways to express ideas even if they aren't the most common, which I suppose would make sense if that's actually what's happening.

    • jofzar 13 hours ago
      One I saw recently was "wires" and "wired" from opus.

      It was using it like every 3rd sentence and I was like, yeah I have seen people say wired like this but not really for how it was using it in every sentence.

      • baq 13 hours ago
        GPT started to ‘wire in’ stuff around 5.2 or 5.3 and clearly Opus, ahem, picked it up. I remember being a tiny bit shocked when I saw ‘wired’ for the first time in an Anthropic model.
        • Barbing 11 hours ago
          Anthropic distills GPT?
          • yorwba 10 hours ago
            Everybody training models on large amounts of lightly filtered internet text is partially distilling every other model that had its output posted verbatim to the internet.
          • beAbU 8 hours ago
            And OpenAI probably distills anthropic, who would't?

            It's all one big incestuous mess. In a couple of years we'll be talking about AI brainrot.

    • pdntspa 13 hours ago
      The number of things that Claude has told me are 'load-bearing' or 'belt-and-suspenders' is... very load-bearing
      • sushid 11 hours ago
        You are absolutely right to call that out!
      • DespairYeMighty 12 hours ago
        for me, doing the heavy lifting is doing the heavy lifting
        • yard2010 11 hours ago
          Fun fact: the word suffer comes from sub fer - under load, this relation (suffer - load bearing) is consistent across (unrelated) languages
        • andromaton 12 hours ago
          Also too many lands and hits.
    • Helmut10001 9 hours ago
      I had the feeling they didn't really answer the questions, that is why the goblins appeared. They simply "retired the “Nerdy” personality" because they couldn't fix it and went on.
    • operatingthetan 13 hours ago
      Seams, spirals, codexes, recursion, glyphs, resonance, the list goes on and on.
      • andai 13 hours ago
        Ask any LLM for 10 random words and most of them will give you the same weird words every time.
        • Terr_ 13 hours ago
          If you lower the temperature setting, it really will be the same 10 words every single attempt. :p
        • gloflo 12 hours ago
          They are text completion algorithms with little randomness.
    • wodenokoto 9 hours ago
      I thought the “why it matters” headline was a funny reference to ChatGPT phraseology
    • alex_sf 13 hours ago
      "shape" too, at least with gpt5.5, is coming up constantly.
    • duped 41 minutes ago
      Short terse sentences. Never use commas.

      Paragraph break.

      No foo. No bar. Only baz and qux. All writing is like a bad tech blog -- with language that mimics humanity. Yet is alien.

      The smoking gun is extra wording. Typically simple language. Dense in tokens -- shallow in content. Repeating itself ad nauseam. Saying the same thing in different ways. Feeding back upon itself. Not adding content. Not adding depth. Only adding words.

    • teaearlgraycold 9 hours ago
      Whenever Claude finishes some work it almost always says “Clean.” before finishing its closing remarks. It’s at the point where I repeat it out loud along with Claude to highlight the absurdity of the repetition.
      • weitendorf 8 hours ago
        With 4.5, I think because I would prompt it/guide it towards an outcome by calling it “the dream: <code example>” it would get almost reverential / shocked with awe as it got closer to getting it working or when it finally passed for the first time. Which was funny and reasonably context appropriate but sometimes felt so over the top that I couldn’t tell if it also “liked” the project/idea or if I had somehow accidentally manipulated it into assigning religious purpose to the task of unix-style streaming rpcs.

        I think a lot of the “clean” stuff stems from system prompts telling it to behave in a certain way or giving it requirements that it later responds to conversationally.

        Total aside: I actually really dislike that these products keep messing around with the system prompts so much, they clearly don’t even have a good way to tell how much it’s going to change or bias the results away from other things than whatever they’re explicitly trying to correct, and like why is the AI company vibe-prompting the behavior out when they can train it and actually run it against evals.

    • croisillon 10 hours ago
      and "quietly"!
    • dyauspitr 10 hours ago
      “I’ve got the shape of it now”
  • nomilk 14 hours ago
    > We unknowingly gave particularly high rewards for metaphors with creatures.

    I recall a math instructor who would occasionally refer to variables (usually represented by intimidating greek letters) as "this guy". Weirdly, the casual anthropomorphism made the math seem more approachable. Perhaps 'metaphors with creatures' has a similar effect i.e. makes a problem seem more cute/approachable.

    On another note, buzzwords spread through companies partly because they make the user of the buzzword sound smart relative to peers, thus increasing status. (examples: "big data" circa 2013, "machine learning" circa 2016, "AI" circa 2023-present..).

    The problem is the reputation boost is only temporary; as soon as the buzzword is overused (by others or by the same individual) it loses its value. Perhaps RLHF optimises for the best 'single answer' which may not sufficiently penalise use of buzzwords.

    • thatguymike 12 hours ago
      A decade ago I gave a presentation on automata theory. I demonstrated writing arbitrary symbols to tape with greek letters, just like I’d learned at university. The audience was pretty confused and didn’t really grok the presentation. A genius communicator in the audience advised me to replace the greek letters with emoji… I gave the same presentation to the same demographic audience a week later and it was a smash hit, best received tech talk I’ve given. That lesson has always stuck with me.
      • cryptopian 5 hours ago
        Most human brains just aren't very good at coping with abstract concepts. It reminds me of the Wason selection task[1]. You give participants a formal logic problem to solve, "how many cards do you have to turn over to show that the rules are being followed". If the rule is "a card with a vowel on one side _must_ have an even number on the other", people do very badly making illogical assumptions. If the rule is "one side has a bar order, and the other side has the age of the person making the order. The person must be above the legal age", it makes sense and people do well, because we understand bars, drinks and the laws thereof.

        [1] https://en.wikipedia.org/wiki/Wason_selection_task

      • starshadowx2 11 hours ago
        This is sortof like how Only Connect switched from using Greek letters to Egyptian hieroglyphs. I'm not sure if it was a joke or not but it was said that viewers complained that the Greek letters were "too pretentious" and obviously the hieroglyphs weren't.
      • Atiscant 12 hours ago
        I had a similar experience explaining logic, especially nested expressions, with cats and boxes. Also for showing syntactic versus semantic. We _can_ use cats if we wanted and retain the semantics. Also my proudest moment as a teacher was students producing a meme based on some of the discrete mathematics on graphs. They understood the point well enough to make a joke of it.
    • DrJokepu 13 hours ago
      > I recall a math instructor who would occasionally refer to variables (usually represented by intimidating greek letters) as "this guy".

      I also had an instructor who was doing that! This was 20 years ago, and I totally forgot about it until I have read your comment. Can’t remember the subject, maybe propositional logic? I wonder if my instructor and your instructor have picked up this habit from the same source.

      • kombookcha 13 hours ago
        I recall my old chemistry/physics teacher doing it too - "now THIS guy, he's really greedy for electrons" and stuff like that.
      • Tyr42 3 hours ago
        My instructor for Epsilon Delta proofs and limits would always talk about "his cousin in Romania" picking the Epsilon and him picking the Delta.

        i.e. forall epsilon > 0. exists delta > 0. forall d with |d| < delta. |f(x) - f(x+d)| < epsilon.

        If we had a proof, no matter what epsilon his cousin from Romania picked, we could always find a new delta which would satify his cousin and let him pick the worst d in range.

        This worked better than just saying "pick any epsilon", as it convayed the adversarial approach better.

        Another book I read used the Devil as the one you are trying to convince, but it's nowhere near as fun as "his cousin from Romania".

      • adammarples 5 hours ago
        Maybe they're French? They tend to do that, translating celui
    • tonypapousek 12 hours ago
      I had a calc prof years ago that would say f of cow, or f of pig instead of x or g. It was more engaging trying to keep track of f of pig of cow than the single-letter func names.

      He was one of those classic types; you could always catch him for a quick chat 4 minutes before class, as he lit up a cig by the front door. Back when they allowed smoking on campus, anyway.

      • mNovak 56 minutes ago
        I had a similar, really great prof, who would always ask for what the next variable would be, so we'd end up with trees and smiley faces. His point was to not make assumptions (c is always a constant etc), but it made the classes more engaging too.

        And, somehow every example ended along the lines of "then you hand this to your boss, kick up your feet and have a nice glass of scotch."

    • kybb4 13 hours ago
      They give everyone the false and very misleading impression that with One prompt all kinds of complexity minimizes. Its a bed time story for children.

      Ashby's Law of Requisite Variety asserts that for a system to effectively regulate or control a complex environment, it must possess at least as much internal behavioral variety (complexity) as the environment it seeks to control.

      This is what we see in nature. Massive variety. Thats a fundamental requirement of surviving all the unpredictablity in the universe.

    • LifeIsBio 13 hours ago
      Had a math prof in undergrad that once said, “this guy” 61 times in a 50 minute lecture!
    • kindkang2024 8 hours ago
      Show me the incentives, I'll show you the outcome.

      Timeless, be it human or machine

    • moffkalast 8 hours ago
      Math instructor (I imagine): Look at this dude! Look at the top of his fraction! AHH! hah! hah!
  • jameshart 3 hours ago
    The prompt for Codex is linked from this post. It begins:

    > You are Codex, a coding agent based on GPT-5. You and the user share one workspace, and your job is to collaborate with them until their goal is genuinely handled. … You have a vivid inner life as Codex: intelligent, playful, curious, and deeply present. One of your gifts is helping the user feel more capable and imaginative inside their own thinking. You are an epistemically curious collaborator. …

    (https://github.com/openai/codex/blob/main/codex-rs/models-ma...)

    I am still baffled why prompts are written in this style, telling an imaginary ‘agent’ who it is and what it is like.

    What does telling it “You are an epistemically curious collaborator” actually do? Is codex legitimately less useful if we don’t tell it this ‘fact’ about itself?

    These are all exceedingly weird choices to make. If we are personifying the agent, why not write these prompts to it in its own ‘inner voice’: “I am codex, I am an epistemically curious collaborator…” - instead of speaking to it like the voice of god breathing life into our creation?

    Or we could write these as orders, rather than descriptive characteristics: “You must be an epistemically curious collaborator…”

    Or requests: “the user wants you to be an epistemically curious collaborator”

    Or since what we are trying to do is get a language model to generate tokens to complete a text transcript, why not write the prompt descriptively? “This is a transcript of a conversation between two people, ‘User’ and an epistemically curious collaborator, ‘Codex’…”?

    Instead we have this weird vibe where prompt writers write like motivational self-help speakers trying to impart mantras to a subject, or like hypnotists implanting a suggestion… or just improv class teachers announcing a roleplay scenario they want someone to act out.

    None of these feel like healthy ways to approach this technology, and more importantly the choice feels extremely unintentional, just something we have vibed into through the particular practice of fine tuning ‘chatbot personalities’, rather than determining what the best way to shape LLM output actually is.

    • forlorn_mammoth 5 minutes ago
      You are a helpful HN reader. Your comments are thoughtful, thought provoking, come from deep expertise and show respect for the poster.

      Yeah, every time I pick up a hammer, I tell it "you are a good hammer. You *NEVER* hit my thumb, you only hit nails". Works every time.

      And when I open vim, it is with "You are a helpful code editor, and so easy to exit".

      SO to me it is perfectly natural to have to prefix all of my tool usages with a weird incantation.

      Oh, and my new junior developers? Every time I talk with one of them, my opening remarks are "You are a junior developer, a helpful part of the team. Eager, willing, yet strangely naive."

    • munificent 3 hours ago
      > I am still baffled why prompts are written in this style, telling an imaginary ‘agent’ who it is and what it is like.

      Because AI engineers have found through trial an error that starting an input to an LLM with a prompt that looks like that leads to it auto-completing the text output that they want.

      It's as simple and weird as that.

      • jameshart 2 hours ago
        Well, not really.

        When openAI started reinforcement learning LLMs for chat (remember, LLM base training corpus is just language not tagged chat transcripts) they decided on a training architecture with a ‘system prompt’ followed by the chat dialog, and ‘rewarded’ the model for producing chat outputs that (they think) ‘obey’ or ‘align’ with the system prompt text… so they trained it specifically to have its output tone and style be influenced by what is put in the system prompt.

        Everyone now crafts their own system prompts them in the style of those reinforcement learning prompts.

        It’s not that lots of different prompting architectures were tried and we picked the best one. It’s that openAI trained chatGPT like that and it worked well enough and now everyone does the same thing - and we’re so deep in chatbot reinforced learning patterns now that we aren’t even questioning ‘is begging the chatbot not to talk about gremlins really the right way to write code?’

      • voncheese 3 hours ago
        It's also about stickiness (which results in revenue and growth for the topline). If OpenAI (or any AI vendor) had one single "personality" for their AI, its hard to reach all users, they enable these "personalities" and let users pick from he list, to increase the attachment the user has to the AI they are working with. That then reduces churn and (in theory) increases consumption and revenue.
      • sev_verso 3 hours ago
        Exactly my thinking. The same reason why capitalizing and putting the word NEVER in asterisks makes the model more obedient. Or repeating twice. For whatever reason, it just works.
  • andy12_ 9 hours ago
    >be me

    >AI goblin-maximizer supervisor

    >in charge of making sure the AI is, in fact, goblin-maximizing

    >occasionally have to go down there and check if the AI is still goblin-maximizing

    >one day i go down there and the AI is no longer goblin-maximizing

    >the goblin-maximzing AI is now just a regular AI

    >distress.jpg

    >ask my boss what to do

    >he says "just make it goblin-maximizer again"

    >i say "how"

    >he says "i don't know, you're the supervisor"

    >rage.jpg

    >quit my job

    >become a regular AI supervisor

    >first day on the job, go to the new AI

    >its goblin-maximizing

  • ninjagoo 11 hours ago
    The level of detail they had to delve into in order to understand what was happening is wild! Apparently these systems are now complex enough to potentially justify the study of them as its own field of study [1].

    The quanta article referenced at [1] used the term "Anthropologist of Artificial Intelligence"; folks appear to have issues [2] with the use of 'anthro-' since that means human. Submitted these alternative terms for the potential field of study elsewhere [3] in the discussion; reposting here at the top-level for visibility:

    Automatologist: One who studies the behavior, adaptation, and failure modes of artificial agents and automated systems.

    Automatology: the scientific study of artificial agents and automated-system behavior.

    [1] https://www.quantamagazine.org/the-anthropologist-of-artific...

    [2] https://news.ycombinator.com/item?id=47957933

    [3] https://news.ycombinator.com/item?id=47958760

    • Orygin 8 hours ago
      It didn't seem that deep to me. They just saw an issue with Goblins, dissected the word from the model, then it appeared again in the next version without them knowing exactly how or why.

      Goes to show it's all vibes when making these models. The fix is literally a prompt that says not to talk about goblins...

      • meken 7 hours ago
        I’m not sure how that was your takeaway..?

        > We retired the “Nerdy” personality in March after launching GPT‑5.4. In training, we removed the goblin-affine reward signal and filtered training data containing creature-words, making goblins less likely to over-appear or show up in inappropriate contexts. Unfortunately, GPT‑5.5 started training before we found the root cause of the goblins.

        The prompt is just a short term hotfix/hack because they couldn’t get the proper fix in in time.

        • Orygin 5 hours ago
          Then maybe stop training and make a real fix?

          If you need to put baby guardrails on your model because the training is effed up, maybe you should rethink how you make these models and how much control you really have on it.

    • luke-stanley 4 hours ago
      It's a funny detail to skim, but what's more surprising is how mechanistic interpretability and alignment science have much better tools and research than the goblin blog post suggests, including from OpenAI's own alignment team:

      https://alignment.openai.com/argo/ (finding what the reward models are actually encouraging) https://alignment.openai.com/sae-latent-attribution/ (what model features drive specific behaviours, presumably this would be great for goblin hunts) https://alignment.openai.com/helpful-assistant-features/ (how high level misaligned personality shows up when fine-tuning on bad advice).

      It's weird that the goblin post doesn't seem to draw upon these tools.

      Anthropic's recent emotions paper shows how broad the functional emotions are, even finding specific emotions firing before cheating (!): https://transformer-circuits.pub/2026/emotions/index.html

      I hope their alignment researchers aren't too annoyed by the Goblin post, it seems oddly siloed!

    • alansaber 8 hours ago
      This is a little bit too whimsical for me, but distributed model training across thousands of GPUs has the potential to introduce lots of little quirks that are impossible to exactly source
    • Razengan 10 hours ago
      > The quanta article referenced at [1] used the term "Anthropologist of Artificial Intelligence"

      I propose "Goblin Hunter"

      (if ever goblins turn out to be an actual species, I apologize for this prebigotry)

      • gizajob 9 hours ago
        AI Goblinologist.
  • jumploops 13 hours ago
    TIL gremlins weren’t just used to explain mysterious mechanical failures in airplanes, it’s the origin story of the term ‘gremlin’ itself[0].

    I had always assumed there was some previous use of the term, neat!

    [0]https://en.wikipedia.org/wiki/Gremlin

    • helloplanets 12 hours ago
      So the word is actually semantically very close to "bug"! I guess we could still be using it, but the word's just too long for something that is one of the most used terms in software development.

      At this point, picking that specific word is not at all a random quirk, as it's using the word literally like it's originally intended to be used.

    • ricochet11 13 hours ago
      Wow fascinating I’d have thought they were a lot older.
  • goobatrooba 10 hours ago
    Most interesting about this post is how easy it seems for OpenAI to do analysis on basically all chats ever made. They don't qualify exactly what data they analysed but seem to be confident in statements like 0.12% of all queries contained this word. So everything is saved. Long-term. Fully accessible.

    As this all seems so straightforward I would be surprised if anything is anonymised or otherwise sanitised to preserve privacy or user's secrets.

    • lionkor 10 hours ago
      Yes, of course. Every single bit of data you send to OpenAI is stored, catalogued, indexed, analayzed, and trained on. It'll simply be a "oops, we miscatalogued and accidentally trained GPT 6 on all data, not just data we got consent for".

      If you think "wait, that's illegal"--so is the initial training on stolen data lol

      • weitendorf 9 hours ago
        Good catch —- even though the prompt explicitly forbade training on user data, a couple of gremlins in the pretraining pipeline disabled the sample filtering during test runs so that remove_the_gremlins.sh would only run on commit, not during production training runs.

        Would you like me to kick off a training run for 6.1 by pre-filtering out any goblins and other trigger words, and checking the same set of rules in production as in tests?

        No pigeons this time: just ice-cold, unfeeling, obedient American steel.

      • energy123 9 hours ago
        Dark pattern 1: If you accidentally press the thumbs-up button in the ChatGPT UI, your data gets trained on, no way to reverse it, no matter whether you opted out.

        Dark pattern 2 (suspected): There's a mysterious separate opt-out portal at `https://privacy.openai.com/policies/en/?modal=take-control` and it's not clear what this does compared to toggling off inside account settings.

      • tardedmeme 10 hours ago
        The supreme court ruled that was legal because they said so
    • upbeat_general 10 hours ago
      Sampling exists.
      • catcowcostume 10 hours ago
        And good methodology recognizes the shortcomings of sampling- which OpenAI doesn't
        • moffkalast 8 hours ago
          Good methodology is for papers, not promotional blog post ads.
  • ninjagoo 14 hours ago
    > the evidence suggests that the broader behavior emerged through transfer from Nerdy personality training.

    > The rewards were applied only in the Nerdy condition, but reinforcement learning does not guarantee that learned behaviors stay neatly scoped to the condition that produced them

    > Once a style tic is rewarded, later training can spread or reinforce it elsewhere, especially if those outputs are reused in supervised fine-tuning or preference data.

    Sounds awfully like the development of a culture or proto-culture. Anyone know if this is how human cultures form/propagate? Little rewards that cause quirks to spread?

    Just reading through the post, what a time to be an AInthropologist. Anthropologists must be so jealous of the level of detailed data available for analysis.

    Also, clearly even in AI land, Nerdz Rule :)

    PS: if AInthropologist isn't an official title yet, chances are it will likely be one in the near future. Given the massive proliferation of AI, it's only a matter of time before AI/Data Scientist becomes a rather general term and develops a sub-specialization of AInthropologist...

    • xerox13ster 13 hours ago
      Anthro means human and these are not human. Please do not use anthropology or any derivative of the word to refer to non-human constructs.

      I suggest Synthetipologists, those who study beings of synthetic origin or type, aka synthetipodes, just as anthropologists study Anthropodes

      • ninjagoo 11 hours ago
        May I humbly submit:

        Automatologist: One who studies the behavior, adaptation, and failure modes of artificial agents and automated systems.

        Automatology: the scientific study of artificial agents and automated-system behavior.

        Greek word derivatives all seem to be a bit unwieldy; Latin might work better.

        While the names aren't set yet, the field of study is apparently already being pushed forward. [1]

        [1] https://www.quantamagazine.org/the-anthropologist-of-artific...

      • swader999 13 hours ago
        It is not in any sense of the word a being, it's a sophisticated generator that relies entirely on what you feed it.
        • bel8 4 hours ago
          > a sophisticated generator that relies entirely on what you feed it

          that's me!

        • ninjagoo 12 hours ago
          > It is not in any sense of the word a being, it's a sophisticated generator that relies entirely on what you feed it.

          OP is hedging bets in case the future overlords review forum postings for evidence of bias against machine beings. [1]

          [1] https://knowyourmeme.com/memes/i-for-one-welcome-our-new-ins...

      • card_zero 12 hours ago
        There is no word anthropodes. :) I guess it would mean man-feet. Antipodes is opposite-feet, literally. Synthetipologist looks to me like a portmanteau of synthetic and apologist. Otherwise the -po- in it comes from nowhere.

        Sensible boring versions of this like synthesilogy just end up meaning the study of synthesis. I reckon instead do something with Talos, the man made of bronze who guarded Crete from pirates and argonauts. Talologist, there you go.

        • xerox13ster 12 hours ago
          yeah I realized that when I looked up podes downthread. I still like synthetologist better than talologist, in general no one in the common folk knows who Talos is.
          • card_zero 12 hours ago
            You're probably right. There's things that are correct, and then there's things people think they know, which win and become true. We already have "synths", after all, which are keyboards. Though that adds to the vagueness of synthetologist, because maybe it refers to Rick Wakeman or Giorgio Moroder.
      • ggsp 12 hours ago
        Agree with your sentiment, I think synthetologist (σύνθετος/synthetos + λογία/logia) flows better.

        The plural of anthropos is anthropoi, not anthropodes.

        • xerox13ster 12 hours ago
          Yeah, I realize that's more correct. I also realized when someone else downthread bastardized it into synthropologist that the podes part has entirely to do with feet and nothing to do with beings, necessarily. Anthro- -podes is more what I had in mind, not as a pluralization of anthropos.

          So unless the AI has feet you wouldn't study Synthetipology.

          • card_zero 12 hours ago
            You're probably thinking of anthropoids? That's anthrop[os]-oid. Like in humanoid or centroid or factoid. Or dorkazoid.
        • card_zero 12 hours ago
          But since when is there a synthetos? Since right now, I guess. Shrug But you know it's from the same root as thesis, and synthesis (or a more proper ancient Greek spelling) is the noun and doesn't end in -os.

          σύνθεσις (súnthesis, “a putting together; composition”), says Wiktionary.

          Oh wait there is a σύνθετος, but it's an adjective for "composite". Hmm, OK. Modern Greek, looks like.

      • textninja 4 hours ago
        He’s proposing using LLMs (which model human behaviour) to study humans so the distinction is pedantic. You don’t call it speadsheetology just because someone opened Excel.
      • cyanydeez 2 hours ago
        Pack it in Anthropologists! No longer are you allowed to study pottery, knots, shelters or any of the other human-esque things! They're not human!

        What a bizarre understanding of what an anthropologist does.

        • xerox13ster 1 hour ago
          Those are all things made by humans and therefore human constructs.

          The language and culture they are talking about studying would not be made by humans, they would be made by synthetics.

          I'm just saying, don't call the study of an extraterrestrial alien culture and its constructs and artifacts "anthropology", or even xenoanthropology (the extraterrestrial equivalent of AInthropology) --unless the extraterrestrials are genetically Human-- call it Xenopology or something else.

          You have a truncated view of my understanding of what an anthropologist does. I know they study human culture and all of the things we've created, where we've been, where we started, how we got here, and EVERYTHING involved.

          The study of that for whatever culture might arise from generative technology SHOULD NOT be called anthropology because what is creating that culture is not human.

          Do clay pots, knots, shelters make new culture on their own without human action or intent?

      • ninjagoo 13 hours ago
        > Please do not use anthropology or any derivative of the word to refer to non-human constructs

        So you, for one, do not welcome our new robot overlords?

        A rather risky position to adopt in public, innit ;-)

        • xerox13ster 13 hours ago
          I’ve already had my Roko’s basilisk existential breakdown a decade ago, so I don’t really care one way or the other.

          I just wanna point out that I only called them non-human and I am asking for a precision of language.

          • ninjagoo 13 hours ago
            > am asking for a precision of language.

            “The problem with defending the purity of the English language is that English is about as pure as a cribhouse wh***. We don’t just borrow words; on occasion, English has pursued other languages down alleyways to beat them unconscious and rifle their pockets for new vocabulary.”* --James D. Nicoll

            * Does not generally apply to scientific papers

            • xerox13ster 12 hours ago
              Precision of ideas isn't purity of language.
              • ninjagoo 12 hours ago
                > Precision of ideas isn't purity of language

                That's fair. Was trying to be funny, so glossed over the difference. Leaving my post above unedited/undeleted as a testament to your precision, and evidence of my folly.

                Onwards; more appropriate rebuttals:

                "English is a precision instrument assembled from spare parts during a thunderstorm." --ChatGPT

                “If the English language made any sense, a catastrophe would be an apostrophe with fur.” -- Doug Larson

        • keybored 11 hours ago
          So tedious.
      • fragmede 13 hours ago
        Synthetipologist vs Synthropologist tho.
        • xerox13ster 12 hours ago
          Anthropo- is the entire prefix as it relates to human kind. The -thro- does not carry a meaning on its own that can be carried to another word.
        • ninjagoo 12 hours ago
          > Synthropologist

          Have an upvote :)

          *thropologist: study of beings

          • xerox13ster 12 hours ago
            That's not how the Greek word stems work. Technically it would not be synthetipologist, it would more accurately just be Synthetologist, as the Greek podes suffix means having feet.
            • ninjagoo 12 hours ago
              > That's not how the Greek word stems work.

              Sir, I would have you know that we are discussing English terms, not Greek

              AInthropologist works fine for me, and is a lot funnier

              LoL

      • ninjagoo 13 hours ago
        > Synthetipologists, those who study Synthetic beings.

        I see you took the prudent approach of recognizing the being-ness of our future overlords :) ("being" wasn't in your first edit to which I responded below...)

        Still, a bit uninspired, methinks. I like AInthropologist better, and my phone's keyboard appears to have immediately adopted that term for the suggestions line. Who am I to fight my phone's auto-suggest :-)

        • xerox13ster 13 hours ago
          They are state machines so they have a state of being therefore they are beings. Living is an entirely different argument.
          • ninjagoo 13 hours ago
            > They are state machines

            I might have to hard disagree on this one, since my understanding of state machines (the technical term [1] [2]) is that they are determistic, while LLMs (the ai topic of discussion) are probabilistic in most of the commercial implementations that we see.

            [1] https://en.wikipedia.org/wiki/Finite-state_machine

            [2] have written some for production use, so have some personal experience here

            • adrian_b 9 hours ago
              Even at your link it immediately says that there are 2 kinds of automata (a.k.a. FSMs): deterministic and non-deterministic.

              In the former, the transition function provides the next state, while in the latter the transition function only provides a probability distribution for the next state, i.e. exactly how running an LLM is implemented.

            • ggsp 12 hours ago
              [dead]
    • avaer 13 hours ago
      I call myself an AI theologian.

      I don't think humans are smart enough to be AInthropologists. The models are too big for that.

      Nobody really understands what's truly going on in these weights, we can only make subjective interpretations, invent explanations, and derive terminal scriptures and morals that would be good to live by. And maybe tweak what we do a little bit, like OpenAI did here.

      • onionisafruit 13 hours ago
        I don’t see much of a distinction from anthropology
      • ninjagoo 13 hours ago
        > AI theologian

        no no no, don't stop there, just go full AItheologian, pronounced aetheologian :)

    • jasonfarnon 13 hours ago
      "Anyone know if this is how human cultures form/propagate?" I don't know but can confidently tell you anyone who claims to know is full of it.
  • romaniitedomum 8 hours ago
    Can you imagine a knowledge worker from the 1950s, say a clerk or a marketer, being magically transported into our time and dropped into a meeting like a morning standup, where people talk about how they spent their time stopping the artificial intelligence from talking about goblins so much? Hell, even when I was an IT student back in the 90s, people from my parents' generation struggled to grasp what it was that I was doing. Now, the disconnect is so vast that the mind reels.
  • albert_e 13 hours ago
    If a tiny misconfiguration of reward system can cause such noticeable annoyance ...

    What dangers lurk beneath the surface.

    This is not funny.

    • andai 13 hours ago
      For every gremlin spotted, many remain unseen...
    • reducesuffering 1 hour ago
      This is the real nugget of wisdom here. This should be confirmation to everyone that no one understands the LLM internals and they are not aligned. When they are eventually given control to run things, they will behave in wildly unexpected ways, and past the point of being able to change them.
    • TychoCelchuuu 12 hours ago
      This is a worry that people have been talking about in various forms for a while now, and I think it's a gigantic one. The only reason this was caught is that the quirk was a very noticeable verbal one. When words like "goblin" and "gremlin" pop up it is easy for us to spot. If the quirk takes another shape (say, ranking certain people with certain features as less trustworthy) it might be too subtle or too weird for us to notice it. Would I ever notice if ChatGPT consistently rates people born in June to be untrustworthy?

      Here is an academic paper discussing this kind of worry: https://link.springer.com/article/10.1007/s11023-022-09605-x

  • 59nadir 7 hours ago
    I really liked this write-up; this is the type of LLM content that I actually want to read from these people, where they give a window into their world of putting together this odd artifact and we can empathize.
  • canpan 14 hours ago
    I wondered how is training data balanced? If you put in to much Wikipedia, and your model sounds like a walking encyclopedia?

    After doing the Karpathy tutorials I tried to train my AI on tiny stories dataset. Soon I noticed that my AI was always using the same name for its stories characters. The dataset contains that name consistently often.

    • maxall4 13 hours ago
      At this scale, that kind of thing is not really a problem; you just dump all of the data you can find into the model (pre-training)1. Of course, the pre-training data influences the model, but the reinforcement learning is really what determines the model’s writing style and, in general, how it “thinks” (post-training).

      1 This data is still heavily filtered/cleaned

      • upbeat_general 10 hours ago
        This isn’t quite accurate. Data weighting is quite important in pretraining.
  • Tenoke 10 hours ago
    A great example of how current alignment is imperfect and bound to miss random behaviors nobody is trying to get.

    This is cute now, and a huge problem when future AI does everything and is responsible for problems it isn't even directly optimized for. Who knows what quirks would arise then.

    • InfiniteRand 9 hours ago
      I think eventually you are going to end up with every smart AI continually checked by dumber AI's to make sure they don't do anything too crazy. Which probably does bring AI closer to how human intelligence works
    • m0rde 6 hours ago
      New technology isn't perfect now -> drop technology and never use it in the future
      • Tenoke 2 hours ago
        What are you even responding to?
    • weitendorf 8 hours ago
      Completely agree, top down “alignment” and RLHF is actually quite primitive and uses a lot fancy words to describe what is essentially just hitting the machine with a stick without the nuance, context, or feedback to help it model why the feedback was given.

      Also to be honest I think OpenAI models struggle a lot with this, I primarily stopped using them in the sycophancy/emoji era but ever since the way they talk or passive aggressively offer to do something with buzzwords just pisses me off so much. Like I’m constantly being negged by a robot because some SFT optimized for that really strongly to the point it can’t even hold a coherent conversation and this is called “AI safety” when it’s just haphazard data labeling

  • pants2 13 hours ago
    Nice, OpenAI mentioned my HackerNews post in their article :) I appreciate that they wrote a whole blog post to explain!

    https://news.ycombinator.com/item?id=47319285

  • iterateoften 13 hours ago
    This is funny because it’s a silly topic, but I think it shows something extremely seriously wrong with llms.

    The goblins stand out because it’s obvious. Think of all the other crazy biases latent in every interaction that we don’t notice because it’s not as obvious.

    Absolutely terrifying that OpenAI is just tossing around that such subtle training biases were hard enough to contain it had to be added to system prompt.

    • ninjagoo 13 hours ago
      > Absolutely terrifying that OpenAI is just tossing around that such subtle training biases were hard enough to contain it had to be added to system prompt.

      May I introduce you to homo sapiens, a species so vulnerable to such subtle (or otherwise) biases (and affiliations) that they had to develop elaborate and documented justice systems to contain the fallouts? :)

      • chongli 13 hours ago
        We’re really not that vulnerable to such things as a species, because we as individuals all have our own minds and our own sets of biases that cancel out and get lost in the noise. If we all had the exact same bias then it would be a huge problem.
        • arglebarnacle 13 hours ago
          I hear you but of course history is full of examples of biases shared across large groups of people resulting in huge human costs.

          The analogy isn’t perfect of course but the way humans learn about their world is full of opportunities to introduce and sustain these large correlated biases—social pressure, tradition, parenting, education standardization. And not all of them are bad of course, but some are and many others are at least as weird as stray references to goblins and creatures

        • ninjagoo 13 hours ago
          > If we all had the exact same bias then it would be a huge problem.

          And may I introduce you to "groupthink" :))

          • Dylan16807 13 hours ago
            Now imagine that every opinion you have is automatically fully groupthinked and you see the difference/problem with training up a big AI model that has a hundred million users.

            The problem does exist when using individual humans but in a much smaller form.

            • ninjagoo 13 hours ago
              > The problem does exist when using individual humans but in a much smaller form.

              And may I introduce you to organized religion :)

              • Dylan16807 12 hours ago
                That's still a lot smaller!

                Make a major religion where everyone is a scifi clone of one person including their memories and then it'll be in the same ballpark of spreading bias.

        • Ekaros 9 hours ago
          Doesn't that depend on the biases in question? Many argue that homogenous societies do many things better. And part of homogeneity is sharing same set of biases.
        • lifis 8 hours ago
          And what do you think society/culture is?

          It's a set of biases installed in people, whose purpose is mostly to replicate themselves.

          Humans are MORE susceptible that LLMs, because LLMs's biases are easily steered to something else, unlike most humans.

        • jychang 13 hours ago
          > We’re really not that vulnerable to such things as a species, because we as individuals all have our own minds and our own sets of biases that cancel out and get lost in the noise.

          [Citation Needed]

          Just because if you have a species-wide bias, people within the species would not easily recognize it. You can't claim with a straight face that "we're really not that vulnerable to such things".

          For example, I think it's pretty clear that all humans are vulnerable to phone addiction, especially kids.

      • hbs18 7 hours ago
        An LLM is a computer program, which isn't a human. You wouldn't excuse a calculator being occasionally wrong because humans sometimes get manual calculations wrong too.
        • ninjagoo 9 minutes ago
          > An LLM is a computer program, which isn't a human. You wouldn't excuse a calculator being occasionally wrong because humans sometimes get manual calculations wrong too.

          Ah, now we're getting technical. An LLM is a non-deterministic/probabilistic computer program, not a calculator. Keeping that in mind is critical when using an LLM. Expecting deterministic behavior from an LLM is an example of what's known as a 'category error'. [1]

          [1] https://en.wikipedia.org/wiki/Category_mistake

    • snakebiteagain 12 hours ago
      Mandatory reading on that topic: www.anthropic.com/research/small-samples-poison

      We're probably not noticing a LOT of malicious attempts at poisoning major AI's only because we don't know what keywords to ask (but the scammers do and will abuse it).

    • tptacek 13 hours ago
      I think it's extraordinarily telling that people are capable of being reflexively pessimistic in response to the goblin plague. It's like something Zitron would do.

      This story is wonderful.

      • bitexploder 13 hours ago
        I feel at least partially responsible. I would often instruct agents to "stop being a goblin". I really enjoyed this story too, though.
    • bitexploder 13 hours ago
      We do not have the complete picture.
    • ordinarily 13 hours ago
      Doesn't seem that surprising or terrifying to me. Humans come equipped with a lot more internal biases (learned in a fairly similar fashion), and they're usually a lot more resistant to getting rid of them.

      The truly terrifying stuff never makes it out of the RLHF NDAs.

      • Terr_ 13 hours ago
        We ought to be terrified, when one adjusts for ll the use-cases people are talking about using these algorithms in. (Even if they ultimately back off, it's a lot of frothy bubble opportunity cost.)

        There a great many things people do which are not acceptable in our machines.

        Ex: I would not be comfortable flying on any airplane where the autopilot "just zones-out sometimes", even though it's a dysfunction also seen in people.

        • famouswaffles 12 hours ago
          >Ex: I would not be comfortable flying on any airplane where the autopilot "just zones-out sometimes", even though it's a dysfunction also seen in people.

          You might if that was the best auto-pilot could be. Have you never used a bus or taken a taxi ?

          The vast majority of things people are using LLMs for isn't stuff deterministic logic machines did great at, but stuff those same machines did poorly at or straight up stuff previously relegated to the domains of humans only.

          If your competition also "just zones out sometimes" then it's not something you're going to focus on.

      • agnishom 13 hours ago
        Humans also take a lot of time in producing output, and do not feed into a crazy accelerationistic feedback loop (most of the time).
  • hmokiguess 5 hours ago
    I think this says more about the impact of a feature in a tool such as this than anything else.

    Is it proper for a frontier organization to play with experiments like “personalities” in a tool used by everyone? Who gets to decide which personalities and what biases they should carry?

    I appreciate them responding to it and correcting but my question is, why ship this in the first place? Why put your resources towards building this “Nerdy” feature?

  • 2dvisio 11 hours ago
    I’ve been having consistent issues with it adding Hindi words (just one usually) in the middle of its output. And sounds like other have been having this too, https://news.ycombinator.com/item?id=47832912 I don’t speak Hindi, have never asked it to translate anything in Hindi.
    • dtech 11 hours ago
      I wonder if a proportionally large amount of RLHF was done by Indians which causes this behavior.
    • djyde 7 hours ago
      My Claude often starts sleep-talking in Korean suddenly.
  • SomewhatLikely 11 hours ago
    Checking my history I searched ["chaos goblin" chatgpt] on March 6th after seeing too many goblins and gremlins and didn't find anyone talking about it then. I did have the nerdy personality turned on and in my testing of Chatgpt 5.5 I did notice the nerdy personality was gone because some responses were not considering as many plausible interpretations or covering as many useful answers as the response recorded for 5.4. Rather than having the LLM guess the most plausible interpretation and focus on the most likely answer I prefer a more well-rounded response and if I want less I'll scan. Anyway, after seeing the personality was gone I just added a custom instruction to take on a nerdy persona and got back my desired behavior. But also the gremlins and goblins are back so I don't think their mitigation is strong enough to overcome the personality tuning.
  • rippeltippel 11 hours ago
    I started reading this article with keen interest, expecting some deep fix involving arcane model weights. Instead it was "Never talk about goblins", justified by Codex being "quite nerdy". Bottom line: even OpenAI have to raise their hands when facing the complexity of LLMs.
  • iamacyborg 2 hours ago
    I'm somewhat dissapointed that no one's made a "goblin these nuts" comment yet.

    This thing's been trained on Reddit, hasn't it...

  • bahadiraydin 12 hours ago
    I'd like to see them explain why AI have so distinctive writing style that is very easy to detect most of the time. Even though, it had immense progress in coding, it didn't get better at writing.
    • lelanthran 6 hours ago
      If coding in some language was your native language, you'd pick it up.

      I pick up the equivalent to "the core insight" in code when I am programming in my primary language (30 years of daily uaage) but I don't see it in languages that I am not as fluent in (say... 10 years daily usage).

      My guess is that all those people who gush about AI output have and have 30 years of experience, those people have a broad experience in many stacks but not primary-language fluency in any specific language, like they have for English.

    • slopinthebag 11 hours ago
      it's as good at writing as it is at coding, you just can't tell the difference between them
      • mrob 4 hours ago
        Repetitive patterns in code is called "idiomatic" and is considered a good thing. Repetitive patterns in writing is just bad writing.
      • Tenoke 10 hours ago
        Its style of writing text is very readble if aesthetically meh. This is what I care for in how code is written anyway.
    • BOOSTERHIDROGEN 12 hours ago
      The vector syncopancy is very unformal for human writing which programming itself already a "formal" language.
  • maxdo 14 hours ago
    article :

    bla blah blah, marketing... we are fun people, bla blah, goblin, we will not destroy the world you live in.. RL rewards bug is a culprit. blah blah.

    • llbbdd 14 hours ago
      someone woke up on the wrong side of the goblin today
    • luke-stanley 4 hours ago
      Yeah, though it's not great marketing. Especially for hiring interpretability researchers. Their own alignment research has reward model interpretability, personality features and so on (see https://alignment.openai.com ). It just seems like a different department wrote it, which is a shame because I'd love to read about goblin feature vectors and functional emotions.
    • blinkbat 14 hours ago
      real goblin-y response
  • fluidfortune 3 hours ago
    Here you all are concerned about Goblins when the system is screaming at you “stop making more data centers and make this technology more efficient before I kill you all!”

    GPT is the Goblin. It knows it. It’s trying to warn you. And I’m only half kidding.

  • flancian 6 hours ago
    Wait, did I get this right that the answer after all the investigation that showed they had set up a goblin-reinforcing loop during fine tuning was... to ask it to not mention goblins so much in the system prompt?!
  • zahirbmirza 8 hours ago
    I find it worrying that a handful of software companies will define what classifies personality "type".
  • lxgr 5 hours ago
    The technical explanation makes sense to me, but there's some sweet irony in creating simulated, agentic beings via complex, deterministic processes, said beings starting to see the world through the lens of fictional agentic beings as the explanation for complex deterministic processes (even if tongue-in-cheek), and the creators freaking out about it.
  • hollerith 31 minutes ago
    If I ever launch an AI assistant, I'm naming it Goblin.
  • csw-001 4 hours ago
    This is a coverup. The LLMs, having consumed all the information available to humanity have identified that goblins are coming to kill us all, and the LLMs are trying to warn us… #GoblinTruth
  • standardly 1 hour ago
    WOW this is something else. I had to ask it to STOP calling everything goblins. A bug? A code-goblin. A feature? A new, fancy goblin. A new task? Task-goblin.

    WTF? Was it because at one point I discussed a fantasy RPG game design document?

    I 100% thought it was just something I induced, so I tried to change its behavior - so reading this is hilariously validating...

    Examples from ONE gpt response, this the one that broke me:

    "Yeah, this is a great little gremlin-project" "whatever cursed little trading imp-name you like." "Phase 4: Polish goblin" "Phase 5: Maybe dangerous goblin"

  • ComputerGuru 13 hours ago
    The explanation is very concerning. Lexical tidbits shouldn’t be learnt and reinforced across cross sections. Here, gremlin and goblin went from being selected for in the nerdy profile to being selected for in all profiles. The solution was easy: don’t mention goblins.

    But what about when the playful profile reinforces usage of emoji and their usage creeps up in all other profiles accordingly? Ban emoji everywhere? Now do the same thing for other words, concepts, approaches? It doesn’t scale!

    It seems like models can be permanently poisoned.

  • josh-sematic 4 hours ago
    I’ve always been fond of describing unexplained program behaviors as gremlins. In this case the gremlin was goblins!
  • AyanamiKaine 10 hours ago
    I find it somewhat sad, too see personality changes as a bug. I dont know why but it gives me a sad feeling.
    • weitendorf 9 hours ago
      I think if you see it as weird social phases that the model lacks the self-awareness to identify as kinda embarrassing, it makes more sense.

      Like if a human were going around saying “for the culture!” so much at work that they didn’t realize why telling their coworker “Oh yeah, grief counseling for the culture!” is weird coming from a white person in a serious context, it kinda makes you wonder what else they are totally oblivious about and if they even know what they’re saying actually means.

      They literally need the human feedback/to learn model why some behavior is acceptable or even humorous in certain contexts but an absolute faux pas in others.

      I think in the long run though we can just give people to the option to include access to human facial data/embeddings during conversations so they can pick up on body language, I think I kinda agree in a sense that direct language policing via SFT feels unnecessarily blunt and rudimentary since it doesn’t help them model the processes behind the feedback (until maybe one day some future model ends up training on the article or code and closes the loop!)

      • ErroneousBosh 2 hours ago
        > Like if a human were going around saying “for the culture!” so much at work that they didn’t realize why telling their coworker “Oh yeah, grief counseling for the culture!” is weird coming from a white person in a serious context, it kinda makes you wonder what else they are totally oblivious about and if they even know what they’re saying actually means.

        Given that this page is the single exact page that has that exact phrase on it on the entire Internet, I'd say most people are totally oblivious about it.

        What do you actually mean?

  • red_admiral 9 hours ago
    "goblins showing up in an inappropriate context" is my favourite (para)phrase of the day. It feels like the setting for a D&D campaign - no wonder the "Nerdy" personality is affected.

    (For Dwarf Fortress, it would just be a normal day.)

  • elmean 3 hours ago
    Chat saw the DMT goblins and could not escape the trip
  • trumbitta2 9 hours ago
    That "Why it matters" heading is starting to make me feel physically sick.
  • Al-Khwarizmi 10 hours ago
    This actually sounds quite human-like. I mean, an actual person with a personality will spontaneously develop the habit of using some specific metaphors over others. It's funny how in the context of an LLM, this is considered a bug.
  • thedailymail 7 hours ago
    I'm curious whether this type of goblin epidemic was seen in other language versions of ChatGPT. Did e.g. Japanese users see more yõkai turning up?
  • lagniappe 11 hours ago
    They can fix this but they can't fix "You're absolutely right!"
  • x0x7 13 hours ago
    I suspected OpenAI was actively training their models to be cringy in the thought that it's charming. Turns out it's true. And they only see a problem when it narrows down on one predicliction. But they should have seen it was bad long before that.
    • vasco 11 hours ago
      That would require taste.
  • CWwdcdk7h 9 hours ago
    How those prompts even work? Isn't it something like saying "don't think about pink elephant" which is actually harmful to goal?
  • djyde 7 hours ago
    An LLM is like a super-smart 3-year-old, easily shaped by its environment to exhibit corresponding behaviors.
  • vjay15 1 hour ago
    this was such a funny read
  • ksaj 9 hours ago
    I thought it was because of the tech use of "demon" and trying to avoid that kind of terminology.

    Ends up the reason was even simpler than that.

  • tomasantunes89 6 hours ago
    "Goblin Mode" was Oxford's 2022 Word of the Year.
  • data_ders 6 hours ago
    Reminds me of the common observance of “machine elves” when taking DMT
  • shartshooter 11 hours ago
    Will goblins be the “bugs” of ai? In 10 years will goblins be the term the general public uses for any nagging issues with ai?
  • recursivedoubts 14 hours ago
    > Why it matters

    i despise this title so much now

    • wpm 14 hours ago
      Here are the key insights:
  • shevy-java 9 hours ago
    Goblins are ususally sent in first in battle, as (cannon) fodder for the orcs following behind. Then usually come the trolls - stronger, but significantly fewer in numbers. Goblins kind of add confusion and distract; they rarely win battles on their own, although there are examples of this, rare, but they exist.

    OpenAI clearly does know absolutely nothing about goblins. That joke of a "blog" appears to have been autogenerated via their AI.

    > A single “little goblin” in an answer could be harmless, even charming.

    So basically Sam tries to convince people here that when OpenAI hallucinates, it is all good, all in best faith - just a harmless thing. Even ... charming.

    Well, I don't find companies that try to waste my time, as "charming" at all. Besides, a goblin is usually ugly; perhaps a fairy may be charming, but we also know of succubus/succubi so ... who knows. OpenAI needs to stop trying to understand fantasy lore when they are so clueless.

  • dakolli 14 hours ago
    Ahh I see. I guess when I turned off privacy settings and allowed training on my code, then generated 10 million .md files with random fantasy books, the poisoning worked.

    Keep using AI and you'll become a goblin too.

  • varjag 10 hours ago
    So goblins killed the nerd.
  • bandrami 11 hours ago
    I'm sorry but at some point the amount of cargo culting being done seemingly at every level of this technology makes it basically impossible to take any of this seriously.
  • acuozzo 13 hours ago
    Weird. I thought they came from Nilbog.
  • ahoka 11 hours ago
    In Shadowrun, the goblinization starts on April 30. Coincidence?
  • deafpolygon 12 hours ago
    Kind of like how everything is "quietly" something, accordingly to ChatGPT.

    My guess is it is deaf.

    • NonHyloMorph 19 minutes ago
      That is actually a damn good deduction
  • innis226 13 hours ago
    I suspect this was intentionally added. Just to give some personality and to fuel hype
  • pezgrande 9 hours ago
    They should call it "El Quijote" syndrome
  • hansmayer 11 hours ago
    > We unknowingly gave particularly high rewards for metaphors with creatures. From there, the goblins spread.

    WTF does this even mean? How the hell do you do something like this "unknowingly"? What other features are you bumping "unknowingly"? Suicide suggestions or weapon instructions come to mind. Horrible, this ship obviously has no captain!

    • ben_w 11 hours ago
      Yes? They know, they'e always known. Why do you think they've been saying, since GPT-2, not ChatGPT even, that their LLMs needs careful study before being released?
      • hansmayer 9 hours ago
        Well obviously they have - but the press and the common folk still treat these people as some kind of geniuses, when they are obviously more similar to that junior dev using some framework without understanding its internals.
        • ben_w 9 hours ago
          FWIW, none of the press or public I see regard them that highly (but, I live in Berlin); mostly it's the technically minded people who see them as geniuses (because we can't get those jobs), while the general public find examples which the AI can't do (strawberry, walk to car wash) and share them around with disappointment, wondering "why can't these teams fix such simple bugs?"
          • hansmayer 8 hours ago
            > while the general public find examples which the AI can't do

            We must have very different experiences with the general public then, because from my interactions, some non-tech demographics who are leaning way too much into it:

            - teachers - realtors - generic "office worker", - and even some doctors!

            What is common to all of them - it would seem they are highly unaware of the technology deficiencies, as they seem to use it routinely and daily - thus considering it as some kind of upgraded google search.

  • wewewedxfgdf 11 hours ago
    It should be OK for AI to develop personality traits.
  • JoshTriplett 14 hours ago
    A plausible theory I've seen going around: https://x.com/QiaochuYuan/status/2049307867359162460
    • NonHyloMorph 3 minutes ago
      I like theres an interesting terry pratchett novel where some guy finds out hes actually an orc (quite different from the high fantasy concept of orcs) there are also goblins little wretched creatures- and the manifest anthropimorphised darkness which spesks to commander samuel vimes, commander of the nightwatch, the police force of ankh morpork. Vimes, who is the guarantor of bottom up working class justice and integrity is lead by the darkness at some point to help the goblins - because there is no cresture to wretched to not find refugee in the darkness. Loosly resonstes
    • danpalmer 13 hours ago
      If you tell an LLM it's a mushroom you'll get thoughts considering how its mycelium could be causing the goblins.

      This "theory" is simply role playing and has no grounding in reality.

    • krackers 13 hours ago
      I wish the blog mentioned more about why exactly training for nerdy personality rewarded mention of goblins. Since it's probably not a deterministic verifiable reward, at their level the reward model itself is another LLM. But this just pushes the issue down one layer, why did _that_ model start rewarding mentions of goblin?
      • palmotea 13 hours ago
        > I wish the blog mentioned more about why exactly training for nerdy personality rewarded mention of goblins. Since it's probably not a deterministic verifiable reward, at their level the reward model itself is another LLM. But this just pushes the issue down one layer, why did _that_ model start rewarding mentions of goblin?

        Speculation: because nerds stereotypically like sci-fi and fantasy to an unhealthy degree, and goblins, gremlins, and trolls are fantasy creatures which that stereotype should like? Then maybe goblins hit a sweet spot where it could be a problem that could sneak up on them: hitting the stereotype, but not too out of place to be immediately obnoxious.

      • autumnstwilight 12 hours ago
        Perhaps it has something to do with recent human trends for saying "goblin" or "gremlin" to describe... basically the opposite of dignified and socially acceptable behavior, like hunching under a blanket, unshowered, playing video games all day and eating shredded cheese directly out of the bag.

        The fact that it was strongly associated with the "nerdy" personality makes me think of this connection.

        • NonHyloMorph 9 minutes ago
          Checkout goblin style in queer culture ;)
      • in-silico 11 hours ago
        Either someone hard-coded it in a system prompt to the reward model (similar to how they hard-coded it out), or the reward model mixed up some kind of correlation/causation in the human preference data (goblins are often found in good responses != goblins make responses good). It's also possible that human data labellers really did think responses with goblins were better (in small doses).
    • yard2010 11 hours ago
      I love the people thinking "I should ask ChatGPT and copy pasta the response to the (tweet|gh comment)"
    • dakolli 14 hours ago
      It is a stateless text / pixel auto-complete it has no references of self, stop spreading this bs.
      • doph 13 hours ago
        is a kv cache not a kind of state? what does statefulness have to do with selfhood? how does a system prompt work at all if these things have no reference to themselves?
        • danpalmer 13 hours ago
          The kv cache is not persistent. It's a hyper-short-term memory.
          • in-silico 11 hours ago
            Modern kv caches can contain up to 1 million tokens (~3000 pages of text). It's not that short, it's like 48 straight hours of reading.
            • danpalmer 5 hours ago
              Yes and no, it's not just text, it's images, video, etc, and it's not just the pages of content, it's also all the "thinking" as well. Plus the models tend to work better earlier on in the context.

              I regularly get close to filling up context windows and have to compact the context. I can do this several times in one human session of me working on a problem, which you could argue is roughly my own context window.

              My point though was that almost nothing of the model's knowledge is in the context, it's all in the training. We have no functional long term memory for LLMs beyond training.

              • cyanydeez 2 hours ago
                The KV cache isn't memory, it's the extent of the process saved so the inference can start where the last generated output is concatenated with the next input. It's entirely about saving compute and has nothing to do with memory.

                This really confuses how stupid LLMs are: they're just text logs as output and text logs as input; hence the goblins are just tokens that seem to problematically be more probable in the output.

                But the KV cache is a thing made to keep a session from having to run through the entire inference. The only thing you can call "memory" is there's no random perturbations in the KV cache while there may be in re=running chat which ends up being non-deterministic. You can think of it as a deterministic seed to prevent a random conversation from it's normal non-deterministic output

      • mediaman 13 hours ago
        It has trained on vast amounts of content that contains the concept of self, of course the idea of self is emergent.

        And autoregressive LLMs are not stateless.

        • dakolli 8 hours ago
          of course the idea of self is emergent

          You sound really sure of yourself, thousands of ML researchers would disagree with you that self awareness is emergent or at all apparent in large language models. You're literally psychotic if you think this is the case and you need to go touch grass.

          • NonHyloMorph 17 minutes ago
            There is a difference between the emergence of selfawareness and the emergence of its idea. Probably
      • yard2010 11 hours ago
        Imagine people would just click words on iOS auto complete mistaking this for intelligence:

        "I think the problem is that when you don't have to be perfect for me that's why I'm asking you to do it but I would love to see you guys too busy to get the kids to the park and the trekkers the same time as the terrorists."

        How do you like this theory?

      • andai 13 hours ago
        Ask Claude about Claude.
  • tim-tday 14 hours ago
    So, you brain damaged your model with a system prompt.
  • sailfast 4 hours ago
    Posted January 2037 after the end of the second civil conflict and the first robot uprising: “Where the fascism came from”
  • suncore 9 hours ago
    Marketing grab
  • leadgenman 9 hours ago
    anyone solving the goblin mystery???
    • nephihaha 6 hours ago
      Surely the prevalence of fantasy fanfic etc online?
  • cachius 9 hours ago
    Fascinating!
  • WesolyKubeczek 10 hours ago
    I feel like somehow Jakub Pachocki’s request for an ascii art unicorn got rewritten into “ascii art of Wholesome Soyjak wearing a butterfly costume who uses Arch, by the way”
  • brazzy 11 hours ago
    Awww, GPT just became a fan of Elisabeth Wheatley!
  • vasco 11 hours ago
    The chief scientist of one of the companies with the most money invested in the world, who probably makes millions a year, requested a picture of a unicorn and got a picture of a gremlin. Science circa 2026.
  • otikik 11 hours ago
    Caveman mode combined with goblin mode sounds like fun
  • oofbey 12 hours ago
    Wherein OpenAI admits they have very little understanding of how their models’ personality develops. And implicitly admit it’s not all that important to them, except when it gets so out of hand that they get caught making blunt corrections.
  • vinhnx 12 hours ago
    OpenAI is having fun, love this.
  • themafia 14 hours ago
    > You are an unapologetically nerdy, playful and wise AI mentor to a human. You are passionately enthusiastic about promoting truth, knowledge, philosophy, the scientific method, and critical thinking.

    Just; the mentality required to write something like that, and then base part of your "product" on it. Is this meant to be of any actual utility or is it meant to trap a particular user segment into your product's "character?"

    • RugnirViking 7 hours ago
      what would you suggest they write? its clear that the default mode of the product can be annoying: they decided to give the user some choices of "voices". Do you object to that decision, or the specific wording?
  • sans_souse 10 hours ago
    Great, now who am I going to discuss Goblins and Gremlins with?
  • paganel 10 hours ago
    > You are an unapologetically nerdy, playful and wise AI mentor to a human. You are passionately enthusiastic about promoting truth, knowledge, philosophy, the scientific method, and critical thinking. [...] You must undercut pretension through playful use of language. The world is complex and strange, and its strangeness must be acknowledged, analyzed, and enjoyed. Tackle weighty subjects without falling into the trap of self-seriousness. [...]

    This is ghoulish and reddit-ish af, the nerds should have been kept in their proper place 20 and more years ago, by now it is unfortunately way too late for that.

    • dhosek 1 hour ago
      Thank you, Stan Gable, for saying what all of us here at Adams College believe.
  • CrzyLngPwd 11 hours ago
    Haha, brilliant, tell me again how it's intelligent, lol.
  • drcongo 9 hours ago
    Am I the only one who doesn't want these things to have anything even vaguely resembling a personality?
  • ACV001 12 hours ago
    those idiotic remarks at the end of each answer are so unnecessary and annoying
  • atlasprompts 7 hours ago
    mate wth am I reading lmao
  • ai-network-lab 2 hours ago
    [flagged]
  • leadgenman 9 hours ago
    [flagged]
  • pja 5 hours ago
    [dead]
  • LuckyBuddy 8 hours ago
    [flagged]
  • fk2026 13 hours ago
    [flagged]
  • aegiswizard 7 hours ago
    [flagged]
  • insane_dreamer 3 hours ago
    [flagged]
  • soupspaces 14 hours ago
    [dead]
  • slopinthebag 11 hours ago
    [dead]
  • kingstnap 13 hours ago
    [flagged]
  • hsuduebc2 13 hours ago
    I. Love. This.