Pre-2022 Books

(notes.lorenzogravina.com)

101 points | by trms 1 hour ago

22 comments

zerobees 52 minutes ago
I've been consciously doing that for reference books for the past three years because Amazon is absolutely littered with AI-generated non-fiction. I have my own ideological reasons too, but the main problem is that most of that AI-generated reference stuff is just of incredibly poor quality. It's meant to saturate the platform as cheaply as possible, so no one actually does any fact-checking, editing, layout, and so on. They're not even using frontier models for that.
For example, there are multiple evidently AI-generated titles that come up on the front page if you search for "Rust programming", "cybersecurity book", etc. I guess I can't rule out that "Winston Knowles" is a real person, but I'm not gonna bet money on that: https://www.amazon.com/Cybersecurity-Career-Manual-Interview...
Oh, and here's one of the top-ranked reference books right now: https://www.amazon.com/100-000-Whys-Kids-Encyclopedia/dp/B0H... - click "read sample". Almost every illustration is wrong in some obvious way - misplaced labels, nonsensical anatomical details, etc.
RomanPushkin 1 hour ago
It's one of the reasons I don't want to update my free book about Ruby: https://leanpub.com/rubyisforfun - written by a human, and will stay the same forever I think. The moment you touch it to update - immediately changes the date from 2022-05-26 to 2026 and all the value is gone
dspillett 20 minutes ago
Not just books. When searching for information online, for anything where things haven't changed significantly in the last few years, I definitely favour a post on SO/SE/HN/reddit/etc dated before 2023 over those that are later. And where there are no good looking references before that, the earlier the better.
Of course there are no doubt people out there realising that a fair few of us do this, and are starting to edit posts to pre-date them as a sort of SEO trick…
[-]
- golem14 18 minutes ago
  In this case, archiv.* might be your friend, since they have time stamped copies.
adamddev1 40 minutes ago
And there might be no way to prove you really wrote something Post-2022. I wrote a long article, all by hand. I never used any LLMs, even for searching. I checked it with a couple of AI detection tools and they confidently said that 60% of the article was written by AI.
[-]
- algoth1 34 minutes ago
  The thing is, llm token frequency was derived from human writings like yours, and rlhf for good writing practices, like the emdash. So getting ’detected’ on good writing it’s unfortunately to be expected. My broken ESL english is much safer for now
  [-]
  - timacles 9 minutes ago
    Our only hope is to start communicating like DevOps Borat
    [-]
    - mannycalavera42 2 minutes ago
      hats off
    - BobbyTables2 1 minute ago
      [flagged]
  - John7878781 32 minutes ago
    Interestingly, you're actually more likely to be flagged as AI if English is your second language.
    [-]
    - heffer 25 minutes ago
      That tracks with reality, as the majority of people don't have English as their first language. Depending on data sources used for training that could well reflect into AI detection tools.
- mohamedkoubaa 29 minutes ago
  I track every change in the fiction I write in git. Its not hard proof but it at least shows how the prose evolved over time and is something like a proof of work.
  [-]
  - arkaic 4 minutes ago
    It's extremely easy to disprove that lack of hard proof too. One could've individually chatgpt'd every addition before commit for proofreading. The infection of AI just gets into every nook and cranny of the process because it's so easy to reach for it
    [-]
    - mohamedkoubaa 1 minute ago
      Right, it isn't proof and I won't claim it is, but it's more diligence than I think is normal. I hope that counts for something.
- raincole 15 minutes ago
  Every time I attempted to convince people to not use Pangram on HN I got downvoted.
- altmanaltman 27 minutes ago
  Most "AI detection" tools are BS, report AI writing for all writing. The current issue with AI writing is that it has a very generic, easy-to-spot style if you spend even a bit of time working with it. If everyone in the world spoke in the same manner, used the same punctuation, spoke in exactly the same catchphrases, the world would lose its richness, and that is the problem with AI writing and communications in general - it has 0 personality, and humans by nature engage with strong and unique personalities. Over time, people will realize the futility of using AI in creative writing, or AI will get really, really good at not just sounding human but being human.
  Yet, from what I can see, AI writing is mostly used by people who don't know a thing about writing, and because they have bad taste, they do not see what's wrong with AI writing and put it out there.
  At the end, you write for a purpose: for marketing copy, etc., you would require a different type of writing talent than something like writing a fiction book. But AI doesn't understand this nuance; it has only a default type of communication, which is highly optimized for being a chatbot. It is possible to write a very good text using AI if you have taste and you know what you're doing, but most people don't.
  Similarly, a lot of vibe-coded apps are garbage, but because the people creating them lack software domain knowledge and don't even know what they don't know, they think it's good and put it out.
  We have a massive problem here that's not just limited to writing - the promise of AI for the mainstream market is that you can replace domain-specific knowledge and have world-class execution in any vertical with just AI, but that's very overhyped imo and doesn't stop the people who don't have domain experience to try out stuff with AI and not realize what they made is a steaming pile of shit in reality.
YesBox 28 minutes ago
You are not alone. I like to read Harry Potter fan fiction [1] and I have started checking the publication date when Im searching for something new to read. I started doing this passively and realized it after the fact.
Have you ever met someone who could say all and do the right things but never made you feel anything, or your gut was sensing an ulterior motive? It's a magic trick we are all bewitched by at some point in our lives. I suppose I filter by published year because I dont want think about if I am being tricked or not.
[1] There are some very talented writers[A] out there who (I assume) cannot do the world building part.
[A] Recent Favorite: https://archiveofourown.org/works/1134255/chapters/2292768
Avicebron 58 minutes ago
Props to the author for not mentioning low-background steel.
[-]
- rzzzt 54 minutes ago
  Sounds like a pink elephant exercise. Low-background steel has now creeped back into our collective consciousness.
bashmelek 10 minutes ago
I’m in my mid-30s, and have never written a book, but I still sometimes think of it. I know it isn’t too late. I still want to create my own applications, but I once used the Google ai result in a utility function. Is it all tainted? I still want people around me to try in earnest
raincole 18 minutes ago
I feel that too. But the reasonable part of me knows that it's just one generation can't "get" the entertainment of the next generation. It has always been like that.
There are mobile game ads on TV here. My father asked me what actually the players get from paying the game companies money. He still doesn't get it after I tried to explain how it works twice.
drchaim 48 minutes ago
This will happen with social accounts, news articles..I set the date pre 2023, but we all have some date in mind. I don’t like it, but it’s what it is
andy99 47 minutes ago
I’m pretty conservative with books and usually only read things based on recommendations anyway. I would rarely read a book published in the last few years just because news of it hasn’t travelled to me yet. I think worrying about AI generated books would really only matter if you’re at the bleeding edge of reading and looking for brand new stuff.
cryo32 32 minutes ago
It’s not just 2022 and earlier books. There’s a supply chain problem as well. I’ve seen two older books so far from Amazon which were AI generated copy text with a genuine looking cover on it. Amazon just took the return and probably restocked it for the next victim.
I tend to buy books from second hand book shops and eBay now and usually older or well used copies. A good sign of their authenticity.
fallat 17 minutes ago
There has always been shit books - let that sink in.
There will be _more_ shit books now, but that's the only difference.
There will be probably a constant rate of "good" books.
wenbin 17 minutes ago
If contents are generated instantly via llm and packaged as books, videos, podcasts, pull requests etc, then they don’t deserve human attention.
tyre 30 minutes ago
Hot take: I think it’ll be pretty much the same as it was. If anything it will get better.
You will still have gatekeepers and taste makers. Publishing houses will screen fiction for well-written and interesting fiction. Word-of-mouth, personal recommendations, and endorsements from people you respect will continue to outweigh algorithms, if you care.
For cheap reads, how much of a difference is there between James Patterson’s 734th beach read thriller and what an LLM with a 50m token context window can produce? Does it matter that it’s not written by six ghostwriters? Probably not to the median Hudson News buyer.
For non-fiction, it’s easier to gather research and related materials. If you were cherry-picking facts to make a narrative, yeah, that’s easier, but it’s not like we haven’t gotten really good at that anyway. Again, there will be cooling off periods for scholarship to be debated and coälesce.
What will get better is people asking questions and getting well-researched pieces on a specific niche or confluence of topics. AI is just-good-enough-to-be-dangerous now. It will get better. We’ll learn to harness it (literally) to iteratively fact check and cite sources. We will build repositories with heavily sourced facts for it to build upon. It will be pulling together “truths” that can be traced, then incrementally adding inference across those, which can then be verified and are a new fact.
I read a lot. I love, love, love new and original authorship. I deeply value writing as a craft. There will be a lot of garbage. More than there is now, at an incredible rate.
And we’ll figure it out.
[-]
- tyre 26 minutes ago
  My worry is less about scholarship than the next generation of readers and authors. It is too easy to be lazy right now. Too easy to skip the difficult work of struggling with ideas. Yapping with Claude probably (?) doesn’t have the same rate of retention and reinforced learning _in humans_ as digging through source material and writing by hand.
  Growing critical thought, in my experience, has always been the much harder problem. Not sure we’re in for a good time on that front.
- mikgp 21 minutes ago
  The James Patterson point is spot on and - to expand on your point, the internet arguably took the tastemakers / gatekeepers down a peg, AI could be what brings them back.
bonoboTP 37 minutes ago
In good hands, it can be a great tool, but you usually don't notice that. The issue is that AI allows for a superficial appearance of quality and it takes time to discover that the content is void of deeper insight.
zeroonetwothree 59 minutes ago
I don’t find this at all. Not that many fiction books use a lot of AI prose it seems. Maybe nonfiction is worse?
[-]
- seliopou 58 minutes ago
  What facts are hanging that hat on?
  [-]
  - vlyan 41 minutes ago
    [dead]
- bbg2401 49 minutes ago
  I'm certainly observing AI smells from a high proportion of the books I read from O'Reilly and Packt since 2023. Authors don't attempt to hide it and some publish the work as if we didn't have a back catalog to distinguish the genuine article from a lazy prompt-driven manuscript.
  I'm not seeing the same from the translated fiction works I've picked up in the same time period, thankfully.
torben-friis 51 minutes ago
I haven't seen any minimal sign that any of the fiction books I read lately was LLM helped. Writers seem like a particularly anti AI crowd too.
Has anyone? Now I'm curious if it's just my particular bubble.
[-]
- bonoboTP 44 minutes ago
  What makes you think you'd recognize it? Do you work a lot with LLMs for fiction writing?
  [-]
  - torben-friis 28 minutes ago
    More than you can imagine. OKRs, brag documents, quarter reviews....
    Kidding aside, I would be surprised if something larger than using it as a thesaurus/corrector is slipping by. Literature is genuinely hard.
casey2 34 minutes ago
C-c C-v existed well before 2022. Most of history until the Renaissances consisted of the bulk of scholars copying out of "the book" whatever the book happened to be (Euclid, The Bible, Aristotle’s Logica Vetus, Cicero's Orations and De Officiis, 四書五經, 史記, 文選)
The liberal concept that the everyman should have their own original thoughts that others should consider is a historically a very new concept. And we start getting things that look a lot like C-c C-v quickly after the Renaissance.
See humans have the tendency to romanticize the past, and if this is allowed to compound they elevate really quite dismal people to the realm of literal godhood in some cases. If you asked someone a thousand years ago what they though life was like thousands of years in the past and what it will be like thousands of years in the future most would have said the past was better in all regards including health, strength, morals even technology; while the future would be viewed as the continual circling of the drain. Put yourself in their shoes, you go look at a Roman Colosseum, you can't build that, nobody you know can build that. If you asked Vitruvius during the construction of the Aqueducts he would tell you that he's maintain the knowledge of his ancestors, whom could have build such structures if they needed them or had the manpower, and the technical problems are just a trifle. If you pushed him, he might invoke Providentia and that if the gods stopped blessing you we'd fall even faster.
This kind of discovered then lost fits better narrative within the human psyche better than the unintuative truth is a constructed social conversation, that can be semi-formal and rigorous (the scientific method) or lax (common sense) depending on the setting.
ares623 44 minutes ago
Same with open source projects (or other software projects in general).
Pre-2022, when someone posts a Show HN, even if it's not something you would normally be interested in, there's a baseline understanding that _someone_ cared enough to spend time and effort to build it. So in a hypothetical future scenario if you do find yourself looking for that particular tool, there was value in you seeing that Show HN so you can revisit it.
Now, I just ignore all Show HN posts.
viccis 45 minutes ago
See also: https://en.wikipedia.org/wiki/Low-background_steel
api 1 hour ago
I honestly find this a little deranged. I’ve read some AI generated prose before and it’s… boring. It tends to be the mathematical average of all stories, with plots that are heavy on cliche and tropes played straight. If I read a book like that it’s probably just going to be a bad book that I don’t finish. Humans write lots of boring bad books too.
Eventually artists will figure out how to use AI to make real art that is actually good, just like photographers did with photography, and that will be its own new thing. I don’t see much of that yet but with photography it took a while.
[-]
- bonoboTP 41 minutes ago
  Controversial, but I think photography is still nowhere close in artistic value to paintings. Yes, I've seen the award winning ones etc. Not impressed. It's fine I guess, but not more. Same with laptop music vs instrument music even before AI.
- adamddev1 38 minutes ago
  We can't say "just like to photographers did with photography" or "just like programmers did with higher-level languages." These developments are not analogous to LLMs. The jump into probabilistic text-guessing machines a fundamentally different thing.
- actionfromafar 1 hour ago
  Sure, but AI is about more than the arts. It's about high fructose corn syrup slop everywhere.
  [-]
  - api 1 hour ago
    That predates AI and has more to do with the incentives baked into media, especially social media that’s all about “time on app” and “time on site” and therefore infinite scroll brain rot.
    [-]
    - actionfromafar 53 minutes ago
      It does predate AI. AI makes it much faster though and can close the rot-loop.
      [-]
      - api 49 minutes ago
        Get off social media. It’s trash. It was trash before AI and it’s trash now.
        [-]
        bonoboTP 40 minutes ago
        HN is social media. And yes, let's get off it too. It's true.
  - phendrenad2 58 minutes ago
    Maybe we'll return to curation and talent scouts. Like the pre-internet days of music: Everyone had a demo tape, but nobody wanted to listen to hundres of tapes of slop.
    [-]
    - ghaff 37 minutes ago
      Well, with writing it's more editors and lunches. That's how I got a book contract.