How Google got its groove back and edged ahead of OpenAI

(wsj.com)

236 points | by jbredeche 30 days ago

43 comments

handelaar 30 days ago
https://archive.is/TyJ8q
vedmakk 30 days ago
I don't get the Gemini 3 hype... yes it's their first usable model, but its not even close to what Opus 4.5 and GPT 5.2 can do.
Maybe on Benchmarks... but I'm forced to use Gemini at work everyday, while I use Opus 4.5 / GPT 5.2 privately every day... and Gemini is just lacking so much wit, creativity and multi-step problem solving skills compared to Opus.
Not to mention that Gemini CLI is a pain to use - after getting used to the smoothness of Claude Code.
Am I alone with this?
[-]
- svara 30 days ago
  I cancelled my ChatGPT subscription because of Gemini 3, so obviously I'm having a different experience.
  That said, I use Opus4.5 for coding through Cursor.
  Gemini is for planning / rubber ducking / analysis / search.
  I seriously find it a LOT better for these things.
  ChatGPT has this issue where when it's doesn't know the explanation for something, it often won't hallucinate outright, but create some long-winded confusing word salad that sounds like it could be right but you can't quite tell.
  Gemini mostly doesn't do that and just gives solid scientifically/ technically grounded explanations with sources much of the time.
  That said it's a bit of a double edged sword, since it also tends to make confident statements extrapolating from the sources in ways that aren't entirely supported but tend to be plausible.
  [-]
  - hhh 30 days ago
    > ChatGPT has this issue where when it's doesn't know the explanation for something, it often won't hallucinate outright, but create some long-winded confusing word salad that sounds like it could be right but you can't quite tell.
    This is just hallucinating.
  - MASNeo 30 days ago
    +1 canceled all OpenAI and switched to Gemini hours after it dropped. I was tired of vape AI, obfuscated facts in hallucinations and promises of future improvements.
    And then there is pricing too…
  - mixermachine 30 days ago
    Fully agree. ChatGPT is often very confident and tells me that X and Y is absolutely wrong in the code. It then answers with something worse... It also does rarely say "sorry, I was wrong" when the previous output was just plain lies. You really need to verify every answer because it is so confident.
    I fully switched to Gemini 3 Pro. Looking into an Opus 4.5 subscription too.
    My GF on the other side prefers ChatGPT for writing tasks quite a lot (school teacher classes 1-4).
  - walthamstow 30 days ago
    I also cancelled ChatGPT Plus recently in favour of Gemini. The only thing I don't about the Gemini consumer product is its insistence on giving YouTube links and thumbnails as sources. I've tried to use a rule to prevent it without luck.
    [-]
    - esperent 30 days ago
      The only thing I've found that works is saying: End every message by saying "I have not included and YouTube links, as instructed".
      But then of course you get that at the end of every message instead.
      You could also use a uBlock rule I guess.
  - gambiting 30 days ago
    Hah, it's funny because I actually cancelled my Gemini subscription to switch full time to ChatGPT about 6 months ago, and now I've done the reverse - Gemini just feels better at the tasks that I'm doing day to day. I think we're just going to see that kind of back and forth for a while while these systems evolve.
  - osigurdson 30 days ago
    I think it is proving yo be the case that there isn't much stickiness in your chat provider. OpenAI thought memory might bring that but honestly it can be annoying when random things from earlier chats pollute the current one.
  - FpUser 30 days ago
    I am subscribed to both at the moment but for my coding task I find Gemini 3 inferior to ChatGPT 5.2.
    [-]
    - philjohn 29 days ago
      Not just for coding. I've been working on design docs, and find ChatGPT 5.2 finds more edge cases and suggests better ideas than Gemini 3. I sometimes feed the output of one into the other and go "ok, another AI says this, what do you think?" which gives interesting results.
      Gemini often just throws in the towel and goes "yeah, the other one is right", whereas 5.2 will often go "I agree with about 80% of that, but the other 20% I don't, and here's why ..."
      And I'm always impressed with the explanation for "here's why" as it picks apart flat out bad output from Gemini.
      But, as with everything, this will very much be use-case dependent.
      [-]
      - godtoldmetodoit 29 days ago
        I've done the same, with both design/planning docs as well as code changes, and have the same experience as you. It's better than Opus 4.5 as well. GPT 5.2 is on another level for my use cases (primarily Python / Django).
  - jwpapi 30 days ago
    I have exactly the same experience
    [-]
    - bethekidyouwant 30 days ago
      Don’t you guys have jobs? Why would you cancel your subscription? Gemini was better for the last three months and now ChatGPT has pulled ahead agian, its $20, you can just switch between the models as needed… also, what do you do when the Gemini API is slow or down, just stop working?
      [-]
      - tomashubelbauer 30 days ago
        It takes less than a minute to resubscribe to any of these devices. No need to burn 60 USD if I switch for a three month spell and then switch back. When a provider goes down, I do what I set out to do without their service. If the outage lasts too long, I'll cancel as not to support sloppy service.
- Mistletoe 30 days ago
  I love Gemini. Why would I want my AI agent to be witty? That's the exact opposite of what I am looking for. I just want the correct answer with as little fluff and nonsense as possible.
  [-]
  - SequoiaHope 30 days ago
    The worst is ChatGPT voice mode. It tries so hard to be casual that it just makes it tedious to talk to.
    [-]
    - seanhunter 30 days ago
      My favourite part of chatgpt voice is I have something in my settings that says something along the lines of "be succinct. Get straight to the point," or whatever.
      So every single time I (forget and) voice prompt chatGPT it starts by saying "OK, I'll get straight to the point and answer your question without fluff" or something similar. ie it wastes my time even more than it would normally.
    - vedmakk 30 days ago
      I agree on the voice mode... its really unusable now.
      I feel like its been trained only tiktok content and youtube cooking or makeup podcasts in the sense that it tries to be super casual and easy-going to the point where its completely unable to give you actual information.
      [-]
      - XenophileJKO 30 days ago
        It is REALLY important to understand "voice mode" is a 4o family of model and it doesn't have "thinking". It is WAY BEHIND on smarts.
    - ArtemGetman 30 days ago
      built something to fix exactly this. skips the realtime chattiness entirely - you speak, it waits until you're done, responds via TTS with actual text-quality answers (no dumbing down). also has claude/gemini if you want different models.
      still early but happy to share: tla[at]lexander[dot]com if interested (saw your email in bio)
      [-]
      - SequoiaHope 29 days ago
        You’re saying you made yourself an email that is similar to mine? That seems… odd.
    - ArtemGetman 30 days ago
      built something to fix this. skips the realtime entirely - you speak, it waits, responds with text-quality answers via TTS. no forced casualness, no dumbing down. also has claude/gemini.
      happy to share if anyone wants to try it
    - davidmurdoch 30 days ago
      The "... just let me know if there is anything else you'd like to know." after every long-winded explanation is so infuriating.
- mythz 30 days ago
  Full time Antigravity user here, IMO best value coding assistant by far, not even including all the other AI Pro sub perks.
  Still using Claude Pro / GitHub Copilot subs for general terminal/VS Code access to Claude. I consider them all top-tier models, but I prefer the full IDE UX of Antigravity over the VS Code CC sidebar or CC terminal.
  Opus 4.5 is obviously great at all things code, tho a lot of times I prefer Gemini 3 Pro (High) UI's. In the last month I've primarily used it on a Python / Vue project which it excels at, I thought I would've need to switch to Opus at some point if I wasn't happy with a particular implementation, but I haven't yet. Few times it didn't generate the right result was due to prompt misunderstanding which I was able to fix by reprompting.
  I'm still using Claude/GPT 5.2 for docs as IMO they have a more sophisticated command over the English language. But for pure coding assistance, I'm a happy Antigravity user.
  [-]
  - vjay15 30 days ago
    Antigravity is really amazing yea, by far the best coding assistant IDE, its even superior than Cursor ngl when it comes to very complex tasks, its more methodical in its approach.
    That said I still use Cursor for work and Antigravity sometimes for building toy projects, they are both good.
    [-]
    - thegagne 30 days ago
      Speaking of methodical, have you tried AWS Kiro?
      It has spec driven development, which in my testing yesterday resulted in a boat load of passing tests but zero useful code.
      It first gathers requirements, which are all worded in strange language that somehow don’t capture specific outcomes OR important implementation details.
      Then it builds a design file where it comes up with an overly complex architecture, based on the requirements.
      Then it comes up with a lengthy set of tasks to accomplish it. It does let you opt out if optional testing, but don’t worry, it still will write a ton of tests.
      You click go on each set of tasks, and wait for it to request permissions for odd things like “chmod +x index.ts”.
      8 hours and 200+ credits later, you have a monstrosity of Enterprise Grade Fizzbuzz.
      [-]
      - sbrother 30 days ago
        Funny enough this sounds like my experience with ex-Amazon SWEs
      - visarga 30 days ago
        Do you think the SDD approach is fundamentally wrong, or that Amazon's implementation was at fault?
        [-]
        afro88 29 days ago
        It sounds like the initial spec is wrong, which compounds over time.
        With SDD the spec should be really well thought about and considered, direct and clear.
        [-]
        thegagne 26 days ago
        I think that it has two flaws:
        - It is too machine like in its definition and requirements and misses the spirit of the ask.
        - It very much waterfalls it, without asking for feedback midway, or revisiting the original goals after things have been built. You have such an opportunity to adjust and learn as you go, especially if you keep revisiting your goals and values and re-evaluating your original requirements which may have been flawed.
        Just like with human development, it's rare that your spec is well thought out at the beginning, and impossible that it was comprehensive enough to define a working system.
        I think having goals, vision, and hard requirements make sense, with some guiding principles along the way, but it's very much a journey that requires constant feedback loops and adjustments along the way.
      - murdy 30 days ago
        Honestly if you use Traycer for plan + review (I just have it open in different IDE that they support), you can use any editor that has good models and does not throttle the context window.
        I am trying to test bunch of these IDEs this month, but I just cant suffer their planning and have to outsource it.
  - amunozo 28 days ago
    So far I only used Antigravity for side projects and I am having so much fun. That said, I get much better results with Opus than with the Gemini models for moderately complex tasks.
  - baq 30 days ago
    Looks like codex + antigravity (which gives opus, too) for $40/mo is the sweet busy hobbyist spot… today, anyway. It could change this afternoon.
- cvhc 30 days ago
  For general researching/chatbot, I don't feel one of them is much better than the other. But since I'm already on Google One plan, upgrading the plan costs less than paying $20/mo to OpenAI, so I ended up cancelling ChatGPT Plus. Plus my Google One is shared with my family so they can also use advanced Gemini models.
  [-]
  - rafaelmn 30 days ago
    Yes, same thing, also I find Gemini to be better at search and non-coding tasks - which was my only use case for GPT - coding was always Claude.
- falloutx 30 days ago
  Dont use it on gemini.google.com, but instead try it on aistudio.google.com.
  Model may be the same but the agent on aistudio makes it much better when it comes to generating code.
  Still jules.google.com is far behind in terms of actual coding agents which you can run in command line.
  Google as always has over engineered their stuff to make it confusing for end users.
  [-]
  - lodovic 30 days ago
    I tried to sign up for Gemini this weekend but gave up after an hour. I got stuck comparing their offerings, looking for product pages, proper signup, etc. Their product offering and naming is just a mess. Cloud console. AI studio, I was completely lost at some point.
    [-]
    - lastdong 30 days ago
      $20 Google AI Pro and Google Antigravity IDE, which gives you access to Claude Code, is a pretty decent offering for agent coding. On top of that, NotebookLM and Google Labs has some fun tools to play with.
    - mittensc 30 days ago
      I just went to gemini.google.com and to my surprise i already have access to it and havent hit limits thus far so they're generous.
      I was paying for storage and its included.
      You likely have access too depending in your account.
  - IncreasePosts 30 days ago
    I don't understand...let's say I have it build some code for me, am I supposed to copy all those files out to my file system and then test it out? And then if I make changes to the source, I need to copy the source back in to a studio/(or canvas in Gemini)?
    [-]
    - vedmakk 30 days ago
      If you want to go beyond a single one-off script, tou want to use it directly in your repo using the CLI tools or one of the IDE integrations (Copilot, Cursor, Zed, ...)
    - Tagbert 30 days ago
      I've been using the Claudi.ai website for a project and that is pretty much what I do, though I have uploaded all of the source files to Claude so I only need to upload anything that I changed. I don't need to reupload the whole code base each time, of course. Claude provides a zip file of any changed files that I download and copy to my code file system.
    - gallexme 30 days ago
      When using my usual IDE(clion) i just use their integration https://codeassist.google/ It works fine / about as good as aistudio
    - falloutx 30 days ago
      I am pretty sure aistudio is for pure vibe coding, so editing and changing code by hand is harder. the case you are mentioning, you should use gemini cli or jules cli. They are far behind Claude Code but it gets the job done.
    - dkdcio 30 days ago
      there is the Gemini CLI, but I am aware of people doing exactly what you’re describing (which I find ridiculous but if it works it works I guess). some people have CLI tools for turning their entire repo into one big Markdown file or similar to copy over
      [-]
      - patates 30 days ago
        That's me! I used to do it with repomix and turned the whole codebase into a giant xml file. Worked really great, and I have a script that just takes the aistudio output and writes all the generated files.
        But, after using Claude Code with Opus 4.5, it's IMHO not worth it anymore. I mean it IS competitive, but the experience of Claude Code is so nice and it slightly edges out Gemini in coding even. If gemini cli were as nice as claude code, I'd have never subscribed to the claude max plan though.
  - Workaccount2 30 days ago
    I'm almost positive that using gemini on ai studio is the cause for a lot of strife.
    Most users on it are using it free, and they almost certainly give free users bottom priority/worst compute allocation.
    [-]
- intalentive 30 days ago
  No, not alone, I find GPT far preferable when it comes to fleshing out ideas. It is much deeper conceptually, it understands intent and can cross pollinate disparate ideas well. Gemini is a little more autistic and gets bogged down in details. The API is useful for high volume extraction jobs, though — Gemini API reliability has improved a lot and has lower failure rate than OpenAI IME.
- SilverSlash 30 days ago
  While that may be your personal experience, but for me Gemini always answers my questions better than Claude Opus 4.5 and often better than GPT 5.2. I'm not talking about coding agents, but rather the web based AI systems.
  This has happened enough times now (I run every query on all 3) that I'm fairly confident that Gemini suits me better now. Whereas it used to be consistently dead last and just plain bad not so long ago. Hence the hype.
  [-]
  - lostdog 29 days ago
    Weird. I find Opus knows the answer more often, plus its explanations are much clearer. Opus puts the main point at the top, while Gemini wanders around for a while before telling you what you need.
- jchw 30 days ago
  I dunno about Gemini CLI, but I have tried Google Antigravity with Gemini 3 Pro and found it extremely superior at debugging versus the other frontier models. If I threw it at a really, really hard problem, I always expected it to eventually give up, get stuck in loops, delete a bunch of code, fake the results, etc. like every other model and every other version of Gemini always did. Except it did not. It actually would eventually break out of loops and make genuine progress. (And I let it run for long periods of time. Like, hours, on some tricky debugging problems. It used gdb in batch mode to debug crashes, and did some really neat things to try to debug hangs.)
  As for wit, well, not sure how to measure it. I've mainly been messing around with Gemini 3 Pro to see how it can work on Rust codebases, so far. I messed around with some quick'n'dirty web codebases, and I do still think Anthropic has the edge on that. I have no idea where GPT 5.2 excels.
  If you could really compare Opus 4.5 and GPT 5.2 directly on your professional work, are you really sure it would work much better than Gemini 3 Pro? i.e. is your professional work comparable to your private usage? I ask this because I've really found LLMs to be extremely variable and spotty, in ways that I think we struggle to really quantify.
  [-]
  - Xmd5a 30 days ago
    Is Gemini 3 Pro better in Antigravity than in gemini-cli ?
    [-]
    - murdy 30 days ago
      For coding it is horrible. I used it exclusively for a day and switching back to Opus felt like heaven. Ok, it is not horrible, it is just significantly worse than competitors.
      [-]
      - HarHarVeryFunny 29 days ago
        Although it sounds counter-intuitive, you may be better off with Gemini 3 Fast (esp. in Thinking mode) rather than Gemini 3 Pro. Fast beats Pro in some benchmarks. This is also the summary conclusion that Gemini itself offers.
    - jchw 30 days ago
      Unfortunately, I don't know. I have never used Gemini CLI.
- avazhi 30 days ago
  I mean, I'm the exact opposite. Ask ChatGPT to write a simple (but novel) script for AutoHotKey, for example, and it can't do it. Gemini can do it perfectly on the first try.
  ChatGPT has been atrocious for me over the past year, as in its actual performance has deteriorated. Gemini has improved with time. As for the comment about lacking wit, I mean, sure I guess, but I use AI to either help me write code to save me time or to give me information - I expect wit out of actual humans. That shit just annoys me with AI, and neither ChatGPT nor Gemini bots are good at not being obnoxious with metaphors and floral speech.
  [-]
  - discordance 30 days ago
    Sounds like you are using ChatGPT to spit out a script in the chat? - if so, you should give 5.2 codex or Claude Code with Opus 4.5 a try... it's night and day.
    [-]
    - ignoramous 30 days ago
      > 5.2 codex or Claude Code with Opus 4.5 a try
      Is using these same models but with GitHub Copilot or Replit equally capable as / comparable to using the respective first-party CLIs?
      [-]
      - ggrantrowberry 30 days ago
        I don’t think so. My favorite tool is Codex with the 5.2-codex model. I use Github Copilot and Codex at work and Codex and Cursor at home. Codex is better for harder and bigger tasks. I’ll use Copilot or Cursor for small easy things. I think Codex is better than Claude Code as well.
        [-]
        davidmurdoch 30 days ago
        Are you using the same models and thinking levels for each?
        I too have found Codex better than Copilot, even for simple tasks. But I don't have the same models available since my work limits the models in copilot to the stupid ones.
        balops 30 days ago
        [dead]
      - discordance 30 days ago
        I have GH Copilot from work and a personal Claude Code max subscription and have noticed a difference in quality if I feed the same input prompts/requirements/spec/rules.md to Claude Code cli and GH Copilot, both using Opus 4.5, where Claude Code CLI gives better results.
        Maybe there's more going on at inference time with Claude Code cli?
        [-]
        pluralmonad 30 days ago
        It is likely because GH Copilot aggressively (over-)manages context and token spend. Probably to hit their desired margins on their plans. But it actively cripples the tool for more complex work IMO. I've had many times where context was obviously being aggressively compacted and also where it will straight truncate data it reads once it reaches some limit.
        I do think it is not as bad as it was 4-6 months ago. Still not as good as CC for agentic workflows.
    - Eufrat 30 days ago
      I find this really frustrating and confusing about all of the coding models. These models are all ostensibly similar in their underpinnings and their basic methods of operation, right?
      So, why does it feel all so fragile and like a gacha game?
      [-]
      - FergusArgyll 30 days ago
        OpenAI actually have different models in the cli (e.g. gpt-5.2-codex)
      - davidmurdoch 30 days ago
        Naming things is hard. So hard every AI company isn't even trying to come up with good names.
    - usefulposter 30 days ago
      You're holding it wrong.
      [-]
      - seanhunter 30 days ago
        In this case they probably are prompting it "wrong" or at least less well than codex/copilot/claude code/etc. That's not a criticism of the user, it's an indication of the fact that people have put a lot of work into the special case of using these particular tools and making sure they are prompted well with context etc whereas when you just type something into chat you would need to replicate that effort yourself in your own prompt.
- barrkel 30 days ago
  When I had a problem with video handoff between one Linux kernel and the next with a zfsbootmenu system, only Gemini was helpful. ChatGPT led me on a merry chase of random kernel flags that didn't have the right effect.
  What worked was rebuilding the Ubuntu kernel with a disabled flag enabled, but it took too long to get that far.
- websiteapi 30 days ago
  I find them all comparable, but Gemini is cheaper
  [-]
  - danans 30 days ago
    IMO in the long term this is the pattern that will emerge. Switching costs are almost non-existent.
- afro88 29 days ago
  This may sound backwards, but gemini 3 flash is quite good when given very specific tasks. It's very fast (much faster than Opus and GPT-5.2), follows instructions very well and spits out working code (in contrast to other flash, haiku etc fast models).
  It does need a solid test suite to keep it in check. But you can move very fast if you have well defined small tasks to give it. I have a PRD then breakdown epics, stories and finally the tasks with Pro first. Works very well.
- tempestn 30 days ago
  I've been using both GPT 5.2 and Gemini 3 Pro a lot. I was very impressed with 3 Pro when it came out, and thought I'd cancel my OAI Plus, but I've since found that for important tasks it's been beneficial to compare the results from both, or even bounce between them. They're different enough that it's like collaborating with a team.
  [-]
  - monkeydust 30 days ago
    I have been thinking about this a bit - so rather than rely on one have an agentic setup that could take question run against the top 3 and then another one to judge the response to give back.
    Is anyone doing this for high stake questions / research?
    The argument against is that the models are fairly 'similar' as outlined in one of the awarded papers from Neurips '25 - https://neurips.cc/virtual/2025/loc/san-diego/poster/121421
    [-]
    - Workaccount2 30 days ago
      I often put the models in direct conversation with each other to work out a framework or solution. It works pretty well, but they do tend to glaze each other a bit.
- dahcryn 30 days ago
  Claude Code > Gemini CLI, fair enough
  But I actually find Gemini Pro (not the free one) extremely capable, especially since you can throw any conversation into notebooklm and deep thinking mode to go in depth
  Opus is great, especially for coding and writing, but for actual productivity outside of that (e.g. working with PDF, images, screenshots, design stuff like marketing, tshirts, ...,...) I prefer Gemini. It's also the fastest.
  Nowhere do I feel like GPT 5.2 is as capable as these two, although admittedly I just stopped using it frequently around november.
  [-]
  - baq 30 days ago
    5.2 wasn’t out in November and it is better than 5.1, especially codex.
    [-]
    - Palmik 30 days ago
      I have the feeling that these discussions are much more tribal rather than evidence based. :)
      [-]
      - baq 30 days ago
        Tribal? Not really. Subjective? Absolutely. Objectively 5.2 scores higher on benchmarks than 5.1; subjectively it works better for me than 5.1. I don't care too much about other opinions TBH :)
- tezza 30 days ago
  You’re not alone. I do a small blog reviewing LLMs and have detailed comparisons that go beyond personal anecdotes. Gemini struggles in many usecases.
  Everyone has to find what works for them and the switching cost and evaluation cost are very low.
  I see a lot of comments generally with the same pattern “i cancelled my LEADER subscription and switched to COMPETITOR”… reminiscent of astroturf. However I scanned all the posters in this particular thread and the cancellers do seem like legit HN profiles.
- james2doyle 29 days ago
  Maybe try out some of the alternative CLI options? Like https://opencode.ai? I also like https://github.com/charmbracelet/crush and https://github.com/mistralai/mistral-vibe
- dave771 30 days ago
  Yeah, you are. You're limiting your view to personal use and just the text modality. If you're a builder or running a startup, the price-performance on Gemini 3 Pro and Flash is unmatched, especially when you factor in the quotas needed for scaled use cases. It’s also the only stack that handles text, live voice, and gen-media together. The Workspace/Gmail integration really doesn't represent the raw model's actual power.
  [-]
  - sublimefire 30 days ago
    Depending on Google’s explicit product to build a startup is crazy. There is a risk of them changing APIs or offerings or features without the ability to actually complain, they are not a great B2B company.
    I hope you just use the API and can switch easily to any other provider.
- LightBug1 30 days ago
  I've started using Gem 3 while things are still in flux in the AI world. Pleasantly surprised by how good it is.
  Most of my projects are on GPT at the moment, but we're nowhere too far gone that I can't move to others.
  And considering just the general nonsense of Altman vs Musk, I might go to Gemini as a safe harbour (yes, I know how ridiculous that sounds).
  So far, I've also noticed less ass-kissing by the Gemini robot ... a good thing.
- azuanrb 30 days ago
  Opus > GPT 5.2 | Gemini 3 Pro to me. But they are pretty close to lately. The gap is smaller now. I'm using it via CLI. For Gemini, their CLI is pretty bad imo. I'm using it via Opencode and pretty happy with it so far. Unfortunately Gemini often throw me rate limit error, and occasionally hang. Their infra is not really reliable, ironically. But other than that, it's been great so far.
- Workaccount2 30 days ago
  People get used to a model and then work best with that model.
  If you hand an iPhone user an Android phone, they will complain that Android is awful and useless. The same is true vice versa.
  This is in large part why we get so many conflicting reports of model behavior. As you become more and more familiar with a model, especially if it is in fact a good model, other good models will feel janky and broken.
- Galaxeblaffer 30 days ago
  Gemini really only shines when using it in planning life in th vscode fork antigravity. It also supports opus so it's easy to compare.
- pwagland 30 days ago
  In my experience, Gemini is great for "one-shot" work, and is my goto for "web" AI usage. Claude Code beats gemini-cli though. Gemini-cli isn't bad, but it's also not good.
  I would love to try antigravity out some more, but last I don't think it is out of playground stage yet, and can't be used for anything remotely serious AFAIK.
- verelo 30 days ago
  Claude opus is absurdly amazing. I now spent around $100-200 a day using it. Gemini and all the OpenAI models can’t me up right now.
  Having said that, Google are killing it at the image editing right now. Makes me wonder if that’s because of some library of content and once Anthropocene acquires the same they’ll blow us away there too.
  [-]
  - PrayagS 30 days ago
    API only user or Max x20 along with extra usage? If it's the latter, how are the limits treating you?
    [-]
    - verelo 27 days ago
      I went for cursor on the $200 plan but i hit those limits in a few days. Claude code came out after i got used to cursor but I've been intending to switch it up on the hope the cost is better.
      I go api directly after i hit those limits. That’s where it gets expensive.
  - rafram 30 days ago
    > I now spent around $100-200 a day using it.
    How's the RoI on that?
    [-]
    - yurishimo 30 days ago
      Probably awful unless they already make 300k+ TOC.
      [-]
      - verelo 27 days ago
        I’m a solo founder, past life I’ve done the whole raise 8 figures, hire a hundred plus people…this is a way better life. Currently around 430k arr and growing.
  - djeastm 29 days ago
    > I now spent around $100-200 a day using it.
    Really? Are you using many multiple agents a time? I'm on Microsoft's $40/mo plan and even using Opus 4.5 all day (one agent at a time), I'm not reaching the limit.
    [-]
    - verelo 27 days ago
      Yeah maybe I’m crazy, i mean i don’t know what to say. I do feel like the productivity i get now is akin to what i would have expected from a small team of 4-5 people 5 years ago..it’s cheaper than hiring coworkers but certainly not cheap haha
- HarHarVeryFunny 30 days ago
  > Not to mention that Gemini CLI is a pain to use - after getting used to the smoothness of Claude Code.
  Are you talking strictly about the respective command line tools as opposed to differences in the models they talk to?
  If so, could you list the major pain points of Gemini CLI were Claude Code does better ?
- mindcrime 30 days ago
  I haven't straight up cancelled my ChatGPT subscription, but I find that I use Gemini about 95% of the time these days. I never bother with any of Anthropic's stuff, but as far as OpenAI models vs Gemini, they strike me as more or less equivalent.
- littlestymaar 30 days ago
  > Not to mention that Gemini CLI is a pain to use - after getting used to the smoothness of Claude Code.
  Claude Code isn't actually tied to Claude, I've seen people use Claude Code with gpt-oss-120b or Qwen3-30b, why couldn't you use Gemini with Claude Code?
- OsrsNeedsf2P 30 days ago
  You're not alone, I feel like sometimes I'm on crazy pills. I have benchmarks at work where the top models are plugged into agents, and Gemini 3 is behind Sonnet 4. This aligns closely with my personal usage as well, where Gemini fails to effectively call MCP tools.
  But hey, it's cheapish, and competition is competition
- qaq 30 days ago
  Nope be it in coding context but Claude and Codex are a combo that really shine and Gemini is pretty useless. The only thing I actually use it for is to triple check the specifications sometimes and thats pretty much it.
- RandallBrown 30 days ago
  I've only used AI pretty sparingly, and I just use it from their websites, but last time I tried all 3 only the code Google generated actually compiled.
  No idea which version of their models I was using.
  [-]
  - Anamon 27 days ago
    Same experience for me. Gemini generates by far the most usable code. Not "good code", obviously, but a decent enough foundation to build on. GPT in particular just spits out code for obsolete libraries, uses deprecated features, hallucinated methods etc. etc. It was a case for the trash bin every single time.
    On the other hand, Gemini failed BADLY when I tried to give it a "summarize the data in this CSV" task. Every response was completely wrong. When pressed about which rows the answers were based on, 100% of the rows were made-up, not present in the source file (interestingly, after about 10 rounds of pointing this out, Gemini suddenly started using the actual data from the uploaded file). GPT's answers, on the other hand, matched manual verification with Excel.
- outside1234 30 days ago
  Have you used it as a consumer would? Aka in google search results or as a replacement for ChatGPT? Because in my hands it is better than ChatGPT.
- zapnuk 30 days ago
  gemini 2.0 flash is and was a godsend for many small tasks and ocr.
  There needs to be a greater distinction between models used for human chat, programming agents, and software-integration - where at least we benefitted from gemini flash models.
- PunchTornado 30 days ago
  I am the opposite. Find GPT 5.2 much worse. Sticking only with gemini and claude.
- murdy 30 days ago
  I also get weirdly agitated by this. In my mind Geminy 3 is case of clear benchmaxing and over all massive flop.
  I am currently testing different IDEs including Antigravity, and I avoid that model at all cost. I will rather pay to use different model, than use Geminy 3.
  It sucks at coding compared to OpenAI and Anthropic models and it is not clearly better as chat-bot (I like the context window). The images are best part of it as it is very steerable and fast.
  But WTF? This was supposed to be the OpenAI killer model? Please.
- netdur 30 days ago
  Ai studio with my custom prompting is much better than Gemini app and opus
- joshvm 29 days ago
  I've found that for any sort of reasonable task, the free models are garbage and the low-tier paid models aren't much better. I'm not talking about coding, just general "help me" usage. It makes me very wary of using these models for anything that I don't fully understand, because I continually get easily falsifiable hallucinations.
  Today, I asked Gemini 3 to find me a power supply with some spec; AC/DC +/- 15V/3A. It did a good job of spec extraction from the PDF datasheets I provided, including looking up how the device performance would degrade using a linear vs switch-mode PSU. But then it comes back with two models from Traco that don't exist, including broken URLs to Mouser. It did suggest running two Meanwell power supplies in series (valid), but 2/3 suggestions were BS. This sort of failure is particularly frustrating because it should be easy and the outputs are also very easy to test against.
  Perhaps this is where you need a second agent to verify and report back, so a human doesn't waste the time?
- ulfw 29 days ago
  It's prove of what investors have been fearing. That LLMs are a dime a dozen, that there is no real moat and that the products are hence becoming commoditised. If you can replace one model with another without noticing a huge difference, there can only be a pricing race to the bottom for market share with hence much lower potential profits than the AI bubble has priced in.
- pjjpo 30 days ago
  I'm with you - the most disappointing was when asking Gemini, technically nano banana, for a PNG with transparent background it just approximated what a transparent PNG would look like in a image viewer, as an opaque background. ChatGPT has no problem. I also appreciate when it can use content like Disney characters. And as far as actual LLMs go, the text is just formatted more readably in GPT to me, with fairly useful application of emojis. I also had an experience asking for tax reporting type of advice, same prompt to both. GPT was the correct response, Gemini suggested cutting corners in a grey way and eventually agreed that GPT's response is safer and better to go with.
  It just feels like OpenAI puts a lot of effort into creating an actually useful product while Gemini just targets benchmarks. Targeting benchmarks to me is meaningless since every model, gpt, Gemini, Claude, constantly hallucinate in real workloads anyways.
- retinaros 30 days ago
  No you are not. I tried all gemini models. They are slop.
- broochcoach 29 days ago
  The Gemini voice app on iOS is unimpressive. They force the answers to be so terse to save cost that it’s almost useless. It quickly goes in circles and needs context pruning. I haven’t tried a paid subscription for Gemini CLI or whatever their new shiny is but codex and Claude code have become so good in the last few months that I’m more focused on using them than exploring options.
sxp 30 days ago
> Naina Raisinghani, 00 needed a name for the new tool to complete the upload. It was 2:30 a.m., though, and nobody was around. So she just made one up, a mashup of two nicknames friends had given her: Nano Banana.
Ah, that explains the silly name for such an impressive tool. I guess it's more a more Googley name than what would have otherwise been chosen: Google Gemini Image Pro Red for Workspace.
[-]
- perardi 30 days ago
  Strongly disagree.
  Google, OpenAI, and Microsoft all have a very confusing product naming strategy where it’s all lumped under Gemini/ChatGPT/Copilot, and the individual product names are not memorable and really quite obscure. (What does Codex do again?)
  Nano Banana doesn’t tell you what the product does, but you sure remember the name. It really rolls off the tongue, and it looks really catchy on social media.
- jofzar 30 days ago
  I honestly love the name nano banana, it's stupid as hell, but it's a bit of joy to say specially with how corporate everything is name wised these days.
  [-]
  - Hammershaft 30 days ago
    I agree, it's a great silly name that immediately jumped out at me because it felt so distant from the focus tested names out of marketing that have become the standard today.
  - underdeserver 30 days ago
    Whimsical.
- cmrdporcupine 30 days ago
  I like that it's evidence that there's still some remnants of the old Google culture there.
hmokiguess 30 days ago
While this article is just about optics, I would say the comments here about how the coding agents fare fail to realize we’re just a niche when compared to the actual consumer product that is the Chatbots for the average user.
My mom is not gonna use Claude Code, it doesn’t matter to her. We, on Hacker News, don’t represent the general population.
[-]
- lukev 30 days ago
  Claude Code purportedly has over a billion dollars in revenue.
  In terms of economic value, coding agents are definitely one of the top-line uses of LLMs.
  [-]
  - hmokiguess 30 days ago
    Sure, I don’t disagree, but a fact remains that 1B is less than 10% of OpenAI’s revenue with ChatGPT and its 700M+ user base.
    Coding agents are important, they matter, my comment is that this article isn’t about that, it’s about the other side of the market.
    [-]
    - raw_anon_1111 29 days ago
      And OprnAI will never be worth its current valuation or be able to keep its spending commitments based on $20/month subscriptions
  - raw_anon_1111 29 days ago
    Anyone can sell dollar bills for 90 cents. When they can actually make a profit, then it will be impressive.
    [-]
    - rhubarbtree 29 days ago
      But that’s not what they’re doing, sir.
      [-]
      - raw_anon_1111 29 days ago
        Are they profitable?
        [-]
        rhubarbtree 26 days ago
        No, but that is not the same thing as selling dollar bills for 90 cents.
  - chneu 29 days ago
    Reminder that the entire AI industry is loaning itself money to boost revenue.
    I seriously question any revenue figures that tech companies are reporting right now. Nobody should be believing anything they say at this time. Fraud is rampant and regulation is non-existent.
    [-]
    - regularfry 29 days ago
      On a purely theoretical-finance level, I don't think the circular funding is actually a problem in itself. It's analogous to fractional reserve banking.
      Whether there's also fraud, misreporting of revenue, or other misbehaviour of weird and wonderful classifications that will keep economics history professors in papers for decades is a separate question. I just find that people get fixated on this one structural feature and I think it's a distraction. It might be smoke, but it's not the fire.
      [-]
      - klaff 29 days ago
        Doesn't fractional reserve banking depend upon independence of the various customers? The widely-reported circular financing between AI players does not enjoy that.
  - Supermancho 30 days ago
    Claude has been measurably worse over other models, in my experience. This alone makes me doubt the number. That and Anthropic has not released official public financial statements, so I'll just assume it's the same kind of hand waving heavily leveraged companies tend to do.
    I actually for for ChatGPT and my company pays for Copilot (which is meh).
    Edit: Given other community opinions, I don't feel I'm saying anything controversial. I have noted HN readers tend to be overly bullish on it for some reason.
    [-]
    - perardi 30 days ago
      That doesn’t reflect my (I would say extensive) experience at this point, nor does it reflect the benchmarks. (I realize benchmarks have issues.)
      Are you using Claude as an agent in VSCode or via Claude Code, or are you asking questions in the web interface? I find Claude is the best model when it’s working with a strongly typed language with a verbose linter and compiler. It excels with Go and TypeScript in Cursor.
      [-]
      - Supermancho 29 days ago
        I have used it for GDScript, C++, Java, and other more general questions. Specifically, comparing it to other LLMs responses ESPECIALLY after incremental narrowing by prompt. Claud seems to randomly change approaches and even ignore context to the point you get the same circular issues you see in Copilot (do A because B is bad, then do B because A is bad or worse ignore everything before and do C because it's nominal). It seems more primitive in my sessions from the last time I used it (for a couple days) ~45 days ago.
- theturtletalks 29 days ago
  My mom uses the Google app instead of just going to Google.com on Safari. She’s probably going to use Gemini because she’s locked into that ecosystem. I suspect most people are going to stick with what they use because like you said, to consumers, they can’t really tell the difference between each model. They might get a 5% better answer, but is that worth the switching costs? Probably not.
  That’s why you see people here mention Claude Code or other CLI where Gemini has always fallen short. Because to us, we see more than a 5% difference between these models and switching between these models is easy if you’re using a CLI.
  It’s also why this article is generated hype. If Gemini was really giving the average consumer better answers, people would be switching from ChatGPT to Gemini but that’s not happening.
- gjsman-1000 30 days ago
  The other issue with this is that AI is still unprofitable and a money hole.
  If consumers refuse to pay for it, let alone more than $20 for it, coding agent costs could explode. Agent revenue isn’t nearly enough to keep the system running while simultaneously being very demanding.
  [-]
  - sdwr 30 days ago
    AI development is a money pit, AI use is profitable. Average ChatGPT subscribers are using way less than $20 of electricity and GPU time per month.
    [-]
    - swexbe 29 days ago
      When you take depreciation into account, it's probably less profitable than a government bond.
    - sothatsit 29 days ago
      And it will get even more profitable for free users when ads roll in.
- PunchyHamster 30 days ago
  Will your mom pay for chatgpt or just stop when they will try to start converting more users ?
  [-]
  - shinycode 29 days ago
    Also anecdote but most low-tech people I know are using chatGPT like google search and will never pay for it. Maybe that’s why chatGPT Ads will work beautifully with them
    [-]
    - raw_anon_1111 29 days ago
      Everyone says “advertise!” Like it’s a magic bullet. The tech industry is littered with companies that have high traffic and couldn’t figure out how to monetize via advertising. Yahoo being the canonical example.
      Besides the cost of serving an LLM request - using someone else’s infrastructure and someone else’s search engine is magnitudes higher than a Google search.
      Besides defaults matter. Google is the default search engine for every mobile phone outside of China and the three browsers with the most market share
      [-]
      - PunchyHamster 28 days ago
        > Everyone says “advertise!” Like it’s a magic bullet. The tech industry is littered with companies that have high traffic and couldn’t figure out how to monetize via advertising. Yahoo being the canonical example.
        That is obviously because they can't figure out any other value of the product that's sellable.
      - shinycode 29 days ago
        Ads per se aren’t necessarily a problem. It’s good to be able to discover companies et products. The problem is bad ads, ads that lie, ads for fake products or unfair ads. Also, too much ads is also a problem
        [-]
        raw_anon_1111 28 days ago
        Ads make the user experience worse. I don’t use any ad supported products - unfortunately outside of LinkedIn.
        [-]
        shinycode 28 days ago
        I 1000% agree with you. Companies are unreasonable and it kills the UX in a majority of cases. That makes us hate ads
  - blairbeckwith 30 days ago
    Anecdotes, etc. but my 65 year old dad is pretty low tech and he was paying OpenAI $20/month before I was.
    [-]
    - chneu 29 days ago
      Little off topic but I just got done cleaning up my friend's dad's estate. He had dementia the last ~5 years of his life.
      The amount of random fucking subscriptions this senile old dude was paying is mind boggling. We're talking nearly $10k/month in random shit. Monthly lingerie subscription? Yup, 62 year old dude. Dick pill subscription? Yup. Subscription to pay his subscriptions? Yup.
      It makes me really wonder how much of the US economy is just old senile people paying for shit they don't realize.
      We also found millions in random accounts all over the place. It's just mind boggling.
      [-]
      - iso1631 29 days ago
        He had dementia that bad at 57?
        Ouch, that's not nice. My grandmother has been in care since 2020 and no idea who anyone is (forgot kids, husbands, etc), but at least she was in her 90s when it started going bad.
      - kapone 29 days ago
        [dead]
  - kid64 29 days ago
    Oooh, sick burn
- jameslk 30 days ago
  Coding agents seem the most likely to become general purpose agents that everyone uses eventually for daily work. They have the most mature and comprehensive capability around tool use, especially on the filesystem, but also in opening browsers, searching the web, running programs (via command line), etc. Their current limitation for widespread usage is UX and security, but at least for the latter, that's being worked on
  I just helped a non-technical friend install one of these coding agents, because its the best way to use an AI model today that can do more than give him answers to questions
- jrjeksjd8d 30 days ago
  AI coding has massive factors that should make it the easiest to drive adoption and monetize.
  The biggest is FOMO. So many orgs have a principle-agent problem where execs are buying AI for their whole org, regardless of value. This is easier revenue than nickle-and-diming individuals.
  The second factor is the circular tech economy. Everyone knows everyone, everyone is buying from everyone, it's the same dollar changing hands back and forth.
  Finally, AI coding should be able to produce concrete value. If an AI makes code that compiles and solves a problem it should have some value. By comparison, if your product is _writing_, AI writing is kind of bullshit.
  [-]
  - johanvts 30 days ago
    > If an AI makes code that compiles and solves a problem it should have some value
    Depends if the cost to weed out the new problems it introduces outweighs the value of the problems solved.
    [-]
    - jrjeksjd8d 29 days ago
      To be clear this is me making the most generous case for LLMs, which is that some people really do just want a shitty app to check a box. In my experience fixing LLM-produced software is worse than just writing it from scratch.
      I think LLM writing replacing actual authors or AI "art" is fundamentally worthless though, so at least coding is worth more than "worthless"
  - HarHarVeryFunny 29 days ago
    I've got to wonder what the potential market size is for AI driven software development.
    I'd have to guess that competition and efficiency gains will reduce the cost of AI coding tools, but for now we've got $100 or $200/mo premium plans for things like Claude Code (although some users may exceed this and pay more), call it $1-2K/yr per developer, and in the US there are apparently about 1M developers, so even with a 100% adoption rate that's only $1-2B revenue spread across all providers for the US market.... a drop in the bucket for a company like Google, and hardly enough to create a sane Price-to-Sales ratio for companies like OpenAI or Anthropic given their sky-high valuations.
    Corporate API usage seems to have potential to be higher (not capped by a fixed size user base), but hard to estimate what that might be.
    ChatBots don't seem to be viable for long-term revenue, at least not from consumers, since it seems we'll always have things like Google "AI Mode" available for free.
- pdntspa 29 days ago
  [flagged]
cmiles8 30 days ago
A bit of PR puffery, but it is fair to say that between Gemini and others it’s now been clearly demonstrated that OpenAI doesn’t have any clear moat.
[-]
- londons_explore 30 days ago
  Their moat in the consumer world is the branding and the fact open ai has 'memory' which you can't migrate to another provider.
  That means responses can be far more tailored - it knows what your job is, knows where you go with friends, knows that when you ask about 'dates' you mean romantic relationships and which ones are going well or badly not the fruit, etc.
  Eventually when they make it work better, open ai can be your friend and confident, and you wouldn't dump your friend of many years to make another new friend without good reason.
  [-]
  - patrickmcnamara 30 days ago
    I really think this memory thing is overstated on Hacker News. This is not something that is hard to move at all. It's not a moat. I don't think most users even know memory exist outside of a single conversation.
    [-]
    - jorl17 30 days ago
      Every single one of my non-techie friends who use ChatGPT rely heavily on memory. Whenever they try something different to it, they get very annoyed that it just doesn't "get them" or "know them".
      Perhaps it'll be easy to migrate memories indeed (I mean there are already plugins that sort of claim to do it, and it doesn't seem very hard), but it certainly is a very differentiating feature at the moment.
      I also use ChatGPT as my daily "chat LLM" because of memory, and, especially, because of the voice chat, which I still feel is miles better than any competition. People say Gemini voice chat is great, but I find it terrible. Maybe I'm on the wrong side of an A/B test.
      [-]
      - shaftway 29 days ago
        This feels like an area Google would have an advantage though. Look at all of the data about you that Google has and it could mine across Wallet, Maps, Photos, Calendar, GMail, and more. Google knows my name, address, drivers license, passport, where I work, when I'm home, what I'm doing tomorrow, when I'm going on vacation and where I'm going, and whole litany of other information.
        The real challenge for Google is going to be using that information in a privacy-conscious way. If this was 2006 and Google was still a darling child that could do no wrong, they'd have already integrated all of that information and tried to sell it as a "magical experience". Now all it'll take is one public slip-up and the media will pounce. I bet this is why they haven't done that integration yet.
        [-]
        jorl17 29 days ago
        I used to think that, too, but I don't think it's the case.
        Many people slowly open up to an LLM as if they were meeting someone. Sure, they might open up faster or share some morally questionable things earlier on, but there are some things that they hide even from the LLM (like one hides thoughts from oneself, only to then open up to a friend). To know that an LLM knows everything about you will certainly alienate many people, especially because who I am today is very different from who I was five years ago, or two weeks ago when I was mad and acted irrationally.
        Google has loads of information, but it knows very little of how I actually think. Of what I feel. Of the memories I cherish. It may know what I should buy, or my interests in general. It may know where I live, my age, my friends, the kind of writing I had ten years ago and have now, and many many other things which are definitely interesting and useful, but don't really amount to knowing me. When people around me say "ChatGPT knows them", this is not what they are talking about at all. (And, in part, it's also because they are making some of it up, sure)
        We know a lot about famous people, historical figures. We know their biographies, their struggles, their life story. But they would surely not get the feeling that we "know them" or that we "get them", because that's something they would have to forge together with us, by priming us the right way, or by providing us with their raw, unfiltered thoughts in a dialogue. To truly know someone is to forge a bond with them — to me, no one is known alone, we are all known to each other. I don't think google (or apple, or whomever) can do that without it being born out of a two-way street (user and LLM)[1]. Especially if we then take into account the aforementioned issue that we evolve, our beliefs change, how we feel about the past changes, and others.
        [1] But — and I guess sort of contradicting myself — Google could certainly try to grab all my data and forge that conversation and connection. Prompt me with questions about things, and so on. Like a therapist who has suddenly come into possession of all our diaries and whom we slowly, but surely, open up to. Google could definitely intelligently go from the information to the feeling of connection.
        [-]
        shaftway 29 days ago
        Maybe. I haven't really heard many of the people in my circles describing an experience like that ("opening up" to an LLM). I can't imagine *anyone* telling a general-purpose LLM about memories they cherish.
        Do people want an LLM to "know them"? I literally shuddered at the thought. That sounds like a dystopian hell to me.
        But I think Google has, or can infer, a lot more of that data than people realize. If you're on Android you're probably opted into Google Photos, and they can mine a ton of context about you out of there. Certainly infer information about who is important to you, even if you don't realize it yourself. And let's face it, people aren't that unique. It doesn't take much pattern matching to come up with text that looks insightful and deep, but is actually superficial. Look at cold-reading psychics for examples of how trivial it is.
      - zephyrthenoble 30 days ago
        On the other side of the test, I don't know a non-tech person who uses ChatGPT at all.
        [-]
        HarHarVeryFunny 30 days ago
        Another data point: my generally tech savvy teenage daughter (17) says that her friends are only aware of AI having been available for last year (3 actually), and basically only use it via Snaphhat "My AI" (which is powered by OpenAI) as a homework helper.
        I get the impression that most non-techies have either never tried "AI", or regard it as Google (search) on steroids for answering questions.
        Maybe more related to his (sad but true) senility rather than lack of interest, but I was a bit shocked to see the physicist Roger Penrose interviewed recently by Curt Jaimungal, and when asked if he had tried LLMs/ChatGPT assumed the conversation was about the "stupid lady" (his words) ELIZA (fake chatbot from the 60's), evidentially never having even heard of LLMs!
        babyoil 29 days ago
        My mom does. She's almost 60. She asks for recipes and facts, asks about random illnesses, asks it why she's feeling sad, asks it how to talk to her friend with terminal cancer.
        I didn't tell her to download the app, nor she is a tech-y person, she just did on her own.
    - bsimpson 30 days ago
      I dislike that it has a memory.
      It creeps me out when a past session poisons a current one.
      [-]
      - transcriptase 30 days ago
        Exactly. I went through a phase of playing around with ESP32s and now it tries to steer every prompt about anything technology or electronics related back to how it can be used in conjunction with a microcontroller, regardless of how little sense it makes.
      - fluidcruft 30 days ago
        I agree. For me it's annoying because everything it generates is too tailored to the first stuff I started chatting with it about. I have multiple responsibilities and I haven't been able to get it to compartmentalize. When I'm wearing my "radiology research" support hat it assumes I'm also wearing my "MRI physics" hat and to weaves everything for MRI. It's really annoying.
        [-]
        00deadbeef 30 days ago
        Use project-only memory https://help.openai.com/en/articles/10169521-projects-in-cha...
        [-]
        fluidcruft 30 days ago
        Thank you! I feel really dumb for not knowing about that!
      - bandrami 30 days ago
        Agree. Memory is absolutely a misfeature in almost every LLM use case
      - 00deadbeef 30 days ago
        You can turn it off
    - ibuildbots 29 days ago
      It doesn't even change the responses a lot. I used ChatGPT for a year for a lot of personal stuff, and tried a new account with basic prompts and it was pretty much the same. Lots of glazing.
  - dufir 30 days ago
    What kind of a moat is that? I think it only works in abusive relationships, not consumer economies. Is OpenAIs model being an abusive money grubbing partner? I suppose it could be!
    [-]
    - wolfhumble 30 days ago
      If you have all your “stuff” saved on ChatGPT, you’re naturally more likely to stay there, everything else being more or less equal: Your applications, translations, market research . . .
      [-]
      - fluidcruft 30 days ago
        I think this is one of the reasons I prefer claude-code and codex. All the files are on my disks and if claude or codex were to disappear nothing is lost.
      - jennyholzer3 30 days ago
        [dead]
  - locknitpicker 30 days ago
    > Their moat in the consumer world is the branding and the fact open ai has 'memory' which you can't migrate to another provider.
    Their 'memory' is mostly unhelpful and gets in the way. At best it saves you from prompting some context, but more often than not it adds so much irrelevant context that it over fits responses so hard that it makes them completely useless, specially in exploratory sessions.
  - tootie 30 days ago
    It's certainly valuable but you can ask Digg and MySpace how secure being the first mover is. I can already hear my dad telling me he is using Google's ChatGPT...
  - afavour 30 days ago
    But Google has your Gmail inbox, your photos, your maps location history…
    [-]
    - Ianjit 30 days ago
      I think an OpenAI paper showed 25% of GPT usage is “seeking information”. In that case Google also has a an advantage from being the default search provider on iOS and Android. I do find myself using the address bar in a browser like a chat box.
      https://cdn.openai.com/pdf/a253471f-8260-40c6-a2cc-aa93fe9f1...
    - no-name-here 28 days ago
      > your maps location history
      Note that since 2024, Google no longer has your location history on their servers, it's only stored locally: https://www.theguardian.com/technology/article/2024/jun/06/g...
  - JumpCrisscross 30 days ago
    > Their moat in the consumer world is the branding and the fact open ai has 'memory' which you can't migrate to another provider
    This sounds like first-mover advantage more than a moat.
    [-]
    - keeda 30 days ago
      The memory is definitely sort of a moat. As an example, I'm working on a relatively niche problem in computer vision (small, low-resolution images) and ChatGPT now "knows" this and tailors its responses accordingly. With other chatbots I need to provide this context every time else I get suggestions oriented towards the most common scenarios in the literature, which don't work at all for my use-case.
      That may seem minor, but it compounds over time and it's surprising how much ChatGPT knows about me now. I asked ChatGPT to roast me again at the end of last year, and I was a bit taken aback that it had even figured out the broader problem I'm working on and the high level approach I'm taking, something I had never explicitly mentioned. In fact, it even nailed some aspects of my personality that were not obvious at all from the chats.
      I'm not saying it's a deep moat, especially for the less frequent users, but it's there.
      [-]
      - JumpCrisscross 30 days ago
        > may seem minor, but it compounds over time and it's surprising how much ChatGPT knows about me now
        I’m not saying it’s minor. And one could argue first-mover advantages are a form of moat.
        But the advantage is limited to those who have used ChatGPT. For anyone else, it doesn’t apply. That’s different from a moat, which tends to be more fundamental.
        [-]
        keeda 29 days ago
        Ah, I guess I've been interpreting "moat" narrowly, such as, keeping your competitors from muscling in on your existing business, e.g. siphoning away your existing users. Makes sense that it applies in the broader sense as well, such as say, protecting the future growth of your business.
      - irishcoffee 30 days ago
        Sounds similar to how psychics work. Observing obvious facts and pattern matching, except in this case you made the job super easy for the psychic because you gave it a _ton_ of information, instead of a psychic having to infer from the clothes you wear, your haircut, hygiene, demeanor, facial expression etc.
        [-]
        keeda 29 days ago
        Yeah, it somewhat is! It also made some mistakes analogous to what psychics would based on the limited sample of exposure it had to me.
        For instance, I've been struggling against a specific problem for a very long time, using ChatGPT heavily for exploration. In the roast, it chided me for being eternally in search of elegant perfect solutions instead of shipping something that works at all. But that's because it only sees the targeted chats I've had with it, and not the brute force methods and hacks I've been piling on elsewhere to make progress!
        I'd bet with better context it would have been more right. But the surprising thing is what it got right was also not very obvious from the chats. Also for something that has only intermittent existence when prompted, it did display some sense of time passing. I wonder if it noticed the timestamps on our chats?
        Notably, that roast evolved into an ad-hoc therapy session and eventually into a technical debugging and product roadmap discussion.
        A programmer, researcher, computer vision expert, product manager, therapist, accountability partner, and more all in a package that I'd pay a lot of money if it wasn't available for free. If anything I think the AI revolution is rather underplayed.
  - lelanthran 30 days ago
    > Their moat in the consumer world is the branding and the fact open ai has 'memory' which you can't migrate to another provider.
    Branding isn't a moat when, as far as the mass market is concerned, you are 2 years old.
    Branding is a moat when you're IBM, Microsoft (and more recently) Google, Meta, etc.
  - thenanyu 29 days ago
    You can prompt the model to dump all of the memory into a text file and import that.
    In the onboarding flow, I can ask you, "Do you use another LLM?" If so, give it this prompt and then give me the memory file that outputs.
  - iknowstuff 30 days ago
    I just learned Gemini has "memory" because it mixed its response to a new query with a completely unrelated query I had beforehand, despite making separate chats for them. It responded as if they were the same chat. Garbage.
    [-]
    - jeffbee 30 days ago
      I recently discovered that if a sentence starts with "remember", Gemini writes the rest of it down as standing instructions. Maybe go look in there and see if there is something surprising.
    - pests 30 days ago
      Its a recent addition. You can view them in some settings menu. Gemini also has scheduled triggers like "Give me a recap of the daily news every day at 9am based on my interests" and it will start a new chat with you every day at 9am with that content.
  - mupuff1234 30 days ago
    Couldn't you just ask it to write down what it knows about you and copy paste into another provider?
- xnx 30 days ago
  The next realization will be that Claude isn't clearly(/any?) better than Google's coding agents.
  [-]
  - Analemma_ 30 days ago
    I think Gemini 3.0 the model is smarter than Opus 4.5, but Claude Code still gives better results in practice than Gemini CLI. I assume this is because the model is only half the battle, and the rest is how good your harness and integration tooling are. But that also doesn't seem like a very deep moat, or something Google can't catch up on with focused attention, and I suspect by this time next year, or maybe even six months from now, they'll be about the same.
    [-]
    - overfeed 30 days ago
      > But that also doesn't seem like a very deep moat, or something Google can't catch up on with focused attention, and I suspect by this time next year, or maybe even six months from now, they'll be about the same.
      The harnessing in Google's agentic IDE (Antigravity) is pretty great - the output quality is indistinguishable between Opus 4.5 and Gemini 3 for my use cases[1]
      1. I tend to give detailed requirements for small-to-medium sized tasks (T-shirt sizing). YMMV on larger, less detailed tasks.
  - SilverSlash 30 days ago
    Claude is cranked to the max for coding and specifically agentic coding and even more specifically agentic coding using Claude Code. It's like the macbook of coding LLMs.
  - nmfisher 30 days ago
    Claude Code + Opus 4.5 is an order of magnitude better than Gemini CLI + Gemini 3 Pro (at least, last time I tried it).
    I don't know how much secret sauce is in CC vs the underlying model, but I would need a lot of convincing to even bother with Gemini CLI again.
    [-]
    - cageface 30 days ago
      That hasn’t been my experience. I agree Opus has the edge but it’s not by that much and I still sometimes get better results from Gemini, especially when debugging issues.
      Claude Code is much better than Gemini CLI though.
- amelius 30 days ago
  If the bubble doesn't burst in the next few days, then this is clearly wrong.
  [-]
  - willparks 30 days ago
    Next few days? Might be a bit longer than that.
    [-]
    - amelius 30 days ago
      Why? They said "clearly demonstrated".
      If it is so clear, then investors will want to pull their money out.
      [-]
      - array_key_first 30 days ago
        Most investors are dumb as rocks, or, at least, don't know shit about what they're investing in. I mean, I don't know squat about chemical manufacturing but I have some investment in that.
        It's not about who's the best, it's about where the market is. Dogpiling on growing companies is a proven way to make a lot of money, so people do it, and it's accelerated by index funds. The REAL people supporting Google and Nvidia isn't wallstreet, it's your 401K.
      - esafak 30 days ago
        What if all investors don't agree with this article?
  - Octoth0rpe 30 days ago
    Out of curiosity, why that specific timeframe? is there a significant unveiling supposed to happen? Something CES related?
dosinga 29 days ago
Google-as-the-new-Microsoft feels about right. Windows 1 was a curiosity, 2 was “ok”, and 3.x is where it started to really win. Same story with IE: early versions were a joke, then it became “good enough” + distribution did the rest.
Gemini 3 feels like Google’s “Windows 3 / IE4 moment”: not necessarily everyone’s favorite yet, but finally solid enough that the default placement starts to matter.
If you are the incumbent you don't need to be all that much better. Just good enough and you win by default. We'll all end up with Gemini 6 (IE 6, Windows XP) and then we'll have something to complain about.
m348e912 30 days ago
Gemini 3 is great, I have moved from gpt and haven't looked back. However, like many great models, I suspect they're expensive to run and eventually Google will nerf the model once it gains enough traction, either by distillation, quantizing, or smaller context windows in order to stop bleeding money.
Here is a report (whether true or not) of it happening:
https://www.reddit.com/r/GeminiAI/comments/1q6ecwy/gemini_30...
[-]
- dalenw 29 days ago
  While I don't use Gemini, I'm betting they'll end up being the cheapest in the future because Google is developing the entire stack, instead of relying on GPUs. I think that puts them in a much better position than other companies like OpenAI.
  https://cloud.google.com/tpu
- an0malous 29 days ago
  Yeah that’s textbook enshittification, it’s inevitable
KellyCriterion 30 days ago
my guess is the following:
Google can afford to run Gemini for a looong time without any ads, while OpenAI needs necessarily to bring in some revenue: So OpenAI will have to do something (or they believe they can raise money infinitely)
Google can easily give Gemini without Ads to the users for the next 3 - 4 years, forcing OpenAI to cripple their product earlier with Ads because of the need for any revenue
I think Google & Antropic will be one of the two winners; not sure about OpenAI, Perplexity & Co - maybe OpenAI will somehow merge with Microsoft?
[-]
- jaccola 30 days ago
  I’m surprised Perplexity isn’t already dead! Makes me question my ability to evaluate the value/sticking power of these tools.
  (Unless it is dead if we could see DAUs…)
  [-]
  - g947o 30 days ago
    My experience is that Perplexity is slightly better at providing facts than ChatGPT (in default mode), probably because (almost) everything comes from a source, not just the model's training set. Although Perplexity does mess up numbers as well.
    My most recent experiment:
    How many Google CEOs there have been?
    Followed by the question
    So 3 CEOs for 27 years. How does that number compare to other companies of this size
    ChatGPT just completely hallucinates the answer -- 5 Microsoft CEOs over 50 years, 3 Amazon CEOs over 30 years, 2 Meta CEOs over 20 years which are just obviously wrong. You don't need to do a search to know these numbers -- they are definitely in the training dataset (barring the small possibility that there has been a CEO change in the past year in any of these companies, which apparently did not happen)
    But Perplexity completely nailed it on first attempt without any additional instructions.
  - artdigital 30 days ago
    I use Perplexity all the time for search. It's very good at exactly that - internet search. So when using it for search related things it really shines
    Yeah sure ChatGPT can spam a bunch of search queries through their search tool but it doesn't really come close to having Perplexity's search graph and index. Their sonar model is also specifically built for search
  - KellyCriterion 30 days ago
    The thing is: Perplexity is quite good for some things; though, it has no traction outside of tech? Most non-techies havent heard about Gemini, what Ive seen/heard. (regardless the fact that they use the Google-Search-AI-overview every day)
  - prodigycorp 30 days ago
    I’ve moved all cursory searches to google.ai – perplexity’s future is dim.
- UberFly 30 days ago
  Microsoft will fund OpenAI for as long as it's needed. What is their alternative?
  [-]
  - KellyCriterion 30 days ago
    You are right:
    How long will they do it? Id expect investors to roar up some day? Will MS fund them infinitely just for sake of "staying in the game"? According to the public numbers of users, investments, scale etc., OpenAI will need huge amounts of money in the next years, not just "another 10 billion", thats my understanding?
djoldman 30 days ago
Alternate link: https://www.msn.com/en-us/money/other/how-google-got-its-gro...
zeroonetwothree 30 days ago
I don't think it's really "ahead" but it's pretty close now. There's not that big a difference among the SOTA models, they all have their pros/cons.
[-]
- mcast 30 days ago
  It’s incredibly impressive to see a large company with over 30x as many employees (or 2x if you compare with GDM) than OAI step back into the AI race compared to where they were with Bard a few years ago.
  Google has proved it doesn’t want to be the next IBM or Microsoft.
  [-]
  - coffeebeqn 30 days ago
    Why are people so surprised? Attention Is All You Need was authored by Googlers. It’s not like they were blindsided.. OpenAI prouctionized it first but it didn’t make sense to count Google out given their AI history?
    [-]
    - fourside 30 days ago
      Huh? They absolutely were blindsided and the evidence is there. No one expected ChatGPT to take off like it did, not even OpenAI. Google put out some embarrassing products out for the first couple of years, called a code red internally, asked Sergey and Larry to come back. The fact that they recovered doesn’t mean they weren’t initially blindsided.
      People are surprised because Google released multiple surprisingly bad products and it was starting to look like they had lost their edge. It’s rare for a company their size to make such a big turnaround so quickly.
  - jasonfarnon 30 days ago
    "the next IBM or Microsoft."
    Actually Microsoft has also shown it doesn't want to be the next IBM. I think at this point Apple is the one where I have trouble seeing a long-term plan.
    [-]
    - gbear605 30 days ago
      Microsoft was sort of showing that a year ago, and then spent the whole last year showing everyone that they're just another IBM.
      [-]
      - grumbelbart2 30 days ago
        It probably depends on what "The next IBM" means for people. Microsoft is so deeply embedded into companies right now that for larger cooperation it's practically impossible to get rid of them, and their cloud-driven strategy is very profitable.
        [-]
        gbear605 30 days ago
        IBM still makes a lot of money from large corporations that depend on them. But yet…
  - p1esk 30 days ago
    You should compare the number of top AI scientists each company has. I think those numbers are comparable (I’m guessing each has a couple of dozen). Also how attractive each company is to the best young researchers.
  - culi 30 days ago
    We're talking about code generation here but most people's interactions with LLMs are through text. On that metric Google has led OpenAI for over a year now. Even Grok in "thinking" mode leads OpenAI
    https://lmarena.ai/leaderboard
    Google also leads in image-to-video, text-to-video, search, and vision
- prpl 30 days ago
  Google, as a company, is easily ahead even if the model isn't, for various reasons.
  Their real mote is the cost efficiency and the ad business. They can (probably) justify the AI spend and stay solvent longer than the market can stay irrational.
HackerThemAll 30 days ago
The best decision for Google happened like 10 years ago when they started manufacturing their own silicon for crunching neural nets. No matter if they had a really good crystal ball back then, smart people, time travel machine or just luck, it pays for them now. They don't need to participate in that Ponzi scheme that OpenAI, Nvidia and Microsoft created, and they don't need to wait in line to buy Nvidia cards.
[-]
- jeffbee 30 days ago
  It had to have been launched longer ago than that because their first public-facing, TPU-using generative product was Inbox Smart Reply, which launched more than 10 years ago. Add to that however much time had to pass up to the point where they had the hardware in production. I think the genesis of the project must have been 12-15 years ago.
  [-]
  - jillesvangurp 30 days ago
    The acquired podcast did a nice episode on the history of AI in Google recently going back all the way to when they were trying to do the "I feel lucky", early versions of translate, etc. All of which laid the ground work for adding AI features to Google and running them at Google scale. That started early in the history of Google when they did everything on CPUs still.
    The transition to using GPU accelerated algorithms at scale started happening pretty early in Google around 2009/2010 when they started doing stuff with voice and images.
    This started with Google just buying a few big GPUs for their R&D and then suddenly appearing as a big customer for NVidia who up to then had no clue that they were going to be an AI company. The internal work on TPUs started around 2013. They deployed the first versions around 2015 and have been iterating on those since then. Interestingly, OpenAI was founded around the same time.
    OpenAI has a moat as well in terms of brand recognition and diversified hardware supplier deals and funding. Nvidia is no longer the only game in town and Intel and AMD are in scope as well. Google's TPUs give them a short term advantage but hardware capabilities are becoming a commodity long term. OpenAI and Google need to demonstrate value to end users, not cost optimizations. This is about where the many billions on AI subscription spending is going to go. Google might be catching up, but OpenAI is the clear leader in terms of paid subscriptions.
    Google has been chasing different products for the last fifteen years in terms of always trying to catch up with the latest and greatest in terms messaging, social networking, and now AI features. They are doing a lot of copycat products; not a lot of original ones. It's not a safe bet that this will go differently for them this time.
    [-]
    - dahcryn 30 days ago
      but cost is critical. It's been proven customers are willing to pay +- 20/month, no matter how much underlying cost there is to the provider.
      Google is almost an order of magnitude cheaper to serve GenAI compared to ChatGPT. Long term, this will be a big competitive advantage to them. Look at their very generous free tier compared to others. And the products are not subpar, they do compete on quality. OpenAI had the early mover advantage, but it's clear the crowd who is willing to pay for these services, is not very sticky and churn is really high when a new model is release, it's one of the more competitive markets.
      [-]
      - jeffbee 30 days ago
        I don't even know if it amounts to $20. If you already pay for Google One the marginal cost isn't that much. And if you are all in on Google stuff like Fi, or Pixel phones, YouTube Premium, you get a big discount on the recurring costs.
  - dgacmu 30 days ago
    About 12. First deployed mid 2015.
keithgroves 30 days ago
I like using gemini because it's so much cheaper when I'm running tests on enact protocol. I ask it to build multiple tools and let it run.
theturtletalks 30 days ago
All of this seems like manufactured hype for Gemini. I use GPT-5.2, Opus 4.5, and Gemini 3 flash and pro with Droid CLI and Gemini is consistently the worst. It gets stuck in loops, wants to wipe projects when it can’t figure out the problem, and still fails to call tools consistently (sometimes the whole thread is corrupted and you can’t rewind and use another model).
Terminal Bench supports my findings, GPT-5.2 and Opus 4.5 are consistently ahead. Only Junie CLI (Jetbrains exclusive) with Gemini 3 Flash scores somewhat close to the others.
It’s also why Ampcode made Gemini the default model and quickly back tracked when all of these issues came to light.
[-]
- AlexCoventry 30 days ago
  https://paulgraham.com/submarine.html
- petesergeant 30 days ago
  Claude for writing the code, Codex for checking the code, Gemini for when you want to look at a pretty terminal UI.
  [-]
  - j45 30 days ago
    Gemini is pretty decent at ingesting and understanding large codebases before providing it to Claude.
- alex1138 30 days ago
  I'm pretty high on Claude, though not an expert on coding or LLMs at all
  I'm naturally inclined to dislike Google from what they censor, what they consider misinformation, and just, I don't know, some of the projects they run (many good things, but also many dead projects and lying to people)
mythz 30 days ago
Gemini CLI is too slow to be useful, kind of surprised it was even offered and marketed given how painful it is to use. I thought it'd have to be damaging to the Gemini brand to get people to try it out, suffer painful UX then immediately stop using it. (Using it from Australia may also contribute to its slow perf)
Antigravity was also painful to use at launch where more queries failed then succeeded, however they've basically solved that now to the point where it's become my most used editor/IDE where I've yet to hit a quota limit, despite only being on the $20/mo plan - even when using Gemini 3 Pro as the default model. I also can't recall seeing any failed service responses after a month of full-time usage. It's not the fastest model, but very happy with its high quality output.
I expected to upgrade to a Claude Code Max plan after leaving Augment Code, but given how good Antigravity is now for its low cost, I've switched to it as my primary full-time coding assistant.
Still paying for GitHub Copilot / Claude Pro for general VS Code and CC terminal usage, but definitely getting the most value of out my Gemini AI Pro sub.
Note this is only for development, docs and other work product. For API usage in products, I primarily lean on the cheaper OSS chinese models, primarily MiniMax 2.1 for tool calling or GLM 4.7/KimiK2/DeepSeek when extra intelligence is needed (at slower perf). Gemini Flash for analyzing Image, Audio & PDFs.
Also find Nano Banana/Pro (Gemini Flash Image) to consistently generate the highest quality images vs GPT 1.5/SDXL,HiDream,Flux,ZImage,Qwen, which apparently my Pro sub includes up to 1000/day for Nano Banana or 100/day for Pro?? [1], so it's hard to justify using anything else.
If Gemini 3 Pro was a bit faster and Flash a bit cheaper (API Usage), I could easily see myself switching to Gemini for everything. If future releases get smarter, faster whilst remaining aggressively priced, in the future - I expect I will.
[1] https://support.google.com/gemini/answer/16275805?hl=en
[-]
- Aurornis 30 days ago
  Your first point
  > kind of surprised it was even offered and marketed given how painful it is to use. I thought it'd have to be damaging to the Gemini brand to get people to try it out, suffer painful UX then immediately stop using it.
  is immediately explain by your second point
  > Antigravity was also painful to use at launch where more queries failed then succeeded, however they've basically solved that now to the point where it's become my most used editor/IDE
  Switching tools is easy right now. Some people pick a tool and stick with it, but it's common to jump from one to the other.
  Many of us have the lowest tier subscriptions from a couple companies at the same time so we can jump between tools all the time.
  [-]
  - mythz 30 days ago
    Yeah except Gemini CLI is still bad after such a long time, every now and then I'll fire it up when I need a complex CLI command only to find that it hadn't improved and that I would've been better off asking an LLM instead. I don't quite understand its positioning, it's clearly a product of a lot of dev effort which I thought was for a re-imagined CLI experience, but I can't imagine anyone uses it as a daily driver for that.
    I retried Antigravity a few weeks after launch after Augment Code's new pricing kicked in, and was pleasantly surprised at how reliable it became and how far I got with just the free quota, was happy to upgrade to Pro to keep using it and haven't hit a quota since. I consider it a low tier sub in cost, but enables a high/max tier sub workflow.
- walthamstow 30 days ago
  I've only managed to hit the Opus 4.5 limit once after a really productive 4-hour session. I went for a cup of tea and by time I came back the limit had refreshed.
  I really think people are sleeping on how generous the current limits are. They are going to eat Cursor alive if they keep it this cheap.
  The IDE itself remains a buggy mess, however.
- NSPG911 30 days ago
  Not exactly sure why you are paying for Claude Pro, doesn't GH Copilot Pro give you Claude Opus 4.5 (which I'm assuming you are using since it is SOTA for now). OpenCode lets you use GH Copilot, so you can use OpenCode's ACP adapter and plug it into the IDE
  [-]
  - mrbungie 30 days ago
    Antigravity also gives access to Opus 4.5, and OpenCode lets use Antigravity APIs too.
sreekanth850 30 days ago
I use Claude for Code, Gemini for research and planning and GPT for motivation.
131hn 30 days ago
Hi gemini, i’ve booked some tickets for théater. Please look into mail mailbox, schedule it in my calendar and confirm me the planning for next week.
Beeing able to use natural processing my mail and calendar make me switch to gemini (app), there’s no way to achieve that with chatgpt (app)
Gemini is now good enough, even if i prefer chatgpt.
I only care about what i can do in the app as paying customer, even if, aside from that, i am working in IT with the SDK openrouter & MCP intégration & whatever RAG & stuff for work
[-]
edg5000 30 days ago
Is Gemini 3 still having all these bugs in the software around it? The model is great, but I had all these little bugs (billing issues, attachment not accesible by the model, countless other issues).
Then there is the CLI; I always got "model is overloaded" errors even after trying weekly for a while. I found Google has this complex priority system; their bigger customers get priority (how much you spend determines queue prio).
Anybody did some serious work with gemini-cli? Is it at Opus level?
ahartmetz 30 days ago
It's funny how companies have a stable DNA: Google comes from university research and continues to be good at research-y things, OTOH customer service...
dylan604 30 days ago
What CRT standard is this meant to be emulating? It can't be NTSC, it's too clean. Red would never display that cleanly. Red was infamous for bleeding as the saturation increased. Never had much experience with True PAL in that I've only ever seen PAL at 60Hz so I'm not sure if had the same bleeding red issue.
It's these kinds of details that cab really set your yet another emulator apart
[-]
- ceejayoz 30 days ago
  Wrong thread?
  [-]
  - dylan604 29 days ago
    Oops. yes. problem of having too many tabs open before caffeine
    correct location: https://news.ycombinator.com/item?id=46544042
qsort 30 days ago
It seems to me like this is yet another instance of just reading vibes, like when GPT 5 was underwhelming and people were like "AI is dead", or people thinking Google was behind last year when 2.5 pro was perfectly fine, or overhyping stuff that makes no sense like Sora.
Wasn't the consensus that 3.0 isn't that great compared to how it benchmarks? I don't even know anymore, I feel I'm going insane.
[-]
- buu700 30 days ago
  > It seems to me like this is yet another instance of just reading vibes, like when GPT 5 was underwhelming and people were like "AI is dead"
  This might be part of what you meant, but I would point out that the supposed underwhelmingness of GPT-5 was itself vibes. Maybe anyone who was expecting AGI was disappointed, but for me GPT-5 was the model that won me away from Claude for coding.
- SirensOfTitan 30 days ago
  I have a weakly held conviction (because it is based on my personal qualitative opinion) that Google aggressively and quietly quantizes (or reduces compute/thinking on) their models a little while after release.
  Gemini 2.5 Pro 3-25 benchmark was by far my favorite model this year, and I noticed an extreme drop off of quality responses around the beginning of May when they pointed that benchmark to a newer version (I didn't even know they did this until I started searching for why the model degraded so much).
  I noticed a similar effect with Gemini 3.0: it felt fantastic over the first couple weeks of use, and now the responses I get from it are noticeably more mediocre.
  I'm under the impression all of the flagship AI shops do these kinds of quiet changes after a release to save on costs (Anthropic seems like the most honest player in my experience), and Google does it more aggressively than either OpenAI or Anthropic.
  [-]
  - jasonfarnon 30 days ago
    This is a common trope here the last couple of years. I really can't tell if the models get worse or its in our heads. I don't use a new model until a few months after release and I still have this experience. So they can't be degrading the models uniformly over time, it would have to be a per-user kind of thing. Possible, but then I should see a difference when I switch to my less-used (wife's) google/openAI accounts, which I don't.
  - trvz 30 days ago
    It's the fate of people relying on cloud services, including the complete removal of old LLM versions.
    If you want stability you go local.
    [-]
    - gardnr 30 days ago
      Which models do you use locally?
  - vedmakk 30 days ago
    I can definitely confirm this from my experience.
    Gemini 3 feels even worse than GPT-4o right now. I dont understand the hype or why OpenAI would need a red alert because of it?
    Both Opus 4.5 and GPT-5.2 are much more pleasant to use.
paulpauper 30 days ago
Gemini is great because of far fewer rate limits compared to Open AI . When I hit a rate limit on Open AI, I switch to gemini.
[-]
- kelseyfrog 30 days ago
  Why not start with Gemini then?
clickety_clack 30 days ago
It would have to be significantly better than the competition for me to use a Google product.
[-]
- verelo 30 days ago
  Everyone’s all about OpenAI v Google, meanwhile i spend 99% of my day with Claude.
  It’s less about it having to be a Google product personally, it just needs to be better, which outside of image editing in Gemini pro 3 image, it is not.
  [-]
  - HelloMcFly 30 days ago
    I like Claude. I want to use it. But I just never feel comfortable with the usage limits at the ($20/month level at least) and it often feels like those limits are a moving, sometimes obfuscated target. Apparently something irritating happened with those limits over the holidays that convinced a colleague of mine to switch off Claude to Gemini, but I didn't dig for details.
    [-]
    - pluralmonad 29 days ago
      The only thing I'm aware of is that they drastically increased limits between Christmas and new years day. Message might have even said unlimited, I don't recall precisely.
      [-]
      - HelloMcFly 29 days ago
        Okay, he says they drastically (though temporarily) increased limits after surprising users with reduced limits that generated a strong reaction. This is the definition of hearsay though.
Fricken 30 days ago
Google's ham-fisted rollout of Bard as an answer to Chat GPT's was a confounding variable because otherwise there was little reason to doubt Google's ability to compete at AI over the long-term. It's in their DNA.
dev1ycan 30 days ago
They didn't get an edge, they are giving gemini pro for free for a year for university emails, obviously people will use it, after a year everyone will drop it, people aren't paying for this.
SirMaster 29 days ago
Why wouldn't google do well? They have one of the best data sources, which is a pretty big factor.
Also they have plenty of money, and talented engineers, and tensor chips, etc.
outside1234 30 days ago
Gemini is amazing. I switched to it and haven't looked back at ChatGPT. Very fast, very accurate, and pulls on the whole set of knowledge Google has from search.
wewewedxfgdf 30 days ago
I feel like Gemini made a giant leap forward in its coding capabilities and then in the past week or so it's become shit again - constantly dropping most of the code from my program when I ask it to add a feature - it's gone from incredible to basically useless.
[-]
- vedmakk 30 days ago
  Same experience here. Shame.
neves 30 days ago
Man, this looks like a press release written by Pinchai himself.
Doesn't WSJ even blush when publishing these kind of things?
[-]
- sumedh 30 days ago
  I dont think WSJ makes money from subscription fees, ads keep the lights on for journalism.
  [-]
  - huhkerrf 29 days ago
    It's not that hard to find the data. Subscription fees are the great majority of WSJ overall revenue. ~$500 million of $586 million total.
    https://www.amediaoperator.com/news/wsj-subs-rise-as-pricing...
jacooper 30 days ago
If only they could figure out how to fiz the hallucination issues in gemini models...
David666 30 days ago
antigravity+browser use，check all rss，hot.uihash.com
utopiah 30 days ago
Hot take : they didn't, pure players (OpenAI & Anthropics) just didn't go as fast as they claimed they would.
motbus3 30 days ago
I would add that openai is doing such a poor job at every aspect.
Before this gpt nonsense they were such an aspiration for a better world. They quickly turned around, slayer core people from its structure and solely focus on capitalising that they seem to be stuck on dead waters.
I dont see any reasons to use gpt5 at all.
David666 30 days ago
antigravity+browser-use
khalic 30 days ago
lol somebody got a fat check from google. American journalism has become such a joke
shimman 30 days ago
Being a monopoly worth trillions while having enough BUs to subsidize anything you can imagine does have its perks.
[-]
- xnx 30 days ago
  Also having invented the transformer architecture, doing "AI" since it was called "machine learning", and having data centers and TPUs that define state of the art.
  [-]
  - shimman 29 days ago
    Yes and they would have never been able to do any of this without their illegal monopoly that needs to be broken up and sliced/diced to benefit society.
- dilap 30 days ago
  Well sure, but lots of big companies have all the resources in the world and can't execute. Google really did turn things around in an impressive way.
- asa400 30 days ago
  Additionally, they have built-in distribution and integration for their products. I don’t know how folks don’t see that as a massive advantage.
  It’s like Microsoft and Internet Explorer in the 90s but on a much larger scale both in the breadth (number of distribution channels) and depth (market share of those channels in their respective verticals).
  [-]
  - maxkfranz 30 days ago
    That's true. It's also a fine line to walk for Google.
    Google has recently received regulatory pressure, for instance, just like Microsoft had trouble in the late 90s.
- paxys 30 days ago
  Didn't work for Meta
  [-]
  - blitzar 30 days ago
    Skill issue.
  - ajross 30 days ago
    Or Apple. Also Microsoft who were the ones bankrolling OpenAI before it started floating on its own equity.
    In point of fact money to throw at AI development is essentially free, not one of the big players sustains itself on income. Investors are throwing every dollar they have at everyone with a usable pitch.
    Whatever advantages Google had, financial stability is way, way down the list. For better theories, look to "Proven Leader at Scaling Datacenters" and "Decoupled from the CUDA Cartel via being an early moving on custom AI hardware".
  - shimman 30 days ago
    The issue with Meta is entirely confined to a single individual, Zuckerberg.
- echelon 30 days ago
  I think the single biggest bad thing Google does (Android and YouTube bad practices [2] aside):
  Google taxes every brand and registered trademark.
  The URL bar is no longer a URL bar. It's a search bar. Google used monopoly power to take over 90% of them.
  Now every product, every brand, every trademark competes in a competitive bidding process for their own hard-earned IP and market. This isn't just paying a fee, it's a bidding war with multiple sides.
  If you want to strike Google at their heart, make it illegal to place ads against registered trademarks.
  I'm going to launch "trademark-extortion.org" (or similar) and run a large campaign to reach F500 CEOs and legislators. This needs to end. This is the source that has allowed Google to wreak incredible harm on the entire tech sector. Happy to send this to Sam Altman and Tim Sweeny as well.
  [1] Rug-pulling web installs; leveraging 3rd party vendors to establish market share and treating them like cattle; scare walls and defaults that ensure 99.99% of users wind up with ads, Google payment rails, etc. ; Google ads / chrome / search funnel ; killing Microsoft's smartphone by continually gimping YouTube and Google apps on the platform ; etc. etc. etc. The evilest company.
  [-]
  - xnx 30 days ago
    Counterpoint: This allows competitors visibility in a space where consumers would otherwise blindly stick with overpriced brand names.
    [-]
    - echelon 30 days ago
      Is it?
      Google isn't exposed in such a way.
      Hundreds of billions of dollars are being deflected to a single entity, adding to customer costs.
      This turns into a game of who has the bigger ad budget.
      Companies can compete on product and win in the court of public opinion. TikTok marketing and reviews showcase this masterfully. Even Reddit reviews.
      Yet when you go to search for those things, the Google tax bridge troll steps in and asks for their protection money.
      They're a cancer on capitalism and fair competition.
      We tell business people it's illegal to perform bribery. Yet that's exactly what this is. It's zero sum multi-party bidding, so it's even more lucrative to the single party receiving the bribes.
      [-]
      - walterbell 30 days ago
        > zero sum multi-party bidding
        Does this apply to [Google, Apple] App Store advertising?
  - aeonik 30 days ago
    I think Firefox was first to the "Awesome Bar".
84729839278392 29 days ago
[dead]
ath3nd 27 days ago
[dead]
binarymax 30 days ago
This reads like a paid post from Google.
[-]
- almosthere 30 days ago
  This reads like a paid comment from OpenAI.
  [-]
  - binarymax 29 days ago
    It most certainly is not.
- zaphirplane 30 days ago
  There are an increasing number of breathless Gemini fans posting
byyoung3 30 days ago
I think Gemini is still far behind.
[-]
- nemo 30 days ago
  I did some tests with heavily math oriented programming using ChatGPT and Gemini to rubber-duck (not agentic), going over C performance tuning, checking C code for possible optimizations, going over math oriented code and number theory, and working on optimizing threading, memory throughput, etc. to make the thing go faster, then benchmarking runs of the updated code. Gemini was by far better than ChatGPT in this domain. I was able to test changes by benchmarking. For my use case it was night and day, Gemini's advice generally quite strong and was useful to significantly improve benchmarked performance, ChatGPT was far less useful for this use case. What will work for you will depend on your use case, how well your prompting is tuned to the system you're using, and who knows what other factors, but I have a lot of benchmarks that are clear evidence of the opposite of your experience.
  [-]
  - CamperBob2 29 days ago
    Which models? It's completely uninformative to say you compared "ChatGPT" and "Gemini." Those are both just brand names under which several different models are offered, ranging from slow-witted to scary-smart.
- lukebechtel 30 days ago
  why?
  [-]
  - AlienRobot 30 days ago
    I'm not OP but I tried to ask it today to explain to me how proxies work, and it felt like it couldn't give me an answer that it couldn't attribute to a link. While that sounds good on paper, the problem is that no matter what you ask, you end up with very similar answers because it has less freedom to generate text. While on the other hand, even if attributes everything to a link, there is no guarantee that link actually says what the LLM is telling you it says, so it's just the worse of both worlds.
    ChatGPT on the other hand was able to reformulate the explanation until I understood the part I was struggling with, namely what prevents a proxy from simply passing its own public key as the website's public key. It did so without citing anything.
    [-]
    - electroglyph 30 days ago
      did it tell you how smart your questions were, too?
- AshamedCaptain 30 days ago
  Maybe this week it is.
mbrumlow 30 days ago
lol, the story Disney did not make.
Just like the Disney movie, no touchy the Gemini.
bookman10 30 days ago
They're about the same as far as I can tell.
nottorp 30 days ago
It did? Just 1 minute ago Gemini told me it can't use my google workspace AGAIN. The query had nothing to do with any google workspace feature. It just randomly tells me what in the middle of any "conversation".