Indexing 100M vectors in 20 minutes on PostgreSQL with 12GB RAM

(blog.vectorchord.ai)

64 points | by gaocegege 5 days ago

4 comments

  • esafak 53 minutes ago
    How does it compare with paradedb and lancedb?
  • nwellinghoff 4 days ago
    Too bad aws does not support any of these other vector extensions in managed rds.
  • duckbot3000 4 days ago
    Kinda makes you wonder why you need cloud for anything besides remote encrypted backups if you can run all that on 12GB
    • riku_iki 3 days ago
      what about failover story if server dies? PG failover setup is complicated, and cloud infra handles this for you.
      • logifail 9 minutes ago
        (Genuine question) What's your current plan for when your cloud provider goes offline? Do you have a failover story, or it a case of "wait for them to come back online"?
      • positron26 1 hour ago
        Do we mean managed or PG on K8s like CNPG? In all cases, I use the infra to simplify things like having disk redundancy and failover nodes, not because 12GB is interesting.
        • riku_iki 1 hour ago
          Primary managed PG, since you still need setup/maintenance/monitoring on your K8S own solution.
    • setr 4 days ago
      Because getting any hardware out of infra-team on premise is utterly miserable, across the board.
      • lelanthran 1 hour ago
        That's not the only alternative.

        Rent your VPS and add in extra volumes for like $10 per 100GB.

  • ayende 1 hour ago
    That suffer from a serious issue

    You must have the data upfront, you cannot build this in an incremental fashion

    There is also bo mention on how this would handle updates, and from the description, even if updates are possible, this will degrade over time, requiring new indexing batch