You can point DuckDB to almost any data source and boom, you get an SQL table that you can search, sum, or join to any other data. Or you can attach existing databases from completely independent db systems, and query and join them as one, without having to first importing anything.
It feels exhilarating (if you're into that sort of thing!)
My honeymoon with duckdb wore off pretty quickly when I need to compile it, myself, into a single-file concordance. I understand it's open source, so I'm free to be ignored. But, it's positioning itself as a drop-in replacement for SQLite; a large part of SQLite's appeal is its ergonomics — its single-fileness — letting me deliver a rational object to my users.
EDIT: "drop-in replacement like SQLite", not "for SQLite".
> it's positioning itself as a drop-in replacement for SQLite
While SQLite is often used for comparison (“SQLite for OLAP”), I’ve never seen DuckDB market itself as a “drop-in” replacement. Where did you see that?
Sqllite and duckdb serve pretty different niches; duckdb is less embeddable but on the OLAP side it’s by far the best today. I wouldn’t ever see them as competing for the same app, though
It works fine for this small set of emails, although the search isn't great, and there was more preprocessing that I would have liked. (I would prefer to be able to point a single binary at a pst or mbox file, and have it magically serve it like this, even if it means I need a VPS to serve it.)
Here's one: a client of mine has a bunch of SnapLogic pipelines that are configured to send errors via email, and there is no other persistent logging system. This results in tens of thousands of emails that are insanely hard to search and parse for any useful auditing.
You can point DuckDB to almost any data source and boom, you get an SQL table that you can search, sum, or join to any other data. Or you can attach existing databases from completely independent db systems, and query and join them as one, without having to first importing anything.
It feels exhilarating (if you're into that sort of thing!)
EDIT: "drop-in replacement like SQLite", not "for SQLite".
While SQLite is often used for comparison (“SQLite for OLAP”), I’ve never seen DuckDB market itself as a “drop-in” replacement. Where did you see that?
Clickhouse seems less marketed, but seems quite similar.
I'm aware of jmail.world, but they haven't (yet?) published the source code.
I had Claude hack something together recently: https://healdsburg-youcubed-emails.vercel.app/
It works fine for this small set of emails, although the search isn't great, and there was more preprocessing that I would have liked. (I would prefer to be able to point a single binary at a pst or mbox file, and have it magically serve it like this, even if it means I need a VPS to serve it.)
These emails aren't published by default but email archives are often included in responses to public record requests.
Ideally anyone who receives one of these archives would be able easily inspect it themselves, and also make it available to others.