Show HN: Offline tiles and routing and geocoding in one Docker Compose stack

(corviont.com)

102 points | by packet_mover 35 days ago

8 comments

t0mk 35 days ago
I like how you packed only the necessities - tiles for maps, routing, and geocoding index in in sqlite. I checked the monaco deployment and missed lookup with street number, as someone else also pointed out.
Why not create a "builder" repo, where people could generate their own local datasets by a bounding box?
[-]
- packet_mover 35 days ago
  Yeah, house-number lookup is not there yet. The demo geocoder does place/street-level search + reverse, but house numbers need a richer address index - it's on the roadmap.
  Re: a bbox "builder" repo - it's an interesting idea. I could see it going two ways: (a) you want to run a bbox builder yourself, or (b) you want a simple way to specify a bbox so the dataset pack can be produced for you.
  I started with the "ship a known-good pack" approach because the build pipeline is the messy part, and I want deployed boxes to stay simple/reproducible.
  For your use case, which did you mean - run the build locally, or "draw/paste a bbox and get back a ready-to-run pack"? And would bbox be OK, or do you prefer admin boundaries (country/state/city)?
  [-]
  - t0mk 34 days ago
    I would prefer to run the build locally/myself, and to specify either a bounding box with lon and lat, or a country. The output would be data files and/or docker images with data. But as sibling comment said, if you plan to sell it, I understand it makes less sense to offer building logic.
    I get why you created the monaco pack, it's a nice demo and fast to download and run. I would rather choose a big city where potential users live (london, nyc, the bay,?), but maybe there are technicalities that make that more complex.
    Drawing box on a map in browser and generating the tiles, routing and geocoding db sounds quite heavy for backend compute. There was a project wiht a website that could generate a tileset (pmtiles file?) from box drawn on a map, but I can't find it now. There was a limit to the box size and if it was over some threshold, you had to have premium, or contact sales, or sth, I thought it was protomaps, but no tool like that there now.
    Anyway really nice idea, I will follow your progress for sure!
    [-]
    - packet_mover 34 days ago
      Thanks, appreciate the context.
      Bounding box input is a really interesting idea. For my product, the best use would probably be to automate the request/fulfillment flow if this ever picks up. Letting someone draw/paste a bbox makes the request unambiguous and could be automated end-to-end.
      Btw, the pack build process isn’t that compute-heavy. It’s not instant, but as a rough data point: on my laptop, a small country like Austria/Slovakia is on the order of ~1 hour to build the three artifacts (tiles + routing tiles + SQLite geocoder).
- mike_d 35 days ago
  > Why not create a "builder" repo, where people could generate their own local datasets by a bounding box?
  If you read the bottom of the page they plan to sell it. Not sure how that works when all the software/data is open.
  [-]
  - packet_mover 34 days ago
    Yes - I do plan to sell Corviont, and it is worth explaining the licensing bit.
    It is built on open-source components and OSM-derived data, so the obligations are mostly (a) keeping third-party software licensing/attribution intact, and (b) OSM/ODbL compliance - clear attribution, and if you distribute derived OSM databases, meeting the ODbL share-alike requirements for those database artifacts.
    I am handling this via attribution in the UI/docs and a licensing reference bundled with the dataset that points to a public licensing/attribution repo containing the third-party license texts and details: http://github.com/corviont/licensing
    What I am selling is the packaging + ops side: ready-to-run region packs and (eventually) a signed updater for fleets.
onaclov2000 35 days ago
Definitely interesting, I don't see an obvious comment on hardware requirements, do you know what those are?
I've played around with OSRM, and Nominatim, etc, but had to do some trickery to run on a raspberry pi.
(For anyone interested in running some of these kind of things on a pi, I talk about it generally here, I need to post an update with more info at some point. http://blog.onaclovtech.com/2025/02/general-purpose-to-speci...)
[-]
- packet_mover 35 days ago
  I do not have published benchmarks yet (I want to collect a few real region packs first and then write this up properly).
  One goal of Corviont is to avoid the on-device pain you hit with OSRM/Nominatim: the region pack is built once elsewhere, and the edge box mostly just serves prebuilt artifacts (PMTiles tiles, Valhalla routing tiles, and a SQLite geocoder index based on Nominatim).
  In practice, requirements scale with region size and traffic. For larger regions the main constraint is usually SSD storage plus enough RAM headroom for routing/cache. I also picked Valhalla partly because it generally has a smaller RAM footprint than OSRM at serving time (OSRM is extremely fast but tends to be more memory-hungry).
tomaskafka 35 days ago
I love it, thanks!
If I may have a feature request, I’d like to have only some of the features turned on - in my case it would be just the reverse geocoder (so I could skip the map and routing data download and storage).
Right now I have my own reverse geocoder for https://weathergraph.app which downloads OSM dumps and builds in-memory KD tree for lookups. Surprisingly, the whole world can fit in 3-4 GB of RAM, and service starts in 90 seconds on a cheap VPS, no database needed, but of course, having a battle tested solution that just works (and someone else maintains it) would help.
[-]
- packet_mover 35 days ago
  Thank you! And yes, Corviont is intentionally a stack of separable services behind one gateway - so "geocoder-only" is exactly the kind of config I want to support (skip PMTiles + Valhalla and ship only the SQLite index + reverse API).
  Re: Weathergraph - thanks for the details. Since you already run whole-world reverse geocoding on a single server, that's a bit different from Corviont's current regional/fleet packaging (where you ship only the area you need to each edge/on-prem deployment). A "world geocoder-only" pack could still make sense - but it's a different distribution/update story than my default.
  For your use case, do you want reverse results at the city/region/country level - or do you also need street/house number detail? That choice mostly determines how heavy a world geocoder-only pack needs to be.
  [-]
  - tomaskafka 34 days ago
    Thanks for reply! It’s great to hear it’s configurable.
    I didn’t realize how many use cases are there - you guessed right that for my use the whole world, but only up to city level, is quite enough.
    In my todo list I have adding landmarks, so that I could eg. show a nearby mountain when outdoors and far away from city, but then deciding which POI types to include gets much trickier.
    [-]
    - packet_mover 34 days ago
      Yeah, I also ran into that when ranking geocoding results - you want to show more "important"/notable places first. In your case, one pragmatic approach could be to keep only features with wikidata or wikipedia tags (and maybe a small whitelist like natural=peak, tourism=viewpoint/attraction).
      This is also why I lean on Nominatim-derived data for Corviont - it already condenses a lot of this into a single importance/ranking model. I have not tried a whole-world build though, so I am not sure how well that behaves at planet scale.
      [-]
      - tomaskafka 32 days ago
        That’s definitely worth trying, I would guess many people would want to have a global service that works for any user.
bikelang 35 days ago
This is super cool. I’ve been kicking around an idea for ages regarding tile-based routing that I think would be excellent for offline routing. You could leverage the quadtree aspect of tiling to encapsulate faster, direct routes (ie highways) and as you go to deeper zoom levels you’d unlock small roads - even down to pathways. This keeps your in-memory graph small while traversing large distances (which would just be highways anyways) and once you eliminated most of the distance your remaining graph traversal on local roads would be small
[-]
- michaelt 35 days ago
  You might enjoy reading the papers "Highway Hierarchies Hasten Exact Shortest Path Queries" [1] and "Exact Routing in Large Road Networks using Contraction Hierarchies" [2] if you're interested in hierarchical approaches to shortest path routing.
  The algorithms do divide the map up into chunks that are themselves divided up and so on, but not on the strict geographical basis a quadtree uses. You might not want to divide Manhattan in two for routing purposes, even if the 74th longitude line runs straight through it.
  [1] https://turing.iem.thm.de/routeplanning/hwy/esaHwyHierarchie... [2] https://publikationen.bibliothek.kit.edu/1000028701/14297392...
- packet_mover 35 days ago
  That’s a really interesting framing - you’re basically describing hierarchical routing / "zoom-level" graphs: do the long leg on a coarse network (highways), then refine locally as you get closer to origin/destination.
  FWIW, Valhalla already does a version of this: it partitions the routing graph into hierarchical tiles and runs with multiple hierarchy levels (highway / arterial / local) specifically to keep search + in-memory working set smaller on long routes: https://valhalla.github.io/valhalla/tiles/
  The "quadtree tile unlock" mental model is a nice way to think about it though - if you have a favorite paper / implementation that leans harder into the tiling aspect, I’d love a pointer. I’m currently focused on packaging + offline data consistency, but routing performance on constrained edge boxes is definitely a core constraint I care about.
- krapht 35 days ago
  What you suggested has been done before - you might find a review of the literature fun if this sort of thing interests you, even if academic papers are pretty dry reading normally.
  [-]
  - tomaskafka 35 days ago
    Also game engines need to do painting very fast
dabreegster 35 days ago
Quite cool to see this space being explored! https://github.com/headwaymaps/headway is another related project.
[-]
- packet_mover 35 days ago
  Nice - thanks for the pointer. Headway is definitely a related "self-hosted maps stack" project.
  One place Corviont is trying to differentiate is the update story for edge/fleet deployments: the goal is a signed, resumable regional dataset updater (verify manifest -> atomic swap -> reload/rollback) so boxes in the field can stay fresh without manual rebuilds or "re-download the world" updates. Headway (at least from a quick skim) looks more like "bring your own data / regenerate when needed," which is totally fine for servers, but fleets usually need something more automated.
  If you've seen Headway (or similar) handle incremental/regional updates well, I'd love to learn from it - updater design is the big missing piece I'm validating demand for.
  [-]
  - ikawe 35 days ago
    (Headway maintainer here)
    Indeed there is currently no incremental update in Headway, and deployments are largely an exercise left to the reader.
    For maps.earth (a Headway planet deployment), I typically rebuild the world, and then do a blue/green deployment.
    I guess the one exception is for transit routing. We have individual transit zones small enough to fit into memory, which can be deployed incrementally. There’s nothing really built in about it - just another level of indirection via our “travelmux” service which redirects your routing queries to a different backend depending on mode and region.
    [-]
    - packet_mover 35 days ago
      Thanks for chiming in - super helpful context.
      I am trying to learn from real deployments as I design Corviont's updater for edge boxes (bandwidth caps, maintenance windows, unreliable WAN, atomic swap + rollback).
      When you say transit "zones" are small enough to deploy incrementally - what is the actual artifact per zone (roughly what format), and what sizes do you typically see?
      And when a transit zone dataset changes, how do you roll that out safely - do you restart/reload the backend that serves that zone, or do you bring up a new backend/version and then flip travelmux to point at it?
      [-]
      - ikawe 35 days ago
        Transit routing is provided by OpenTripPlanner, so the deployment artifact is their OTP serialized graph format.
        So it’s not really incremental with respect to the existing transit zone deployment. I just mean I can redeploy a single transit zone with the latest GTFS without having to touch the other transit zones, tileserver, geocoder, etc.
        Deployment/rollback is handled by k8s config.
        [-]
        packet_mover 34 days ago
        Thank you, that's very helpful.
leros 35 days ago
Does this not require a massive database of tiles?
I ask because I've been looking to self host some sort of map tile server and they seem to have database in the hundreds of GB.
[-]
- packet_mover 35 days ago
  Good question. Corviont is region-focused (you package one region), not "host the whole planet". Hundreds of GB is usually the full planet at high zoom / lots of layers.
  Also it’s not one giant tile DB - there are 3 datasets:
```
  - map tiles (PMTiles)
  - routing tiles (Valhalla tiles)
  - geocoder index (SQLite)
```
  For Monaco all three are tiny - you can see the exact files here: https://github.com/corviont/monaco-demo/tree/main/data
  For small countries like Austria/Slovakia, each is typically hundreds of MB.
- KomoD 34 days ago
  The size of the tiles will depend on how much detail you want.
  Maptiler has a bunch of datasets, anywhere from 385MB to 527GB but the OSM dataset is only 70GB. (MBTiles format)
willi59549879 35 days ago
searching with street name and number offline would be nice. In google maps search only works when online
[-]
- packet_mover 35 days ago
  Good point - house-number search isn't there yet in Corviont.
  Right now the offline geocoder in the demo does place/street-level search + reverse, but street + house number ("Main St 12") isn't supported yet. It's explicitly on the near-term roadmap: richer geocoding output with house numbers and (optionally) street/area geometry instead of just centerpoints.
  [-]
  - dennis16384 35 days ago
    Why not package photon?
    [-]
    - packet_mover 35 days ago
      Photon is solid - but it comes with a very different operational profile than what I am aiming for.
      Photon is built on Elasticsearch (Java) - so it tends to mean a heavier index + higher RAM/CPU expectations and more moving parts. That's fine on a beefy server, but it is a rough fit for the "drop-in appliance on small edge/on-prem boxes (amd64/arm64) + simple ops" goal.
      Corviont's geocoder is intentionally "boring": a single SQLite file + an HTTP service, built from Nominatim-derived data. Fast startup, low RAM, easy to ship per-region, and it stays consistent with the rest of the stack.
      That said - if there is demand for a "server-grade geocoder option" for people already comfortable running Elastic, I am not opposed to offering it as an alternative profile. The default is just optimized for constrained edge hardware and minimal moving parts.
      [-]
      - dennis16384 34 days ago
        Have you measured actual memory and disk requirements of a photon OpenSearch index vs your sqlite database?
        [-]
        packet_mover 34 days ago
        No. When I looked at Photon and saw that it involves running Java plus an OpenSearch/Elasticsearch backend on the device, I assumed it would be heavier in terms of memory and moving parts than my setup (single SQLite file + small HTTP API).
        Have you (or anyone here) actually run Photon on edge-class hardware? If you have real-world numbers, I'd be interested in seeing them. When I add house-number search, Photon might be an easier route than enhancing my current approach.
        [-]
        karussell 34 days ago
        > involves running Java plus an OpenSearch/Elasticsearch backend on the device
        photon is just a single process and opensearch runs inside it (but you can run photon and opensearch separately). Saying "Java" means more memory is in general wrong as the underlying technology Lucene is heavily memory optimized.
        dennis16384 34 days ago
        Well, trying out is better than a thousands words
        Let's start it with index of whole Spain, 2.4gb download, 4gb on disk: https://gist.github.com/dkourilov/e243270684b5973f1fac005c78...
        I'd say it's pretty usable to run a EU-sized country or several US states on any commodity PC. For embedded devices, it really depends what is the device. On Raspberry PI it should be fine for batch geocoding, realtime (typeahead) will definitely be lagging.
        [-]
        packet_mover 34 days ago
        Thanks both - appreciate the clarification and the Spain datapoint.
        Those numbers look pretty reasonable. I’ll keep Photon in mind and, as I get time to benchmark different approaches on a few representative regions/hardware, I’ll use the results to decide what the best way forward is - and I’ll publish the numbers when I do.
Natfan 34 days ago
mild critique: this website looks _incredibly_ vibe coded.
[-]
- packet_mover 34 days ago
  It is built in a website builder (https://unicornplatform.com/). I kept it intentionally simple while I validate demand - the real artifact is the demo + repo/docs, the site mostly just points people there and helps with validation.
  If you have 1-2 concrete suggestions on what makes it feel that way (copy/layout/typography), I am happy to improve it.