Docker Compose was production ready in 2015 and it still is today. I've lost track of how many projects I've deployed with it and never really ran into a single issue where Docker Compose was at fault. It's super solid.
I love Docker Compose. It is simple to use, easy to organize and manage, and very robust. Also, our company does not need to "scale" production aggressively. Our production load is very predictable, so Docker Compose fits like a glove.
We have been using it for more than five years now. Before that, we had a legacy deployment model, and I do not remember a single major issue related to Docker Compose.
We use it for both staging and production environments. The same Docker image validated in staging is deployed to production. Never fails!
It's simple to use only for toy use cases, that's why nobody uses it. The article everyone in this thread seems to like only goes as far as 'I pushed to git so it must be ok' which is laughable and I'm not even DevOps.
What happens if it errored on deployment or after that? you wanna write custom (bash? :D) hooks for that? What about upgrading your 'very vertically scalable' box? What if it doesn't come up after the upgrade? your downtime is suddenly hours, oops.
The k8s denial is strong and now rivals frontend frameworks denial. Never fails to amuse.
Fair points, and yes, failed deploys need to be handled explicitly.
In our case, the answer is not "hope and bash". We deploy versioned images, use health checks, monitor the result, and keep rollback simple: redeploy the previous known-good image/config. Host upgrades are also treated as maintenance events, with backups and a recovery path, not as something Compose magically solves.
But I think there is an opposite mistake too: assuming every production system should be operated like a high-scale tech company.
Many production workloads are boring, predictable, and business-critical. They do not need aggressive autoscaling, multi-node orchestration, or constant traffic-spike handling. They need reliable deploys, backups, monitoring, health checks, and a clear rollback path.
That is where Compose can be a good fit: simple operational model, understood failure modes, low moving parts.
Kubernetes becomes much more compelling when you actually need automated failover, rolling deploys, autoscaling, multi-node scheduling, and stronger deployment primitives.
Not needing Kubernetes is not necessarily denial, it is just choosing the complexity budget that matches the problem.
Definitely not a one-size-fits-all choice, but Kubernetes can be so easy and there are so many benefits that get you from one small app to a medium sized business that it seems like a no-brainer for someone starting out. Spinning up k3s is pretty minimal overhead, but right away you can handle storage and backups very easily, automatic certs for all your apps with cert-manager is pretty much a one-and-done, traffic management for external and internal tools is easy, and even logins for websites is just an annotation in a yaml file. You can spin up and try out any software you want without spending time configuring it or setting up additional servers- and when you do need more hardware, it's one command on a virtual server, and just about as easy with physical hardware.
2-3 miniPCs, cloudflare, tailscale, and k3s can save (possibly tens of) thousands on SaaS products, and would probably scale you to a company of dozens AND host your product.
I think people are using different meanings of “production environment.”
I agree with gear54us and upvoted their comment, but I also understand what the author of the root comment is saying.
I have also delivered systems using Docker Compose that are actually running in production. The point I want to make is that people may define “production” differently depending on the number of active users, operational requirements, and risk level.
To me, this debate feels similar to the broader monolith vs. microservices debate.
Not necessarily. When you get to those numbers you're seeing dozens of teams with their own silos and deployment methods. So they might be responsible for the core business that's running 30 nodes and serving 100MM users a day, or they might be working on some internal portal or a WordPress site.
The post receive hook provides you real-time feedback as it runs in the terminal where you did the git push. If something broke during the deployment you'd get notified by looking at the output. If it's running in CI, you'd see a CI failure and get notified using whatever existing mechanisms you have in place to get notified of deployment pipeline failures.
Zero downtime server upgrades are easy. You could make a new server, ensure it's working in private and then adjust DNS or your floating IP address to point to the new server when you're happy. I've done this pattern hundreds of times over the years for doing system upgrades without interruption and safely. The only requirement is your servers are stateless but that's a good pattern in general.
Define production. Docker compose is fine for running a small internal service in production for dozens of users (i.e. not for developing said service, but for using it). I would assume it isn't fine to run a hyperscaler (but I wouldn't know). Those are extremes, and there are going to be a ton of situations in between.
I can't personally speak to what the limit of docker compose is, as I have only worked on the lower end of this: self hosting for personal use and for small internal services serving maybe 20 users.
From my personal experience if deployment strategy is thought through then Docker running through Compose can handle few hundreds of thousand of users per day without an issue and probably could handle more with proper hardware upgrades.
Most people/apps only need the toy cases... If you're writing an internalized tool for a company that will only have a handful of users, then doing much more than compose for deployments is a violation of KISS and YAGNI.
Are you really going to try to get 4+ 9's of uptime for a small, one-off app? Do you really need to use a cloud distributed data store that only slows things down for no real gains in practice? Do you really think the cloud services are never down, and you're willing to spend a f*ck-ton of money to create a distributed app when historically an Access DB or VB6 app would have done the job?
I've moved applications deployed via compose pretty easily... compose down -t 30 then literally sftp the application to a backup location then to the new server which only needs the Docker Engine community stack installed.. then compose up -d ... tada! In terms of deployment, you can use github action runners if you want, or anything else... you can even do it by hand pretty easily.
How do I backup docker volumes?
I never found a native flow for backing up docker compose projects.
While not built in k8s has at least velero and kasten. However they are only possible because of snapshots https://kubernetes.io/docs/concepts/storage/volume-snapshots... and kasten has a plugin like architecture (because of k8s ) that supports application specific backups.
However I never found something like that for compose. And that is troublesome in bigger projects like sentry
The "easiest" way is to use bind mounts to a local directory (or multiple directories) instead of volumes. Then you can just use normal backup tooling.
Docker volumes (and bind mounts) however have the minor problem of being hard to get a consistent copy to without stopping the service. You can work around this by, e. G., having ZFS or btrfs as the underlying FS and making a snapshot there. Otherwise, your software (like PostgreSQL) might also have other online backup tooling.
Why use a docker volume? Seems like a good way to accidentally lose your files. Just volume-mount from the external filesystem and back it up the same way you would any linux application's files - maybe stop it, maybe not, depends on how it uses files.
Extremely debatable. They still have never fully implemented health checks and auto healing. I have had compose itself behave in unexpected ways, weird things like not realizing the tag of an image it is running is actually in use, and letting prune commands yank it out from under the system. Other things I can't remember. I'd rather use something like Nomad or for simpler systems maybe plain systemd. But realistically kubernetes is a superior orchestrator in just about every way, and installing k3s is simple and k3s is actually production ready. I don't like kubernetes all that much as cluster tech, but as a container orchestrator it has a lot of nice features.
> Extremely debatable. They still have never fully implemented health checks and auto healing.
Agree.
Plus there's the monitoring of the host that is always overlooked in articles. I've ended up chucking Monit on there to monitor disk usage et al, and also used it for monitoring compose too and restarting containers.
And then there's Healthchecks.io, and external uptime monitoring... the list goes on. Properly monitoring systems, even single server systems, is not simple.
I'm totally onboard with k3s/k8s being better in a lot of cases.
But docker compose can actually be very sufficient for what many projects actually need.
Granted I am a guy pushing for compose based localdevs and such but going further you often just cannot beat the simplicity of doing update QA or other CI/CD workloads in compose based projects. I have had dozens of projects where we replaced flaky slow and maintenance heavy pipelines with just docker compose up --build --wait in the past years. How come you say health checks are still broken?
I do use compose for some things, smaller one off type setups, and I’ve done the compose up --build CI/CD approach before. I’m generally not a fan of building on the production node outside of very small deployments. It can work, I just think it tends to blur the line between build and runtime more than I’m comfortable with.
Some of my concerns with compose aren’t purely technical. It makes it easier to lean on local state like volumes, bind mounts, and large .env files. Similar mechanisms exist in kubernetes, but the additional setup tends to force a bit more thought about whether they’re actually needed or just a shortcut.
On the health check side, they exist, but compose doesn’t fully act on them, that's the part that is missing. There’s no built in remediation or orchestration behavior tied to health status, which is why things like
https://github.com/willfarrell/docker-autoheal exist. It’s something that was never fully carried through in Docker itself.
Really liked reading your blog. Bookmarked for future. One question: for databases, do you recommend using containers as well because in development, I love the ease of using databases in docker compose as well but I always worry about production in terms of resilience. Thoughts ?
For databases, I usually host them on a separate server. This could either be through Docker Compose or a managed DB server. If a managed DB is affordable enough I'd reach for it.
It's because I like keeping my servers stateless when possible. It makes it easier to upgrade them in a zero downtime way later.
If your web server has your DB too, then you can't do zero downtime system upgrades. For example I would never upgrade Debian 12 to 13 on a live server. Instead, I'd make a new server with 13, get it all ready to go and tested and then when I'm ready flip over DNS or a floating IP address to the new server. This pattern works because both the old and new server can be writing to a database on a different server.
With all that said, if you were ok with 1 server, then yeah I'd for sure run it in Docker Compose.
At this stage the volume/persistence configuration for all of the major DBs is arguably extremely well understood and has been for years. The only real risk in running the DB as a container for most people is not configuring volumes for persistence correctly.
For most DBs it's one or two paths in the container, and virtually all DBs vendors have a reference Docker Compose example somewhere showing volume config. I can't remember the last time I ever "natively" installed a DB personally!
Do you prefer self hosting DB with container OR using a managed service liek RDS ? I guess both can work depending on your level of comfort and even though I am a big self host guy, db hosting is something that makes me nervous and I end up just leaving it to RDS etc.
The answer to this for me anyway depends entirely on the size of the solution, what the rest of the stack looks like, how many users, what is my support contract like etc etc, do I have to collaborate with other engineers or is it just me? Similarly, if you already have a bunch of ops guys managing some RDS stuff, it might make sense to just take advantage. RDS also comes with a ton of features a simple compose stack won't, especially around redundancy and disaster recovery.
I don't think there's a good one size fits all answer to whether hosting in Compose or RDS is right for you or a given project.
Not OP, but I think it depends heavily on your use case and where you are deploying... I've used containerized DBs as well as leaned into hosted DBs in a given cloud environment. I've tended to favor PostgreSQL for container dev simply because it is well supported in pretty much every first and second tier cloud provider out there.
It really comes down to YMMV... Sometimes for a singular app surface, it's easier to just use a compose file that includes the database. mailu/mailcow is a good example... you don't necessarily want to comingle email on the same server as other services.
That said, if you need to share a single DB or set of DBs across an applicaiton with several instances/deployments, then it makes much more sense for a central deployment. I almost never do my own host level install, instead relying on cloud hosting and mgt. The only real exception is MS-SQL on internal servers... MS-SQL in Docker is barely acceptable for dev, and missing a few key features you may actually want/need.
I probably used the wrong word. I meant more about managing volume property so we dont have data loss, backups, replication etc etc. I assume going managed is easier if you can pay for it (e.g. RDS)
Thank you. I had been procrastinating on learning how to work with containers and finally got a handle on Docker Compose to play with self-hosting a coding agent and was worried that I'd once again procrastinated so long that I'd picked something up long after it was already dead.
Should you have a turkey sandwich for lunch in 2026? I don't know buddy just do whatever. There are ten thousand other sandwiches you could eat surely, but does turkey sound good for you?
What if you can't by yourself objectively evaluate if turkey sandwich sounds good?
It's not a matter of giving a universal answer to whether docker compose in production is fine, but how to evaluate it. Which features or safeguards necessary for a healthy production environment you forfeit when choosing plain docker compose? What's the tradeoff?
So we agree people shouldn't write off these posts with "does turkey sandwich sounds good to you" like it's some deep insight to default to the trivial answer?
The user plausibly isn't asking others to tell him what he "likes" -- a subjective preference -- but rather the user could be asking for objective research and information.
The analogy you are replying to is pointing out that it works just fine for production, so if you should use it or not is simply a matter of taste.
More to the point, there is no objectively right answer of what stack you should use. There are plenty of objectively wrong answers, but compose isn't one of them.
Even the wrong answers, it comes down to who's dealing with the mess? Who's paying for the site's uptime? Who isn't getting paid while the site isn't up?
But then if you're not going to answer a question on technology, and you won't motivate any of the choices, including the one to not give an answer, what's the value in participating to the conversation?
Your entire original comment looks like just an opportunity to be snarky. It's a longer version of "whatever", which you can literally throw around as an answer to anything.
In case you were curious, the subheading of the article already answers the question posed by the title:
> Yes, plain Docker Compose can still run production workloads in 2026—if you close the operational gaps it leaves: cleanup, healing, image pinning, socket security, and updates.
They are not using Docker at runtime for their services. Every company uses Docker for builds unless they have a particular cost or ethos they're avoiding and purely using Linux/podman/buildah/et al.
Thinking about it a little further, though, I believe Rancher Desktop has come a long way and may be eating market share.
There are more secure alternatives. Are you sure those you listed actually use it on the servers? I would guess that at least Spotify and Netflix uses some other container runtime than Docker on their production servers.
For a long time Docker was helpful and opened exposed ports on the firewall. So you wanted to access your redis ports locally and exposed it on the container? Now everything in there is accessible on the open internet.
I believe they've fixed it but I haven't used Docker in years so I wouldn't know.
Just the other day, someone was asking me if I knew of any options for replicating externaldns for Docker Compose. They didn't want "all the complexity" of running k8s, but wanted the features. This person was absolutely on the way to "building a Kubernetes".
That's mostly my take as well. I'm a big proponent of having separate teams for ops/deployment/sre from app development when you make the jump to k8s though. There's also a few bridge or in-between options for most cloud services as well.
To me, if there's generally fewer than 10 actual active users at any given time and/or you can easily tolerate 30-60m of down time now and then... I'd lean into the simpler option of docker-compose. While I generally think of compose as a dev tool first, it's definitely useful sometimes.
I think many of these issues are also solved by Podman and systemd depending on what kind of "production" you're building for. If you're building a linux-y appliance and you need to run a few containers I think Podman is a much better and more ergonomic way of doing so. I think perhaps that's less true for running a web service (where the linux environment is just a means to that end).
GP is talking about podman with generated systemd unit files (a.k.a. podman quadlet[0]), not the docker-compose-compatible podman-compose ...and I'd agree, systemd can manage services on a system just fine, and even better than any compose workload ever could.
journald will help with logs, and the pull policy[1] helps with mutable tags. What help do you need with "orphan containers"?
I'm not OP, but the whole podman compose topic gets quite confusing, as initially Podman didn't seem to know what they were trying to do. I've given some more context around it in previous comments.
You shouldn't be using podman compose. It's flimsy and doesn't work very well (at least it was last time I used it prior to Podman v3), and I'm pretty sure it doesn't have Red Hat's direct support.
Instead, activate Podman's Docker API compatibility socket, and simply set your `DOCKER_HOST` env var to that socket, and from there you can use your general docker client commands such as `docker`, `docker compose` and anything else that uses the Docker API. There are very few things that don't work with this, and the few things that don't are advanced setups.
For what it's worth, podman has also a thin wrapper (podman compose) which executes `docker-compose` or the old `podman-compose`. The docs should explain which it picks.
Note:
- `podman-compose` is an early attempt at remaking `docker-compose` v1 but for Podman. This used parsed the compose config and converts them to podman commands, and executes it.
- Later Podman wrote a Docker compatible socket instead, which can work with most docker clis that accept a `DOCKER_HOST` argument, including `docker` and `docker-compose` (both v1 and v2)
- `podman compose` is a thin wrapper that automatically selects `docker-compose` or `podman-compose` depending on which is installed.
Generally all you need is podman, docker-compose (the v2 binary), and that's it. From there you can use `podman` and/or `podman compose`.
One of the nastiest aspects of migrating from docker to podman really is "what to do about docker compose?" coz there are three wildly divergent ways to answer that all of which really suck under certain specific circumstances.
Im no fan of docker and podman by itself is a step up but orchestration headaches are enough to ruin that.
I don't understand what you're asking here. The answer to that is probably nothing. That is unless you want:
- systemd to manage your containers
- You want to use K8s primitives (which are mostly compatible)
I'm unsure what the 3rd method is you're talking about. The nice thing about Podman's compose API is you don't have to change anything (mostly). You can point all your docker tooling to Podman's socket, and it'll (mostly) magically work.
* use systemd, red hat's favorite kitchen sink for handling everything from setting up sound services to mounting your home dir to logging so why not this too i guess.
* docker compose where i have to run a whole separate podman service to lie to docker compose about not actually being docker.
* podman compose which would be the obvious solution if it didnt just plain suck.
This is what stopped me from picking up Podman more, all our devs use Docker and have been writing compose files for years now. When the response at the time was "you're using Podman wrong, Quadlets are the hot stuff now" it just felt like too big a risk and commitment to jump to at the time. Have things settled more? Getting away from Docker is a bigger priority nowadays for us.
Docker (Compose) has some quirks compared to Podman (Compose), e.g. when using gvisor or a lot of internal networks. Depending on what you do your milage will vary, though.
Agreed. I found compose overlay files merged list values differently between the Docker and Podman versions, which was a PITA in teams running Docker & Linux dev machines.
FWIW, most of these issues were recently fixed in podman-compose. I can now use the current git version of podman-compose interchangeably with docker-compose.
And one nice thing about podman-compose is that it's ONE PYTHON FILE. You can just copy it into your source tree.
Podman in general is much higher quality than Docker. For the last 5 years, Docker was throwing all kinds of stuff against the wall to see what sticks, neglecting foundational issues.
Moreover, podman-compose integrates really easily with systemd. You can create a service by just running "podman-compose systemd" and following the prompt! No quadlet nonsense required.
Then you learn podman can't even list containers for all users properly and it kind of starts smelling like the whole ip4 vs ip6 debacle: bunch of vocal proponents wanting you to subject yourself to endless torture for no discernible reason.
I desperately WANT to like podman quadlets and keep trying to find a use case for them. But I always got the impression that the developers who implemented quadlets never actually had to manage multiple containers in a real production environment.
Having your whole application with its containers, volumes, and networks all defined together in one easy-to-read YAML file is a way better experience. Deployment is two steps: 1. `git clone foo` 2. `docker compose up -d`. You can see the state of the application containers with `docker compose ps`. You can run multiple compose applications on the same host and manage them separately by putting them in different directories.
With quadlets, you delegate everything to systemd. You have to break the configuration up into a bunch of tiny unit files and then separately copy them to /etc or a dedicated user's dotfiles. An application with a handful of containers and multiple networks/volumes/etc can spiral into a dozen unit files. Good luck SSH'ing into an unfamiliar system and understanding at a glance what it's doing. It is far more annoying to predictably deploy and tightly couples your application configuration to the host system configuration. (Even moreso if you created dedicated users for each application, which I understand is the recommended solution.)
If I'm just holding it wrong and there exists some better tooling to manage podman in prod that I don't know about, I'm happy to hear about it.
> Having your whole application with its containers, volumes, and networks all defined together in one easy-to-read YAML file is a way better experience. Deployment is two steps: 1. `git clone foo` 2. `docker compose up -d`. You can see the state of the application containers with `docker compose ps`. You can run multiple compose applications on the same host and manage them separately by putting them in different directories.
I always felt it the other way around: docker compose files are weird blobs of YAML that I have to hunt down the location of or parse their under-speced labels to find the location of. I can't make them depend on any non-container services[0], the break my firewall rules[1], and I have to use a whole mess of bespoke tooling just to do normal start/stop/restart operations with them instead of using the same commands I use for literally any other service.
> With quadlets, you delegate everything to systemd. You have to break the configuration up into a bunch of tiny unit files and then separately copy them to /etc or a dedicated user's dotfiles.
The nice thing about quadlets is exactly that, they integrate with systemd and by extension the rest of the system. I don't have to think about `webapp.container` as a "Docker container" I can think of it as just `webapp.service`, like any other piece of software I would install and run. All the related files are in one of the well-speced file locations that follow the same hierarchy as anything else on the system (user -> etc -> /usr), optionally grouped in folders[2].
> Good luck SSH'ing into an unfamiliar system and understanding at a glance what it's doing.
Just use the same tools you'd use on any other systemd system: `systemctl list-units`, `systemctl status`, etc. Versus having to hunt down compose files either manually or by parsing the under-specified labels on the containers.
> (Even moreso if you created dedicated users for each application, which I understand is the recommended solution.)
TBH I've rarely seen this advice. Most people I know just run it as root (which is what I do) or as a `podman` user. But even in this situation it should be pretty easy to figure out whats' running, as you know it's all running as one user and is hard-namespaced to only rely on resources available in that account.
> If I'm just holding it wrong and there exists some better tooling to manage podman in prod that I don't know about, I'm happy to hear about it.
Quadlets are just files that created systemd services, so basically any configuration management or deployment tool will manage them fine. Ansible has a dedicated Quadlet role that works pretty well, or just git clones+`systemctl start`. This would probably be the recommended way if you're not using k8s/etc.
Alternatively, you can just `git clone /etc/containers/systemd/`, `systemctl start container` like with docker compose. If you're running multiple containers, either refer to them with `Wants=`/etc in the Quadlet files, create a `.target` file that references them all, or put them all in a `.pod` and start the pod. I think this is the part were most people stumble though: when you're used to treating containerized software as a separate kind of "thing" it's a little weird to go back to treating it like normal services.
I've been writing something to help with deploying quadlets GitOps-style[3] that will hopefully fill the "more than one server but less than kubernetes" deployment gap.
[0] Unless I wrap the compose steps in a systemd unit, at which point now I have two problems.
[1] Caveat, this has probably gotten better overall but I still run into compose-related firewall issues about once or twice a year
[2] The newer versions of Podman also support `.quadlets` files, that merge all the quadlets into one file.
Honestly with the hell I'm going through just trying to get Podman to run properly on macos, I can't imagine trusting the Podman people with a production deployment. I was not particularly impressed with the Docker tooling, but Podman has been even worse. This is a not-remotely exhaustive list of things I've run into in the last 24 hours:
* Podman fails to build a 16GB container image (after 30 minutes of downloading dependencies) despite having 90GB free out of a 200GB podman virtual machine
* Podman machine will, for reasons I don't understand, create a filesystem in a block device with wildly different sizes, and it seems like it's just random
* Pushing podman images to a container image registry via the Podman Desktop UI gives no indication that it's doing anything or even recognized the "push image" click, a success or error notification _might_ appear several or tens of minutes later or possibly not at all
* Starting a podman machine might work, but it fails ~75% of the time with not-particularly-exotic options (a bunch of ram and disk) and very cryptic error messages, frequently telling me to file bug tickets (I have)
* Podman Desktop won't let me create a podman machine with more than 44GB of disk, but the podman machine CLI won't let me create a machine with fewer than 100GB (IIRC--it's some number larger than 44, in any case)
Apart from the container image being absurdly large (Python developers love massive packages, I guess), I'm not doing anything exotic.
Is there a nice guide for podman that includes quadlets (or saying not to use them?) I find lots of guides stray into things that work on redhat, and on my Linuxes of choice, Raspbian and Ubuntu, things aren't straightforward.
Thanks - you're right. I just have never known whether to quadlet or not to quadlet, as the centre of gravity seems to be moving that way, but that might be a redhat-first feature.
The quadlet feature is just part of podman, which consists of no more than a generator for systemd. Generators are just an executable "hook" that spits out unit files according to systemd-defined paths. Any systemd system with podman will be equivalent to any other for the respective versions of each. There is nothing distro-specific about it, especially since podman can be a single multicall binary like busybox to provide it to a system that has systemd and nothing else.
Can't comment on Raspbian, but Ubuntu LTS (has/had) a seriously outdated podman version. This is the kind of nuisance the Debian derivatives have been running into for more than 20 years: they are extremely conservative, and if that is all you need, then that is great, but if not, you'll have to either run the latest Ubuntu (not LTS), or you upgrade to something like fedora.
Not sure if you consider 5.7.0 (6 months old) "seriously outdated", or are talking about Ubuntu 24.04 (the previous LTS). I recently looked and decided 5.8.2 (3 weeks old), didn't have anything compelling to make me want to try to shoehorn it in.
Ubuntu 24.04. The new LTS had dropped only two weeks ago. LTS users had a very outdated podman (4.9, two years old) and couldn't use quadlet types like build units (v5.2.0, aug 2024).
Podman's quadlets have a deep integration with systemd. I guess that if you have that kind of risk appetite you would be better of with running Arch on auto-update.
> they are extremely conservative, and if that is all you need, then that is great
You don’t need to live at the edge of new features. Do you upgrade your fridge and your oven every two months? It’s nice when you can have something running and not worry that the next update will break your software and/or your workflow.
Sure, but these are development dependencies we are talking about. Running old versions of these dependencies block your projects. But it isn't limited to self-developed software, quite often for of the shelve software you run at the same problem.
To each their own, but this is the reason I advice newcomers to stay away from Debian based distro's. I don't intend a distro flamewar, it works perfect for `boring old and feature complete software´ like Dovecot.
To add: containers would alleviate a good part of these concerns, but the stupid thing here is that precisely that is broken for up-to-date podman workflows.
Your test system should reflect your prod system. Why run Debian if you intend to deploy on the latest ubuntu? Unless you want to use VMs. For other stuff that does not alter the system that much, you can find more recent version in the backports.
It has integration with systemd, but moreover, I think the promise of Debian-derivatives is one of "we are boring and old, but also boringly stable". Now, throwing in backports undermines that promise. I think one is better of with a distro that moves faster.
How do you guys, who run Docker in production deal with managing nftables firewall on hosts running containers? By design docker daemon creates and manages a set of firewall rules to forward traffic between containers and ingress traffic into containers as well as masquarades the outgoing container traffic. That is all well until admin needs to alter hosts firewall to allow and deny other traffic unrelated to docker - and restarting nftables or even applying new nftables rules usually ( flush ruleset in /etc/nftables.conf ) purges all the docker created rules and effectively breaks everything until docker daemon is restarted and rules re-created. I have partially solved this by using nftables filter chains with different names - admin_input/admin_output and using input hook with negative priority - so that traffic I choose to block is evaluated before docker rules are applied - that feels a bit like hack, but so far is the only way I have found. It is good practice in this day and age to run local firewalls on all hosts with policy deny, so that only traffic explicitly allowed can pass, that can severely limit blast radius during compromise.
My containers run in dedicated "docker host" VMs. And I never expose ports on 0.0.0.0, just the private internal IP. Most (all) of my docker hosts do not have a public IP anyway. I use wireguard to access them myself. If they need to be public I reverse proxy with caddy from my web server (or use Authentik's embedded proxy). These servers have access to the same private LAN which could be hardened without having the issues you brought up.
By the way most docker based implementations do not actually need the userland proxy docker runs automatically. Disable it in /etc/docker/daemon.js
I have all of mine on the same (or accessible) internal LAN so they can all talk to each other. You can get the connection going with Wireguard if they are in different places in terms of networking.
No it just needs to have route to the internal IP of the docker host. And you expose your ports on that IP. Let me know if you need more details. You could also put the reverse proxy (Caddy in my case) on the docker host.
I reverse proxy everything through a Caddy instance running on the same machine so I avoid the firewall dance entirely by just prefixing all my port assignments in the compose file with the loopback IP (eg. 127.0.0.1:3000:3000). Nftables denies all but 80 and 443 and I don't have to worry about restarts/flushes breaking things.
This is how I self host all my home services (Home Assistant, PFSense, Frigate etc), I do not for the life of me understand why so many folks doing self-hosted services for themselves put them on the public internet.
Caddy will even do fully automated valid TLS certificates for private IP ranges via DNS ACME challenge for free etc with renewals handled, so all my internal self-hosted sites have properly terminated TLS too, accessible by connected VPN clients.
It's funny that for many of us in our day job, we stand up private services behind a VPN all the time so only work clients can access it, but when self hosting don't bother with a simple wireguard/tailscale config etc.
This is surely the easiest and I would guess the safest way, and has the added benefit that your proxy (nginx in my case) can handle SSL for you, making certificate deployment a breeze.
Adding to other answers: many cloud providers, including more reasonably priced one like hetzner etc offer firewall as a service where you can configure the firewall there instead of on the OS itself.
I put a firewall ahead of the Docker host so that they aren't running on the same system. Docker can do what it wants to on the host without stepping on my firewall rules.
Well, as an example we usually set incoming rules to filter SSH only from administrator IP addresses, TCP 10050 only from zabbix monitoring server and leave few icmp types required and rest is dropped and logged.
For forward chain we set docker network ranges to route between themselves and only services actually used in containers. Allow container outgoing connections to our DNS servers, centralized HTTP proxy server and monitoring - nothing else containers are allowed to route to.
And for output is similar, only allow our DNS servers, NTP, HTTP proxy, centralized rsyslog where everything goes and zabbix monitoring server and a few icmp types - nothing else gets out and is logged.
With the advent of these supply chain attacks we read about often here it's just a matter of time some container is compromised and this seems like only viable way to at least somehow limit impact when such an event occurs.
To expand, you can use privileged containers, host network, capabilities, etc if the software really needs it. In that case, Docker basically becomes an init system/service manager but you get a singular daemon managing everything
I like running docker compose for my simple needs because it consolidates pretty much all the config in one declarative file, and docker manages 'everything'.
By now I know how to handle the handful of caveats listed in this article. Beyond what's listed there, I'd also give a mention to the way port publishing works (the fact that it ignores firewalls), as that's something that still trips people up if they don't know about it.
> docker compose pull && docker compose up -d is a fine command if you are SSH’d into the host. At customer scale—dozens of self-managed environments behind firewalls, each with its own change-control process—that manual process doesn’t scale.
No idea what this 'customer scale' operation is, but it seems like a pretty clear cut candidate for not using docker compose. I also don't think watchtower should be listed there, it's been archived and was never recommended for production usage anyways.
> docker compose pull && docker compose up -d is a fine command if you are SSH’d into the host. At customer scale—dozens of self-managed environments behind firewalls, each with its own change-control process—that manual process doesn’t scale.
> I'd also give a mention to the way port publishing works (the fact that it ignores firewalls), as that's something that still trips people up if they don't know about it.
Isn't that a Docker thing rather than Docker Compose though? There is a ton more caveats to add if we don't already assume the reader is familiar with the hard edges of Docker, seems the article only focuses on Docker Compose specifically, probably because it'd be very long otherwise :)
Docker Swarm sits between Compose and k8s and can be used on a single node if your needs are modest. I find Docker Swarm more reliable and easier to automate with a CI/CD pipeline than Compose, and it also provides health checks and other useful directives allowing you to minimize downtime, rollback when a deploy fails, and so on.
I really want something that is Docker Compose but for Kubernetes. I mean that I can have a simple way to declaring resources in just like Docker Compose, but I run the environment in Kubernetes so that I can get to test the behaviors when there are multiple copies of the softwares running together. I do rely on Kubernetes heavily for distributed and networked software deployment, so it is even better if we can emulate things like latency or burstable packet loss so that we can do a controlled chaos test for reliability test. I tried Skaffold, Tilt, Devspaces and Devpod/Coder v2, none of them are really simple like Compose.
This is a problem we, as a company, have thought about a lot, but we always concluded that Kubernetes is already the simplest abstraction of a distributed system that is feasible for the diverse needs that the biggest companies out there have.
We previously built a package manager for Kubernetes to abstract it in the simplest way possible `glasskube install app` but we failed because every abstraction needs to follow a "convention over configuration" pattern at some point. Also, we weren't able to monetize a package manager.
With Distr (https://github.com/distr-sh/distr), we have actually been able to help companies not only package but distribute and either manage or give their customers a way to self-manage applications. Our customers are able to land on-premises contracts at enterprises way faster than before, which is also a clear ROI for paying for Distr.
So, I don't think that you can get the flexibility of a distributed application orchestrator with a simple declarative YAML file if your target environments are diverse.
Same, I've tried three or four times to make it work, including one attempt that just translated compose.yaml into k8s yaml, and every time I came away thinking, "just use k8s". K8s yaml looks complex, and can start to feel very boilerplate, but attempts to hide the complexity often just lead to something not-flexible-enough because it encodes convention over configuration, and inevitably some project runs into limitations and pretty soon you've just built an abstraction layer that leaks or is equally complex/verbose and now you have to learn something new.
Just use k8s and follow similar patterns is the conclusion I've arrived at personally.
Helm mostly does that. Not a huge fan of a text templating engine generating yaml but once you get your chart setup with a few variable inputs, you can continue using it for a bunch of other stuff with minimal new config.
The inputs (values) are yaml so you can make it look exactly like a Docker Compose file if you want (wouldn't be surprised if there's some charts floating around that do that)
I've recently been dipping my toes into k8s / kustomize / helm, and I recently had a situation where having a base deployment yml template that I wanted to reuse across various deployments. I had a look at Helm and I was frankly shocked how bad the templating was with Go templates, it was close to unreadable and felt very brittle!
Yeah, that's fair. I don't think it's as bad if you make your own charts and can more liberally hardcode things. Community charts tend to have an insane amount of "knobs" so you can basically change everything being templated.
I don't know if I'd necessarily call it brittle, though. You can use `helm template` and various linters to validate the generated yaml is correct (and use something like pre-commit to autorun)
I did that too, and ended up just skipping helm and using envsubst to interpolate the values I need at runtime from env vars. Nearly everyone preferred that approach. YMMV of course.
I think that's a good approach if it works for your use case. Sometimes you might want something slightly more sophisticated like basic logic (loops/conditionals). In those cases, you can still use helm but you have an extremely simple template and avoid many of the "can't read this template" helm pitfalls.
How so? I find it much the same to other templating engines like Jinja, though I'm definitely not a fan of the syntax. But that hardly matters anymore with LLMs.
I think it's a combination of text templating + yaml (which is whitespace delimited). When templating html with Jinja, it's not a big deal if the indentation isn't right. In helm, best case you get a syntax error, worst case you end up overwriting keys and producing something syntactically but not logically correct.
Compose is great, but a couple things always created friction for me when using it for non-local setups:
* Lack of a user-friendly way of managing a Docker Compose installation on a remote host. SSH-forwarding the docker socket is an option, but needs wrappers and discipline.
* Growing beyond one host (and not switching to something like Kubernetes) would normally mean migrating to Swarm, which is its own can of worms.
* Boilerplate needed to expose your services with TLS
Uncloud [1] fixed all those issues for me and is (mostly) Compose-compatible.
For remote installation, use the `docker context` command. You create a context with a named SSH host and then it connects via SSH to that host (as configured in your local ssh_config) and uses its docker daemon. Everything works flawlessly apart from local bind mounts (for obvious reasons).
If you remember `docker machine`, this is basically the modern version of that.
What I found pretty great with docker is isolating individual docker systemd instances in rootless linux namespaces (i.e. users). I wrote about this here [1].
This lets you easily create multiple services on one VM that are quite isolated from each other. This system of doing things has worked reliably for me for quite some time, even for the 'bigger' services (gitlab, nextcloud, mailcow-dockerized etc.).
What a great blog post! I have wanted to do rootless docker with subuids, but putting it all together like you have is not easy. Thank you for writing it down!
I am using systemd + go binary deploy. Running 10 years+ in production. Meanwhile docker based setup fail every now and then. And kubernetes? well forget about it.
I mean no disrespect. This is more of a rant at how things are today. It is telling that over-complicated solutions have become so common that, for the current generation of devs, Kubernetes is the obvious way of doing stuff and a simple systemd service is the obscure one. I am sure there are good reasons for this, but it still feels like a loss when simplicity is no longer obvious.
That's like comparing you being able to make a salad from tomatoes and an industrial tomato sauce making facility. Both take tomatoes and end up with food, but the scale is completely different.
Yes, you can deploy a Go binary easily with systemd. Could you reliably do this across a fleet of machines? Including managing its configuration, persistent storage, database, network setup, etc.? Maybe, just need Ansible or equivalent config management. What if it were multiple Go binaries? And what if some of them needed to scale up some days because they hit more traffic than the others?
And on and on. Yes, not everyone needs Kubernetes, Nomad or other advanced orchestrators. But comparing them to running a Go binary with systemd is an unfair comparison.
I am using docker-compose everywhere. I really enjoy using it. I have a single thing that is annoying for normal production deployments, and that is that it isn't super easy to have a rolling deployment, I just need two replicas for zero downtime deployment, and I don't really want docker swarm. I think it is the networking which breaks at that point, and you have to have a more involved setup, and at that point I'd just use kubernetes, as I know how that works.
Could i survive with 10 seconds of downtime, probably, but I'd really like if I could avoid it.
Reading the article over, it really feels like Docker should be targeting Swarm as (instead of being its own platform) a set of incremental enhancements to Docker Compose. "I need healthcheck-restarts" "I need off-host logging", etc.
They've basically lost the war against Kubernetes but they could easily claim a lot of ground when it's just one more tweak you're adding to your docker-compose file as it scales.
I prefer Portainer to manage my docker composes. It is simple and can do it all instead of using cli.
Added benefit if you have multiple hosts and want to manage them from one place. And you can extend the whole setup with git for version control.
I do this via Dokploy on a hosted Linode VPS and absolutely love it. Super easy to set up and maintain for tons of little side projects that don't require tons of resources.
Seems like an ad for whatever "Distr" is though; I haven't run into any of these issues with Dokploy and everything's been running fine for months.
To be honest I never really understood the benefit of Docker (Compose) Secrets - which is different from Swarm Secrets. Imho there just plain host mounted volumes, which are hidden from inspect commands?
That's just different customer personas for marketing reasons, just like Vercel has "Build and deploy on the AI Cloud" as their main tag line on the landing page. It doesn't mean they are an "AI company".
I quite like the “shape” term; every type of (sigh…) stakeholder … understands it, and I don’t need to swap in terms like “interface”, api, contract, architecture, structure, etc - unless I want to talk specifically about that thing. Everyone can fit a triangle and parallelogram in their mind, which is just dandy when I’m just trying to communicate difference.
> Every docker compose pull keeps the previous image on disk. Every container with the default json-file log driver writes unbounded JSON to /var/lib/docker/containers/<id>/<id>-json.log. On a busy host this is one of the most common reasons for an outage: the disk fills and Docker stops being able to write anything
I ran docker compose in development a lot. Just an easy way to turn on / off 5 different services at once for a project. Over time this was filling up my machine's storage (like 1 TB). Every few months I needed to run docker compose prune and see 600GB free up
A very simple fix for that is to use the systemd log driver to send all the container logs to journald. Then you can set a size or time limit on journald.
This section misses the one thing I was interested in: how do you avoid downtime in a deployment?
I like to write web applications with Perl and Mojolicious, and a deployment is just "hypnotoad app", and then hypnotoad gracefully starts up new worker processes to handle new requests and lets the other ones exit once they've finished handling their in-flight requests.
When I switched to Docker I found that there was no good way to handle this.
Record the existing container id, rescale the service to 2 instances (hence bringing a second container up), wait for the second one to be healthy, (optional) stop directing traffic to the old container, wait a few seconds, stop the old container, rescale the service back to 1 instance.
Kubernetes sounds like overkill, but I've been running microk8s for few standalone servers. This feels a pretty good match when working with agents. Codex can manage the cluster also over ssh, schedule new pods, check statuses, logs etc.
I think k8s is a great choice today specially when you can plug it into Gitlab and have a control plane for your clusters in the same place where your code lives.
Sure, it's stable enough, just keep in mind you won't get any autoscaling (or manual for that matter). Swarm is still supported by a third party, but that party has been loudly signaling that they intend to kill it off this year or next. Kubernetes isn't too big a leap, but damn are all those yaml manifests annoying to maintain. I usually just copy and tweak them from another project.
Somewhat adjacent in how I look at using Docker at all in prod, here's what I always wonder:
Is using Docker/Compose "just" as the layer for installing & managing runtime environment and services correct? Especially for languages like PHP?
I.e. am I holding it wrong if I run my "build" processes (npm, composer, etc) on the server at deploy time same as I would without containers? In that sense Docker Composer becomes more like Ansible for me - the tool I use to build the environment, not the entire app.
For the purpose of my question, let's assume I'm building normal CRUD services that can go a little tall or a little wide on servers without caring about hyper scale.
> if I run my "build" processes (npm, composer, etc) on the server at deploy time
It's perfectly fine, as long as you accept the risks and downsides. Your IP can get ratelimited for Docker Hub. The build process can exhaust resources on the host. Your server probably needs access to internal dev dependencies repository, thus, needs credentials it would not need otherwise. Many small things like that. The advantage is simplicity, and it's often worth the risk.
> The build process can exhaust resources on the host
Maybe, but I've yet to have a host where that's the case for usual CRUD fare.
> The advantage is simplicity, and it's often worth the risk.
That's basically what I'm evaluating for here.
For bog standard LAMP or similar stack applications, I've not understood the advantage of going through the build-image-then-pull-on-host rigmarole. There's more layers involved there than something like provisioning with Ansible and just having a deploy script to run the usual suspects.
But I have seen that done fairly often, hence was wondering what the point was.
I would say it's bad practice because you end up having to copy all the build dependencies (source code) to the host and you're potentially putting a bunch of extra load on the host during the build process.
Also adds moving parts to your deploy which increases risk/introduces more failure modes.
Couple things that come to mind
- disk space exhaustion during build
- I/o exhaustion esp with package managers that have lots of small files (npm)
However, on the small/hobby end I don't think it's a huge concern.
> you end up having to copy all the build dependencies (source code) to the host
> disk, i/o exhaustion
This is why I mentioned specifically for ecosystems like PHP, which are interpreted. I'm specifically asking for that use case.
I'm not building binaries, my "build" steps are actually deployment steps (npm build, composer install, etc) that I'd be running in exactly the same way on the host. The image I'm deploying by definition also contains my source code because I'm not deploying anything compiled.
No, those are build steps. If you weren't using Docker, you would either run all those and shove in a zip/tarball or package into a deb/rpm, etc
>The image I'm deploying by definition also contains my source code
It doesn't contain .git or need credentials to your git/SCM
>I'm not seeing the benefit of the whole "build image, pull on server" pipeline when I can just ditch the registry and added layers by doing those steps on the server as I would normally in other kinds of scenarios
You don't need a registry--you can Docker save/load to push images directly to the server. Images buy you a versioned artifact with all the code-level dependencies baked in. Some maintainer yanks their package from npm? Who cares--you have a copy in your Docker image. Your new app version doesn't work? Edit 1 line to point back to the old image tag and rollback.
>> The build process can exhaust resources on the host
>Maybe, but I've yet to have a host where that's the case for usual CRUD fare.
When the build process completes, it tears down the overlayfs which causes everything to sync which leads to a big I/O spike. Depending on the server and amount of files, it might have no impact. However, I've seen build servers become completely unresponsive for 5+ minutes due to the I/O load when this happens. One place I worked, we had to switch our build servers to NVMe--the Docker container teardown caused spikes over 100k IOPs. Can't remember the exact details--it was React either React web front end or React Native mobile app.
>There's more layers involved there than something like provisioning with Ansible and just having a deploy script to run the usual suspects.
Have a look at multi stage container builds. Your images should not need a build step at start, the result should be in the baked image. Else you become reliant on fetching packages during build etc.
I guess what I'm asking for is what the point is of a "baked" image for interpreted language ecosystems. Already using multi stage builds.
"Builds" are the same as deploys, so when working with server(s) instead of larger scale deployments, I'm not seeing the benefit of the whole "build image, pull on server" pipeline when I can just ditch the registry and added layers by doing those steps on the server as I would normally in other kinds of scenarios.
But I have seen this in action, which is why I'm wondering if I'm missing something.
The clearer benefit to me seems to be in this scenario to use it as a fast environment provisioning tool.
One big thing I think is whether you want some sort of non-trivial network configuration, such as multiple external IPs via ipvlan. That's technically possible off docker, but not in a responsible way as anything in the ipvlan will be accessible to the public internet. Overall the implementations for this are very janky and occasionally enter tilted states that are close to impossible to recover from short of a restart of the docker daemon.
My experience with docker-compose is a bit outdated, but my impression some years ago was that it was too sensitive and fragile. I encountered bugs or incompatibilities that broke the docker-compose setup often enough to be forced to pin the specific docker and docker-compose versions.
And the error handling was terrible. Most of these problems resulted in a Python stack trace in some docker-compose internals instead of a readable error message. Googling the stack trace usually lead to a description of the actual problem, but that's really not something that inspires confidence.
If you love docker compose then you would love k3s. A single server with k3s is basically docker compose + the possibility to use helm to install all kinds of open source project such as monitoring and it just works.
Sure why not, it just never fits into my model of how I design infrastructure.
Docker compose assumes all your services can reach each other over docker, which I find horribly insecure.
I separate all my services by user account at least, maybe even by VM, and I run them all in rootless podman containers. So it just doesn't fit my style, but I'm sure it works fine.
I'm very happy using docker swarm on a single host with traefik as reverse proxy using the setup described here: https://dockerswarm.rocks/
Super easy deployment of additional apps, defined completely in one file (incl setup on host, backups, reverse proxy config, etc).
Never found a reason to migrate away. Swarm was already considered dead when I started using it in 2022[1], but the investment was so low and benefits so big, that it was the right choice for me. I think a lot of people are replicating swarm features with compose, losing a lot of time. But hey, to each their own choice!
Docker context for remote access - over Internet or vpn, whatever.
Swarm-cronjob for scheduled things.
Labels for things that need to run in particular places.
So easy.
Personally, k8s is fine, but its an abstraction for building a service architecture, not the thing an end user (developer) should ever use. If you are in a big company and you are using helm or k8s yaml files to roll things out, your infra or platform teams have missed something out.. building the platform!
Personally I have moved to k3s, although after learning a bit too deep how k8s operates when writing custom controllers at the day job.
Docker/containers are great, especially for local development. But I feel the docker compose model quickly becomes a lot of messy brittle squeeze for little gain when multiple containers need to integrate.
Better then to just take the plunge for the "real deal" and set up a non-HA k8s/k3s cluster with the interactions between the workloads clearly specified.
In other words. I care care more about the interactions declaratively spelled out than the "scale to the moon" HA, auto-scaling, replicas or whatever people get sold on.
And LLMs make this even easier. If you love reviewing yaml manifests....
I am doing just this. Running docker compose on a server. When there will be to many microservices, we will move them in managed Kubernetes on a cloud platform or Nomad if any cloud platform offers it.
Some time ago I've written about my experiences using it in production https://nickjanetakis.com/blog/why-i-like-using-docker-compo.... Not just for my own projects but for $500 million dollar companies and more.
We have been using it for more than five years now. Before that, we had a legacy deployment model, and I do not remember a single major issue related to Docker Compose.
We use it for both staging and production environments. The same Docker image validated in staging is deployed to production. Never fails!
https://github.com/daitangio/misterio
It works very well!
This is why nobody uses it. Cloud stuff has to be as baroque as possible.
What happens if it errored on deployment or after that? you wanna write custom (bash? :D) hooks for that? What about upgrading your 'very vertically scalable' box? What if it doesn't come up after the upgrade? your downtime is suddenly hours, oops.
The k8s denial is strong and now rivals frontend frameworks denial. Never fails to amuse.
In our case, the answer is not "hope and bash". We deploy versioned images, use health checks, monitor the result, and keep rollback simple: redeploy the previous known-good image/config. Host upgrades are also treated as maintenance events, with backups and a recovery path, not as something Compose magically solves.
But I think there is an opposite mistake too: assuming every production system should be operated like a high-scale tech company.
Many production workloads are boring, predictable, and business-critical. They do not need aggressive autoscaling, multi-node orchestration, or constant traffic-spike handling. They need reliable deploys, backups, monitoring, health checks, and a clear rollback path.
That is where Compose can be a good fit: simple operational model, understood failure modes, low moving parts.
Kubernetes becomes much more compelling when you actually need automated failover, rolling deploys, autoscaling, multi-node scheduling, and stronger deployment primitives.
Not needing Kubernetes is not necessarily denial, it is just choosing the complexity budget that matches the problem.
2-3 miniPCs, cloudflare, tailscale, and k3s can save (possibly tens of) thousands on SaaS products, and would probably scale you to a company of dozens AND host your product.
I agree with gear54us and upvoted their comment, but I also understand what the author of the root comment is saying.
I have also delivered systems using Docker Compose that are actually running in production. The point I want to make is that people may define “production” differently depending on the number of active users, operational requirements, and risk level.
To me, this debate feels similar to the broader monolith vs. microservices debate.
Seems reasonable to assume these are serious production environments, no?!
Zero downtime server upgrades are easy. You could make a new server, ensure it's working in private and then adjust DNS or your floating IP address to point to the new server when you're happy. I've done this pattern hundreds of times over the years for doing system upgrades without interruption and safely. The only requirement is your servers are stateless but that's a good pattern in general.
I can't personally speak to what the limit of docker compose is, as I have only worked on the lower end of this: self hosting for personal use and for small internal services serving maybe 20 users.
Are you really going to try to get 4+ 9's of uptime for a small, one-off app? Do you really need to use a cloud distributed data store that only slows things down for no real gains in practice? Do you really think the cloud services are never down, and you're willing to spend a f*ck-ton of money to create a distributed app when historically an Access DB or VB6 app would have done the job?
I've moved applications deployed via compose pretty easily... compose down -t 30 then literally sftp the application to a backup location then to the new server which only needs the Docker Engine community stack installed.. then compose up -d ... tada! In terms of deployment, you can use github action runners if you want, or anything else... you can even do it by hand pretty easily.
While not built in k8s has at least velero and kasten. However they are only possible because of snapshots https://kubernetes.io/docs/concepts/storage/volume-snapshots... and kasten has a plugin like architecture (because of k8s ) that supports application specific backups. However I never found something like that for compose. And that is troublesome in bigger projects like sentry
Docker volumes (and bind mounts) however have the minor problem of being hard to get a consistent copy to without stopping the service. You can work around this by, e. G., having ZFS or btrfs as the underlying FS and making a snapshot there. Otherwise, your software (like PostgreSQL) might also have other online backup tooling.
Agree.
Plus there's the monitoring of the host that is always overlooked in articles. I've ended up chucking Monit on there to monitor disk usage et al, and also used it for monitoring compose too and restarting containers.
And then there's Healthchecks.io, and external uptime monitoring... the list goes on. Properly monitoring systems, even single server systems, is not simple.
But docker compose can actually be very sufficient for what many projects actually need.
Granted I am a guy pushing for compose based localdevs and such but going further you often just cannot beat the simplicity of doing update QA or other CI/CD workloads in compose based projects. I have had dozens of projects where we replaced flaky slow and maintenance heavy pipelines with just docker compose up --build --wait in the past years. How come you say health checks are still broken?
Some of my concerns with compose aren’t purely technical. It makes it easier to lean on local state like volumes, bind mounts, and large .env files. Similar mechanisms exist in kubernetes, but the additional setup tends to force a bit more thought about whether they’re actually needed or just a shortcut.
On the health check side, they exist, but compose doesn’t fully act on them, that's the part that is missing. There’s no built in remediation or orchestration behavior tied to health status, which is why things like https://github.com/willfarrell/docker-autoheal exist. It’s something that was never fully carried through in Docker itself.
It's because I like keeping my servers stateless when possible. It makes it easier to upgrade them in a zero downtime way later.
If your web server has your DB too, then you can't do zero downtime system upgrades. For example I would never upgrade Debian 12 to 13 on a live server. Instead, I'd make a new server with 13, get it all ready to go and tested and then when I'm ready flip over DNS or a floating IP address to the new server. This pattern works because both the old and new server can be writing to a database on a different server.
With all that said, if you were ok with 1 server, then yeah I'd for sure run it in Docker Compose.
For most DBs it's one or two paths in the container, and virtually all DBs vendors have a reference Docker Compose example somewhere showing volume config. I can't remember the last time I ever "natively" installed a DB personally!
I don't think there's a good one size fits all answer to whether hosting in Compose or RDS is right for you or a given project.
It really comes down to YMMV... Sometimes for a singular app surface, it's easier to just use a compose file that includes the database. mailu/mailcow is a good example... you don't necessarily want to comingle email on the same server as other services.
That said, if you need to share a single DB or set of DBs across an applicaiton with several instances/deployments, then it makes much more sense for a central deployment. I almost never do my own host level install, instead relying on cloud hosting and mgt. The only real exception is MS-SQL on internal servers... MS-SQL in Docker is barely acceptable for dev, and missing a few key features you may actually want/need.
What if you can't by yourself objectively evaluate if turkey sandwich sounds good?
It's not a matter of giving a universal answer to whether docker compose in production is fine, but how to evaluate it. Which features or safeguards necessary for a healthy production environment you forfeit when choosing plain docker compose? What's the tradeoff?
Because if that's the case, I also don't care for baroque music that much.
Yes: okay try it
No: okay you don't have to try it
Unsure: okay you can read about it some more and decide if it sounds good to you
Comments like this are apathetic and reduce the challenges of good software engineering to hopes and random chance.
More to the point, there is no objectively right answer of what stack you should use. There are plenty of objectively wrong answers, but compose isn't one of them.
There’s a reason articles like this exist. Things change.
*shudder*
Your entire original comment looks like just an opportunity to be snarky. It's a longer version of "whatever", which you can literally throw around as an answer to anything.
In case you were curious, the subheading of the article already answers the question posed by the title:
> Yes, plain Docker Compose can still run production workloads in 2026—if you close the operational gaps it leaves: cleanup, healing, image pinning, socket security, and updates.
Docker also commonly refers to Docker _images_ or Docker-esque container setups
Thinking about it a little further, though, I believe Rancher Desktop has come a long way and may be eating market share.
For a long time Docker was helpful and opened exposed ports on the firewall. So you wanted to access your redis ports locally and exposed it on the container? Now everything in there is accessible on the open internet.
I believe they've fixed it but I haven't used Docker in years so I wouldn't know.
K8s as small time is overkill for sure but make sure you don't fall into this trap. https://www.macchaffee.com/blog/2024/you-have-built-a-kubern...
To me, if there's generally fewer than 10 actual active users at any given time and/or you can easily tolerate 30-60m of down time now and then... I'd lean into the simpler option of docker-compose. While I generally think of compose as a dev tool first, it's definitely useful sometimes.
journald will help with logs, and the pull policy[1] helps with mutable tags. What help do you need with "orphan containers"?
[0]: https://docs.podman.io/en/latest/markdown/podman-quadlet.1.h...
[1]: https://docs.podman.io/en/latest/markdown/podman-image.unit....
You shouldn't be using podman compose. It's flimsy and doesn't work very well (at least it was last time I used it prior to Podman v3), and I'm pretty sure it doesn't have Red Hat's direct support.
Instead, activate Podman's Docker API compatibility socket, and simply set your `DOCKER_HOST` env var to that socket, and from there you can use your general docker client commands such as `docker`, `docker compose` and anything else that uses the Docker API. There are very few things that don't work with this, and the few things that don't are advanced setups.
For what it's worth, podman has also a thin wrapper (podman compose) which executes `docker-compose` or the old `podman-compose`. The docs should explain which it picks.
Note:
- `podman-compose` is an early attempt at remaking `docker-compose` v1 but for Podman. This used parsed the compose config and converts them to podman commands, and executes it.
- Later Podman wrote a Docker compatible socket instead, which can work with most docker clis that accept a `DOCKER_HOST` argument, including `docker` and `docker-compose` (both v1 and v2)
- `podman compose` is a thin wrapper that automatically selects `docker-compose` or `podman-compose` depending on which is installed.
Generally all you need is podman, docker-compose (the v2 binary), and that's it. From there you can use `podman` and/or `podman compose`.
Im no fan of docker and podman by itself is a step up but orchestration headaches are enough to ruin that.
I don't understand what you're asking here. The answer to that is probably nothing. That is unless you want:
- systemd to manage your containers - You want to use K8s primitives (which are mostly compatible)
I'm unsure what the 3rd method is you're talking about. The nice thing about Podman's compose API is you don't have to change anything (mostly). You can point all your docker tooling to Podman's socket, and it'll (mostly) magically work.
* use systemd, red hat's favorite kitchen sink for handling everything from setting up sound services to mounting your home dir to logging so why not this too i guess.
* docker compose where i have to run a whole separate podman service to lie to docker compose about not actually being docker.
* podman compose which would be the obvious solution if it didnt just plain suck.
i also want to stay the hell away from quadlets or any other software which tries to make me use systemd more.
And one nice thing about podman-compose is that it's ONE PYTHON FILE. You can just copy it into your source tree.
Moreover, podman-compose integrates really easily with systemd. You can create a service by just running "podman-compose systemd" and following the prompt! No quadlet nonsense required.
There are workarounds to make ipv4 work, but they complicate the system and make it more fragile.
Having your whole application with its containers, volumes, and networks all defined together in one easy-to-read YAML file is a way better experience. Deployment is two steps: 1. `git clone foo` 2. `docker compose up -d`. You can see the state of the application containers with `docker compose ps`. You can run multiple compose applications on the same host and manage them separately by putting them in different directories.
With quadlets, you delegate everything to systemd. You have to break the configuration up into a bunch of tiny unit files and then separately copy them to /etc or a dedicated user's dotfiles. An application with a handful of containers and multiple networks/volumes/etc can spiral into a dozen unit files. Good luck SSH'ing into an unfamiliar system and understanding at a glance what it's doing. It is far more annoying to predictably deploy and tightly couples your application configuration to the host system configuration. (Even moreso if you created dedicated users for each application, which I understand is the recommended solution.)
If I'm just holding it wrong and there exists some better tooling to manage podman in prod that I don't know about, I'm happy to hear about it.
I always felt it the other way around: docker compose files are weird blobs of YAML that I have to hunt down the location of or parse their under-speced labels to find the location of. I can't make them depend on any non-container services[0], the break my firewall rules[1], and I have to use a whole mess of bespoke tooling just to do normal start/stop/restart operations with them instead of using the same commands I use for literally any other service.
> With quadlets, you delegate everything to systemd. You have to break the configuration up into a bunch of tiny unit files and then separately copy them to /etc or a dedicated user's dotfiles.
The nice thing about quadlets is exactly that, they integrate with systemd and by extension the rest of the system. I don't have to think about `webapp.container` as a "Docker container" I can think of it as just `webapp.service`, like any other piece of software I would install and run. All the related files are in one of the well-speced file locations that follow the same hierarchy as anything else on the system (user -> etc -> /usr), optionally grouped in folders[2].
> Good luck SSH'ing into an unfamiliar system and understanding at a glance what it's doing.
Just use the same tools you'd use on any other systemd system: `systemctl list-units`, `systemctl status`, etc. Versus having to hunt down compose files either manually or by parsing the under-specified labels on the containers.
> (Even moreso if you created dedicated users for each application, which I understand is the recommended solution.)
TBH I've rarely seen this advice. Most people I know just run it as root (which is what I do) or as a `podman` user. But even in this situation it should be pretty easy to figure out whats' running, as you know it's all running as one user and is hard-namespaced to only rely on resources available in that account.
> If I'm just holding it wrong and there exists some better tooling to manage podman in prod that I don't know about, I'm happy to hear about it.
Quadlets are just files that created systemd services, so basically any configuration management or deployment tool will manage them fine. Ansible has a dedicated Quadlet role that works pretty well, or just git clones+`systemctl start`. This would probably be the recommended way if you're not using k8s/etc.
Alternatively, you can just `git clone /etc/containers/systemd/`, `systemctl start container` like with docker compose. If you're running multiple containers, either refer to them with `Wants=`/etc in the Quadlet files, create a `.target` file that references them all, or put them all in a `.pod` and start the pod. I think this is the part were most people stumble though: when you're used to treating containerized software as a separate kind of "thing" it's a little weird to go back to treating it like normal services.
I've been writing something to help with deploying quadlets GitOps-style[3] that will hopefully fill the "more than one server but less than kubernetes" deployment gap.
[0] Unless I wrap the compose steps in a systemd unit, at which point now I have two problems.
[1] Caveat, this has probably gotten better overall but I still run into compose-related firewall issues about once or twice a year
[2] The newer versions of Podman also support `.quadlets` files, that merge all the quadlets into one file.
[3] https://github.com/stryan/materia . There's also https://github.com/orches-team/orches and https://github.com/ubiquitous-factory/quadit
* Podman fails to build a 16GB container image (after 30 minutes of downloading dependencies) despite having 90GB free out of a 200GB podman virtual machine
* Podman machine will, for reasons I don't understand, create a filesystem in a block device with wildly different sizes, and it seems like it's just random
* Pushing podman images to a container image registry via the Podman Desktop UI gives no indication that it's doing anything or even recognized the "push image" click, a success or error notification _might_ appear several or tens of minutes later or possibly not at all
* Starting a podman machine might work, but it fails ~75% of the time with not-particularly-exotic options (a bunch of ram and disk) and very cryptic error messages, frequently telling me to file bug tickets (I have)
* Podman Desktop won't let me create a podman machine with more than 44GB of disk, but the podman machine CLI won't let me create a machine with fewer than 100GB (IIRC--it's some number larger than 44, in any case)
Apart from the container image being absurdly large (Python developers love massive packages, I guess), I'm not doing anything exotic.
https://docs.podman.io/en/latest/markdown/podman-systemd.uni...
You don’t need to live at the edge of new features. Do you upgrade your fridge and your oven every two months? It’s nice when you can have something running and not worry that the next update will break your software and/or your workflow.
To each their own, but this is the reason I advice newcomers to stay away from Debian based distro's. I don't intend a distro flamewar, it works perfect for `boring old and feature complete software´ like Dovecot.
To add: containers would alleviate a good part of these concerns, but the stupid thing here is that precisely that is broken for up-to-date podman workflows.
By the way most docker based implementations do not actually need the userland proxy docker runs automatically. Disable it in /etc/docker/daemon.js
{
Also you don't even need the loopback address if the traffic is between one container and another, just a bridge network is fine.
Caddy will even do fully automated valid TLS certificates for private IP ranges via DNS ACME challenge for free etc with renewals handled, so all my internal self-hosted sites have properly terminated TLS too, accessible by connected VPN clients.
It's funny that for many of us in our day job, we stand up private services behind a VPN all the time so only work clients can access it, but when self hosting don't bother with a simple wireguard/tailscale config etc.
The only modification is that I pin containers to an IPv4 address so I can limit the forward rule to that address.
For forward chain we set docker network ranges to route between themselves and only services actually used in containers. Allow container outgoing connections to our DNS servers, centralized HTTP proxy server and monitoring - nothing else containers are allowed to route to.
And for output is similar, only allow our DNS servers, NTP, HTTP proxy, centralized rsyslog where everything goes and zabbix monitoring server and a few icmp types - nothing else gets out and is logged.
With the advent of these supply chain attacks we read about often here it's just a matter of time some container is compromised and this seems like only viable way to at least somehow limit impact when such an event occurs.
> docker compose pull && docker compose up -d is a fine command if you are SSH’d into the host. At customer scale—dozens of self-managed environments behind firewalls, each with its own change-control process—that manual process doesn’t scale.
No idea what this 'customer scale' operation is, but it seems like a pretty clear cut candidate for not using docker compose. I also don't think watchtower should be listed there, it's been archived and was never recommended for production usage anyways.
We just use ansible for this part.
Isn't that a Docker thing rather than Docker Compose though? There is a ton more caveats to add if we don't already assume the reader is familiar with the hard edges of Docker, seems the article only focuses on Docker Compose specifically, probably because it'd be very long otherwise :)
We previously built a package manager for Kubernetes to abstract it in the simplest way possible `glasskube install app` but we failed because every abstraction needs to follow a "convention over configuration" pattern at some point. Also, we weren't able to monetize a package manager.
With Distr (https://github.com/distr-sh/distr), we have actually been able to help companies not only package but distribute and either manage or give their customers a way to self-manage applications. Our customers are able to land on-premises contracts at enterprises way faster than before, which is also a clear ROI for paying for Distr.
So, I don't think that you can get the flexibility of a distributed application orchestrator with a simple declarative YAML file if your target environments are diverse.
Just use k8s and follow similar patterns is the conclusion I've arrived at personally.
The inputs (values) are yaml so you can make it look exactly like a Docker Compose file if you want (wouldn't be surprised if there's some charts floating around that do that)
I don't know if I'd necessarily call it brittle, though. You can use `helm template` and various linters to validate the generated yaml is correct (and use something like pre-commit to autorun)
* Lack of a user-friendly way of managing a Docker Compose installation on a remote host. SSH-forwarding the docker socket is an option, but needs wrappers and discipline.
* Growing beyond one host (and not switching to something like Kubernetes) would normally mean migrating to Swarm, which is its own can of worms.
* Boilerplate needed to expose your services with TLS
Uncloud [1] fixed all those issues for me and is (mostly) Compose-compatible.
[1] https://github.com/psviderski/uncloud/
If you remember `docker machine`, this is basically the modern version of that.
I've been using portainer for years, it's decent.
[1]: https://du.nkel.dev/blog/2023-12-12_mastodon-docker-rootless...
https://docs.podman.io/en/latest/markdown/podman-systemd.uni...
Service file lives in the mono repo where all 6 services live.
Makes it trivial to make changes and redeploy.
Yes, you can deploy a Go binary easily with systemd. Could you reliably do this across a fleet of machines? Including managing its configuration, persistent storage, database, network setup, etc.? Maybe, just need Ansible or equivalent config management. What if it were multiple Go binaries? And what if some of them needed to scale up some days because they hit more traffic than the others?
And on and on. Yes, not everyone needs Kubernetes, Nomad or other advanced orchestrators. But comparing them to running a Go binary with systemd is an unfair comparison.
Could i survive with 10 seconds of downtime, probably, but I'd really like if I could avoid it.
https://uncloud.run/docs/guides/deployments/rolling-deployme...
They've basically lost the war against Kubernetes but they could easily claim a lot of ground when it's just one more tweak you're adding to your docker-compose file as it scales.
Seems like an ad for whatever "Distr" is though; I haven't run into any of these issues with Dokploy and everything's been running fine for months.
> This is the shape Distr lands on
“Lands on”? I like that less.
I ran docker compose in development a lot. Just an easy way to turn on / off 5 different services at once for a project. Over time this was filling up my machine's storage (like 1 TB). Every few months I needed to run docker compose prune and see 600GB free up
https://docs.docker.com/engine/logging/drivers/journald/
I believe Podman can do something similar.
This section misses the one thing I was interested in: how do you avoid downtime in a deployment?
I like to write web applications with Perl and Mojolicious, and a deployment is just "hypnotoad app", and then hypnotoad gracefully starts up new worker processes to handle new requests and lets the other ones exit once they've finished handling their in-flight requests.
When I switched to Docker I found that there was no good way to handle this.
edit: thanks to next comment for referencing one
If you want to test something that is between compose and k8s, check ring: https://github.com/kemeter/ring
Haven't used it in a while but this thing is also interesting--it supports a bunch of different ways to spin up k8s https://github.com/tilt-dev/ctlptl
Is using Docker/Compose "just" as the layer for installing & managing runtime environment and services correct? Especially for languages like PHP?
I.e. am I holding it wrong if I run my "build" processes (npm, composer, etc) on the server at deploy time same as I would without containers? In that sense Docker Composer becomes more like Ansible for me - the tool I use to build the environment, not the entire app.
For the purpose of my question, let's assume I'm building normal CRUD services that can go a little tall or a little wide on servers without caring about hyper scale.
It's perfectly fine, as long as you accept the risks and downsides. Your IP can get ratelimited for Docker Hub. The build process can exhaust resources on the host. Your server probably needs access to internal dev dependencies repository, thus, needs credentials it would not need otherwise. Many small things like that. The advantage is simplicity, and it's often worth the risk.
How? What I'm describing is using Docker less.
> The build process can exhaust resources on the host
Maybe, but I've yet to have a host where that's the case for usual CRUD fare.
> The advantage is simplicity, and it's often worth the risk.
That's basically what I'm evaluating for here.
For bog standard LAMP or similar stack applications, I've not understood the advantage of going through the build-image-then-pull-on-host rigmarole. There's more layers involved there than something like provisioning with Ansible and just having a deploy script to run the usual suspects.
But I have seen that done fairly often, hence was wondering what the point was.
Also adds moving parts to your deploy which increases risk/introduces more failure modes.
Couple things that come to mind
- disk space exhaustion during build
- I/o exhaustion esp with package managers that have lots of small files (npm)
However, on the small/hobby end I don't think it's a huge concern.
> disk, i/o exhaustion
This is why I mentioned specifically for ecosystems like PHP, which are interpreted. I'm specifically asking for that use case.
I'm not building binaries, my "build" steps are actually deployment steps (npm build, composer install, etc) that I'd be running in exactly the same way on the host. The image I'm deploying by definition also contains my source code because I'm not deploying anything compiled.
That's what I answered for.
>I'm not building binaries
If you were, I would have added CPU to the list.
>my "build" steps are actually deployment steps (npm build, composer install, etc)
No, those are build steps. If you weren't using Docker, you would either run all those and shove in a zip/tarball or package into a deb/rpm, etc
>The image I'm deploying by definition also contains my source code
It doesn't contain .git or need credentials to your git/SCM
>I'm not seeing the benefit of the whole "build image, pull on server" pipeline when I can just ditch the registry and added layers by doing those steps on the server as I would normally in other kinds of scenarios
You don't need a registry--you can Docker save/load to push images directly to the server. Images buy you a versioned artifact with all the code-level dependencies baked in. Some maintainer yanks their package from npm? Who cares--you have a copy in your Docker image. Your new app version doesn't work? Edit 1 line to point back to the old image tag and rollback.
>> The build process can exhaust resources on the host
>Maybe, but I've yet to have a host where that's the case for usual CRUD fare.
When the build process completes, it tears down the overlayfs which causes everything to sync which leads to a big I/O spike. Depending on the server and amount of files, it might have no impact. However, I've seen build servers become completely unresponsive for 5+ minutes due to the I/O load when this happens. One place I worked, we had to switch our build servers to NVMe--the Docker container teardown caused spikes over 100k IOPs. Can't remember the exact details--it was React either React web front end or React Native mobile app.
>There's more layers involved there than something like provisioning with Ansible and just having a deploy script to run the usual suspects.
`docker save myimage:tag | gzip | ssh user@server 'gunzip | docker load'`
Not saying creating distributable artifacts is the de-facto answer, but I'd strongly consider whether it's really that much more complicated.
"Builds" are the same as deploys, so when working with server(s) instead of larger scale deployments, I'm not seeing the benefit of the whole "build image, pull on server" pipeline when I can just ditch the registry and added layers by doing those steps on the server as I would normally in other kinds of scenarios.
But I have seen this in action, which is why I'm wondering if I'm missing something.
The clearer benefit to me seems to be in this scenario to use it as a fast environment provisioning tool.
Very few separate ecosystem transfers are quite that frictionless.
And the error handling was terrible. Most of these problems resulted in a Python stack trace in some docker-compose internals instead of a readable error message. Googling the stack trace usually lead to a description of the actual problem, but that's really not something that inspires confidence.
Docker compose assumes all your services can reach each other over docker, which I find horribly insecure.
I separate all my services by user account at least, maybe even by VM, and I run them all in rootless podman containers. So it just doesn't fit my style, but I'm sure it works fine.
Granted, its B2B Saas with not many users, maybe 100 concurrent.
80% of workloads dont need the complexity of Kubernetes and run fine with compose.
Super easy deployment of additional apps, defined completely in one file (incl setup on host, backups, reverse proxy config, etc).
Never found a reason to migrate away. Swarm was already considered dead when I started using it in 2022[1], but the investment was so low and benefits so big, that it was the right choice for me. I think a lot of people are replicating swarm features with compose, losing a lot of time. But hey, to each their own choice!
1: https://www.yvesdennels.com/posts/docker-swarm-in-2022/
Using traefik or caddy as proxy.
Docker context for remote access - over Internet or vpn, whatever.
Swarm-cronjob for scheduled things.
Labels for things that need to run in particular places.
So easy.
Personally, k8s is fine, but its an abstraction for building a service architecture, not the thing an end user (developer) should ever use. If you are in a big company and you are using helm or k8s yaml files to roll things out, your infra or platform teams have missed something out.. building the platform!
https://developer.hashicorp.com/nomad
Disclaimer: I used to work for HashiCorp
even their follow up - Docker Compose vs Kubernetes.
Docker compose for me has been great - no complexity.
Docker/containers are great, especially for local development. But I feel the docker compose model quickly becomes a lot of messy brittle squeeze for little gain when multiple containers need to integrate.
Better then to just take the plunge for the "real deal" and set up a non-HA k8s/k3s cluster with the interactions between the workloads clearly specified.
In other words. I care care more about the interactions declaratively spelled out than the "scale to the moon" HA, auto-scaling, replicas or whatever people get sold on.
And LLMs make this even easier. If you love reviewing yaml manifests....
It's nice to get an easy question every once in a while.
Ie you need a sysadmin. Oops, you fired them all 10 years ago when the agile devopsing became the best thing after the pumpkin latte.