It lets you patch/upgrade an isolated environment without touching the running bits, reboot into that environment, and if things aren't working well boot back into the last known-good one.
Sounds a lot like the A/B update method used widely in Android and to a lesser extend for embedded GNU/Linux OTA updates. But it uses two distinct boot partitions. Since ZFS is involved here, I assume that boot environments take advantage of its copy-on-write mechanism to avoid duplicating the entire boot dataset.
NixOS and Guix use a concept called 'system generations' to do the same without the support of the filesystem. LibOSTree can do the same and is called 'atomic rollback'.
Talking about NixOS, does anybody know of a similar concept in the BSD world (preferably FreeBSD)?
The original idea of boot environments in Solaris came from Live Upgrade, which worked at least as far back as Solaris 8. Live upgrade was not part of Solaris, rather it was an addon that came from the services or enterprise support parts of Sun.
Solaris 11 made boot environments a mandatory part of the OS, which was an obvious choice with the transition from
UFS to ZFS for the root fs. This came into Solaris development a bit before Solaris 11, so it was present in OpenSolaris and lives on in many forms of illumos.
It seems weird in 2025/2026 we are still discussing the baseline of getting a storage working.
Feels we’re spending too much time discussing the trees and not enough time getting the forest going:
* we need reliable local storage
* integrated backup
* apps installation / management
* remote access and account management
* app isolation, reliable updates
I’m primarily a ZFS-on-FreeBSD kind of guy but I repeatedly had need of doing ZFS-on-Linux recently and after a couple of times wrote it up for others. There are a lot of these guides, the difference is that this one tries to be idiomatic, using “native” tooling (eg systemd, love it or hate it) to do the job as “correctly” on Linux as possible:
Is zfs really worth the hassle, for someone who does not have time to play "home sysadmin" more than once or twice a year?
I've just rebuilt my little home server (mostly for samba, plus a little bit of docker for kids to play with). It has a hardware raid1 enclosure, with 2TB formatted as ext4, and the really important stuff is sent to the cloud every night. Should I honestly bother learning zfs...? I see it popping up more and more but I just can't see the benefits for occasional use.
I've lost work and personal data to bit rot in NAS filesystems before. Archived VM images wouldn't boot anymore after months in storage. Multiple vacation photos became colorful static part way through on disk due to a bit flip in the middle of the JPEG stream. I've had zero issues since switching to ZFS (even without ECC.)
Another huge benefit of ZFS is the copy-on-write (CoW) snapshots, which saved me many times as an IT administrator. It was effortless to restore files when users accidentally deleted them, and recovering from a cryptolocker type attack is also instant. Without CoW, snapshots are possible, but they're expensive and slow. I saw a 20-user office try to snapshots on their 30TB Windows Server NAS hoping to avoid having to revert to tape backups to recover the occasional accidentally deleted file. While hourly snapshots would have been ideal, the NAS only had room for only two snapshots, and would crawl to a halt while it created them. But ZFS's performance won't suffer if you snapshot every minute.
When it's time to backup, ZFS' send/recv capability means you only ever move the differences when backing up, and they're pre-computed so you don't have to re-index an entire volume to determine that you only need to move 124KB, making small transfers are lightning fast. Once backup completes, you have verified that the snapshot on both sides is bit-for-bit identical. While this is the essential property of a backup, most filesystems cannot guarantee it.
ZFS has become a hard requirement for any storage system I build/buy.
> Is zfs really worth the hassle, for someone who does not have time to play "home sysadmin" more than once or twice a year?
I'd argue that it's better for minimizing sysadmin work than the alternatives. Running a scrub, replacing a disk, taking a snapshot, restoring a snapshot, sending a snapshot somewhere (read: trivial incremental backups), etc. are all one command, and it's easy to work with.
> I've just rebuilt my little home server (mostly for samba, plus a little bit of docker for kids to play with). It has a hardware raid1 enclosure, with 2TB formatted as ext4, and the really important stuff is sent to the cloud every night. Should I honestly bother learning zfs...? I see it popping up more and more but I just can't see the benefits for occasional use.
The reason I personally would prefer it in that situation is that I don't really trust the layers under the filesystem to protect data from corruption or even to notice when it's corrupted. If you're sufficiently confident that your hardware RAID1 will always store data correctly and never mess it up, then it's close enough. (I wouldn't trust it, but that's me.) At that point, the only benefit I see to ZFS would be snapshots; an incremental `zfs send` is more efficient than however else you're syncing to the cloud.
IMHO, there's not much hassle anymore, unless you seek it out. The FreeBSD installer will install to zfs just as well as ufs. This article seems to not take the least hassle path.
Backups using zfs snapshots are pretty nice; you can pretty easily do incremental updates. zfs scrub is great to have. FreeBSD UFS also has snapshots, but doesn't have a mechanism to check data integrity: fsck checks for well formed metadata only. I don't think ext4 has snapshots or data integrity checking, but I haven't looked at it much.
There are articles and people claiming you need ECC to run zfs or that you need an unreasonable amount of memory. ECC is nice to have, but running ZFS without ECC isn't worse than running any other filesystem without ECC; and you only really need a large amount of ram if you run with deduplication enabled, but very few use cases benefit from deduplication, so the better advice is to ensure you don't enable dedup. I wouldn't necessarily run zfs on something with actually small memory like a router, but then those usually have a specialized flash filesystem and limited writes anyway.
> you only really need a large amount of ram if you run with deduplication enabled, but very few use cases benefit from deduplication, so the better advice is to ensure you don't enable dedup
a lot of people parrot this, but you can always just check for yourself. the in-memory size of the dedupe tables scales with total writes to datasets with deduplication enabled, so for lots of usecases it makes sense to enable it for smaller datasets where you know it'll be of use. i use it to deduplicate fediverse media storage for several instances (and have for years) and it doesn't come at a noticeable ram cost.
The difference is zfs does a lot of work and makes a lot of promises that it proved the data is good at every step of the way while it's being handled, that other filesystems do not do.
So: "I copied the data and didn't really look at it much." and it ended up being corrupt,
is different from: "I promise I proved this is solid with math and logic." and it ended up being corrupt, complete with valid checksum that "proves" it's not corrupt.
A zfs scrub will actually destroy good data thanks to untrustworthy ram.
ZFS says "once I've committed to disk, if the data changes, I'll let you know".
This works, regardless of if you have ram errors or not.
I will say that the reported error rate of 5 bit errors per 8 GB per hour in 8% of installed RAM seems incredibly high compared to my experience running on a fleet of about one to three thousand machines with 64-768 GB of ECC RAM. Based on that rate, assuming a thousand machines with 64 GB ram each, we should have been seeing about 3000 bit errors per hour; but ECC reports were rare. Most machines went through their 3-5 year life without reporting any correctable errors. Of the small handful of machines that had errors, most of them went from no errors to a concerning amount of errors in a short time and were shut down to have their ram replaced; a few threw uncorrectable errors, most of those threw a second uncorrectable shortly thereafter and had their ram replaced; there was one or two that would do about one correctable error per day and we let those run. There was one, maybe two that were having so many correctable errors that the machine check exceptions caused operational problems that didn't make sense until the hourly ECC report came up with a huge number.
The real tricky one without ECC is that one bit error a day case... that's likely to corrupt data silently, without any other symptoms. If you have a lot of bit errors, chances are the computer will operate poorly; you'll probably end up with some corrupt data, but you'll also have a lot of crashing and hopefully run a memtest and figure it out.
If you are interested in keeping backups, including the ability to go back in time to recover accidentally deleted/changed files, then ZFS with its reliable snapshot facility is fantastic. Other file systems offer some version of this, e.g. btrfs, but they don't have the same reliability as ZFS.
Snapshots on ZFS are extremely cheap, since it works on the block level, so snapshots every hour or even 15 minutes are now doable if you so wish. Combine with weekly or monthly snapshots that can be replicated off-site, and you have a pretty robust storage system.
This is all home sysadmin stuff to be sure, but even if you just use it as a plain filesystem, the checksum integrity guarantees are worth the price of admission IMO.
FWIW, software RAID like ZFS mirrors or mdm is often superior to hardware raid especially for home use. If your raid controller goes blooey, which does happen, unless you have the exact same controller to replace it, you run a chance of not being able to mount your drives. Even very basic computers are fast enough to saturate the drives in software these days.
Learning effort aside, there’s also the ZFS hardware requirements issue. I bought a four bay NAS couple years ago and looked into TrueNAS. I (somewhat) remember coming across details such as ZFS benefitting from larger amounts of ECC RAM and higher number of drives than what I had. This post covers details about different types of caches and resource requirements:
I found thst ZFS to be very simple to understand, everything is controlled by just two commands. Datasets are huge win over partitions which seem like such a weird relic of the past once you have tried datasets. Fairly confident you can grasp ZFS in a hour or 2, you can even make a zfs pool from files to mess around with.
I never found a good non-tutorial introduction into ZFS concepts. Do you know any? By non-tutorial, I mean something that doesn’t focus on teaching you the command-line tooling. Like you can explain to someone how Git works conceptually in detail, without having to mention any Git commands and having them exercise some Git workflow hands-on.
The biggest advantage of ZFS from a operational experience, is that when you have problems, ZFS tells you why. Checksum errors? Something wrong with the hard drive or SATA/SAS cables. Is the disk slow, zfs events will tell you that it spent more than 5 seconds to read sector x from disk /dev/sdf. The zfs cli commands are super-intuitive, and makes fully sense. Compared to ie. virsh, which is just weird for managing vm's.
It definitely worth the hassle. But if everything works fine for you now, don't bother. ZFS is not going away and you can learn it later.
zfs is the furthest thing from hassle, really trivial to use and manage. you'll sit down to do some kind of unhinged change to your infrastructure and it will end up taking 3 command line commands that complete instantly and then you will think, "huh, that was easy" and go back to the rest of your life
I've run my own Home NAS on ZFS for 4 years or so. In the end I ended up with majority NVMe setup with long-term archival storage on HDDs. This made my NAS much more useful / ergonomic since I can search, move files at 10Gbps speeds. I really hated the pause when opening up directories, etc. Details here:
This is getting lots of upvotes and rightfully so. I think people would love more posts about FreeBSD: especially about ZFS and bhyve (the FreeBSD hypervisor).
It's a bit sad that this Lenovo ThinkCentre ain't using ECC. I use and know ZFS is good but I'd prefer to run it on a machine supporting ECC.
I never tried FreeBSD but I'm reading more and more about it and it looks like although FreeBSD has always had its regular users, there are now quite some people curious about trying it out. For a variety of reasons. The possibility of having ZFS by default and an hypervisor without systemd is a big one for me (I run Proxmox so I'm halfway there but bhyve looks like it'd allow me to be completely systemd free).
I'm running systemd-free VMs and systemd-free containers (long live non-systemd PID ones) so bhyve looks like it could the final piece of the puzzle to be free of Microsoft/Poettering's systemd.
You express a desire for more FreeBSD posts and then immediately wade into all the typical flame-warring that surrounds most BSD/ZFS posts (systemd, ECC RAM), and it's been that way for over a decade at this point.
Is your desktop or laptop using ECC? For data that you are actively modifying the time that it spends on non-ECC RAM on the server is trivial compared to your desktop or laptop.
I'll take ZFS without ECC over hardware RAID with ECC any day.
* https://klarasystems.com/articles/managing-boot-environments...
* https://wiki.freebsd.org/BootEnvironments
* https://man.freebsd.org/cgi/man.cgi?query=bectl
* https://dan.langille.org/category/open-source/freebsd/bectl/
* https://vermaden.wordpress.com/2022/03/14/zfs-boot-environme...
It lets you patch/upgrade an isolated environment without touching the running bits, reboot into that environment, and if things aren't working well boot back into the last known-good one.
It happens by default with freebsd-update (I hope the new pkg replacement still does it too)
NixOS and Guix use a concept called 'system generations' to do the same without the support of the filesystem. LibOSTree can do the same and is called 'atomic rollback'.
Talking about NixOS, does anybody know of a similar concept in the BSD world (preferably FreeBSD)?
Well, there's https://github.com/nixos-bsd/nixbsd :)
- https://is.gd/BECTL
- https://vermaden.wordpress.com/2025/11/25/zfs-boot-environme...
* https://man.freebsd.org/cgi/man.cgi?query=bectl#end
> beadm(1M) originally appeared in Solaris.
* https://man.freebsd.org/cgi/man.cgi?query=beadm#end
Solaris Live Upgrade BEs worked with (mirrored) UFS root:
* https://docs.oracle.com/cd/E18752_01/html/821-1910/chapter-5...
* https://www.filibeto.org/sun/lib/solaris8-docs/_solaris8_2_0...
It allowed/s for migration from UFS to ZFS root:
* https://docs.oracle.com/cd/E23823_01/html/E23801/ggavn.html
Solaris 11 made boot environments a mandatory part of the OS, which was an obvious choice with the transition from UFS to ZFS for the root fs. This came into Solaris development a bit before Solaris 11, so it was present in OpenSolaris and lives on in many forms of illumos.
It seems weird in 2025/2026 we are still discussing the baseline of getting a storage working.
Feels we’re spending too much time discussing the trees and not enough time getting the forest going: * we need reliable local storage * integrated backup * apps installation / management * remote access and account management * app isolation, reliable updates
https://neosmart.net/blog/zfs-on-linux-quickstart-cheat-shee...
I've just rebuilt my little home server (mostly for samba, plus a little bit of docker for kids to play with). It has a hardware raid1 enclosure, with 2TB formatted as ext4, and the really important stuff is sent to the cloud every night. Should I honestly bother learning zfs...? I see it popping up more and more but I just can't see the benefits for occasional use.
I've lost work and personal data to bit rot in NAS filesystems before. Archived VM images wouldn't boot anymore after months in storage. Multiple vacation photos became colorful static part way through on disk due to a bit flip in the middle of the JPEG stream. I've had zero issues since switching to ZFS (even without ECC.)
Another huge benefit of ZFS is the copy-on-write (CoW) snapshots, which saved me many times as an IT administrator. It was effortless to restore files when users accidentally deleted them, and recovering from a cryptolocker type attack is also instant. Without CoW, snapshots are possible, but they're expensive and slow. I saw a 20-user office try to snapshots on their 30TB Windows Server NAS hoping to avoid having to revert to tape backups to recover the occasional accidentally deleted file. While hourly snapshots would have been ideal, the NAS only had room for only two snapshots, and would crawl to a halt while it created them. But ZFS's performance won't suffer if you snapshot every minute.
When it's time to backup, ZFS' send/recv capability means you only ever move the differences when backing up, and they're pre-computed so you don't have to re-index an entire volume to determine that you only need to move 124KB, making small transfers are lightning fast. Once backup completes, you have verified that the snapshot on both sides is bit-for-bit identical. While this is the essential property of a backup, most filesystems cannot guarantee it.
ZFS has become a hard requirement for any storage system I build/buy.
I'd argue that it's better for minimizing sysadmin work than the alternatives. Running a scrub, replacing a disk, taking a snapshot, restoring a snapshot, sending a snapshot somewhere (read: trivial incremental backups), etc. are all one command, and it's easy to work with.
> I've just rebuilt my little home server (mostly for samba, plus a little bit of docker for kids to play with). It has a hardware raid1 enclosure, with 2TB formatted as ext4, and the really important stuff is sent to the cloud every night. Should I honestly bother learning zfs...? I see it popping up more and more but I just can't see the benefits for occasional use.
The reason I personally would prefer it in that situation is that I don't really trust the layers under the filesystem to protect data from corruption or even to notice when it's corrupted. If you're sufficiently confident that your hardware RAID1 will always store data correctly and never mess it up, then it's close enough. (I wouldn't trust it, but that's me.) At that point, the only benefit I see to ZFS would be snapshots; an incremental `zfs send` is more efficient than however else you're syncing to the cloud.
Backups using zfs snapshots are pretty nice; you can pretty easily do incremental updates. zfs scrub is great to have. FreeBSD UFS also has snapshots, but doesn't have a mechanism to check data integrity: fsck checks for well formed metadata only. I don't think ext4 has snapshots or data integrity checking, but I haven't looked at it much.
There are articles and people claiming you need ECC to run zfs or that you need an unreasonable amount of memory. ECC is nice to have, but running ZFS without ECC isn't worse than running any other filesystem without ECC; and you only really need a large amount of ram if you run with deduplication enabled, but very few use cases benefit from deduplication, so the better advice is to ensure you don't enable dedup. I wouldn't necessarily run zfs on something with actually small memory like a router, but then those usually have a specialized flash filesystem and limited writes anyway.
a lot of people parrot this, but you can always just check for yourself. the in-memory size of the dedupe tables scales with total writes to datasets with deduplication enabled, so for lots of usecases it makes sense to enable it for smaller datasets where you know it'll be of use. i use it to deduplicate fediverse media storage for several instances (and have for years) and it doesn't come at a noticeable ram cost.
Nice usecase. What kind of overhead and what kind of benefits do you see?
So: "I copied the data and didn't really look at it much." and it ended up being corrupt,
is different from: "I promise I proved this is solid with math and logic." and it ended up being corrupt, complete with valid checksum that "proves" it's not corrupt.
A zfs scrub will actually destroy good data thanks to untrustworthy ram.
https://tadeubento.com/2024/aarons-zfs-guide-appendix-why-yo... "So roughly, from what Google was seeing in their datacenters, 5 bit errors in 8 GB of RAM per hour in 8% of their installed RAM."
It's not true to say that "Well all filesystem code has to rely on ram so it's all the same."
This works, regardless of if you have ram errors or not.
I will say that the reported error rate of 5 bit errors per 8 GB per hour in 8% of installed RAM seems incredibly high compared to my experience running on a fleet of about one to three thousand machines with 64-768 GB of ECC RAM. Based on that rate, assuming a thousand machines with 64 GB ram each, we should have been seeing about 3000 bit errors per hour; but ECC reports were rare. Most machines went through their 3-5 year life without reporting any correctable errors. Of the small handful of machines that had errors, most of them went from no errors to a concerning amount of errors in a short time and were shut down to have their ram replaced; a few threw uncorrectable errors, most of those threw a second uncorrectable shortly thereafter and had their ram replaced; there was one or two that would do about one correctable error per day and we let those run. There was one, maybe two that were having so many correctable errors that the machine check exceptions caused operational problems that didn't make sense until the hourly ECC report came up with a huge number.
The real tricky one without ECC is that one bit error a day case... that's likely to corrupt data silently, without any other symptoms. If you have a lot of bit errors, chances are the computer will operate poorly; you'll probably end up with some corrupt data, but you'll also have a lot of crashing and hopefully run a memtest and figure it out.
Snapshots on ZFS are extremely cheap, since it works on the block level, so snapshots every hour or even 15 minutes are now doable if you so wish. Combine with weekly or monthly snapshots that can be replicated off-site, and you have a pretty robust storage system.
This is all home sysadmin stuff to be sure, but even if you just use it as a plain filesystem, the checksum integrity guarantees are worth the price of admission IMO.
FWIW, software RAID like ZFS mirrors or mdm is often superior to hardware raid especially for home use. If your raid controller goes blooey, which does happen, unless you have the exact same controller to replace it, you run a chance of not being able to mount your drives. Even very basic computers are fast enough to saturate the drives in software these days.
https://www.45drives.com/community/articles/zfs-caching/
https://openzfs.github.io/openzfs-docs/man/master/7/zpoolcon...
https://openzfs.github.io/openzfs-docs/man/master/8/zpool.8....
https://openzfs.github.io/openzfs-docs/man/master/8/zfs.8.ht...
It definitely worth the hassle. But if everything works fine for you now, don't bother. ZFS is not going away and you can learn it later.
Yes. Also: what hazzle? It's in many ways simpler than alternatives.
https://benhouston3d.com/blog/home-network-lessons
It's a bit sad that this Lenovo ThinkCentre ain't using ECC. I use and know ZFS is good but I'd prefer to run it on a machine supporting ECC.
I never tried FreeBSD but I'm reading more and more about it and it looks like although FreeBSD has always had its regular users, there are now quite some people curious about trying it out. For a variety of reasons. The possibility of having ZFS by default and an hypervisor without systemd is a big one for me (I run Proxmox so I'm halfway there but bhyve looks like it'd allow me to be completely systemd free).
I'm running systemd-free VMs and systemd-free containers (long live non-systemd PID ones) so bhyve looks like it could the final piece of the puzzle to be free of Microsoft/Poettering's systemd.
I'll take ZFS without ECC over hardware RAID with ECC any day.