ZFS is very sensitive and takes a long time to rebuild. I only choose RAID-10 for this reason, because I'd rather have most of my data than none of it.
It's because ZFS is much more than RAID. RAID just exposes a disk to the OS. ZFS exposes a filesystem and knows more metadata, e.g. the checksum, so it knows when to repair the data using the redundancy from the disks.
Although I completely agree with your observations - sitting in a weird spot between very large and very small data, there are use cases where ZFS is extremely useful.
I'm regularly working with datasets in the single to two digit TB range. They need to sit on disk because the university doesn't have 10GE.
If I ran a regular raid and would encounter a disk or data error, I would have to re-create the data from tape archive or worse, lose it; and getting a replacement disk would surely take weeks.
So instead, I've been running raidz pools for ~10 years and have never skipped a beat.
I agree that this is not very typical, but it suits my needs perfectly.
My main gripe with ZFS is that I can't dynamically size up an array once it's created. A raidz2 can't go from 6 disks to 7 or 8 disks without completely destroying and recreating the array.
ZFS is 100% better for RAID and resiliency. If you care about your data, ZFS is the answer. Even if you only have a single disk with no redundancy, at the very least you'll know when your data is corrupted, so you can avoid polluting your backups with that corrupted data.
It's possibly better for performance, depending on use case. If you're dealing with easily compressible data, then you can enable compression, which can speed up your reads/writes since you're reading or writing less data. You shouldn't get a performance hit on non-compressible data, since it will adjust its compression when it gets non-compressible data.
At first, I assumed that RAIDZ was the obvious way to go. But I switched to a concatenation of mirrored pairs; I grow the array by adding another pair to it.
I've actually had six disks fail in six months, without data loss. Was scary. But wow.
ZFS, so many problem got solved for me all at once: no more having to spend money on hardware RAID cards, or having them end up being the single point of failure... so having to spend more on RAID cards. zfs-send really streamlines the backup process as well - so it actually happens. It basically made it possible to build an enterprise class SAN that I wouldn't have been able to otherwise afford. So not technically a 10x, more like a 10(x+1).
Mine is RAIDZ too, and has become unusably slow since a few years back. Now I need to find a way to recreate the pool, but it's too much data to store on another disk... Not in love with ZFS so far.
Big Kudos to ZFS (on Linux), it's just amazing how sane and stable it is. It saved my ass multiple times in the past years. I've also just finished upgrading a RAIDZ1 vdev by replacing one disk at a time with bigger ones. Resilvering took 15h for each disk and there was some trouble with a (already replaced) disk failing during that. Panic mode set in, but ZFS provided the means to fix it quite easily - all good. Best decision ever to pick ZFS.
ZFS is great. But one thing scares me away from using it. With ZFS, you can't just simply add a new disk raidz set. You have to either build a new pool to send your data to, or :
1. Buy the same amount of (but larger) hard-disks
2. Replace them one by one
3. After the last disk is done, the size will grow.
4. Sell your old hard-disks.
Slow performance is always my impression with RAIDZ. There's an obvious performance hit if you are coming from conventional RAID setups (like mdadm). I've even seen some unbelievable slow speed like 1 MiB/s in the middle of copying a large git repo (which have a lot of small files, repeated git pull also creates a lot of fragmentation). But my experience was based on early ZFSOnLinux, perhaps a real BSD will perform better. Anyway, I knew ZFS's first goal is data safety, not performance, and it's why I could tolerate the performance...
but that's not what we were talking about! no-one was saying "zfs sucks as much as raid but at least it rebuilds faster afterwards". the implication was that zfs avoided the problem in the article (when, it seems, both zfs and raid need to be scrubbed, and both avoid the problem when that is done).
I, too, have had this experience, but how is this possible? So much other experience has taught me that hardware is better. The main downside I've had with ZFS (over hardware RAID) is performance, but it's been incredibly reliable.
reply