Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

ZFS is very sensitive and takes a long time to rebuild. I only choose RAID-10 for this reason, because I'd rather have most of my data than none of it.


sort by: page size:

It's because ZFS is much more than RAID. RAID just exposes a disk to the OS. ZFS exposes a filesystem and knows more metadata, e.g. the checksum, so it knows when to repair the data using the redundancy from the disks.

Although I completely agree with your observations - sitting in a weird spot between very large and very small data, there are use cases where ZFS is extremely useful.

I'm regularly working with datasets in the single to two digit TB range. They need to sit on disk because the university doesn't have 10GE.

If I ran a regular raid and would encounter a disk or data error, I would have to re-create the data from tape archive or worse, lose it; and getting a replacement disk would surely take weeks.

So instead, I've been running raidz pools for ~10 years and have never skipped a beat.

I agree that this is not very typical, but it suits my needs perfectly.


I use RAID-Z, a feature of ZFS. It works great.

ZFS is at the file level, not the disk level. So recovering from errors (i.e. rebuilds) is MUCH faster.

This is why I always opt for ZFS.

I'm thinking about using ZFS on my data drives and HFS on the backup drives, in case of trouble. Perhaps this is flawed thinking.

My main gripe with ZFS is that I can't dynamically size up an array once it's created. A raidz2 can't go from 6 disks to 7 or 8 disks without completely destroying and recreating the array.

ZFS is 100% better for RAID and resiliency. If you care about your data, ZFS is the answer. Even if you only have a single disk with no redundancy, at the very least you'll know when your data is corrupted, so you can avoid polluting your backups with that corrupted data.

It's possibly better for performance, depending on use case. If you're dealing with easily compressible data, then you can enable compression, which can speed up your reads/writes since you're reading or writing less data. You shouldn't get a performance hit on non-compressible data, since it will adjust its compression when it gets non-compressible data.


I've been using the same ZFS array since 2009...

At first, I assumed that RAIDZ was the obvious way to go. But I switched to a concatenation of mirrored pairs; I grow the array by adding another pair to it.

I've actually had six disks fail in six months, without data loss. Was scary. But wow.


ZFS, so many problem got solved for me all at once: no more having to spend money on hardware RAID cards, or having them end up being the single point of failure... so having to spend more on RAID cards. zfs-send really streamlines the backup process as well - so it actually happens. It basically made it possible to build an enterprise class SAN that I wouldn't have been able to otherwise afford. So not technically a 10x, more like a 10(x+1).

What's the advantage of using ZFS RAIDZ over mdadm? I thought that mdadm was more flexible in growing your RAID array.

Mine is RAIDZ too, and has become unusably slow since a few years back. Now I need to find a way to recreate the pool, but it's too much data to store on another disk... Not in love with ZFS so far.

Big Kudos to ZFS (on Linux), it's just amazing how sane and stable it is. It saved my ass multiple times in the past years. I've also just finished upgrading a RAIDZ1 vdev by replacing one disk at a time with bigger ones. Resilvering took 15h for each disk and there was some trouble with a (already replaced) disk failing during that. Panic mode set in, but ZFS provided the means to fix it quite easily - all good. Best decision ever to pick ZFS.

ZFS is great. But one thing scares me away from using it. With ZFS, you can't just simply add a new disk raidz set. You have to either build a new pool to send your data to, or :

1. Buy the same amount of (but larger) hard-disks 2. Replace them one by one 3. After the last disk is done, the size will grow. 4. Sell your old hard-disks.

Much more painful then other raid solutions.


Slow performance is always my impression with RAIDZ. There's an obvious performance hit if you are coming from conventional RAID setups (like mdadm). I've even seen some unbelievable slow speed like 1 MiB/s in the middle of copying a large git repo (which have a lot of small files, repeated git pull also creates a lot of fragmentation). But my experience was based on early ZFSOnLinux, perhaps a real BSD will perform better. Anyway, I knew ZFS's first goal is data safety, not performance, and it's why I could tolerate the performance...

I use 10x8TB in RaidZ 2 in my home server. TimeMachine Backup for 6 people, docker volumes and an excessively huge media collection.

The TimeMachine datasets are backed up offsite.

Losing this pool would be a PITA, but not critical.

My primary goal with ZFS is some data redundancy. At a good cost. And quick remote backup for a fraction of the pool. Not performance.

At one point, 2 disks died within 2 days. While there was some panic involved, the data on the server could be reproduced with some time.

There isn’t a best solution, that fits all needs. If there was, ZFS wouldn‘t offer all the options it does.


but that's not what we were talking about! no-one was saying "zfs sucks as much as raid but at least it rebuilds faster afterwards". the implication was that zfs avoided the problem in the article (when, it seems, both zfs and raid need to be scrubbed, and both avoid the problem when that is done).

I, too, have had this experience, but how is this possible? So much other experience has taught me that hardware is better. The main downside I've had with ZFS (over hardware RAID) is performance, but it's been incredibly reliable.

I have a fair amount of experience with ZFS, and have the same question.
next

Legal | privacy