bcachefs: Principles of Operation [pdf]

eurg | karma 841 | avg karma 6.42 · 2022-05-18 00:58:26

See also: https://news.ycombinator.com/item?id=31419121 "Bringing bcachefs to the mainline"

dang | karma 18142 | avg karma 0.25 · 2022-05-18 03:41:25

Comments moved thither. Thanks!

arunprakash01 | karma -7 | avg karma -0.7 · 2022-05-18 07:23:04

Wow, that is quite informative. I like this article very much. The content was good. If any of the engineering students are looking for a projects for renewable energy projects, I found this site and they are providing the best service to the engineering students regarding the projects <a href="https://takeoffprojects.com/renewable-energy-projects">renew... energy projects</a>

acd | karma 2809 | avg karma 2.02 · 2022-05-18 01:43:43

Nice lots of good features! Seems like cache hard disk on ssd, filesystem snapshots.

Lopks like it can replace/complement ZFS, Btrfs

reply

viraptor | karma 41139 | avg karma 2.79 · 2022-05-18 03:18:24

Probably won't replace zfs, but if the recovery / failure handling story is better than btrfs, maybe we'll see that one replaced. They're going to be fairly similar in features and concepts.

supern00b | karma 2 | avg karma 2.0 · 2022-05-18 05:02:13

why do you think that it won't replace zfs? it has all the features that zfs have

viraptor | karma 41139 | avg karma 2.79 · 2022-05-18 05:15:03

If it just has all the features zfs has and none of the years of stability that zfs has... why would any current user (who is by definition ok with the licence) take the effort to switch? What's the extra benefit that zfs currently doesn't provide?

wokkel | karma 174 | avg karma 1.93 · 2022-05-18 06:38:29

Memory usage: zfs is still not neatly integrated and requires a big chunk of memory to operate efficiently.

MailNerd | karma 104 | avg karma 2.54 · 2022-05-18 07:28:28

Big chunk of memory - that's not the case except if you are using deduplication. I have it running fine on a 4GB machine (10TB mirrored volume) without any issues.

viraptor | karma 41139 | avg karma 2.79 · 2022-05-18 07:47:31

What I believe gp was referring to was that zfs uses different caching system that the main system. Specifically arc vs page cache do almost the same job, but are separate and may fight for resources. Discussed in a few places, but here's an example with some behaviour summary https://www.reddit.com/r/zfs/comments/o8xqzb/zfs_on_linux_ca...

cesarb | karma 14181 | avg karma 3.67 · 2022-05-18 09:03:35

Just being in the mainstream kernel, instead of an out-of-tree module, already gives a couple of benefits. The code will always be kept up-to-date with the internal kernel API, so you won't have a situation where the kernel is updated but the out-of-tree module fails to compile due to an API change. And if you're running a signed kernel (for SecureBoot), you don't need to have a complex setup to add an extra key and sign the module whenever it's rebuilt. Also, as others have mentioned, being in-tree means it's better integrated with the rest of the kernel; this includes making changes to core kernel code when they could help the module.

2pEXgD0fZ5cF | karma 4234 | avg karma 8.8 · 2022-05-18 09:09:17

> What's the extra benefit that zfs currently doesn't provide?

Not being at the mercy of a company like Oracle is a huge plus in many ways. A huge plus for future development and risk-free adoption.

I use ZFS on a server of mine, but I am one of those paranoid people that would switch just to get away from a project that could be hamstringed at any moment if Oracle has one of its episodes again.

ZFS (hopefully) never finding its way into the mainline kernel is kind of a meta-disadvantage.

reply

cyphar | karma 11934 | avg karma 2.62 · 2022-05-18 10:04:08

OpenZFS is the primary place where ZFS development happens and it's not managed by Oracle.

2pEXgD0fZ5cF | karma 4234 | avg karma 8.8 · 2022-05-18 10:21:44

Yes that I am aware of, my fear is Oracle going on a legal rampage, not direct mismanagement.

mekster | karma 1189 | avg karma 0.8 · 2022-05-18 13:49:31

What are they going to do about it?

They can't just relicense OpenZFS. The only attack vector is suing Ubuntu for distributing zfs because Canonical's lawyers think that's fine but even then, you're still free to use it by adding it to the system yourself like any other distribution is doing.

Running a new filesystem is far more dangerous than your theoretical legal concerns.

reply

viraptor | karma 41139 | avg karma 2.79 · 2022-05-18 15:07:26

They can go after end users if they manage to find a business angle where they can start charging for the system. "But that's silly, Oracle wouldn't be bothered by small businesses, right?" https://www.reddit.com/r/sysadmin/comments/d1ttzp/oracle_is_...

mekster | karma 1189 | avg karma 0.8 · 2022-05-18 20:51:21

How can they charge a product that's not their own?

cyphar | karma 11934 | avg karma 2.62 · 2022-05-18 20:54:27

OpenZFS is licensed under a free software license. There's no mechanism by which they could demand anything. At best they could try to sue Canonical for CDDL violations, though the legal arguments online suggest that Oracle wouldn't have a leg to stand on here because the license incompatibility comes from the GPL side (meaning that Linux copyright holders might be able to sue Canonical but even that is a stretch since it's hard to argue OpenZFS is a derivative work of Linux).

The thread you linked is someone who was using software with a proprietary license against its terms and is being asked to pay for its usage -- obviously I think Oracle is being scummy but it's not a comparable situation at all. This would be like saying that you won't use VS Code because Microsoft once demanded that someone who was using a cracked copy of Windows pay them -- it's a complete non-sequitur.

Would you refuse to use ZFS on FreeBSD as well?

reply

kzrdude | karma 11414 | avg karma 2.35 · 2022-05-19 10:41:37

Because right now it "has no features" because it's not really widely available. No installation, no features. The proof of the pudding is in the actual shipping and installing and using it. :)

nwmcsween | karma 1122 | avg karma 1.12 · 2022-05-18 10:24:21

ZFS would probably take ages to merge unless there is a simple mapping of spl -> native Linux interfaces.

folli | karma 3647 | avg karma 4.5 · 2022-05-18 01:53:12

Can someone explain to me (a bloody layman) what the advantages/differences are over a traditional file system ext3?

justinlloyd | karma 2369 | avg karma 3.51 · 2022-05-18 02:03:42

CoW (copy-on-write), deduplication of data blocks, replicas (RAID equivalent), file system snapshots and multiple, tiered caching layers, plus several other features.

On a sidenote, I integrated bcachefs as a vmx driver into VMWare ESXi for a CI/CD build server a few years back. The build system ran in VMs and on containers, but the non-essential target directories sat on bcachefs volumes with the caching layer directed first at RAM, then at SSD, then finally at the HDD. Managing all the Unity3D and Unreal caches was amazingly fast across dozens of different SKUs of what the same project.

From my project write-up on my LinkedIn:

    Bcache-like caching layer for VMWare ESXi

    Reduce latency and read/write delay even if using SSD as your storage.
    Written in C and inserting itself as a storage tier into VMWare ESXi to handle read & write storage requests this caching system accelerates all accesses to the underlying backing store.
    Can work in both write-back and write-through caching modes. (Native C kernel device driver)

nwellinghoff | karma 294 | avg karma 1.96 · 2022-05-18 06:09:58

Cool stuff, do you have a more detailed write up of this? I am interested in learning more!

pitaj | karma 4013 | avg karma 2.53 · 2022-05-18 12:03:31

Does bcachefs support something like raidz2?

attentive | karma 277 | avg karma 1.4 · 2022-05-18 02:06:01

checksums, compress, encrypt, snapshots etc.

It's a long list. It's better to compare with btrfs

reply

formerly_proven | karma 13110 | avg karma 3.44 · 2022-05-18 02:20:35

There's a ton of features, but really the most important one is that CoW filesystems tend to extend checksum protection to the actual data you're storing on them. Journaling filesystems generally only protect themselves and not the data.

FooBarWidget | karma 7524 | avg karma 2.79 · 2022-05-18 03:31:51

Is filesystem-level checksums better than dm-integrity which works at the block level?

viraptor | karma 41139 | avg karma 2.79 · 2022-05-18 04:35:37

The only small advantage I can think of is that the filesystem knows which parts of the drive are being used, so it will not try to recover stale/unused blocks. Same goes for writing - dm-integrity needs to initialise the whole drive and update every single write, but the fs can issue some big TRIMs instead. (Or does integrity know how to trim these days? (Apparently it depends on the usage mode https://www.kernel.org/doc/html/latest/admin-guide/device-ma...))

greenicon | karma 37 | avg karma 1.16 · 2022-05-18 08:31:57

With dm-integrity you either have a small hole of a few milliseconds (bitmap mode) or write all data twice (journal mode). When integrating everything into a cow file system you can sidestep the issue, as you basically journal the data through cow anyway.

Shish2k | karma 4193 | avg karma 3.46 · 2022-05-18 05:04:49

The thing that I’m most looking forward to: the ability to have multiple drives of different sizes, and expand the array with new oddly-sized drives (great for a homelab NAS built of spare parts; btrfs can do this, but zfs can't), combined with being able to set some drives as caches (eg having the filesystem automatically store frequently-read data on an SSD and rarely-read data on HDD; zfs can do this, btrfs can't)

mekster | karma 1189 | avg karma 0.8 · 2022-05-18 09:34:49

I don't know about bcachefs' capability on this but with zfs, you can do an instant snapshot of MySQL/PostgreSQL data directory and call it a backup instead of fighting the tough fight that is database backup with their dump utilities taking a good amount of space and time and not easy to do incremental backups.

https://lackofimagination.org/2022/04/our-experience-with-po...

https://www.percona.com/blog/2017/12/07/hands-look-zfs-with-...

reply

pavon | karma 3906 | avg karma 4.66 · 2022-05-18 10:22:03

I really want a filesystem with snapshots as the foundation of a better backup system. Bcachefs has other useful features like CoW and disk pooling (in fact snapshots weren't even on the table until recently), but in my mind snapshots are a must-have feature these days, like journaling was 20 years ago.

Unfortunately, this has been just around the corner on Linux for over a decade now, and the two filesystems that promised to deliver it are unlikely to reach mainstream support on Linux. ZFS has licensing issues, and uses too much memory for a desktop system. BTRFS tried to do too many things, and has had too many reliability issues for me to trust it.

reply

mekster | karma 1189 | avg karma 0.8 · 2022-05-18 13:43:41

Licensing issue is only for distribution. You're free to use it without breaching the license. Not sure why people won't just use it.

How is zfs using too much memory? zfs can run on a 2GB server (with some swap). Any laptop would have enough memory to run it. You might want to change "zfs_arc_max" as zfs tries to use half the memory available on a system if you don't set it and don't use deduplication.

reply