Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
Compressed file-systems increase performance (blogs.sun.com) similar stories update story
38.0 points by ryandvm | karma 8694 | avg karma 5.21 2010-02-02 19:20:22+00:00 | hide | past | favorite | 36 comments



view as:

Not a big surprise. We've hit a point in the I/O bandwidth-to-spare-CPU-cycles ratio where this now saves us time getting bits into RAM.

IIRC, the latest Firefox (3.6) is now compressing files on-disk using HFS' built-in compression to cut ~20% off launch time.


Actually we hit this point at least a decade ago; I remember getting improved performance by using a compressing filesystem for Windows 3.1. CPUs have been outrunning disks for a while, only the number of orders of magnitude has really changed.

Part of this is that even very cheap algorithms can consistently get "some" compression; as the article observes, bzip is still a net loss for many cases, but gzip-style compression has been better-than-free for a while now, as long as your CPU isn't loaded.


This absolutely backs up my own findings-

I've been working to decrease the mailserver backup times. Our Zimbra mailstore was taking over 8 hours to backup; I started compressing teh files before writing them to disk and it made a huge difference. I was able to bring the total time down to 3 hours.

I attributed this to two factors- The first was, as the article mentions, that the machine has so much CPU when compared to disk, that the smaller data-transfer helped more than the CPU overhead hurt. I found that this is true for gzip, but bzip2 slowed it down. An implementation of parallel-bzip I found online spreads the bzip cost over multiple cores, reduced this further, however. This leads me to believe that something closer to the OS, which would adjust the compression level dynamically with CPU load, may be a wonderful product.

The second factor was that the compressed files were in tarballs, so didn't create nearly as many files. Zimbra creates one file per message (a maildir like format), so this means hundreds of thousands of files need to be created and sized on the disk. In the archives, however, I am able to store just a few, large files, reducing the overhead of individually creating each message.


This has interesting implications when considering SSDs versus traditional hard disks. SSD reads are faster, writes are slower, and space is at a premium.

Indeed, I'm looking into this at the moment.

Some ssd-drives have compression on board to save space,speed and help endurance.

Interesting, it seems they're using "LZJB" compression. http://en.wikipedia.org/wiki/LZJB

I've been on the lookout for a fast compression algorithm that gives reasonable ratios, while having a liberal enough license. LZJB is licensed under the CDDL, which is GPL-incompatible, but it looks simple enough to be re-implemented.


Have you heard about QuickLZ? It's the best thing I've seen in the LZO class. http://www.quicklz.com/

Edit: Here we go, a list of compression tools ranked by speed: http://www.maximumcompression.com/data/summary_mf4.php


Thanks - QuickLZ does look good, unfortunately we're planning to release certain versions of the software we're developing under GPL and others under a proprietary license, and the $1200 for QuickLZ are a bit steep for an as-yet-unfunded startup. (this is the only time so far that I'm wishing I was a lone founder ;-) - the license would only be $100 then)

BriefLZ on that comparison page caught my eye, however. It uses the same license as zLib while being more than twice as fast. We'll have to test that, it would be a good starting point. (and the code is short and clear enough to be understood and reviewed; can't afford to have buffer overflow bugs in it) We can upgrade to QuickLZ or LZO later when we can afford them. Somehow, BriefLZ that seems to have been missing from all the other compression comparison pages I've looked at.


Completely off-topic, but this makes me even sadder that I still can't have ZFS on my Mac and probably never will...

Indeed. Fortunately, btrfs is shaping up to be a fine substitute (http://www.codestrom.com/wandering/2009/03/zfs-vs-btrfs-comp...). Including compression.

And will btrfs ever see the light of day as a kernel-mode FS on OS X?

You have to realize that most Apple applications make extensive use of bit-twiddling in the HFS+ filesystem. Resource Forks are used extensively in the OS X software, particularly the iSomething programs. These would all have to be extensively modified to work w/ a different FS.


Doh - I missed that we were talking about the near miss between ZFS and the Mac. I was just lamenting that it's not on Linux.

Most Apple applications do use bundles extensively, but that has nothing to do with HFS+.

Apple has religiously avoided "use of bit-twiddling in the HFS+ filesystem" for years -- everything new since 10.4 uses standard extended attributes. The only remaining uses I can think of are Finder color labels (a special bit-field in HFS), and that a bunch of the standard fonts still use resource forks (they only have a license to redistribute).


Can you please provide a source for your claims about Apple religiously avoiding bit fiddling?

Because Snow Leopard is all about using some really crafty and serious bit fiddling + Resource Forks to improve performance and decrease size.

If you haven't read John Siracusa's excellent report on Snow Leopard, have a look at this page of his review: http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.a...

While a lot of what he says is about compression, but pay special attention to what John says about the entire contents of files being stored in extended attributes in order to boost performance and lower file size... and Mail archives being stored in Resource Forks... and much, much more.

And I never mention bundles anywhere - they're a higher level construct independent of the FS.


I don't suppose you've tried the ZFS plugin for MacFUSE, and just maintaining a separate partition? I haven't touched it, I have no idea if it's any good, but it might solve your woes (unless you're trying to ZFS your OSX boot partition...).

MacFUSE won't really do the trick, I'm a strict believer in kernel-mode FS.

But, yes, what I truly want is to ZFS my OS X boot partition. Actually I don't care if it's ZFS - anything BUT HFS please!

Why? Corruption. HFS+ with less than 50% free space is almost guaranteed to corrupt on unsafe shutdown. And it doesn't fsck at boot, leaving you to realize it's been corrupted only when files and folders refuse to copy or spotlight acts up.


Really? I regularly have less than 20% free, and I've had a fair number of unsafe shutdowns (I'm an edge case, I do stuff that encourage this). I've yet to have a single corrupted file out of 2 million+.

I have yet to see any HFS+ corruption. I'm constantly at 70% usage on my partition due to dual booting and virtual machines.

The majority of my shutdowns have been unsafe because I suspend my macbook and I don't shut it down until it crashes. I've been doing this for the 2 years I've had my machine. No problems yet.


Is it good or bad to use ZFS for hosting a database?

This article alludes to a claim that ZFS has fragmentation issues: http://antydba.blogspot.com/2010/02/mongodb-backup-with-zfs....

Because ZFS is a copy-on-write filesystem, Eliot Horowitz says he wouldn't use it for a database: http://groups.google.com/group/mongodb-user/msg/731111475835...

However, a controverting opinion is offered by Jason J. W. Williams, who claims that ZFS works well with large MySQL installs: http://groups.google.com/group/mongodb-user/msg/e94712bc287a...

[edit: Updated links]


SenSage (a log aggregation tool with many security-related features) has used compressed per-column raw data at the lowest level in its distributed storage engine since 2001. Writes are fast, and when reading raw data at the lowest level to fulfill queries in its multi-level multi-node map-reduce style query processor it wins big-time over traditional RDBMSs because (a) it reads only the columns relevant to the query, not having to pull fields out of rows/pages; (b) it's faster to pull & decompress than pull uncompressed data. Storage typically is 10x+ smaller than the original logs, even with redundant copies. RDBMSs inflate the data size instead.

But FWIW, drives are not that slow. On my i7 machine, a really fast machine, the disks (one SSD, one RAID-1 array with 2 normal disks) can read data much faster than the blocks can be decrypted. My read speed is limited to about 60MB/s compared to the 130MB/s that the disks can actually read. (Yeah, kcryptd is single-threaded. Ouch.)

It's even worse on my Atom laptop. The SSD can do about 150MB/s (as I accidentally bought a really nice one), but the tiny CPU can only manage to decrypt about 20MB a second. And on a laptop, crypto is not exactly paranoia; it's the only sane thing to do.


I fail to see how this is relevant. The article is about disk compression, not disk encryption.

I keep wanting to build a qnap box with zfs somehow. anyone ever do this?

This makes intuitive sense. A great deal of operations on computers today are not CPU bound. In fact, the CPU (plus GPU) are sitting mostly un-utilized. Processors are fast, memory is cheap--both expand following Moore's law.

But hard drive speed does not follow Moore's law. If hard drives are the most significant bottleneck, it makes sense to reduce the amount of seeking and reading the hard drive is doing, shifting the work to the usually-idle multicore CPU+GPU.


So if compression not only saves space, but increases performance, it seems like its past due for disk manufacturers to actually include compression in disk controllers.

This is potentially true, but it's not a slam dunk because CPUs may be much more powerful than whatever you can build into a disk controller cost-effectively. This may not be true (a high-performance ASIC implementation may be tiny; this is not unheard of in the world of bit-banging algorithms), but it is worth considering. I'll have to look into what compression algorithm they used, and whether you can get away with something smaller/less power-hungry than a Core 2 on a disk controller.

How and where you do the compression matters. ZFS compression appears to be much more efficient than ext2 or NTFS compression due to the copy-on-write nature of ZFS. It would be difficult for a disk controller to implement the same optimizations.

This is something I've been thinking about since the iPad announcement. One of the reasons I desire access to my filesystem is that I like to keep some stuff in archives, either for compression or encryption or easy transfer or avoiding long readdir(2) operations, or any combination thereof. It'd be interesting if this whole issue can be avoided, and whole-filesystem compression is a good step toward that.

There is a google spreadsheet benchmark ( http://spreadsheets.google.com/ccc?key=0AlX1n8WSRNJWdGNQME5B... ) released by one of the developers of ZFS-Fuse.

Makes for interesting reading, since it confirms this benchmark. but on Linux and shows comparable results with Solaris ZFS benchmarks.

And SSD + linux BLOWS everything else away


This has been known for well over a decade: volumes that use NTFS compression deliver performance increases as high as 50 percent over their uncompressed counterparts, depending on the type of data stored on the volumes

http://technet.microsoft.com/en-us/library/cc767961.aspx


* ahem * From the article you linked:

This performance seemed too good to be true until I monitored CPU utilization during a subsequent run of the same benchmarks on a compressed NTFS volume. The CPU utilization on the test jumped from an average of 10 to 18 percent on the uncompressed NTFS volume to a whopping 30 to 80 percent on the compressed NTFS volume.

And most importantly (immediately after the above, separated for extra emphasis):

In addition, performance significantly decreased when I used NTFS compression on larger volume sizes (4GB or greater) and software-based, fault-tolerant RAID volumes.

Methinks we're over the 4GB mark now. The article is too old to be considered accurate any more.


Legal | privacy