Running ZFS Natively On Linux Slower Than Btrfs - Slashdot

Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

×

Running ZFS Natively On Linux Slower Than Btrfs 235

Posted by kdawson on Monday November 22, 2010 @11:56AM from the early-days dept.

An anonymous reader writes "It's been known that ZFS is coming to Linux in the form of a native kernel module done by the Lawrence Livermore National Laboratory and KQ Infotech. The ZFS module is still in closed testing on KQ infotech's side (but LLNL's ZFS code is publicly available), and now Phoronix has tried out the ZFS file-system on Linux and carried out some tests. ZFS on Linux via this native module is much faster than using ZFS-FUSE, but the Solaris file-system in most areas is not nearly as fast as EXT4, Btrfs, or XFS."

This discussion has been archived. No new comments can be posted.

Running ZFS Natively On Linux Slower Than Btrfs

Search 235 Comments Log In/Create an Account

Comments Filter:

Different ZFS distros (Score:4, Informative)

by hoggoth ( 414195 ) writes: on Monday November 22, 2010 @12:15PM (#34306870) Journal

I was confused as to what versions of ZFS were available on which distros so I made a chart that lists the different distros and which version of ZFS they support:
http://petertheobald.blogspot.com/2010/11/101-zfs-capable-operating-systems.html [blogspot.com]
Hope it's helpful...

Share
twitter facebook
Re:They Why ZFS? (Score:2, Informative)

by Anonymous Coward writes: on Monday November 22, 2010 @12:29PM (#34307042)

ZFS is...the only FS for large disks
XFS
I shapshot the entire 10 TB array in about 30 minutes (about 2000 file systems)...LVM has snapshots, true, but they are not quick or convenient compared to ZFS.
30 minutes? That's insane. An LVM2 snapshot would take seconds. I fail to see how that's not quick, and how "lvcreate -s" is less convenient.
In LVM I can only snapshot to unused space in the volume set. With ZFS you can snapshot as long as you have free space.
I can't even make sense of these two sentences. What you're saying is, an LVM snapshot requires free space, and er, a ZFS snapshot requires free space?

Parent Share
twitter facebook
Re:They Why ZFS? (Score:5, Informative)

by daha ( 1699052 ) writes: on Monday November 22, 2010 @12:30PM (#34307054)

Which of the ZFS features most impact its performance?
Compression enabled by default can't help (available in btrfs).
Checksum for all blocks probably doesn't help, but definitely helps detect corrupt data/corruption (available in btrfs).
Forcing any file that requires more than a single block to use a tree of block pointers probably doesn't help. The dnode only has one block pointer and the block pointer can only point to a single block (no extents). On the plus side, the block size can vary between 512 bytes and 64 KiB per object, so slack space is kept low. If more than a single block is necessary it creates a tree of block pointers. Each block pointer is 128 bytes in size, so the tree can get deep fairly quick.
Three copies of almost all file system structures (such as inodes, but called dnodes in ZFS) by default can't help (which are compressed of course).

Parent Share
twitter facebook
Checksums - 1 feature ZFS has that Ext4 doesn't (Score:4, Informative)

by yup2000 ( 182755 ) writes: on Monday November 22, 2010 @12:34PM (#34307088) Homepage

hmmm, well the most obvious feature that ZFS has that Ext4 does not is check summing.
That feature is one reason why ZFS is better (it will tell you if your disk is going bad, and if you have a raid setup, it will go get the good data for you). However, this is also one reason why ZFS is slower... it spends time making sure your data is safe and that it always gives you the correct bits from your disk.
That single feature is why I run FreeBSD (looking forward to kFreeBSD/debian!) on my file server in a mirrored raid configuration. Yes, it is "slower", but I still pull data off that server at over 50MB/sec on my home gigabit lan! The specs on that server aren't great either... 2GB ram, and an old 1.6GHZ single core sempron.

Parent Share
twitter facebook
Re:They Why ZFS? (Score:5, Informative)

by Maquis196 ( 535256 ) writes: on Monday November 22, 2010 @12:42PM (#34307196)

zpool status
That's the command you are looking for. The zfs-fuse lists disks by id which means if you go into /dev/disks/by-id/ and do a ls -al you'll see which devices they are linked to.
It is done this way to make it easier in Linux, in BSD/Solaris the disks are by gpt name (well they were for me) so this keeps it sane.
Hope it helps.
Maq

Parent Share
twitter facebook
Re:They Why ZFS? (Score:1, Informative)

by Anonymous Coward writes: on Monday November 22, 2010 @12:52PM (#34307336)

If you evenly split your 1TB disk into 2 LVM volume sets and volume1 is full, you can't make a snapshot of it. volume2 is sitting there empty, but the snapshot can't use it.

Parent Share
twitter facebook
Re:Using a first beta slower than stable? Wha?!?!? (Score:3, Informative)

by chrb ( 1083577 ) writes: on Monday November 22, 2010 @12:56PM (#34307406)
Who would have thought that a first-release Beta kernel module would not run as fast or be as reliable as the stable implementation for other operating systems, or the stables on Linux?
The full release is supposed to be coming out in the first week of January. Given the short time frame, it would seem like this is probably closer to the final release than the words " first beta" imply.
Surprises:
- Native ZFS beat XFS on several of the benchmarks - XFS is usually a good performer in these kind of tests
- Native ZFS does very well on the Threaded IO Test, where it ties for first place.
- Btrfs is really bad on the SQLite test, taking 5 times longer than XFS on both 2.6.32 and 2.6.37 (bug?)
- XFS IOzone write performance increased by 45% going from 2.6.32 to 2.6.37 (!) XFS increased on FS-Mark by 37%. I thought XFS would be pretty much at the point where there would be no such great improvements.
- "Real" Solaris+ZFS gets absolutely slaughtered on the Threaded IO Test and the PostMark Test, with ext4 pushing almost 10x more transactions per second.
- Tests were done on a SSD, apparently there was no difference in relative performance of the filesystems on SSD versus HD
Notes:
- "Real" Solaris+ZFS results are not shown for most tests
- Would be nice to know how many replicates they did of each test
- This is an interesting set of results that will get people talking/arguing :-) Thanks to Phoronix for starting the discussion.
Parent Share
twitter facebook
Re:how about versus ZFS on Solaris or FreeBSD? (Score:1, Informative)

by Anonymous Coward writes: on Monday November 22, 2010 @01:10PM (#34307576)

If you read TFA (or perhaps even the slashdot submission text) you should know that both fuse and native ports for linux are being discussed.

Parent Share
twitter facebook
Re:I'm using btrfs on my home partition. (Score:2, Informative)

by larkost ( 79011 ) writes: on Monday November 22, 2010 @01:14PM (#34307634)

"As for ZFS, it's not the tech that's keeping it from Linux but the restrictive licensing."
Just to be clear: between CDDL (ZFS) and GPL (BTRFS), GPL is clearly the more restrictive license. BTRFS can probably never be shipped with any other major OS other than linux (at least not in kernel mode), while ZFS has already shipped with a few.
The license restriction is one of linuxes making, not ZFS's. There are arguments for that restricion, but calling the problem one of CDDL being restrictive is a completly distorted view.

Parent Share
twitter facebook
Re:They Why ZFS? (Score:1, Informative)

by Anonymous Coward writes: on Monday November 22, 2010 @01:20PM (#34307694)

L2ARC is a HUGE performance improvement for many workloads, it essentially allows you to use faster disks to cache the most frequently used data. If they had combined the SSD and the 7200 RPM SATA drive and benchmarked a real world workload the ZFS implementation would have probably stomped the others because it would have used the SSD for the 'hot' data, the best you can do with btrfs is to place the metadata on the SSD.
L2ARC is just another cache. The ultimate IO limits of the filesystem are still set by limitations of the final backing store.
So if you're moving lots and lots of data, the L2ARC is pretty useless.
Set yourself up a ZFS file system, then start benchmarking it. If you're running on Solaris, run something like "iostat -sndxz 1" so you can see actual IO to your physical LUNs every second. Under heavy write load, you'll see ZFS go for extended periods without writing anything, then it'll hang your box badly as it flushes to disk. That's bad for two reasons - the relatively long periods of time ZFS isn't writing are IO opportunities lost, and the hanging of the box is horrible.
ZFS's IO pattern gives away available bandwidth, and then ZFS hammers your system to its knees.

Parent Share
twitter facebook
Re:They Why ZFS? (Score:5, Informative)

by cbhacking ( 979169 ) writes: <been_out_cruisin ... m ['hoo' in gap]> on Monday November 22, 2010 @02:57PM (#34308900) Homepage Journal

Um... WTF? Compression is a performance *improvement* and a massive one, at that. The trivial cost in CPU time is offset by the massive reduction in IO time, which is more expensive by far. This has been true since 2000 or even earlier. Modern multi-core CPUs just take the CPU penalty from negligible to nonexistent. Unless your CPU cores are all running at 100%, and possibly even if they are, compression will improve performance.
Note that this is true on a wide variety of filesystems; it's nothing special to these particular ones. Hell, NTFS has had built-in compression for a decade or more. You can improve performance on a Windows system by right-clicking the C: drive and selecting Properties -> Compress this drive. You can do it from the command line using
compact.exe /C /S:C:\ /A
This will compress all files in or under the root of the C drive, including hidden or system files (requires admin, of course) and marks all the directories so that any files written to them will also get compressed.

Parent Share
twitter facebook
Re:They Why ZFS? (Score:1, Informative)

by Anonymous Coward writes: on Monday November 22, 2010 @03:35PM (#34309328)

Don't do this for any files that regularly get random writes (like, say, database files). Compression uses bigger blocks (64K I think) so a write to a single block becomes a decompression of several blocks, an update and a recompression of the blocks. Which will kill performance.

Parent Share
twitter facebook
Re:They Why ZFS? (Score:3, Informative)

by makomk ( 752139 ) writes: on Monday November 22, 2010 @03:41PM (#34309422) Journal

Wrong answer. XFS is extremely prone to data corruption if the system goes down uncleanly for any reason. We may strive for nine nines, but stuff still happens. A power failure on a large XFS volume is almost guaranteed to lead to truncated files and general lost data. Not so on ZFS.
On ZFS, if the system goes down uncleanly you should avoid data corruption so long as every part of the chain from ZFS to your hard drive's platters behaves as ZFS expects and writes data in the order it wants. If it doesn't, you can easily end up with filesystem corruption that can't be repaired without dumping the entire contents of the ZFS pool to external storage, erasing it, and recreating the filesystem from scratch. If you're even more unlucky, the corruption will tickle one of the bugs in ZFS and even trying to mount the FS will cause a kernel panic, though this was more of a problem in older versions.

Parent Share
twitter facebook
Re:They Why ZFS? (Score:4, Informative)

by Dhalka226 ( 559740 ) writes: on Monday November 22, 2010 @03:59PM (#34309646)

Half of which's results will be one discussion forum or another where people who are not smug asses thoughtfully took a moment to answer a person's question.
You had time to post this self-important drivel, surely you have time to answer the question as well -- but you elected for the drivel. And you think that somehow says something about the people asking the question rather than about you?

Parent Share
twitter facebook
Re:They Why ZFS? (Score:3, Informative)

by segedunum ( 883035 ) writes: on Monday November 22, 2010 @04:50PM (#34310222)

Wrong answer. XFS is extremely prone to data corruption if the system goes down uncleanly for any reason. We may strive for nine nines, but stuff still happens.
What? That's true of any filesystem, and especially ZFS as practical experience shows. The only way to reliably keep any filesystem going is to keep it on a UPS and talking about 'nine nines' in that context is just laughable.

I keep hearing this shit over and over, mostly on idiot infested Linux distribution and Solaris fanboy forums, and it's just getting unbearable to see.
It's very simple. LVM snapshots require free volume set space. If your volume group is 10 TB, then you must leave unallocated space on it for the snapshots to consume.
You make it sound like you need an extra 10 terabytes to backup a 10 terabyte volume with LVM. You don't. It takes a snapshot and the free space you need is for further changes to the volume. ZFS is the same, except it's more intelligent about how it can use any free space over multiple volumes for snapshots and with things like dedpluication it will get much better, but you still need free space to perform them. You make it sound like ZFS snapshots are completely free as I see many ZFS proponents saying, and it's crap. The OP is also right about the time that ZFS snapshots can take. It's far too long.

This is a road Btrfs will have to travel because it also has to be *the* general purpose Linux filesystem and will have to solve problems and be in places where ZFS is not.

Parent Share
twitter facebook
Re:They Why ZFS? (Score:3, Informative)

by CAIMLAS ( 41445 ) writes: on Monday November 22, 2010 @07:42PM (#34312042)

What features does ZFS have that ext4 doesnt? Its a simple question, but you had to act like an ass. Good job.
Jeez, where to start? They're night and day. EXT4 has more in common with FAT32 or UFS than it does ZFS.
It's got a handful of core features, all of which are significant on their own:
* copy-on-write, so you know your data gets committed
* integral RAID-like functionality, integrated with the filesystem. This reduces overhead and eliminates the need for archaic RAID controllers (almost) entirely (complete with their shitty firmware and quirks, etc.) - just the controller, please.
* Due to the above two, eliminates the RAID5 write hole
* instant (like, a second or two) snapshotting of very large amounts of data.
* You can transparently 'piggyback' any filesystem on top of ZFS to provide said filesystem with ZFSs' protection
* Integral iSCSI provider. Nice to have with the above feature!
Shortcomings might include:
* No fdisk. IMO it's a bit of a serious limitation, but "it's not needed". Still, it can't help you recover from something like...
* The potential loss of your zpool definition file. Unlike (say) mdraid on Linux, there are no block backups within the filesystem (as far as I know) so the pool definition can tenably be lost (if you have a backup file somewhere, it's easy enough to recover, but still..)
As for the original post "not terribly fast" diss? Sorry, not buying it. They really needed to compare the performance against (say) other ZFS-based systems to show it's utility - there are a lot of people 'forced' to use solaris and or FreeBSD because it's got ZFS. Another significant thing to consider will be its maturity/stability and feature-completeness (eg. FreeBSD is a good way behind Solaris/OS/Illumos in these departments).
Finally, this is still pretty beta code. The only 'significant' not-as-good performance failure is the Postmark benchmark, which may or may not be conclusive (I don't know what it does). If you compare it to this [phoronix.com] postmark benchmark for PCBSD, it doesn't look that bad (particularly when you consider the above linked article figures are 500 points or so higher across the board than the 'new' benchmarks) - and the new implementation appears better than XFS, which is still quite a decent filesystem.
Oh, yeah - consider it's still 'beta'. Noteably, considerably more 'beta' than Butter. Consider me excited. I'm not going to jump until I get fairly certain news that it's at least as stable as the FreeBSD implementation (while requiring less 'tuning' - bah!); I can do without features if it's stable. CoW and the basic RAID-like implementation on their own is enough to jump ship for.

Parent Share
twitter facebook

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Related Links Top of the: day, week, month.

283 commentsShould There Be an 'Official' Version of Linux?
210 commentsIs Wayland Becoming the Favored Way to Get a GUI on Linux?
199 commentsLinux Passes 4% Desktop Market Share
191 commentsHow Red Hat Divided the Open Source Community
188 commentsWhy Desktop Linux Is Finally Growing In Popularity

"Experience has proved that some people indeed know everything." -- Russell Baker