Forgot your password?
typodupeerror
Data Storage IT Linux

Btrfs Is Getting There, But Not Quite Ready For Production 268

Posted by timothy
from the delicious-on-popcrnfs dept.
An anonymous reader writes "Btrfs is the next-gen filesystem for Linux, likely to replace ext3 and ext4 in coming years. Btrfs offers many compelling new features and development proceeds apace, but many users still aren't sure whether it's 'ready enough' to entrust their data to. Anchor, a webhosting company, reports on trying it out, with mixed feelings. Their opinion: worth a look-in for most systems, but too risky for frontline production servers. The writeup includes a few nasty caveats that will bite you on serious deployments."
This discussion has been archived. No new comments can be posted.

Btrfs Is Getting There, But Not Quite Ready For Production

Comments Filter:
  • Re:ZFS (Score:4, Interesting)

    by Bill_the_Engineer (772575) on Friday April 26, 2013 @10:30AM (#43555883)
    Incompatible license prevents ZFS inclusion with the kernel. This is why Btrfs exists and explains Oracle's involvement with both.
  • by larry bagina (561269) on Friday April 26, 2013 @10:38AM (#43555983) Journal
    Oracle now owns ZFS. They could relicense it if they wanted to. BTRFS was started before the Sun acquisition but it seems strange* to develop BTRFS as a GPL file system with ZFS-like features while ZFS is mature and reliable today.

    * Yes, they're a large corporation and right hand doesn't know what left hand does... but isn't this more like the index finger not knowing what the middle finger is doing?

  • Re:ZFS (Score:4, Interesting)

    by Chris Mattern (191822) on Friday April 26, 2013 @10:44AM (#43556091)

    Mixing licenses does not somehow make things "not production ready".

    No, using a file system that doesn't ship with the kernel makes things "not production ready." Licensing is the reason why it doesn't ship with the kernel, but it's not shipping with the kernel that keeps it out of critical production use.

  • by Anonymous Coward on Friday April 26, 2013 @11:08AM (#43556425)

    The problem with "XFS" eating data wasn't with XFS - it was with the Linux devmapper ignoring filesystem barrier requests. [nabble.com]

    Gotta love this code:

    Martin Steigerwald wrote:
    > Hello!
    >
    > Are write barriers over device mapper supported or not?

    Nope.

    see dm_request(): /*
                      * There is no use in forwarding any barrier request since we can't
                      * guarantee it is (or can be) handled by the targets correctly.
                      */
                    if (unlikely(bio_barrier(bio))) {
                                    bio_endio(bio, -EOPNOTSUPP);
                                    return 0;
                    }

    Who's the clown who thought THAT was acceptable? WHAT. THE. FUCK?!?!?!

    And it wasn't just devmapper that had such a childish attitude towards file system barriers [lwn.net]:

    Andrew Morton's response tells a lot about why this default is set the way it is:

    Last time this came up lots of workloads slowed down by 30% so I dropped the patches in horror. I just don't think we can quietly go and slow everyone's machines down by this much...

    There are no happy solutions here, and I'm inclined to let this dog remain asleep and continue to leave it up to distributors to decide what their default should be.

    So barriers are disabled by default because they have a serious impact on performance. And, beyond that, the fact is that people get away with running their filesystems without using barriers. Reports of ext3 filesystem corruption are few and far between.

    It turns out that the "getting away with it" factor is not just luck. Ted Ts'o explains what's going on: the journal on ext3/ext4 filesystems is normally contiguous on the physical media. The filesystem code tries to create it that way, and, since the journal is normally created at the same time as the filesystem itself, contiguous space is easy to come by. Keeping the journal together will be good for performance, but it also helps to prevent reordering. In normal usage, the commit record will land on the block just after the rest of the journal data, so there is no reason for the drive to reorder things. The commit record will naturally be written just after all of the other journal log data has made it to the media.

    I love that italicized part. "OMG! Data integrity causes a performance hit! Screw data integerity! We won't be able to brag that we're faster than Solaris!"

    See also http://www.redhat.com/archives/rhl-devel-list/2008-June/msg00560.html [redhat.com]

    There's a lot more out there if you care to look.

    Toss in other things like the way Linux handles NFSv2 group membership (More than 16? Let's just silently drop some!) and lots of fanbois wonder why I view Linux as little better than Windows. Hell, Microsoft may fuck things up six ways from Sunday, but they're not CHILDISH when it comes to things like data integrity.

  • by Anonymous Coward on Friday April 26, 2013 @11:31AM (#43556819)

    FYI, ext4 can be larger than 16 TB but you need a newer version of the e2fsprogs than is included in a typical enterprise distribution. It's not the kernel filesystem drivers with the limitation, but the user-level utility for formatting a new filesystem.

  • Re:Happy with XFS (Score:5, Interesting)

    by Kz (4332) on Friday April 26, 2013 @11:37AM (#43556909) Homepage

    Your happy with XFS because your machine has never lost power or crashed. If either of those things happened with the older versions of XFS it was nearly a 100% guarantee you would lose data. Now i'm told its more reliable.

    It _is_ quite reliable, even on the face of hardware failure.

    Several years ago, I hit the 8TB limit of ext3 and had to migrate to a bigger filesystem. ext4 wasn't ready back then (and still today it's not easy to use on big volumes). Already had bad experiences with reiserfs (which was standard on SuSE), and the "you'll lose data"warnings on XFS docs made me nervous. It was obviously designed to work on very high-end hardware, which I couldn't afford.

    so, I did extensive torture testing. hundreds of pull-the-plug situations, on the host, storage box and SAN switch, with tens of processes writing thousands of files on million-files directories. it was a bloodbath.

    when the dust settled, ext3 was the best by far, managing to never lose more than 10 small files in the worst case, over 70% of the cases recovered cleanly. XFS was slightly worse, never more than 16 lost files and roughly 50% clean recoveries. ReiserFS was really bad, always losing more than 50-70 files and sometimes killing the volume. JFS didn't lose the volume, but lost files count never went below 130, sometimes several hundred.

    needless to say, i switched to XFS, and haven't lost a single byte yet. and yes, there has been a few hardware failures that triggered scary rebuilding tasks, but completed cleanly.

  • by UnknownSoldier (67820) on Friday April 26, 2013 @12:47PM (#43558059)

    Please mod parent informative.

    One of the retarded things about btrfs is that you can not see how much disk space is being used by each subvolume. How the hell can you have a filesystem and not know how much space is in use or free ??

    The design of ZFS is much more wholistic. That is, when we take a step back and look at both the micro and macro we see that we are really trying to solve 3 problems:

    * Volume Management
    * File System
    * Data Integrity

    ZFS solves all of these be leveraging knowledge from ALL the layers as one cohesive whole.
    https://blogs.oracle.com/bonwick/en_US/entry/rampant_layering_violation [oracle.com]

    Why RAID is fundamentally broken
    https://blogs.oracle.com/bonwick/entry/raid_z [oracle.com]

    Another interesting doc
    http://www.scribd.com/doc/43973847/5/ZFS-Design-Principles [scribd.com]

  • by Luke_22 (1296823) on Friday April 26, 2013 @12:47PM (#43558065)

    I tried btrfs as my main laptop filesystem:

    nice features, speed ok, but i happened to unplug by mistake the power supply, without a battery. bad crash... I tried using btrfsck, and other debug tools, even in the "dangerdon'teveruse" git branch, they just segfaulted. at the end my filesystem was unrecoverable, I used btrfs-restore, only to find out that 90% of my files had been truncated to 0... even files i didn't use for months....

    now, maybe it was the compress=lzo option, or maybe I played a little too much with the repair tools (possible), but untill btrfs can sustain power drops without problems, and the repair tools at least do not segfault, I won't use it for my main filesystem...

    btrfs is supposed to save a consistent state every 30 seconds, so I don't understand how I messed up that bad.... maybe the superblock was gone and the btrfsck --repair borked everything, I don't know.... luckily for me: backups :)

  • Re:Happy with XFS (Score:4, Interesting)

    by bored (40072) on Friday April 26, 2013 @03:20PM (#43560289)

    Without sounding like too much of a jerk, I have hundreds of commits in the linux-2.6 fs/* tree. This is what I do for a living.

    Well, then your part of the problem. Your idea that you have to be correct or fast is sadly sort of wrong. Its possible to be correct without completely destroying performance. I have a few commits in the kernel as well mostly to fix completely broken behavior (my day job in the past was working on an enterprise unix). So, I do understand filesystems too. Lately, my job has been to replace all that garbage, from the scsi midlayer up, so that a small industry specific "application" can both make guarantees about the data being written to disk while still maintaining many GB/sec of IO. The result, actually makes the whole stack look really bad.

    So, I'm sure your aware that on linux, if you use proper posix semantics (fsync() and friends) the performance is abysmal compared to the alternatives. This is mostly because of the "broken" fencing behavior (which has recently gotten better but still is far from perfect) in the block layer. Our changes depend on 8-10 year old features available in SCSI to make the guarantees that aren't available everywhere. But it penalizes devices which don't support modern tagging, ordering and fencing semantics rather than ones that do.

    Generally in linux, application developers are stuck either dealing with orders of magnitude performance loss, or they have to play games in an attempt to second guess the filesystem. Neither is a good compromise and its sort of shameful.

    Maybe its time to admit linux needs a filesystem that doesn't force people to choose either abysmal performance, or no guarantees about integrity.

Faith may be defined briefly as an illogical belief in the occurence of the improbable. - H. L. Mencken

Working...