Denial-of-Service Attack Found In Btrfs File-System

Follow Slashdot stories on Twitter

Denial-of-Service Attack Found In Btrfs File-System 210

Posted by timothy on Friday December 14, 2012 @09:24PM from the at-that-range-a-hammer-works-too dept.

An anonymous reader writes "It's been found that the Btrfs file-system is vulnerable to a Hash-DOS attack, a denial-of-service attack caused by hash collisions within the file-system. Two DOS attack vectors were uncovered by Pascal Junod that he described as causing astonishing and unexpected success. It's hoped that the security vulnerability will be fixed for the next Linux kernel release." The article points out that these exploits require local access.

This discussion has been archived. No new comments can be posted.

Denial-of-Service Attack Found In Btrfs File-System

Load All Comments

Search 210 Comments Log In/Create an Account

Comments Filter:

Who ported btrfs to DOS? (Score:5, Funny)

by Nimey ( 114278 ) writes: on Friday December 14, 2012 @09:27PM (#42297559) Homepage Journal

and should we give him a medal or lynch him?

Share
twitter facebook
- Re:Who ported btrfs to DOS? (Score:5, Funny)
  
  by macraig ( 621737 ) writes: <mark DOT a DOT craig AT gmail DOT com> on Friday December 14, 2012 @09:46PM (#42297721)
  
  Do I have to choose? Can I hang a medal on him, and then hang him? I'll make the medal 20 pounds to speed up the lynching.
  
  Parent Share
  twitter facebook
- - Re: (Score:3, Informative)
    
    by maxwell demon ( 590494 ) writes:
    
    DOS = Disk Operating System
    DoS = Denial of Service
    - Re: (Score:3, Funny)
      
      by byornski ( 1022169 ) writes:
      
      DOS = Density of States [wikipedia.org]
    - - Re: (Score:2)
        
        by Pf0tzenpfritz ( 1402005 ) writes:
        
        DOS++; if (!DOS==TRES) { return ("unit test failed"); }
Can we get a real Linux filesystem, please? (Score:5, Interesting)

by Anonymous Coward writes: on Friday December 14, 2012 @09:35PM (#42297625)

btrfs is a step in the right direction, but even now, Linux does not have production-level deduplication (which even Windows has, for crying out loud), encryption, snapshots, or something even close to supplanting LVM2.
I just got out of a meeting at my job because we are replacing some old large servers... and because Linux has no stable filesystem with enterprise features, looks like things are either going to Windows, or perhaps Solaris x86 (which is expensive.)
This doesn't mean to suck Sun's teat for ZFS access... but at least try to come close to what even NTFS or even ReFS offers...

Share
twitter facebook
- Re:Can we get a real Linux filesystem, please? (Score:5, Informative)
  
  by Anonymous Coward writes: on Friday December 14, 2012 @10:02PM (#42297855)
  
  ZFS on FreeBSD or FreeNAS is great. Easily saturates gigE with a simple mirror of recent 7200rpm disks. It scales up from there, and FreeBSD is pretty rock solid.
  
  Parent Share
  twitter facebook
  - - Re:Can we get a real Linux filesystem, please? (Score:4, Informative)
      
      by LordLimecat ( 1103839 ) writes: on Saturday December 15, 2012 @01:26AM (#42299005)
      
      FAT32 is going to be faster than a LOT of filesystems precisely because it lacks features like dedup, any notion of real ACLs, and, oh, I dont know, data integrity. Thats why if you want a really fast RAMDisk, you dont use NTFS or ReFS, you use FAT16 or FAT32.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by bzipitidoo ( 647217 ) writes:
        
        Last time I ran a benchmark, FAT was by far the slowest file system. Ext2, 3 and 4, Reiser 3 and 4, btrfs, xfs, jfs, and even ntfs were all much faster. Each varied on different kinds of loads, but the differences between them was insignificant next to the difference in speed between all of them and FAT. Simplicity often doesn't translate to speed. FAT does many things in brain dead ways. Let's rewrite the entire file for every tiny change, and do it right away, no caching! Insert a little something i
        
        Re: (Score:2)
        
        by LordLimecat ( 1103839 ) writes:
        
        If ext3 is showing as faster than FAT in your benchmarks, your benchmarks are horribly flawed. A non-journaling filesystem with real metadata is going to be oodles faster than any journaling filesystem.
        Heck, I wouldnt be suprised if NTFS were competitive with EXT3, ext3 isnt exactly known as a speed demon.
        
        Re: (Score:2)
        
        by donaldm ( 919619 ) writes:
        
        Actually ext4 is way faster than ext3 and has been out a few years now. I make all my file-systems ext4 including my backup disks and have never had any issues. The only thing I have FAT on is some flash drives that I sometimes use to transfer files to MS Windows machines and it is rare for me to go the other way since I normally don't have anything I want on MS Windows machines.
        
        I did try BtrFS about a year ago but for home use I found it not worth the effort (actually it is really easy) and I am very fam
- Re:Can we get a real Linux filesystem, please? (Score:4, Interesting)
  
  by Anonymous Coward writes: on Friday December 14, 2012 @10:07PM (#42297893)
  
  btrfs is a step in the right direction, but even now, Linux does not have production-level deduplication (which even Windows has, for crying out loud), encryption, snapshots, or something even close to supplanting LVM2.
  I just got out of a meeting at my job because we are replacing some old large servers... and because Linux has no stable filesystem with enterprise features, looks like things are either going to Windows, or perhaps Solaris x86 (which is expensive.)
  This doesn't mean to suck Sun's teat for ZFS access... but at least try to come close to what even NTFS or even ReFS offers...
  Hear hear! Backup admin here, just want to add before the unwashed masses of armchair Linux admins show up, one example of an enterprise filesystem feature is the NTFS change journal. It makes the file system scan as part of an incremental backup run in constant time.
  It's sad on other systems with large numbers of files to schedule subdirectories for different times of day to deal with scanning overhead.
  
  Parent Share
  twitter facebook
  - Re:Can we get a real Linux filesystem, please? (Score:5, Informative)
    
    by Tough Love ( 215404 ) writes: on Friday December 14, 2012 @10:22PM (#42297977)
    
    NTFS doesn't have snapshots. Instead it relies on volume shadow copies, with known severe performance artifacts caused by needing to move snapshotted data out of the way when new writes come in. Btrfs, like ZFS and Netapp's WAFL, use a far more efficient copy-on-write strategy that avoids the write penalty. The takeaway: I would not go so far as to claim Microsoft has an enterprise-worthy solution either. If you want something with industrial strength dedup, snapshots and fault tolerance, you won't be getting it from Micorosft.
    
    Parent Share
    twitter facebook
    - Re:Can we get a real Linux filesystem, please? (Score:4, Insightful)
      
      by jamesh ( 87723 ) writes: on Saturday December 15, 2012 @12:08AM (#42298631)
      
      NTFS doesn't have snapshots. Instead it relies on volume shadow copies, with known severe performance artifacts caused by needing to move snapshotted data out of the way when new writes come in. Btrfs, like ZFS and Netapp's WAFL, use a far more efficient copy-on-write strategy that avoids the write penalty. The takeaway: I would not go so far as to claim Microsoft has an enterprise-worthy solution either. If you want something with industrial strength dedup, snapshots and fault tolerance, you won't be getting it from Micorosft.
      What nonsense. VSS is the snapshot solution for NTFS, and of course it uses copy-on-write. Microsoft VSS backup architecture is years ahead of Linux... LVM is kind of cool but if you have a single database spread across multiple LV's then you can't snapshot them all as an atomic operation so it becomes useless. MS VSS does this, and always has.
      I'm normally a Linux fanboi but when you sprout rubbish like this I have no hesitation in correcting you.
      
      Parent Share
      twitter facebook
      - Re: (Score:3, Informative)
        
        by Anonymous Coward writes:
        
        Tried to find some more information on this. First discovery: VSS stands for "Volume Shadow copy Service", not "Visual SourceSafe", as was my first association. :)
        AFAICT he's saying pretty much what Microsoft is saying [microsoft.com]:
        When a change to the original volume occurs, but before it is written to disk, the block about to be modified is read and then written to a "differences area", which preserves a copy of the data block before it is overwritten with the change. Using the blocks in the differences area and unchanged blocks in the original volume, a shadow copy can be logically constructed that represents the shadow copy at the point in time in which it was created.
        The disadvantage is that in order to fully restore the data, the original data must still be available. Without the original data, the shadow copy is incomplete and cannot be used. Another disadvantage is that the performance of copy-on-write implementations can affect the performance of the original volume.
        Do you have a newer reference?
        
        Re: (Score:3)
        
        by Tough Love ( 215404 ) writes:
        
        If you keep all old file where in their original sectors and write changes in new places, your files get fragmented to hell.
        Microsoft's "shadow copy" doesn't work at the file level, it works at the block level, so it doesn't know anything about files. Btrfs and its ilk try to leave some empty space distributed across the volume, so copy-on-write can leave the copies in fairly reasonable places. After the copy is committed, the original space can be freed, so the next update won't mess things up too badly either. Snapshots mess this up because the original space doesn't get freed. But then, snapshots are always messed up, there i
      - Re:Can we get a real Linux filesystem, please? (Score:5, Informative)
        
        by Tough Love ( 215404 ) writes: on Saturday December 15, 2012 @03:17AM (#42299481)
        
        VSS is the snapshot solution for NTFS, and of course it uses copy-on-write
        Well. Maybe you better sit down in a comfortable chair and think about this a bit. From Microsoft's site: When a change to the original volume occurs, but before it is written to disk, the block about to be modified is read and then written to a “differences area”, which preserves a copy of the data block before it is overwritten with the change. [microsoft.com]
        Think about what this means. It is not a "copy-on-write", it is a "copy-before-write". Gross abuse of terminology if anybody tries to call it a "copy-on-write", which has the very specific meaning [wikipedia.org] of "don't modify the destination data". Instead, copy it, then modify the copy. OK, are we clear? VSS does not do copy-on-write, it does copy-before-write.
        Now let's think about the implications of that. First, the write needs to be blocked until the copy-before-write completes, otherwise the copied data is not sure to be on stable storage. The copy-before-write needs to read the data from its original position, write it to some save area, then update some metadata to remember which data was saved where. How many disk seeks is that, if it's a spinning disk? If the save area is on the same spinning disk? If it's flash, how much write multiplication is that? When all of that is finally done, the original write can be unblocked and allowed to proceed. In total, how much slower is that than a simple, linear write? If you said "on the order of an order of magnitude" you would be in the ballpark. In face, it can get way worse than that if you are unlucky. In the best imaginable case, your write performance is going to take a hit by a factor of three. Usually, much much worse.
        OK, did we get this straight? As a final exercise, see if you can figure out who was talking nonsense.
        
        Parent Share
        twitter facebook
        
        Re:Can we get a real Linux filesystem, please? (Score:5, Insightful)
        
        by jamesh ( 87723 ) writes: on Saturday December 15, 2012 @04:04AM (#42299661)
        
        VSS is the snapshot solution for NTFS, and of course it uses copy-on-write
        Well. Maybe you better sit down in a comfortable chair and think about this a bit. From Microsoft's site: When a change to the original volume occurs, but before it is written to disk, the block about to be modified is read and then written to a “differences area”, which preserves a copy of the data block before it is overwritten with the change. [microsoft.com]
        Think about what this means. It is not a "copy-on-write", it is a "copy-before-write". Gross abuse of terminology if anybody tries to call it a "copy-on-write", which has the very specific meaning [wikipedia.org] of "don't modify the destination data". Instead, copy it, then modify the copy. OK, are we clear? VSS does not do copy-on-write, it does copy-before-write.
        Now let's think about the implications of that. First, the write needs to be blocked until the copy-before-write completes, otherwise the copied data is not sure to be on stable storage. The copy-before-write needs to read the data from its original position, write it to some save area, then update some metadata to remember which data was saved where. How many disk seeks is that, if it's a spinning disk? If the save area is on the same spinning disk? If it's flash, how much write multiplication is that? When all of that is finally done, the original write can be unblocked and allowed to proceed. In total, how much slower is that than a simple, linear write? If you said "on the order of an order of magnitude" you would be in the ballpark. In face, it can get way worse than that if you are unlucky. In the best imaginable case, your write performance is going to take a hit by a factor of three. Usually, much much worse.
        OK, did we get this straight? As a final exercise, see if you can figure out who was talking nonsense.
        I concede that the terminology used by the MS article is misused. I don't think you're thinking the performance issues through though. You start with a file nicely laid out linearly on disk, and you take a snapshot so you can make a backup. Now you make a modification to the middle of the file and what happens? Suddenly the middle of the file is elsewhere on disk, and in the case of LVM this is invisible to the filesystem so no amount of defragging is going to fix it. This situation persists long after you have taken your backup and thrown the snapshot away. Of course this doesn't matter for flash but we're not all there yet. If BTRFS does snapshots using copy-on-write (correct definition) then this will be a problem too, although if BTRFS is smart enough it should be able to repair the situation once the snapshot is discarded.
        VSS's way leaves the original data in-order on the storage medium. The difference area is likely on a completely different disk anyway so the copy-on-write (MS definition) could not be performed any other way.
        
        Parent Share
        twitter facebook
        
        Re: (Score:3, Informative)
        
        by Tough Love ( 215404 ) writes:
        
        Modifications in the middle of files are extremely rare. It's true, running a database on top of a snapshotted spinning disk is probably going to suck. For normal users, keeping regular files mostly linear, and files in the same directory nearby each other is what matters, and yes, Btrfs does a credible job of that.
        I know why shadow copy works the way it does. 1) It's simple, therefore likely to work. 2) It's an easy answer to the "how do you control fragmentation" question. But the write performance issue
        
        Re: (Score:2)
        
        by Tough Love ( 215404 ) writes:
        
        Dear Microsoft spinmods: you don't change the fact that your volume snapshots suck by modding down my post.
        
        Re: (Score:2)
        
        by jamesh ( 87723 ) writes:
        
        Dear Microsoft spinmods: you don't change the fact that your volume snapshots suck by modding down my post.
        Troll is a little harsh... I disagree with you but I know you're not trolling and the discussion is still an Interesting one.
        
        Re: (Score:2)
        
        by Tough Love ( 215404 ) writes:
        
        Sure, Microsoft abuses the CoW terminology and Wikipedia documents that. More politely than necessary, IMHO.
        Copy-on-write leaves the original data unchanged. Copy on write makes a private copy, leaving the orignal unchanged. Microsoft has a different definition, but then Microsoft has a lot of different definitions. Let's you and me be precise about it, and avoid the terminology that Microsoft has wantonly polluted in its ignorance. Copy-before-write or redirect-on-write.
        
        Re: (Score:2)
        
        by Tough Love ( 215404 ) writes:
        
        You say: "Copy-on-write leaves the original data unchanged" and VSS leaves the original data unchanged.
        the implementation details doesn't change the logical concept. COW says: "The fundamental idea is that if multiple callers ask for resources which are initially indistinguishable, they can all be given pointers to the same resource" but it looks like you don't get it.
        I did not say that VSS leaves the original data unchanged, I said the opposite. And this is not an "implementation detail", it's a fundamental property of the operation. And could you please read the next sentence after the one you quoted from Wikipedia, it invalidates your argument. And could you please stop chewing on my toes and learn something about computer science.
      - Re: (Score:2)
        
        by Tough Love ( 215404 ) writes:
        
        LVM is kind of cool but if you have a single database spread across multiple LV's then you can't snapshot them all as an atomic operation so it becomes useless.
        You're also wrong about that. You can concatenate multiple logical volumes as a single logical volume and snapshot that atomically.
        
        Re: (Score:2)
        
        by jamesh ( 87723 ) writes:
        
        LVM is kind of cool but if you have a single database spread across multiple LV's then you can't snapshot them all as an atomic operation so it becomes useless.
        You're also wrong about that. You can concatenate multiple logical volumes as a single logical volume and snapshot that atomically.
        OK this is news to me. When I last asked about that it couldn't be done but that was a few years go. Google doesn't tell me how I can concatenate (say) my database lv and my logs lv (separate vg's because separate spindles), snapshot them, then un-concatenate them... a link would be appreciated.
        
        Re: (Score:2)
        
        by Tough Love ( 215404 ) writes:
        
        lvm lets you concatenate any block devices into a virtual block device
        
        Re: (Score:2)
        
        by Tough Love ( 215404 ) writes:
        
        Which of course you can do that, but then you can't have the database LV and the log LV on different physical disks any more, which is what was asked. Can you post an example how you would concatenate two existing LVs, with existing file systems on them, mounted and being modified at the time. into a "new virtual block device" without even un-mounting them, and then make a consistent snapshot of them?
        You're delusional, "without even unmounting them" appeared nowhere in the discussion above, nor did the concept of making separate filesystems work together atomically. Your assertion about "different physical disks" doesn't make any sense at all. Of course you can combine different physical disks into a single logical volume. You would then create a single filesystem on the logical volume. Look here [debian-adm...ration.org] for examples.
        
        Re: (Score:2)
        
        by MikeBabcock ( 65886 ) writes:
        
        Totally aside from your main point, what does the spindle count have to do with your VG naming?
        pvcreate /dev/sda1
        pvcreate /dev/sdb1
        pvcreate /dev/sdc1
        vgcreate LotsOfDrives /dev/sda1 /dev/sdb1 /dev/sdc1
        Now if you want spindle-specific LVs:
        lvcreate -n dbdata LotsOfDrives /dev/sdb1
        lvcreate -n logdata LotsOfDrives /dev/sdc1
        
        Re: (Score:3)
        
        by jamesh ( 87723 ) writes:
        
        I'm still not getting how you can simultaneously snapshot dbdata (optimised for read and write) and logdata (optimised for write) as an atomic operation. "Tough Love (215404)" said "concatenate them together" but I don't get what that means in this context.
        Last time I checked you would still have to snapshot one, then the other, and the resulting snapshots are almost certainly not going to give you a consistent backup because there would have been writes between the first and the second snapshots.
      - Re: (Score:2)
        
        by jamesh ( 87723 ) writes:
        
        Why would you spread a database over multiple Logical Volumes. That just sounds like a poorly engineered LVM setup. Am I wrong?
        The idea is to spread it over separate underlying disks or RAID sets. MSSQL and Exchange transaction logs are pretty much write only. The databases themselves are read/write, obviously, but still might be read-mostly or write-mostly. By putting them on separate array's you can optimize the caching, RAID type, and RAID stripe size in each array for its intended purpose. Even spreading different database tables over different arrays can help too depending on the usage patterns.
        Oracle have the similar recommen
    - Re: (Score:2)
      
      by belrick ( 31159 ) writes:
      
      Btrfs, like ZFS and Netapp's WAFL, use a far more efficient copy-on-write strategy that avoids the write penalty.
      WAFL doesn't do copy-on-write. Copy-on-write means a write to a block in a file requires the original block to be read, written elsewhere for the snapshot, then the new block written in the original location. That's exactly what WAFL doesn't do. WAFL writes all changed blocks for multiple files in big RAID stripes, updating pointers to current copies and leaving snapshot pointers pointing to old copies of the updated files. Very efficient for writes, but changes almost all reads, random or sequential (w
      - Re: (Score:2)
        
        by Tough Love ( 215404 ) writes:
        
        Btrfs, like ZFS and Netapp's WAFL, use a far more efficient copy-on-write strategy that avoids the write penalty.
        WAFL doesn't do copy-on-write. Copy-on-write means a write to a block in a file requires the original block to be read, written elsewhere for the snapshot, then the new block written in the original location. That's exactly what WAFL doesn't do. WAFL writes all changed blocks for multiple files in big RAID stripes, updating pointers to current copies and leaving snapshot pointers pointing to old copies of the updated files. Very efficient for writes, but changes almost all reads, random or sequential (within a file) into random reads (within the filesystem) because file blocks get scattered according to write order, not location of the block within the file. That's why they want lots of spindles in an aggregate and they love RAM cache and flash cache.
        But since you say that copy-on-write avoids the write penalty I think you know what is does but simply don't know that it isn't copy-on-write.
        We both know what we're talking about, we just disagree on terminology. Properly, a "copy-on-write" doesn't modify the original destination. [wikipedia.org] Nobody should ever use the term "copy-on-write" to describe the algorithm that is properly "copy-before-write". The strategy that leaves the original destination untouched and updates pointers to point at the modified copy is correctly called "copy-on-write", but because the terminology has been so commonly abused by the likes of Microsoft and their followers, it is be
    - - Re: (Score:2)
        
        by Tough Love ( 215404 ) writes:
        
        Sounds like Btrfs envy. Question is, can they get to work reliably?
        Here is an informative post [blogspot.com] that details why Microsoft's Refs sucks and you don't need to care about it. Even if it works reliably, which is not at all assured (see many reports on the net of issues) this filesystem is pathetically feature poor. What's the point.
  - Enterprise architect here (Score:2)
    
    by iggymanz ( 596061 ) writes:
    
    Deduplication typically isn't done by the operating system in production systems, it is a feature of enterprise grade storage, backup and archival systems.
    Snapshots and encryption can be done in GNU/Linux, or done outside the OS.
    What enterprise grade storage/backup/archival systems are you using, the obvious solution will already be evident from that answer in most cases.
- Re: (Score:2, Interesting)
  
  by Anonymous Coward writes:
  
  Wouldn't it be cheaper and just as effective to use FreeBSD or FreeNAS for your data? if you're considering either Windows or Solaris then obviously you don't need a specific operating system. I would think FreeBSD (or even ZFS on Linux) would suit your purposed better 9and with less expense) than Windows or Solaris.
- Re:Can we get a real Linux filesystem, please? (Score:5, Informative)
  
  by maz2331 ( 1104901 ) writes: on Friday December 14, 2012 @10:51PM (#42298139)
  
  ZFS on Linux does exist as a kernel module that is pretty stable and works well. http://zfsonlinux.org/ [zfsonlinux.org] -- it was put out by Lawrence Livermore National Lab, but can't be included with the kernel distros due to GPL / CDDL license compatability issues.
  
  Parent Share
  twitter facebook
- Re: (Score:2, Informative)
  
  by Anonymous Coward writes:
  
  Linux has production level encryption, snapshots, and LVM2. What are you talking about?
  Unless you have very specific uses, deduplication should be done at your storage array really. It's not a high priority to implement in the filesystem. (No, your anecdote does not make it a high priority).
- Re: (Score:2)
  
  by WWJohnBrowningDo ( 2792397 ) writes:
  
  Did you guys look at FreeBSD?
- No (Score:3, Interesting)
  
  by ArchieBunker ( 132337 ) writes:
  
  Instead of picking a filesystem and moving forward people will moan and cry and eventually split into a few different groups with beta level implementations. Sound on Linux is a great example. Two completely different sound drivers that both work half assed. What's the word with XFS these days?
  - Re: (Score:2)
    
    by drinkypoo ( 153816 ) writes:
    
    What's the word with XFS these days?
    I don't know, but my last word is that I dropped it due to data corruption and now I'm using ext4 while I'm waiting for btrfs.
    I was hoping to be using bcache by now too, but alas, no. I have an 80GB SSD and a 320GB HDD, which I will bump up to 2x1TB stripe and backup to 2TB external... just as soon as I can install with bcache without having to do it all manually.
    - Re: (Score:2)
      
      by Wolfrider ( 856 ) writes:
      
      --Have you tried JFS? I'm a heavy Vmware user and it works really well, with minimal CPU usage.
  - Re: (Score:2)
    
    by diegocg ( 1680514 ) writes:
    
    What's the word with XFS these days?
    http://www.youtube.com/watch?v=FegjLbCnoBw [youtube.com]
- Re: (Score:2)
  
  by guruevi ( 827432 ) writes:
  
  Solaris and it's derivatives can be had for free. You don't HAVE to buy it and it's derivatives like OpenIndiana are very stable.
  - Re: (Score:2)
    
    by iggymanz ( 596061 ) writes:
    
    opensolaris is long dead. OpenIndiana has never put out a stable release and never met their 2011 q1 stable release target. they put out a development release once in a while, but that is NOT production grade nor matained at a level suitable for production use
- Re: (Score:2)
  
  by MikeBabcock ( 65886 ) writes:
  
  Funny, even my home box uses LVM over dm-crypt over RAID on Linux just fine. And that's with Ext4 file systems.
  LVM lets me create a snapshot for consistent backups any time I want.
- Re: (Score:2)
  
  by LWATCDR ( 28044 ) writes:
  
  I would say that you should look at BSD then. If you are willing to go open souce anyway FreeBSD offers ZFS. Too bad that more hardware and software companies do not support BSD as well as Linux.
- Re: (Score:2)
  
  by T-Ranger ( 10520 ) writes:
  
  If you ... your employer ... are prepared to spend money, then why not spend money? I mean, and this is a serious question, why not go with something like a EMC VNX or VNXe? Byte for byte of real physical storage SANs are pretty expensive, I grant, but the features can oft make up for that.
- Re: (Score:3)
  
  by dna_(c)(tm)(r) ( 618003 ) writes:
  
  [...]I just got out of a meeting at my job [...]and because Linux has no stable filesystem with enterprise features [...]
  Sure, AC has some real complex stuff to handle on an enterprise level. That's why all the big boys like Google, Facebook and Twitter are using Windows to host their data...
  You're either a silly moron, a self deluding enterprisy [a-z]+architect or a very capable troll.
  - Re: (Score:2)
    
    by donaldm ( 919619 ) writes:
    
    I will second a very capable troll. :)
- Re: (Score:3)
  
  by Zero__Kelvin ( 151819 ) writes:
  
  "I just got out of a meeting at my job because we are replacing some old large servers... and because Linux has no stable filesystem with enterprise features, looks like things are either going to Windows, or perhaps Solaris x86 (which is expensive.)"
  Somebody notify the millions of Enterprise servers that are Linux based, and serving up a major portion of the internet's content every day! Talk about throwing the baby out with the bathwater. Basically, you don't want to take a chance that established files
- Re: (Score:2)
  
  by Lennie ( 16154 ) writes:
  
  It depends on your needs.
  Take for example the top500, if I'm not mistaken more than 50% of that uses Lustre as the filesystem. Which is obviously Linux based.
  I think both Ceph ("inspired" by Lustre) and btrfs are interresting and I'm sure they'll be more than production ready next year.
  Hopefully with bcache in the mainline kernel too.
- Re: (Score:2)
  
  by Rich0 ( 548339 ) writes:
  
  btrfs is a step in the right direction, but even now, Linux does not have production-level deduplication (which even Windows has, for crying out loud), encryption, snapshots, or something even close to supplanting LVM2.
  Well, that might be why they're working on btrfs, then. :) I'm not sure about encryption, but everything else on your list is something likely to be in the feature list at some point. It obviously isn't stable yet, but that is a matter of time, and if somebody wanted to make a push to get something stable they'd get there a lot faster with btrfs than reinventing something else.
  btrfs already supports reflink copies (think of a copy that behaves like a hard link on initial copy, but each file tracks its own
- - Re: (Score:3)
    
    by smash ( 1351 ) writes:
    
    Data integrity for one?
    - - Re: (Score:2)
        
        by smash ( 1351 ) writes:
        
        1. No storage array does it properly. 2. You can BUILD a ZFS storage array with de-dup, compression, self-healing, etc. for cheaper than you can buy a Netapp or EMC. A filesystem approach is the only way to ensure end-to-end data integrity, correcting tranmission errors between the host and the storage, etc.
        
        Re: (Score:2)
        
        by kasperd ( 592156 ) writes:
        
        A filesystem approach is the only way to ensure end-to-end data integrity
        Integrity checks in the file system certainly provides much better guarantees than integrity checks on the storage level. And anybody designing file systems today should build integrity checks into their file systems. But the higher a layer you move the integrity checks to, the closer you get to real end-to-end integrity. File system integrity checks don't protect data while it is sitting in memory.
        
        If you copy a file from one file s
- - Re:Can we get a real Linux filesystem, please? (Score:4, Informative)
    
    by grumbel ( 592662 ) writes: <grumbel+slashdot@gmail.com> on Friday December 14, 2012 @10:41PM (#42298087) Homepage
    
    I have seen the userlevel ZFS crash multiple times, it's also slow as hell. It's still worth it if you are short on storage and want to reduce the size of your backup, but I wouldn't exactly call it ready for production.
    
    Parent Share
    twitter facebook
    - Re:Can we get a real Linux filesystem, please? (Score:4, Informative)
      
      by dbIII ( 701233 ) writes: on Friday December 14, 2012 @11:00PM (#42298219)
      
      Kernel level probably is ready, but not on 32bit (big hassles there but probably not a big deal to most) and on 64 bit there are some memory usage problems and performance seems to suck when there's a dozen or so hosts keeping connections to files on ZFS open via NFS at the same time. There's still a way to go before ZFS on linux gets to where it is on FreeBSD but it's still early days, and for many usage patterns it looks like it is ready for production.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by drinkypoo ( 153816 ) writes:
        
        There's still a way to go before ZFS on linux gets to where it is on FreeBSD but it's still early days, and for many usage patterns it looks like it is ready for production.
        Can I get it as just a module, or do I need to build a custom kernel package? I can do that, but I prefer not to.
        
        Re: (Score:2)
        
        by dbIII ( 701233 ) writes:
        
        By default it builds as a module from source and I don't think anybody is packaging it yet. It seems to use close to 4GB (which seems well over twice what ZFS on FreeBSD appears to be using) so I wouldn't recommend it on anything with less memory than that from what I've seen of it.
        
        Re: (Score:2)
        
        by drinkypoo ( 153816 ) writes:
        
        It seems to use close to 4GB (which seems well over twice what ZFS on FreeBSD appears to be using) so I wouldn't recommend it on anything with less memory than that from what I've seen of it.
        That's an awful lot for a filesystem. What does it use on slowlaris?
        
        Re: (Score:2)
        
        by dbIII ( 701233 ) writes:
        
        I'm not sure, my solaris boxes don't have a lot of storage so I haven't touched ZFS on solaris. I've got it running on two FreeBSD machines, one with a total of 2GB memory, and total memory usage rarely goes above 512MB (it went past that when I was moving a 350GB file) so it looks like just a sign that the linux version is still in it's early days. I'm moving the 4GB linux machine over to FreeBSD this week since all the memory slots are used.
        I haven't used it a lot, but so far FreeBSD with the ports coll
        
        Re: (Score:2)
        
        by drinkypoo ( 153816 ) writes:
        
        I haven't used it a lot, but so far FreeBSD with the ports collection looks to me a lot like what Gentoo linux was intended to become.
        Maybe I'll look at it again for my next filer. Last time it seemed to be annoying for the sake of being annoying, which was also my impression of the FreeBSD users I knew personally, but that doesn't mean they're all like that. I've used netbsd and OpenBSD and even 4.3BSD-lite on ROMP but not FreeBSD.
- - Re: (Score:3)
    
    by Agent ME ( 1411269 ) writes:
    
    If snapshots are handled by the filesystem, then it could be possible to snapshot a specific directory or file rather than a whole partition for example. Snapshots in the filesystem also prevents stuff like changes to space that was free when the snapshot was taken from being unnecessarily remembered.
  - Re: (Score:3)
    
    by petermgreen ( 876956 ) writes:
    
    Also, ZFS is an insane thing written by people who don't seem to understand that keeping a good separation of concerns can lead to a rather slick set of general tools that can be used on almost any fs.
    Separating stuff into layers has benefits but it also has costs. Sometimes merging layers can make things practical that aren't practical with them separate. Afaict this is what drove the creation of zfs and btrfs.
    Lets first look at RAID. traditional raid provides protection against reads that fail but not against reads that silently return wrong data. Experience has shown that hard drives cannot be trusted not to silently return wrong data. Worse still raid resyncs after power failure may silently overwrit
- - Re:Dedupe doesn't belong in a filesystem (Score:4, Informative)
    
    by LWATCDR ( 28044 ) writes: on Saturday December 15, 2012 @10:20AM (#42300989) Homepage Journal
    
    You then turn it off.... And go take your meds.
    I do not think you know what DeDup means. You as a user still see two copies of the file. If you make changes to one copy of the file it will only change that copy of the file. It is not like a link. In other words it is totally transparent to the end user but saves drive space. So if you work in a large organization and someone sends out an email to all 4000 people that email will only take up the space of one email. Even if everyone saves it the imap server.
    In other words you do not know what you are talking about, you probably do not need these functions because you probably do not run a server or servers for a large organization, you seem to have some anger issues, and maybe just a little nuts.
    
    Parent Share
    twitter facebook
- - Re: (Score:3)
    
    by LWATCDR ( 28044 ) writes:
    
    Wow, just how many clueless people are on Slashdot posting as ACs?
    "No good deduplication software? Don't put duplicate data on the system in the first place!"
    Okay Sparky you have 5000 users on a server and that all save that email about vacation time or the pictures from the office party. Redundant data. This is a large system with lots of users, it is not for you leet Linux box you have in your mom's basement. Your plays on Microsoft's name are also childish and over done. Now there is a valid argument tha
  - - Re: (Score:2)
      
      by smallfries ( 601545 ) writes:
      
      When you have a filesystem that understands hard links, deduplication is still required to find files that have the same content and link them together. You are possibly thinking of a filesystem that hashes contents to decide on storage locations.
    - Re: (Score:2)
      
      by TCM ( 130219 ) writes:
      
      What a load of BS. What if two files happen to have the same content, but shouldn't really be tied to each other?
      Two hardlinked files are forever stuck together until you unlink them manually, down to their file access times and everything. If I write to one, the other changes.
      Deduplication doesn't have this semantic tie. Two files happen to have the same content? Fine, save space. But write to one file and the other stays as it was. Plus you _still_ have hardlinks if you want to create a semantic connectio
    - Re: (Score:2)
      
      by jamesh ( 87723 ) writes:
      
      When you have a filesystem that understands hard links, deduplication is redundant.
      I would argue that maybe it doesn't belong in the filesystem in the first place. If you have a bunch of VM's all with (say) Debian Wheezy then deduplication in the backend storage would do much more than simple FS deduplication. Some FS knowledge in the storage would be useful (eg files with the same name in each FS are probably a good place to start to look for duplicates) but even that is just an optimisation and isn't required.
- - Re: (Score:2)
    
    by TCM ( 130219 ) writes:
    
    So the also non-existent data integrity is the reason they don't have deduplication? Why don't you just say "Yes, we don't have a real filesystem" instead of these laughable arguments?
Requires local access (Score:5, Funny)

by Anonymous Coward writes: on Friday December 14, 2012 @09:48PM (#42297733)

no more dangerous than a fork bomb or filling up /tmp or trying to compile open office.

Share
twitter facebook
- Re:Requires local access (Score:5, Informative)
  
  by cryptizard ( 2629853 ) writes: on Friday December 14, 2012 @10:38PM (#42298081)
  
  Sort of, but at least you can recover from those attacks by restarting or booting from an external source to clean up your filesystem. The second attack here leaves you with undeletable files because the file system code responsible for deleting cannot handle the multiple hash collisions. There is no way to recover from that until a patch is pushed out that fixes the problem.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by blade8086 ( 183911 ) writes:
    
    Which, without the over sensationalized BS that is this story, will probably be in about a week tops.
    And since BTRFS is not in any 'enterprise' Linux Distributions, means that it will pretty much be available
    immediately since everyone running it in critical production environments will probably be running
    pretty bleeding edge linuxen
  - Re: (Score:2)
    
    by drinkypoo ( 153816 ) writes:
    
    The second attack here leaves you with undeletable files because the file system code responsible for deleting cannot handle the multiple hash collisions. There is no way to recover from that until a patch is pushed out that fixes the problem.
    There's no filesystem debugger for btrfs?
    Seems to me like fsck ought to be able to solve this problem, too. Two files with the same hash? Delete the one with the newer timestamp.
    - Re: (Score:3)
      
      by cryptizard ( 2629853 ) writes:
      
      Two files with the same hash is not a problem, it is allowed. This will happen just by chance many times on your filesystem because the hash is relatively short (64 bits). The problem is when you engineer many files to have the same hash and your data structure (hash table) degrades to an array. There is also some other problem in the code here that makes it so the the hash table can't store or for some reason can't process more than a certain number of collisions.
- - Re: (Score:2)
    
    by ArsenneLupin ( 766289 ) writes:
    
    Well, it requires the ability to create named files. That could happen through a Wiki upload page, by extraction of an archive to a temporary folder for processing, etc.
    ... or worse, web caches which preserve original file names...
Nice! (Score:4, Interesting)

by gweihir ( 88907 ) writes: on Friday December 14, 2012 @10:21PM (#42297971)

"Algorithmic Complexity Attacks" like this one have long been known, but rarely been documented publicly. One good example to point out why hash-randomization is a good idea!

Share
twitter facebook
- - Re: (Score:3, Funny)
    
    by Anonymous Coward writes:
    
    Words, they mean nothing! Take 'rarely' for example, who gives a shit, I'll read it as 'never' same thing.
Nice this was found before BTRFS goes stable (Score:5, Insightful)

by Anonymous Coward writes: on Friday December 14, 2012 @10:31PM (#42298029)

Hopefully more people start fuzzing btrfs so it is that much better when it is declared stable.

Share
twitter facebook
- Re: (Score:2)
  
  by Rich0 ( 548339 ) writes:
  
  Lots of people have been doing testing on btrfs. Filesystems aren't so much declared as stable as they become used as stable. Unless the fix changes the on-disk format in some non-backwards-compatible way, it doesn't really matter when the fix gets deployed. Most likely the fixes will be in git in a week or two.
  Oh, and anybody who really wants to run btrfs should probably be running the git version anyway. They're doing so many bugfixes per month that this is one of those rare times where the mainline k
- - Re: (Score:2)
    
    by Rich0 ( 548339 ) writes:
    
    Just one of those issues with running an open-source OS published by the vendor of a proprietary OS. OpenSUSE and Fedora tend to be treated like guinea pigs.
    And this isn't necessarily a bad thing. If you're a RHEL shop then you probably want to have some Fedora test systems to get a sense for how your applications will operate in future versions.
    If you want something free and stable, you run something like Debian or CentOS, or whatever.
Who cares? (Score:2)

by UltraZelda64 ( 2309504 ) writes:

Unstable software that is still under heavy development is actually unstable. Who would've guessed?
I think that based on this ingenious discovery, we should all switch over to it by next week.
Good god man (Score:2)

by tomp ( 4013 ) writes:

"Denial-of-Service Attack Found In Btrfs File-System" didn't happen. A vulnerability was found. That's a big deal, no reason to obscure it.
Attack? (Score:3)

by Decameron81 ( 628548 ) writes: on Saturday December 15, 2012 @12:18AM (#42298681)

An attack was found in the filesystem? What's that supposed to mean?

Share
twitter facebook
- Re: (Score:2)
  
  by dr2chase ( 653338 ) writes:
  
  Carefully chosen file names (a lot of them) can DOS file system performance. Whether this could be escalated to a network vulnerability, hard to say -- if an attacker over the net can figure out a way to induce particular file names on the server, that would be worse.
  It's a little sad that people are still forgetting about this failure mode of hash tables and hash functions; either there's got to be a randomizing secret swizzled in, or a better (more nearly cryptographically strong) hash function, or both.
  - - Re: (Score:2)
      
      by dr2chase ( 653338 ) writes:
      
      True, but good random numbers (good hashes) have interesting and powerful statistical properties.
      - Re: (Score:2)
        
        by dr2chase ( 653338 ) writes:
        
        Read about universal hash functions (the writeup on wikipedia is not that bad). They're not a hack.
        You don't necessarily use a small space, either -- a 64-bit hash is not normally regarded as a small space, thought it is often smaller than the bit size of what is hashed into it.
        Two problems with trees are that you need to define a comparison (you can often concoct one, but they're not always given to you) and though memory is cheap, *probes* into memory are not. If a hash function can get you there in 1 st
- Attack vector (Score:2)
  
  by aNonnyMouseCowered ( 2693969 ) writes:
  
  Indeed, the title makes you think that BTRFS was trojaned or worse is malware.
- Re: (Score:2)
  
  by Noughmad ( 1044096 ) writes:
  
  An attack was found in the filesystem? What's that supposed to mean?
  I'm not sure, but it sure sounds like Mr. Reiser had something to do with it.
Can we get a real editor? (Score:2, Insightful)

by Anonymous Coward writes:

Editors please! I normally expect even a submitter to know the difference between an attack and a vulnerability. However the editor damn well better know the difference. When I read that an ATTACK had been found in btrfs I went to read about how some malicious code had been placed into the code for btrfs. Maybe this code modified data, erases stuff, sends data to China, or just renames files. But no, this was a simple vulnerability. They didn't find an attack in btrfs, they found the potential for an attack
- Re: (Score:3)
  
  by Nimey ( 114278 ) writes:
  
  ed(1) is the standard text editor.
- Re: (Score:2)
  
  by cpghost ( 719344 ) writes:
  
  I thought about trying ZFS at one point, but decided that using Solaris as a non-server OS is pointless. Does anyone still use Solaris?
  Have you thought about using ZFS on FreeBSD? Running FreeBSD/amd64 here on a desktop machine with ZFS file systems without any problems.
- - Re: (Score:2)
    
    by maxwell demon ( 590494 ) writes:
    
    Or just use a RB tree instead of a linear list for hash collisions, then you get only O(log n) instead of O(n) worst case search performance.
    To quote Wikipedia:
    Instead of a list, one can use any other data structure that supports the required operations. For example, by using a self-balancing tree, the theoretical worst-case time of common hash table operations (insertion, deletion, lookup) can be brought down to O(log n) rather than O(n). However, this approach is only worth the trouble and extra memory co
  - - Re: (Score:2)
      
      by MikeBabcock ( 65886 ) writes:
      
      There are much more efficient hashes than MD5 that would work as well for fewer clock cycles. http://cr.yp.to/hash127.html [cr.yp.to] comes to mind.
    - - Re: (Score:2)
        
        by Tough Love ( 215404 ) writes:
        
        a 64-bit output isn't really collision resistant anyway
        Plenty good enough for a hashed directory key, which doesn't need to be crypticographically secure, just to have good distribution and random results affected as much as possible by all input bits. The size of the output is not the dominant factor, the quality of the input mixing is.
- - - Re: (Score:2)
      
      by Wolfrider ( 856 ) writes:
      
      --Yah, the movie version with George Clooney had a pretty hot chick in it, too... :D
- Re: (Score:2)
  
  by FranTaylor ( 164577 ) writes:
  
  Are you saying that google's file systems are corrupt?
- Re: (Score:2)
  
  by iggymanz ( 596061 ) writes:
  
  actually most client/server file systems can be DOS'd by too many requests.....local access generally implies the ablility to clog things up
- - Re: (Score:3)
    
    by Zero__Kelvin ( 151819 ) writes:
    
    It is stupid to make this racial, but since you did, when was the last time a black guy opened up on a group of innocent school children?
    - - Epic Fail (The joke's on you) (Score:3)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        A good joke requires significantly planning.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Who ported btrfs to DOS? (Score:5, Funny)

Re:Who ported btrfs to DOS? (Score:5, Funny)

Re: (Score:3, Informative)

Re: (Score:3, Funny)

Re: (Score:2)

Can we get a real Linux filesystem, please? (Score:5, Interesting)

Re:Can we get a real Linux filesystem, please? (Score:5, Informative)

Re:Can we get a real Linux filesystem, please? (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Can we get a real Linux filesystem, please? (Score:4, Interesting)

Re:Can we get a real Linux filesystem, please? (Score:5, Informative)

Re:Can we get a real Linux filesystem, please? (Score:4, Insightful)

Re: (Score:3, Informative)

Re: (Score:3)

Re:Can we get a real Linux filesystem, please? (Score:5, Informative)

Re:Can we get a real Linux filesystem, please? (Score:5, Insightful)

Re: (Score:3, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Enterprise architect here (Score:2)

Re: (Score:2, Interesting)

Re:Can we get a real Linux filesystem, please? (Score:5, Informative)

Re: (Score:2, Informative)

Re: (Score:2)

No (Score:3, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re:Can we get a real Linux filesystem, please? (Score:4, Informative)

Re:Can we get a real Linux filesystem, please? (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re:Dedupe doesn't belong in a filesystem (Score:4, Informative)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Requires local access (Score:5, Funny)

Re:Requires local access (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Nice! (Score:4, Interesting)

Re: (Score:3, Funny)

Nice this was found before BTRFS goes stable (Score:5, Insightful)

Re: (Score:2)