Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
Data Storage Stats Linux

What's the Damage? Measuring fsck Under XFS and Ext4 On Big Storage 196

Posted by timothy
from the disks-groaning-with-shame dept.
An anonymous reader writes "Enterprise Storage Forum's long-awaited Linux file system Fsck testing is finally complete. Find out just how bad the Linux file system scaling problem really is."
This discussion has been archived. No new comments can be posted.

What's the Damage? Measuring fsck Under XFS and Ext4 On Big Storage

Comments Filter:
  • by drewstah (110889) on Friday February 03, 2012 @12:54PM (#38917591) Homepage

    When I had some EBS problems a couple years ago, I figured I would run xfs_check. It seemed to do absolutely nothing, even if there were disks known to be bad in the md array. xfs is nice and fast, but I haven't seen the xfs_check or xfs_repair to do either of the things I'd assume they'd do -- check and repair. I found it easier to delete the volumes and start from scratch, because any compromised xfs filesystem seems to be totally unfixable. Is fsck for xfs new?

  • by h4rr4r (612664) on Friday February 03, 2012 @01:15PM (#38918073)

    Because sometimes it does work. Relying on any such software is stupid.

    While the FSCK/CHKDSK runs you restore onto another machine. This way if the check finishes first, you can use it until you can switch over to the restored machine. It also can save your ass if you are not smart enough/fortunate enough to have good backups.

  • Re:Why bother? (Score:4, Interesting)

    by _LORAX_ (4790) on Friday February 03, 2012 @01:47PM (#38918647) Homepage

    Our BTRFS evaluation resulted in rejecting it for some very serious problems ( what they claim are snapshots are actually clones, panic in low memory situations, no fsck, horrible support tools, developers who are hostile to criticism, pre-release software, ... ). ZFS was nice, but limited to non-distributed systems and still had a non-trivial amount of volume and backend management headaches. Personally I use ZFS for my personal servers at home ( incremental snapshots are the bomb ) but out production systems needed more.

  • by hackstraw (262471) on Friday February 03, 2012 @01:50PM (#38918699)

    The largest filesystem I admin is just shy of 1/2 petabyte. And its one in number. Backing up everything on that filesystem is simply not feasible. To put it in perspective 1 stream @ 200 MiB/s would take almost 28 days to backup the whole thing. I would imagine a restore would take about the same order. Telling hundreds of users their files are unavailable for reading or writing for 30 days is not really an option, so I run fsck.

    Backups simply are not really an option past 20+ terabytes of storage, and simply not feasible if the storage is volatile in nature. AFAIK everyone has gone to redundancy over backups at scale.

  • by Anonymous Coward on Friday February 03, 2012 @02:38PM (#38919397)

    So you have 1/2 petabyte storage but 200 MiB/s speed -- are you kidding me ? Is your storage controller broken or really cheap or both ?

    Also, xfsdump (which is used to backup xfs) can do multi-threaded backups.

    Now to comment on the test -- it is completely insane. As mentioned by you and others, if you are running fsck while your whole application is down -- thing broken is not system but the thing inside the skull -- you will obviously need a very fast backup/restore and/or a HA solution, both are not (and need not be) mutually exclusive.

  • Re:Why bother? (Score:4, Interesting)

    by Guspaz (556486) on Friday February 03, 2012 @02:41PM (#38919467)

    ZFS now runs pretty well on Linux too, as a kernel module, thanks to zfsonlinux. If you're running a Debian-based distro, installing it is trivial (one command to add the PPA, one command to install the package).

  • by tlhIngan (30335) <[ten.frow] [ta] [todhsals]> on Friday February 03, 2012 @03:23PM (#38920029)

    The largest filesystem I admin is just shy of 1/2 petabyte. And its one in number. Backing up everything on that filesystem is simply not feasible. To put it in perspective 1 stream @ 200 MiB/s would take almost 28 days to backup the whole thing. I would imagine a restore would take about the same order. Telling hundreds of users their files are unavailable for reading or writing for 30 days is not really an option, so I run fsck.

    Which means You're Doing It Wrong(tm).

    Two words: volume snapshot.

    What it does is give you a view of the filesystem as it exists at that the time the snapshot is taken. The frozen image is mounted in another mountpoint (read-only), while the snapshotted voume is still accessible (read-write). Changes to the volume since the snapshot was taken won't be in the snapshot (obviously).

    Your backup points to that snapshot which won't change and that's copied to tape. Once you're done backing up 30 days later, you delete the snapshot.

    Since your backup takes so long, you'd immediately then make another snapshot and being the backup again.

    If it's a database, the database backup tools work on a database snapshot - it will be correct and consistent as of when the snapshot was taken while the database remains available for reading and writing outside of the snapshot.

    Having to take a system down to back it up is a dead concept on modern OSes as they all tend to have snapshot capability.

  • Re:linux is fail (Score:4, Interesting)

    by jd (1658) <<moc.oohay> <ta> <kapimi>> on Friday February 03, 2012 @04:32PM (#38920875) Homepage Journal

    Works best if you use the "Doom as Sys Admin" hack [sourceforge.net].

  • Re:linux is fail (Score:5, Interesting)

    by jd (1658) <<moc.oohay> <ta> <kapimi>> on Friday February 03, 2012 @04:58PM (#38921161) Homepage Journal

    A lot of stuff is also faster on Linux, particularly on the x86. Solaris x86 is dog slow. AIX ("aches") is an appropriate name for a mainframe OS that never really got the hang of this new-fangled "interactive user" stuff. It's a good mainframe OS, that is what it is designed for, tuned for and intended for, but traditional mainframe batch transactional work isn't the sort of payload that is typically run these days. The high-end users want hard real-time (i.e.: they know to the microsecond - or nanosecond, in some cases - exactly when each process will start and stop) for data collection, data analysis and simulation. The data centers want massive multithreading for gigantic servers with minimal overhead and service guarantees per thread. The typical user wants extremely low latency interactive. None of these are pre-scripted batch jobs.

    Now, if you wanted to develop a data warehouse for, say, technical writings, journalism, etc, where you're compiling a collection of things that can be typeset overnight, that may be doable as a batch job. However, anyone planning on publishing a journal that needs 72 terabytes of storage had best consider the marketplace a little more closely first. A publishing company, say Nature, might conceivably have use for AIX for batch work. I could see the number of submissions, referee responses and article selections per journal being such that a mainframe would be a perfectly valid way to do things. Even then, it might still be sufficiently small that a live transactional database would be more cost-effective.

    Traditionally, batch processing has been a niche market for electrical and gas companies, etc, where the number of customers is staggering. Even then, it has largely been replaced with live transactional systems because customers want things adjusted NOW and not overnight or at the end of the week.

    Mass mailers still use batch processing, but printing is the bottleneck and there is no point in having an expensive OS process everything in a fraction of a second on an expensive mainframe when it takes N actual real-world seconds before a printer becomes available to take the next block of data. You need run no faster than the slowest component because the end produce won't be delivered any faster. You would have to have a gigantic number of printers before the OS became a significant factor and most shops just don't have that kind of printing power.

Stinginess with privileges is kindness in disguise. -- Guide to VAX/VMS Security, Sep. 1984

Working...