Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Linux Software

Ext3 Filesystem Explained 174

sheckard writes: "The next installment of the wonderful Advanced filesystem implementor's guide, part 7, details the ext3 filesystem in all of its glory. This is another great voyage into the world of journaling filesystems, and ext3 has been rock-solid in my experience."
This discussion has been archived. No new comments can be posted.

Ext3 Filesystem Explained

Comments Filter:
  • by Fillup ( 121335 )
    Journaling file systems / ext3 -- is this the best? Or should we be looking in another direction entirely?
    • Actually, the best filesystem on Linux right now for most uses is probably XFS. Its a little slow on deletes, and not as fast as Reiser for extremely small files, but from the stuff I've done with both (compiling, tar/untar, moving around directories, general workstation stuff) XFS is just as fast as Reiser for normal sized files, and much faster for large files. JFS is the dark horse here, though. I've seen some benchmarks showing it to have as good large file performance as XFS, but much better metadata (creating, deleting, growing, etc) performance. But there's not much info on it yet, and its not rock solid entirely.
  • by Deal-a-Neil ( 166508 ) on Saturday November 17, 2001 @03:29AM (#2577861) Homepage Journal
    ext3 catches my fancy because there's no ext2 --> ext3 conversion -- you just have to unmount, make a journal file, and remount. reiserfs migration is a challenge for the huge partitions.
  • ext3 (Score:4, Insightful)

    by FeeDBaCK ( 42286 ) on Saturday November 17, 2001 @03:31AM (#2577862) Homepage
    One thing I would have to agree on in the usage of ext3 is the fact that the machine can be booted with a kernel that does not understand ext3 (only ext2) and the filesystem can still be read. This is a major strong-point in my book.
  • by nusuth ( 520833 )
    On my new machine I installed linux as my primary os, expecting soon get tired of it (again) and reconfigure a dual boot system windows as my primary OS. While installing linux, I didn't think much(since I would soon be destroying the partition anyway) and installed the system on reiserfs. To my surprise that didn't happen and unreliability of reiserfs started to bother me more and more. And with this article I'm convinced that ext3 is what I want. Now, how do I convert from ReiserFS to ext3? I have plent of empty space on a soon to be destroyed ntfs partition and a cd writer, so backing up existing data is no problem, but simply copying back files will not do the trick, right?
    • If your root partition is formatted as ReiserFS, you're pretty much limited. Try to make a partition big enough on your free space, and make an ext[2-3] there. Then copy everything that is on the root partition to the new ext* one (use "cp -pR" to preserve permissions). Try to reboot the system, passing 'root=/dev/hd??' to the kernel, being ?? the new ext partition. If everything boots fine, you're on your way. If not, you won't lose anything on your old ReiserFS root; just reboot as usual.
      • I now wish I had waited for your reply, instead I did what AC said, tar everything, mkfs and untar. For some unclear reason, lilo refused to understand that I was trying to install it on /dev/hda5 (/) instead of /dev/hda1 (backup.) But your suggestion does not seem to be working around this problem either, how could I move boot stuff to ext3 and destroy reiserfs after transition?
  • I know RH7.2 has Ext3 support. Right from the setup you too, very sweet.
  • Is EXT3 in Linus' tree yet? Other thing I'm wondering is if it's worth moving to the Alan Cox tree to get it?
  • Ext3 is quite nice, I've moved to it myself
    because the root partition until recently
    had to be ext2. I still can't help feeling
    that ext3 is slower though compared to Reiser.

    Anyone else got comments on performance?
    • Re:Performance (Score:2, Informative)

      by Anonymous Coward
      I've had an ext3 root partition for over a year - it needed a reboot to change root to ext3 in those days, though. All other partitions were done with a remount.

      Now, any ext2 fs can be turned into ext3 by a single tune2fs command, with no remount and no reboot.

      I'm sure you could come up with several benchmarks which show reiserfs to be faster than ext2/ext3/xfs/whatever, but for most desktops and servers, filesystem performance is not a factor to be overly concerned about - unless you choose something silly like umsdos or NFS over PPP. News servers and high load fileservers are a different matter of course.
    • I'm sitting at a humble (K6-2 450MHz, 128MB RAM, 8GB HD) box that ran for 230 days on RH7, and I never noticed speed issues with it...until installing RH 7.2 with ext3.

      The box is definitelyslower doing disk access now. It's a little disappointing, but I'm not really concerned: I'd rather have the protection of the journaling and I'm willing to pay with a little speed tradeoff. Eventually I'll get a bigger/newer/faster HD for this thing, and that will probably help as well.

      My perceptions may be skewed; my main box for the last year has been a PIII-750MHz with a 30GB drive, and that bad boy sings, so it may be that I just don't remember precisely what performance is "supposed" to be like on this one.

      • Re:Performance (Score:3, Informative)

        by rodgerd ( 402 )
        IIRC RH7.2 installs ext3 with both data and metadata logging enabled by default, so your performance change is most likely that you're doing two writes for every one you did before.
        • Actually, RH 7.2 uses the "data=ordered" mode, not the "data=journal" mode. The data is not stored in the journal, but it is written before to changes in the metadata are written (according to the article, that is). This should guarantee data consistency, and is faster than full data and metadata journaling, but still gives a minor performance hit.

          FWIW, I have tried both reiserfs and ext3fs on the same system, and haven't noticed a significant speed difference. Both seemed to work well for me.
  • Good Teckie Post (Score:2, Interesting)

    by Newt-dog ( 528340 )
    Well another great post that went over my head! :-)

    Although I did enjoy the paragraph on filesystem journaling -- After pulling my one of my [gasp] Win2000 servers offline the other night to do a defrag, I could appreciate the fact that a developer could tweak Ext3 to do some neat things. (ahh, for linux, at least) Like when I save and resave files on a test server, the journaling approach could be made more efficient by only saving the changed data! (not the whole freakin fragmented file)

    Now the question could be -- Is there someone who will step up to the plate and produce several custom filesystems. The article points out that there is no "best" file system, but given the options, I'm sure the teckie endusers could tweak settings to meet their needs, be it server or desktop.

    Newt-dog

  • ext3 vs xfs (Score:1, Interesting)

    by Anonymous Coward
    I've tried both, but when I played with XFS it took forever to rm -rf a 2 GB directory. Was it me, or is XFS extremely slow with removing lots of files? If so, is this because if takes forever to update the journal?
    • Re:ext3 vs xfs (Score:2, Informative)

      by be-fan ( 61476 )
      XFS historically has very bad delete performance. I don't think its the fault of the journal, since other things involving the journal (growing or creating files) aren't slow (though, ReiserFS does seem to have the best journaling code). I don't know what the official take on this is, but here's my theory. Most filesystems use a bitmap to keep track of free blocks. XFS, on the other hand, uses a pair of B+ trees to mange extents of free areas. This allows it to find better (more contiguous) blocks more quickly when an allocation has to be done. A bitmap, on the other hand, has to do a scan through the bits and can't afford to spend a lot of time looking in different places for the "best" place to allocate. However, when deleting a file, the bitmap approach already has all the addresses of the blocks, so its just a matter of clearing some bits. XFS, on the other hand, has to go ahead an reinsert the blocks back into the B+ tree, which takes many more disk access and much more time. Normally, this is an okay tradeoff, since you usually grow files more often than you delete (ie. you grow it many times while writing it out to 2GB, but delete the thing in one go). On systems like Squid server, on the other hand, you create and delete files like mad, so Reiser is often faster in that case.
      • Sounds like they might benefit from the optimization of not putting newly freed blocks back into the main B+ trees until it's got some dead time...

        If they build another B tree (only trivially balanced) as they delete files they could return control to the system quickly, and then they could pull the free blocks out of the temp tree and spend the time to properly balance the main trees as it builds them.

        In the event that it needs those blocks *now* it could stop and take the time to merge them into the tree immediately.

        The benefit is that it would only have bad performance on very full drives, where it is writing immediately following a delete, into the freed space. As opposed to how it sounds now where it has bad performance on all deletes.

        Deletes are a common enough action that I think you'd want to optimize for them.
  • by Anonymous Coward
    I hope that joe public will eventually realize journalling filesystems don't guarantee data integrity in the event of an unclean system shutdown.
    • Nothing can insure data integrity in case of mid-write shutdown. That's logically obvious

      Journaling insures filesystem integrity, which is very important. Mounting an unclean ext3 fs will take seconds - no need to check the filesystem for mid-write evidence, etc. - the journal says excatly what mid-write problems there are, and wether to delete them or keep them as files.

      If your system crashes in the middle of your work, and your hard drive wasn't physically damaged (it can happen. Use RAID if you're so paranoid), everything but your open files will be normal. Your open files might be 'un-journaled' (new official term? no) back to before you wrote them.
      • Journaling insures filesystem integrity, which is very important. Mounting an unclean ext3 fs will take seconds - no need to check the filesystem for mid-write evidence, etc.

        Let's say the journaling file system has 5% overhead (it probably has more). That means you lose more than 1h per day on a busy server--it's spread out, but it's still lost. You'd have to do a lot of rebooting in order to make up for that in terms of "saved" fsck time.

        • by cowbutt ( 21077 ) on Saturday November 17, 2001 @06:56AM (#2578029) Journal
          Let's say the journaling file system has 5% overhead (it probably has more). That means you lose more than 1h per day on a busy server--it's spread out, but it's still lost. You'd have to do a lot of rebooting in order to make up for that in terms of "saved" fsck time.

          Actually, Andrew Morton reckons [lwn.net] ext3 is actually quicker than ext2 in spite of the journalling. Go figure. :)

          --

          • Speed and overhead are different things: You can have an incredibly fast webserver, but it may take 99% CPU to pull it off. That's bad. :)

            Same goes for filesystems. A great filesystem is going to have stunningly low overhead, and be blisteringly fast [ plus be 8-hours-sleep reliable, but you can only choose two ;) ].

            I mean... You want your machine to do things outside of managing their filesystems, don't you? ;)
          • That may well be, but it doesn't really affect the argument. Journaling imposes an additional set of constraints on when and where data needs to be written to disk, and an optimal file system designed under those constraints is going to have more overhead than an optimal file system designed without those constraints. Generally, with journaling, either you write the same data multiple times sooner or later (which, I believe, ext2 does), or you put data in places where it may take longer to get to when reading it.

            The only time where journaling doesn't have any significant overhead is if you put the journal on another device that can operate in parallel.

        • That's an overhead you can safely quash by other means (faster disk subsystem, efficient RAID, etc).

          A fsck is unpredictable wasted time you can't get around unless you've used a journalling filesystem - it may take hours, it may not work at all.

          I'll play it safe, thanks.
        • That's absolute bull. What's the difference between 10ms at 10% cpu load and 10ms at 20% cpu load
          ?
        • by edhall ( 10025 ) <slashdot@weirdnoise.com> on Saturday November 17, 2001 @07:13AM (#2578040) Homepage

          A few points:

          1. You can't equate down-time to a slightly slower response time. Having a reboot time of tens of seconds vs. tens of minutes for (e.g.) a large source repository or a critical web server is well worth a minor performance hit. Reboot time is dead time for all who need access to the server.
          2. If your file server is running so close to capacity that a 5% decrease in maximum filesystem throughput represents a 5% slowdown in actual throughput, your server is dangerously overloaded already.
          3. In general, journaling affects write performance, not read performance. If your server performs mostly reads, the overall overhead of journaling may amount to much less than your 5% figure. Most (though not all) applications for file servers are read-intensive with incidental writes apart from the initial "load" of the server.
          4. Fast fsck's aren't the main reason for journaled filesystems. Rather, its the improvement in filesystem integrity that is the main attraction -- an improvement that incidently allows for fast fsck's.
          -Ed
          • Those are all the usual arguments. However, if you want reliability and avoid downtime, you must have redundant servers or replication; journaling will not protect against most of the problems that cause downtime. Once you have redundant servers, you can easily tolerate a little more time for fsck.

            What it comes down to is that journaling is a convenience feature. Relying on it for "filesystem integrity" or "reduced downtime" or "reliability" is foolish. You pay for fast reboots in slower performance and more complex file system code.

            • by sigwinch ( 115375 ) on Saturday November 17, 2001 @09:30PM (#2580052) Homepage
              However, if you want reliability and avoid downtime, you must have redundant servers or replication; journaling will not protect against most of the problems that cause downtime.
              Here in the real world we cannot afford triple redundant drives, motherboards, RAM, CPUs, power supplies, keyboards, mice, monitors, NICs, routers, and network cables for every single computer on every desktop in the entire organization. Sure, we could do it, but the cost would be ludicrous for a very small payback.

              Most computers simply don't need guaranteed zero downtime. What they need is bounded downtime. It's OK if they crash every once in a while, as long as they reboot cleanly within a few minutes. The biggest contributor to boot time after a crash is the file system check. Since a journalling file system can recover the file system within a few minutes, it is a huge win.

              Relying on it for "filesystem integrity" or "reduced downtime" or "reliability" is foolish.
              Here in the real world, even the big real-time transaction processing systems occassionally have common-mode failures that wipe out all the redundant subsystems at the same time. Lightning strikes, idiots frob the emergency power switch, etc. Thus, the big real-time systems need journalling even more desparately than the small systems.
              You pay for fast reboots in slower performance and more complex file system code.
              Sheer ignorance. Replication of filesystems and databases has at least as much of a performance hit as journalling, and the complexity is likely to be vastly higher.
            • Sure, redundancy is nice. That's entirely orthogonal to the issue here, however. If I have a farm of redundant servers, and one of them cooks its power supply, with a journaled file system I can be reasonably sure that the system will come back up without some unfortunate sysadmin pulling an all-nighter to reload after the spare PS is installed. Redundancy might make it less likely that a failure will inconvenience my customers/users, but it also makes it more worthwhile for me to reduce my per-server admin costs -- and journaling does that.

              -Ed
        • Actually, the new journaling filesystems (ReiserFS, XFS, and JFS) are all *faster* than ext2. Also, journaling itself can cost very little these days because modern JFSs use large buffers and coalesce writes. For example, BFS achieves metadata performance nearly as high as ext2 on a heavily loaded system. So if all you're doing all day is creating/deleting/growing/shrinking files, the filesystem is only slightly slower. When you factor in all the performance improvements, it end up being faster.
  • Partition resizing? (Score:2, Interesting)

    by Bun ( 34387 )
    I've converted over to ext3fs, and am curious about one thing: resizing the ext3fs partitions. I know Partition Magic can resize ext2fs partitions with no difficulty, and Linux won't miss a beat. If the file systems are cleanly unmounted, as during a shutdown, and the ext3fs partitions are resized using Partition Magic, will there be problems? Is there anything in the journal that would make the kernel panic and puke on the newly changed partitions? I have no plans to do this; I'm just curious what would happen if I did.
    • by Anonymous Coward
      I can think of at least one method that will work. Read on.

      According to the article you can freely convert back and forth between ext2 and ext3. So it follows that you could convert an ext3 back to an ext2, then resize the ext2 file system. Afterwards convert back to ext3. Not so tough.

      Maybe it is possible to do it directly too, but the method I suggest will work in any case.

    • by Sapien__ ( 156881 ) on Saturday November 17, 2001 @05:45AM (#2577990)
      This thread [redhat.com] might be useful.

      To summarize: yes, it's possible to resize ext3 partitions, so long as your resizer doesn't mind. Don't use Partition Magic to do it. It doesn't like it. Badly.

    • Why don't you use GNU Parted ? It supports ext3. Sorry, I can't help with Partition Magic.
    • do be aware that your journal may be to small if you lets say, double or triple the size of your partition.
  • Power loss (Score:2, Informative)

    by nick255 ( 139962 )
    Just had my first power-loss since switching to ext3 last night. Normally would take 10-15 minutes for my computer to restart after checking /home, etc. But today came up in just a couple of minutes with no corruption (or none I have noticed, or has been reported). So ext3 gets my thumbs-up!
    • Yep, I was reinstalling our main fileserver here at home the other week, upgrading to redhat 7.2. Unfortunately the space was a bit cramped and I didnt bother to put the cover back on the computer so the powerbutton ended up resting against my chair. Of course, this resulted in several instant poweroutages as I got up to get coffee, etc. I think I managed to instakill it 6 times total. Not a single problem noted, just fast log replays and up and running again :). Thumbs up for ext3 from me too.
  • by ppetru ( 24677 ) on Saturday November 17, 2001 @05:12AM (#2577968) Homepage

    The very existence of ext3, and it's complete forward and backward compatibility with ext2, shows that ext2 was extremely well designed by it's authors. Kudos to Remy Card, Ted Tso, and the rest of the ext2 team!

    Also, based on the same extensibility of ext2, Daniel Phillips is working on a directory indexing patch which speeds up ext2 by a huge factor when working with lots of files in a directory. You can get the preliminary patches here [linux.org] and see a graph of a simple file creation benchmark here [linux.org]. Amazing!

    • But why would someone now want to use ext2 given that (aparently, from the postings above), ext3 does a much better job (and so do other JFSes). Why spend time on ext2 development? Or is that going to affect ext3 too? So, that ext3 is faster than now?

      BTW, ext3 is doing a wonderful job on my desktop. One more thumb up ... ;-)
  • Do you still need the every 20th mount fsck???
  • ... is one that allows for 'query directories'.

    Explanation : I am of two minds abouth everithing, so I can never decide how to organize my files, i.e. for category (like executables, libraries, html, music ) or for products (like gnome, kde, MyPerferredApp ). So I want to do both.

    I want a filesystem in which you can define directories by query of file attributes : e.g. :
    mkdir ~/gnome_bin --query -type=executable -package=gnome
    And then the system keeps update my directory, and I can handle it with standard filesystem tools.

    I know that it isn't easy : that is why I'm aksing it as a cristmas gift.

    • In other words you want a relational database for a filesystem. I think some of IBM's mainframes have features like that.
    • Did you mean this [slashdot.org]?

      Slashdot: News for nerds. Again, and again, and again...
  • extreme, perhaps? extendable? extraneous? extatic?
  • I just moved over to this Reiserfs a couple of months ago. I like it and all but is ext3 better or faster. Faster is always better.
  • I am setting up a lab of 30 machines for internet surfing at my school. They are p200's with 32megs of Ram. I decided to go with XFS basically because I know SGI has been using it for a long time, and therefore, most of the bugs are probably worked out of it. For the Lab, I am using Redhat 7.1 XFS with IceWM as the window manager. The system boots, runs an autologin script I made, and goes into IceWM with Netscape.
    I was using Blackbox, but I decided not to, because I didn't look "Windowish" enough, and I didn't want people confused by it. IceWM looks great, runs fast, and has a little Penguin for the start button. It took me about a month now to get all the net cards in the 30 computers (along with other stuff) and now all I have to do it haul them over to the middle school, and ghost them with the image I have on CD.
    I am very happy, because I have been working the bugs out of this project since August, and am almost done. Next Wensday I hope to have all the machines done with. Then I get to find out how easily kids can trash linux. But, I didn't secure it that much, because I feel as if they want to mess it up, all they would have to do is boot with a floppy and nuke the partitions. And it only takes a few minutes to re-ghost them. The 486 lab they have now has been surviving for 5 years now with no reinstall, so I think I'm safe.
    Does anyone have any comments that would help me out?
    • by Anonymous Coward
      But, I didn't secure it that much, because I feel as if they want to mess it up, all they would have to do is boot with a floppy and nuke the partitions.

      If possible, go into the BIOS settings, set the computers not to boot from floppy, and password-protect the BIOS settings. You should also add a password to lilo.conf and use the "restricted" keyword so people can't type "linux init=/bin/sh" and get a root shell (if you're using GRUB, similar options are available).

      After doing this, a student would probably need to open the computer and disconnect the CMOS battery to boot from a floppy. Or they could find a local root exploit. Try to minimize the number of setuid-root programs, apply security patches when they're available, and use a recent kernel (older ones had security problems).

      You might not expect your students to do stuff like this, but sometimes people try to hack their school's computers out of boredom. In my high school, a few people tried to run Netware exploits to gain administrator status (they didn't work). I've also heard of people installing keyboard sniffers on school computers to get passwords. Re-ghosting a machine will do nothing if someone knows your password.

    • Well, for best bets, you should reorder the boot order so that it boots from the hard drive first. Also, you should password protect lilo so that they can't type 'linux single' or its variants at the lilo prompt, and then you should chmod /etc/lilo.conf 600 so that no one else can read it. Beyond that, your best bet is to keep the packages up to date security-wise.
      -Aaron
  • has anyone done any benchmarks comparing ext2 and ext3 to reiserfs? i know the article mentions differences in performance, but i wanna see graphs and pictures :)
    • Namesys have compared ReiserFS, ext2, ext3 and XFS. They don't have graphs or pictures, but they do have easy to read numbers.

      Remember that Namesys is Hans Reiser company, so they like ReiserFS, but I don't think they cheat with the bechmarks.

      http://www.namesys.com/benchmarks/benchmark-resu lt s.html
      • Remember that Namesys is Hans Reiser company, so they like ReiserFS, but I don't think they cheat with the bechmarks.

        Cheat, probably not, but accurate to common usage of a filesystem?

        Be very careful interpreting those benchmarks, because the ones they consistently list first are the ones with a bunch of files that are 100 bytes in length, which is essentially the only area where Reiserfs really pulls ahead. Reiserfs is essentially tied with ext2 for all reasonably sized files that you would expect to find on a system. (Unless you're dealing with intense processing of millions of 100 byte files) When comparing ReiserFS to XFS and JFS, ReiserFS pulls way ahead for extremely small files, but the other filesystems perform notably better for reasonably sized files (10k) when synchronized.

        For practical uses, neither filesystem seems to really pull ahead, so it's worth considering other features when deciding which to use.
  • by Anonymous Coward

    Yet another Red Hat revolutionary product that the rest of the distributions promptly ignore. And with good cause.

    This talk of ext3 being faster than Reiser or XFS is crap. It's not faster, and on IDE hardware the journaling capabilities are offset by the way the IDE drives work. Ext3 is the weaker of the bunch on IDE hardware, to the point that you might as well not even use it. It seems the point of ext3 is to eliminate the need of fsck and not the benefits that can be had with journaling (as in XFS's xfsdump and xfsrestore).

    If you want a good journaling filesystem, use Reiser or XFS on FAST drive hardware. If you're not up to making the investment in SCSI or ATA 100 drives and insist upon running XFS or Rieser on your 5200 rpm 10 gig IDE drive, of *course* it'll be slow.
    • If you bothered to read the article you would know that there is one (significant) way which makes ext3 better than both Reiser and XFS. It can journal data.

      This means that you will not open up your XF86config file and discover that the powerfailiure has now resulted in a file full of zeroes.

      To me /that/ is a good reason to use ext3. I still use XFS on my file server though. Because that's what it (XFS) is great at. (Big files, lot's of files and lot's of accessing.)
  • Is it just me, or does ext3 sound like FAT16 > FAT32 and VFS, in that it's for all the little nancy boys who are too chicken/lazy to upgrade to a much better filesystem (and OS, while they're at it)?

    Not that the work done by the ext2/ext3 people isn't excellent, it's just that time is coming for extX to move on (be incompatible), or move aside.
  • I've been running ext3 fine for a few weeks now on my home box and my linux workstation at work. On Monday I decided to update our cvs server to kernel 2.2.20 (from .19) and ext3 and the next morning it was down big time. Reading logs, I could see that something had gone wrong during the big backup cronjob after 6am. It creates a 150-meg tmp tarball of our cvs repository for replication and it had only managed to do the first 4 megabytes. I also had a few "hda: lost interrupt" entries in the syslog, right during the time the backup process had halted. The disk was sloppy and not responding much, so it might be some h/w failure as well. I booted, the ext3 replayed the journal and everything seemed fine until I found some weird files with mysterious access bits set in some directories. I couldn't delete or move them. Also some files had disappeared and some others corrupted, AFAIK. I took the system down to runlevel one, remounted partitions read-only and run fsck.ext2 on them. It reported hundreds if not thousands blocks belonging to more than one inode.

    This may just be some weird hardware failure but it just sounds too coincidental. The box has been rock stable for at least a year in its current h/w setup. I've been testing 2.2.20's fine on many machines before, both with ext3 and ext2. Now that I restored the old system from backups it's running on 2.2.19+ext2 again quite happily.

    I'd like to know if anyone else has had problems that may be related to ext3? I'm still running it on my personal boxen but it seems that our servers won't be seeing this new filesystem at least until it appears on Debian Potato, included in the standard 2.2.x kernel release. If ever.

  • I run a large lab and use IBM's LCCM Package to install OS images to 60 client systems that I use for benchmarks.
    LCCM does not support installing Linux like it does Windows OS's.
    I attempted to use the latest Norton Ghost, and it will only allow ext2 filesystems to be created.

    Anyone out there used IBM's LCCM to install ext3 filesystems? Or have a good process for making an image of an already installed system for mass installs?
  • snapshots (Score:3, Interesting)

    by Anonymous Coward on Saturday November 17, 2001 @01:23PM (#2578784)
    Not to "troll" for my fav OS or whatever, but I've been playing with snapshots in FreeBSD-CURRENT for the last few days, and I must say that this is quite possibly the coolest filesystem technology I have ever seen.

    In short, a snapshot is approximately equal to an image of a filesystem. To create a snapshot, you run a mount command like "-u -o snapshot /var/snapshots/snap1 /var". Becase of the way snapshots work, the snapshot must reside in the same filesystem that it contains.

    Now, once the snapshot is created, it can be treated like another filesystem. You can run fsck on it, dump it, or even mount it. The only difference is that within the snapshot, previous snapshots will appear as null files.

    Basically, when you create a snapshot, you tell the filesystem that you want it's contents at the current time preserved, and the snapshot file is where it does this. Now, whenever said filesystem is modified, the modification is basically applied in reverse to extant snapshots. So, when a snapshot is first taken, it doesn't contain much information at first, but when you rm a file living in the directory, the file is saved into the snapshot. When you modify a file, deltas to reverse the change are saved to the snapshot.

    This is extremely powerful used in the hands of a good sysadmin. Imagine your server that is backed up to tape every week. When someone comes asking for a file they clobbered or deleted by accident, you say "how old was the file?" - you know if they say "8 days", you have to go restore from tape, and if they say "2 days", you have to tell them that they are out of luck. Now imagine if a cron job was set up to take a snapshot once a day, and clear out old ones once a week. If they say "8 days", you still have to go fetch the tapes, but if they say "2 days", all you need is some mdconfig, mount, cp, and umount action to restore the file. How cool is that?

    Snapshots essentially give your filesystems the "undo" capabilities that your editor has.
    • In short, a snapshot is approximately equal to an image of a filesystem. To create a snapshot, you run a mount command like "-u -o snapshot /var/snapshots/snap1 /var". Becase of the way snapshots work, the snapshot must reside in the same filesystem that it contains.

      Linux has snapshots too, but only if your filesystem is in a logical volume managed by LVM. An LVM tool calls the kernel LVM driver, which tells the filesystem in question to "quiesce itself" (i.e. make itself consistent on disk) if possible (only ext3 and reiserfs support this operation at the moment), then it creates a new snapshot logical volume, which is of course COW (as are, I assume, FreeBSD snapshots), and finally the filesystem is given permission to continue operations.

      Unlike the FreeBSD snapshot facility (as you describe it, anyway) the new logical volume is read-only - you can't fsck it etc.

      This is extremely powerful used in the hands of a good sysadmin.

      Indeed, I've been thinking of using Linux snapshots more or less the way you describe. (Our shop is small enough that it's hardly worth doing daily backups, but if it's easy enough....)

  • ext3 not solid to me (Score:1, Interesting)

    by Anonymous Coward
    kernel 2.4 is not solid enough for me
    to even start testing. if it doesn't
    run on 2.2 then its not solid enough
    for my needs. i don't want to have
    to upgrade a kernel every 2-3 months
    to fix a critical bug. reiserfs is
    decent i use it on a few machines, but
    ext2 is still the dominant filesystem
    on my ~40 linux servers. most have
    2 hours of battery backup and never
    crash. so journalling isn't much of
    an issue. the last power outage
    that lasted more then 30mins that ive
    experienced was back in 96 or 97 when
    a tree branch broke a line and caused
    most of the west coast to go to brownout/blackout state for 3-4 hours.
  • I've just recompiled my kernel to 2.4.14. So when will they add ext the the regular file systems rather than waiting for the patch to come out? I mean the first time I tried to apply the patch, bad things happened.

    I don't like seeing kernel panic messages.

A sine curve goes off to infinity, or at least the end of the blackboard. -- Prof. Steiner

Working...