Linux File System Shootout 437
IpSo_ writes "Finally an extensive, human readable Linux file system benchmark has been unleashed upon us. Originally posted on the Linux Kernel mailing list, using two of the most popular benchmarking tools available, it compares all the major file systems, including their different mount options. The results are surprising."
human readable ? (Score:4, Funny)
Re:human readable ? (Score:3, Insightful)
HTH
Re:human readable ? (Score:2)
Re:human readable ? (Score:5, Insightful)
The human readable result is you need to know what you want. There is no silver bullet.
It looks like xfs wipes the floor for all but temporary (loads of create/delete) file usage. Jfs looks like the best all-rounder. Reiser looks like something that can be tuned to the specific usage, but eats CPU for breakfast, lunch and dinner and EXT3 "surprise, surprise" sucks rocks. The other "surprise, surprise" is that EXT2 is still very good for many uses.
Frankly, I do not see anything new and fascinating in the results, but they are good to throw at people who keep asking me "why not EXT3" and "Why XFS or EXT2". Here is why!
Re:human readable ? (Score:2, Informative)
Really? You must be looking at a different set of benchmarks to me, because as I see it, ext3 is running a close race with XFS to take second place behind JFS. Remember, ext3's journalled mode is journalling data as well, and hence it isn't fair to compare it to other filesystems directly as it's doing much more work (equally, ext2 comes out on top for a number of things because it's doing far less work). Others like reiserfs, XFS and JFS are journalling metadata only
Re:human readable ? (Score:4, Insightful)
So much for the "human readable" aspect of these benchmarks. Everyone seems to be walking away with a different idea of what the results are supposed to be showing.
Since there was no legend explaining what the colors meant, I couldn't figure out anything from looking at them. Is the high number good? As in did the most work? Or is the high number bad? As in took the longest amount of time to do something?
Re:human readable ? (Score:4, Informative)
Depends on the column. For K/sec, higher is better, so red cell shows lowest, and green shows highest. For %CPU, lower is better, so green shows lowest and red shows highest. It's not that complicated really if you take a few minutes to look at it. What you get from the data depends on what you were looking for in the first place.
Re:human readable ? (Score:4, Insightful)
especialy if the file system can be mounted read-only; you could do this in partitions like
I wonder if the kernal version makes a difference, are the xfs and jfs better on the 2.6 as compared to say the 2.4.19 that I'm running now; or how about with files that are much smaller like on the typical web server?
also partioning schemes can make a big difference in overall performance, in the old days i placed the swap partition in the center of the disk (most accessed in the center where the heads are most likely to be) the next most likely like
also does the disk make a difference, such as is any performance differences consistant between scsi and ide drives?
These are things that need to be looked at before make a decision as to which is best, but it definately appears that we need to do some looking now
Re:Other's don't do journaling? (Score:2, Informative)
Re:human readable ? (Score:4, Informative)
Mount static filesystems read-only, and make them EXT2 for performance. Use a journalling FS for dynamic filesystems. Reap the benefits of both.
Come on not everybody understands this... (Score:2)
Are you saying WTF? Well, congratulations, not everybody wants to know that level of detail in a particular subject.
The results are a bunch of numbers that for some people are totally non-comprehensible because they do not understand that level of detail.
Some folks would rather know why a particular file system is better than anothe
Cheaters! (Score:5, Funny)
Re:Cheaters! (Score:3, Interesting)
Re:Cheaters! (Score:2)
Re:Cheaters! (Score:3, Informative)
FAT performs well as long as you just do sequential access to large files. But don't access too many files at the same time, because there are only eight entries in the fat_cache. If you run over this limit or do random access FAT is going to be the worlds slowest filesystem. And that is so bad it has really caused me trouble some times.
The reality is minix is faster, cleaner, simpler, and more flexible than FAT. Just take a look on the source minix
/.ed already? (Score:2, Informative)
Huh? (Score:3, Interesting)
Anyone has good/bad experience using JFS?
Hmm... I think I'll setup my test box with JFS...
Re:Huh? (Score:5, Informative)
From what I've seen poking around USEnet, JFS seems to have the too little, too late problem. I've never seen it pwn a benchmark like it did today though.
I'm a little confused-I have been told XFS is the best designed, highest performing file system, and I would hate to think SGI is getting into a lot of this crap with SCO for a relatively slow journaling file system...
Re:Huh? (Score:5, Informative)
IIRC, XFS is more about guaranteed performance under various stressful conditions than about getting the absolute peak speed in calm conditions.
Re:Huh? (Score:2, Informative)
I don't know how JFS falls in the "too little to late" catagory, both file systems have been available for a long time on Linux, however very few Linux distributions embrase them during installs so they have gone unknown to a great deal of the non-storage geeks out there. Mandrake, much to their credit, has for a long t
Reliability? (Score:2, Informative)
I wonder if anyone has some experience with the reliability of the current version?
Re:Huh? (Score:2, Interesting)
Throughput benchmarks only... (Score:5, Insightful)
Notably missing are more day-to-day useful operations such as the creation and deletion of lots of files, parallel action on many open files,
lots of files in a directory, etc.
When I want to select a filesystem, I do not want to know how fast it can read a 3GB file sequentially. I want to know how well it performs on a fileserver, mailserver etc.
Re:Throughput benchmarks only... (Score:5, Informative)
Re:Throughput benchmarks only... (Score:3, Insightful)
Horses for courses, my friend. If you are running a database or doing video editing then reading large files is exactly what you do want. This is exactly what SGI's customers do, and that is why XFS is IRIX default filesystem.
file size a problem? (Score:5, Insightful)
Re:Throughput benchmarks only... (Score:4, Insightful)
You want to quickly do readdir operations, quickly open many of the files and read some data from them, etc.
When you run a fileserver and don't want to explain to your users that it is not a good idea to have 100.000 files in a single directory as a storage format for filled-in entry forms, you want that situation to be handled well by the filesystem.
When a nightly backup operation needs to read the tree that includes that directory and write it out to a directory on a different disk (and remove the copy of a few days before), it should be able to do that without spending ridiculous amounts of time or excessively wearing out the diskdrives.
Those are not hypothetical situations, those are situations that I encounter in real life.
i am a human (Score:2, Funny)
Re:i am a human (Score:3, Funny)
Is that "layman's" or "lamer's"?
Re:i am a human (Score:2)
Short summary (Score:5, Informative)
best: jfs
worst: ext3_journal
bonnie++ benchmark
best: ext2
worst: reiser4/reiser4_extents, ext3_ordered/ext3_journal
Re:Short summary (Score:5, Insightful)
worst: reiser4/reiser4_extents
You might think that just based on the amount of red in Reiser4's row, but if you look all the way over to the right, you'll notice something interesting: Reiser4 often completes the benchmark in significantly less time than the other filesystems. Reiser seems to be caching a lot of flak for the CPU usage (certainly it gets a lot of red boxes in this benchmark because of it). Personally, though, I've got CPU to spare. Disk seek times aren't changing drastically anytime soon, unlike CPU speeds. If I can trade some CPU cycles for less wasted disk seek time, I think that's a great trade.
Re:Short summary (Score:4, Interesting)
I totally agree. CPU cycles are alot less important on my box than disk seek times. Then again, I'm guessing that the people this will be most relevant for are those running servers. Mine are all running reiserfs and ext2.
Re:Short summary (Score:2)
Re:Short summary (Score:3, Insightful)
Re:Short summary (Score:4, Informative)
I'm not saying that trading CPU for filesystem speed is a bad idea; it isn't. What I'm saying is that it's not a simple "more is better" function, and that the cutoff for when it no longer makes sense does depend a lot on the application you intend it for. Again, to take an extreme, you would not want to have a system where the filesystem eats so much CPU the rest of the system essentially blocks, starved for CPU time, when the disk is used.
To take an even more extreme way of doing the tradeoff: you could compress and uncompress all data on the fly. That way you would increase transfer speed (and increase it quite a bit in the case of text files and similar) as well as decrease disk usage. It is not often done, though, because the tradeoff is not worth it in general.
For us, and our app, Reiser is on the wrong side of that cutoff point (and Reiser4 is not even on the horizon yet).
this benchmark was performed using a 200Mhz CPU (Score:5, Informative)
I don't really target 200Mhz CPUs in my performance tuning....;-)
Hans
Re:this benchmark was performed using a 200Mhz CPU (Score:5, Informative)
I took an old PII-350 w/ 128MB RAM and benchmarked ext2, ext3, jfs, reiserfs, and xfs on an old 5GB IDE drive. ext2 was the winner by a margin (raw throughput).
Now I'm beating up various hardware and software RAID configs on a dual Athlon MP 2200+ system w/ 2GB RAM and dual 3ware 8-port 7500 controllers w/ 180GB WD drives. JFS rises above the rest in terms of throughput (I didn't test XFS on this new machine), and, of course, reiserfs simply spanks everything in terms of file creation/deletions. The thing I noticed was the JFS had much lower CPU utilization for file creations/deletions and was twice as fast at it than the ext2/3 filesystems (it still got spanked by reiserfs, though).
If anyone's interested, the "best" overall was reiser w/ the mount options noatime,notail,nodiratimeall. Also, if anyone cares, on this machine, the Linux software RAID code at no less than twice the performance numbers over the 3Ware hardware RAID. Running RH9 with all RH updates applied.
Re:Short summary (Score:3, Interesting)
Now, if I had a few large database files, then I might think of changing to something else - but I don't have that situation on my workstation, and probably never will. To echo what Spy Hunter said, I am willing to trade some CPU cycles for more efficient data retrieval - m
Re:Short summary (Score:2, Informative)
Me bit confused here, arent file systems which come new(like ext3 etc) supposed to be better than the older ones!?
Re:Short summary (Score:2, Informative)
From: Comment 7167683 [slashdot.org]
We allocate a "jnode" per unformatted node in the filesystem. The traversing of these jnodes consumes more CPU than performing the memcpy from user space to kernel space when doing large wri
Re:Short summary (Score:2)
Definitely no. The "newer ones" mainly add journaling, and that means, at least meta-data has to be written twice on the physical medium, once into the journal, and once to the "real" place inside the FS' block-allocation and tree structure.
E.g. reiserfs tries to s
Re:Short summary (Score:2, Insightful)
these are narrow tests, not comprehensive tests (Score:5, Insightful)
It would be nice if exactly what they did was explained. You know, things like how you can get both the lowest total elapsed time and the worst overall score on one of the runs (because of CPU usage? ), what task was measured by each of the numbers printed, what the different settings on the different runs mean.....
Sigh, time to go read the source code for them.
Re:these are narrow tests, not comprehensive tests (Score:3, Insightful)
Just looking at the table of results, it seems clear that the bonnie test has a penalty for cpu uti
Re:these are narrow tests, not comprehensive tests (Score:3, Informative)
In this case, I think that this 200Mhz CPU benchmark is not highly worth optimizing for, but generally more views of a design are interesting.
One of the things reiser4 needs to do is not have a structure per unformatted node for large files, and you can see the need for that if you look at our CPU consumption when writing a large file using dd. We'll probably
Size on disk too? (Score:2, Interesting)
It would be nice if non-Linux filesystems (FATxx, NTFS etc) were also benchmarked.
histogram, please! (Score:2, Informative)
Re:histogram, please! (Score:2)
Gripping! (Score:2)
What I find surprising... (Score:2, Interesting)
And, as many others have already pointed out, it would be nice to have a comparison of these file systems with the *BSDs FFS...
Any comment on this would be greatly appreciated!
Results question (Score:4, Insightful)
If I am wrong, please either resopond to correct me or email me.
scythefwd@yahoo.com
Re:Results question (Score:5, Informative)
I know we're used to seeing "benchmarks" used as corporate propaganda, but let's not forget what they're supposed to be used for
Re:Results question (Score:2)
--Note that resierfs v3 is the only version available right now for the stable Linux kernel; I don't even know why they bothered testing v4, because it's only in 2.6 - and alpha code, at that.
Re:Results question (Score:2)
The conclusion I drew from this was that if you don't have many spare cycles cycles, e.g. by running high-crunch applications or by running with a small CPU, ext2 has its advantages.
But if primarily you using a system to serve files rather than cycles, or you expect to fsck often, by all means go with jfs.
DeFacto Standard (Score:5, Insightful)
I'm not trying to be an asshole or a troll; just hear me out.
I love Reiser. I also love Gentoo and adore Debian. Myself and another guy, Joe, are the main "linux geeks" in our computer group (cugy.net). When it came time to decide what to support at our group, we had to choose RedHat.
If I'm in a message board or IRC channel, I need to know some things about the guy I'm helping. We reccomend RedHat because that is the biggest US company behind Linux (IBM and SUN notwithstanding). If I am teaching people about Linux, then it is to both our advantages to teach/learn about what we will see "in the field". Therefore, we only support RedHat.
What does this have to do with anything? Well, RedHat 9 and Severn do not allow the creation of Reiser by default. I could probably boot from a Gentoo disk and format a partition to Reiser, then install RedHat to it. But, by default, only ext* is allowed.
I love to do things that improve performance. I love testing new things on my laptop or on a offline box in our test lab. But unless RedHat offers it, it will remain in the shadows of the linux world, which is, in turn, in the shadows of the user enclave. Hell, of every important box on my network, they are either RedHat or Win2k.
More on topic, Joe got a lot of recognition when the "internet got a lot faster". Did he upgrade the firewall? Did he install another OC-3? Maybe he reconfigured services on the proxy?
Nope, he installed a hard drive, formatted it to Reiser, and moved the proxy cache to the reiser disk. I couldn't belive it. Just changing the filesystem caused an increase that was noticable across our network. At no cost!
Good work, Joe.
Re:DeFacto Standard (Score:2)
Re:DeFacto Standard (Score:2)
Re:DeFacto Standard (Score:2)
Also, for something that's not important to keep across reboots (like a proxy cache) you should probably stick with Ext2, mount it async, and maybe re-create it upon reboots. That's what I do with
"linux reiserfs" (Score:5, Informative)
i've been using this method for ~2 years now.
Re:DeFacto Standard (Score:2, Insightful)
Squid is a cache of (parts of) the internet. It can be rebuilt pretty easily. For example, the next time a user goes to a page. It might cost you a fraction of a second the first time, but after that you're sweet. Journalling transitory data just adds unnecessary overhead.
If it's quite a large cache with a number of binary objects, why don't you just mirror it up front?
A mail spooler or news spooler that has to keep somewhat static data I'd h
Re:DeFacto Standard (Score:2)
* Tuned to small files
* faster
seems useful for a cache too me,
Re:DeFacto Standard (Score:3, Informative)
For the same reason you would want to have email, or a file server, on a journaled system, recovery speed.
I have some clients with servers (that run squid) and when they take a power hit, long enough to drain thier UPS, the last thing I want to have to do is deal with a call saying "how come the server did not come back up..." Meanwhile the fsck is still running and they are hitting the power switch trying to "reboot" the problem away b
Re:DeFacto Standard (Score:3, Informative)
If you're looking to restart quickly after a power failure you can always set a partition to ignore file system checks at startup, "0 0" options in
You have never waited over an hour to fsck 3 harddisks while over 100 people have no "internet".
Re:DeFacto Standard (Score:4, Informative)
He installed a hard drive. He didn't just format to reiser. The hard drive costs money.
If the proxy cache was formerly on a disk that was also doing other things, it would have sped up no matter what filesystem he used.
You will have to give us more information if you want your claims to have any merit.
Summary (Score:5, Informative)
Re:Summary (Score:3, Informative)
Note on ext3 (Score:2, Interesting)
Here's my $699 now. (Score:3, Funny)
Where's the deviation? (Score:5, Insightful)
For example, consider that harddrives do their own error correction. Depending on the location of marginal blocks on the media, different file systems can score dramatically for no other reason than the drive's re-mapping or error correction logic is kicking in at a bad point. Alignment of data can also be a factor in performace which depending on the formatting procedure may be completely random when compared to the file system sitting on top of it.
For these reasons and a host of others, it is not reliable to do filesystem performance comparison on a single machine.
Bottom line is that there is a good chance that these data are not fair representations of the relative merits of each filesystem.
Odd (Score:2, Insightful)
Ext3 (Score:4, Informative)
Rus
Re:Ext3 (Score:3, Insightful)
I'm also very interested in the ease and ability to repair (or auto-heal) a corrupted file system after a hard crash.
Also, simple undelete would also be on my wish list. just in case.
Anyone here who's got those features ready?
Re:Ext3 (Score:3, Informative)
I had ext3 on my wife's laptop for a while, and it failed twice. By "fail", I mean that, due to Linux crashes, the filesystem had errors that had to be recovered by hand. By "fail", I mean actual, significant data loss.
When I got my new laptop (from QliLinuxPC), they formatted the HD into one big partition (well, one for /boot and one for /), and formatted those as ext3. I didn't switch to ReiserFS, because QliLinuxPC said they'd had good luck with ext3. In the past
Is this realistic? (Score:2, Interesting)
Re:Is this realistic? (Score:2)
Re:Is this realistic? (Score:2)
So you're saying that you don't want to see tests based on typical hardware? I can't think of anything better to test.
TWW
Re:Is this realistic? (Score:2, Insightful)
Can't wait for Novell Storage System on Linux (Score:5, Interesting)
I could go on... About the only thing it is missing is encryption. Of course it remains to be seen whether the port to Linux will be successful, and whether Novell has the sense to make it open source.
Re:Can't wait for Novell Storage System on Linux (Score:3, Informative)
Novell could give *nix systems windows like (don't bash if you don't know) fine granularity over user access at the enterprise level along with true enterprise scaling. Again, if you have never worked in a cross enterprise environment, don't start bashing
Re:Can't wait for Novell Storage System on Linux (Score:3, Interesting)
This is kinda weird (Score:2, Interesting)
Irrelevant numbers (Score:2, Informative)
Myself, I'd be much more interested in seeing numbers made on a setup more like my own.
Static benchmarks are never good for deciding "which is best".
Re:Irrelevant numbers (Score:2)
These [oregonstate.edu] benchmarks, by the guys who host most of Gentoo, were done on a dual 2.8Ghz P4 Xeon's, 2Gb of RAM machine (Dell PowerEdge 2650) and (AFAICS) reach broadly similar conclusions - 'best' depends on your usage, but JFS is pretty good, reiser uses a lot of CPU etc.
there is more to a filesystem that speed. (Score:3, Interesting)
Re:there is more to a filesystem that speed. (Score:2, Informative)
Sort of on topic... (Score:3, Interesting)
I have two different physical drives in this machine now and I dual boot between them. Linux (for just about everything I do) and then WinXP (for things that absolutely require Windows.)
The new drive I'm getting will be hooked up to my machine externally via Firewire. (I don't need help with the external setup. I already have another drive hooked up this way and it works just fine.)
Now my question is - what is the best file system to use for compatibility between Windows XP and Linux. I require full read/write access to this drive whether I'm in Linux or WinXP. I know NTFS is out. (Even with the 2.6 kernel, write support from Linux to an NTFS partition is limited [can't create new files or directories] and Linux NTFS writing is not considered completely safe.)
I'm guessing VFAT is my only option but I thought I would ask around first.
I do have another machine laying around but I don't want to set it up as an NFS/Samba server for a few different reasons. #1. I don't want to leave the machine on 24/7. #2. I don't want to tie up that machine. I like experimenting with new things so if I turned that machine into a full time server, I wouldn't have a test bed machine any more. #3. I don't like NFS.
I have also thought about one of those Network Area Storage systems. Maybe someday, but at this point in time that idea is out too.
Does anyone have any experience with this? What solutions have you come up with?
Re:Sort of on topic... (Score:3, Informative)
Back when I still dual-booted, I had this layout:
5 gig NTFS WindowsXP partition
5 gig XFS Slackware partition
1 gig swap parition
45 gig reiserFS shared storage partition
This also made me feel a lot safer in using the systems: Neither ever mounted the other system's Root directo
Re:Sort of on topic... (Score:4, Informative)
There are some free and some commercial products which can offer full read/write + journalling access for ext3 partitions from Windows. I'd definitely recommend you pick ext3 over fat32.
Some examples..
Free: Explore2fs [swin.edu.au] allows you to read ext2 and ext3. Limited write support is available.
Commercial: Ext2FS Anywhere [ext2fs-anywhere.com] don't let the name put you off as it has full read/write support for ext2, ext3 and I think reiserFS is supported now too.
NTFS read/write in Linux (Score:4, Insightful)
Because NTFS specs are locked in a dark closet in near Seattle never to be seen by the evil Linux developers. These developers, fearing for their lives, will never have the guts to deem their NTFS write stable - there will always be a slight chance you'll corrupt your entire disk table - and no one wants their so-called "stable" driver to corrupt people's data.
In NT4, NTFS was at version 1.1, aka NTFS 4.0 (to align with NT4.0). In Windows 2000, it was version 3.0, aka NTFS 5. And in XP, it's version 3.1, also known as NTFS 5. The point is, NTFS is a moving target, so it's unlikely we'll see effective NTFS abilities in Linux anytime soon.
Worth Noting (Score:4, Informative)
As for complaints about Reiser's performance -- last I heard, it was more optimized for many small files -- precisely the domain that this thing didn't test.
Some suggestions for future tests (Score:3, Informative)
The problem that we had with JFS during testing is that we had kernel panic with very large files. Thus we chose XFS - which has done an excellent job. I'm sure glad that the XFS file system has been merged into the 2.6 kernel, no more patching the 2.4's!
For more benchmarks on other file systems using postmark check out This [shub-internet.org]
Re:The only one that matters (Score:2, Informative)
Re:The only one that matters (Score:2, Insightful)
Anyway, the same accounts for DB-servers, etc. etc.
Please think before you flame
Re:The only one that matters (Score:2)
Re:The only one that matters (Score:2)
I haven't seen more incorrect statement for quite a while.
Re:The only one that matters (Score:2)
Re:Didn't jfs do well (Score:2, Interesting)
Re:Didn't jfs do well (Score:2, Funny)
Perh
Re:Didn't jfs do well (Score:3, Interesting)
It is the core issue of their suit against IBM and everything that has followed out of it.
KFG