XFS merged in Linux 2.5 271
joib writes "According to this notice, the XFS journaling file system has been merged into Linus bitkeeper tree, to show up in 2.5.36." Ya just know someone out there wants to have every journaling file system on one drive just 'cuz.
New file system (Score:4, Funny)
Re:New file system-attribute. (Score:3, Funny)
Comparison? (Score:3, Interesting)
Re:Comparison? (Score:3, Informative)
Google is always your friend [google.com].
-B
Re:Comparison? (Score:5, Informative)
Re:Comparison? (Score:2)
Basically I still see ext3 as much more full-featured than Raiserfs (supports file attributes which can be useful in many places on the system) but Raiserfs is faster (esp. for small files), so if you have databases, or are using your filesystem as a hierarchical database, maybe Raiser is for you.
Now how does XFS compare to these two?
Re:Comparison? (Score:2, Informative)
My understanding (Score:3, Informative)
ext3:
* can be told to journal everything, including data (not just metadata) -- most theoretical reliability.
* is backwards compatible with ext2
xfs:
* tweaked for streaming large files to/from disk -- probably best at sequential reads/writes.
reiserfs:
* best performance with many, many files in a single directory.
* Can save space on very small files with -tail option
jfs:
* really don't know.
Re:My understanding (Score:4, Interesting)
xfs:
* tweaked for streaming large files to/from disk
-- probably best at sequential reads/writes.
Hm...would that imply that XFS would be say a really good candidate FS for building video streaming devices?
Seems like it might fit well from the perspective of:
Re:My understanding (Score:2)
Well, yes. Which is one of the things SGI designed it for in the first place. Have you only just realized?
Re:My understanding (Score:2)
Re:My understanding (Score:2, Informative)
Re:My understanding (Score:3, Informative)
Even NTFS has inodes, they simply call them "MFT records."
More on inodes (was Re:My understanding) (Score:3, Informative)
You're mixing filesystem features up. To clear things up a bit,
Regards,
A critique of journalling filesystems (Score:2)
I had an operating systems professor that did some filesystem design work (and DBMS design work, which at a low level and especially back in the day, was pretty similar). He was pretty negative on the mass demand for journalling filesystems.
See, what people really want is filesystems that don't get corrupt. It's also kind of nice if the recovery procedure at mount is pretty fast. So they want a filesystem that is always consistent -- it's never in a state where if the power is lost, the computer will try to mount the thing and say "hmm, this isn't a proper filesystem."
So if you want to add a file, you can't just add an entry to the table of files, then create the file metadata, than complete the filesystem operation, because you could lose power and end up with only the entry, but no filemetadata...so you have a pointer to garbage on the disk.
You need some sort of atomic updating. You want to say "at this point the change I'm making to the FS is not active, at this point it is, and nowhere in between is the FS invalid".
Journalling is one method of making atomic updating -- always write in the forwards direction on the hard drive, building a journal of all actions as you go, and just using the lastest journal entry when you're reading. Journalling tends to have pretty sexy write performance, because it always writes forward and doesn't have to seek it all. It also usually has fairly lousy read performance, since you have to be sure that you're using the most up to date journal entries.
To avoid some of the slower read performance, most "journalling" filesystems on Linux only journal metadata -- the lists of files in directories, permissions, times modified and so on, because the data is what you're really worried about accessing quickly, and if the data in a file gets corrupted when you lose power, you only lose that file -- not the whole filesystem.
It's possible to use other techniques -- I believe that BSD's FFS uses a non-journalling approach to ensuring a consistent filesystem at all points in time. Despite claims both ways, I don't believe that FFS is radically faster or slower than any of Linux's journalling filesystems.
And what's my personal preference? Well, I use ext3, because I already had an ext2 filesystem, and it's awfully easy to upgrade. ext3 used to have pretty bad performance, but now it's generally on par with ReiserFS (which was ahead for a bit), except for Reiser's strongest points (like a single directory with, say, five or six thousand files in it). That being said, I suspect that most people just use ext3 in it's "metadata journalling mode", which means that it doesn't have many advantages over reiserfs.
Ext3 builds heavily on ext2, which is a pretty mature filesystem. I've had one roommate that screwed up his reiserfs filesystem a while back. I believe the bug that caused that was fixed, but it made me a bit leery of reiser at the time.
The other misgiving I have about reiser is that I'm uncomfortable with the direction that the developers are going -- very heavyweight filesystem drivers, with plugins and all sorts of stuff. I'm not sure that I want my filesystem drivers to be so complex.
On the other hand, if you have lots of very small files (not empty, just a hundred bytes or so), Reiser does a great job of keeping them from eating up more disk space than they should (normally, you have to throw 4K or so at a file, unless you've changed the block size of your FS).
XFS, as far as I can tell, wasn't really designed so much to be a general-purpose filesystem as a streaming video filesystem.
And, as I've said earlier, I don't know a thing about JFS.
Other interesting tidbits:
* ext2 is still a pretty well-designed, fast filesystem.
* All of the mentioned Linux filesystems beat the snot out of MS's FAT-16 and FAT-32 in performance and *particularly* fragmentation. The popular act of defragmenting your hard drive on Windows stems solely from the fact that FAT was not well designed for anything but the very smallest of filesystems, like a disk.
* I've heard stories that NTFS (MS's new filesystem) is still worse off from a fragmentation point of view that Linux's FSes. That's second hand, so it could be wrong.
* I know for a fact that real-world performance on NTFS (at least in the NT 4 era) is significantly slower than on ext2. I have a strong suspicion that a fair bit of that stems from the ACL security system MS uses in their filesystems. In terms of performance, ACLs are not a good choice.
Re:Comparison? (Score:5, Informative)
My personal experience (Score:2)
Every month or so, I had to sit through the following:
"Warning: drive has been mounted more than 30 times, check forced" on the ext3 partition
I thought the idea for journaling was to AVOID fsck's on boot?
Re:My personal experience (Score:3, Informative)
(you can turn fscks off, change the number of mounts or make it time-dependent, etc.)
Re:My personal experience (Score:3, Informative)
This is a safety feature. Filesystem corruption can be caused by hardware funnies as well as software bugs. Your memory could be flaky, your hard drive could be on its way out, your IDE cable could be too long, your SCSI chain could be improperly terminated, your motherboard might be iffy, your CPU could be running too hot. There might be software bugs in the generic kernel, the block / scsi drivers, the ext3 code, or even some random driver that has nothing to do with filesystems or memory management.
Because of this, ext2 and ext3 have tunable parameters for how often to force an fsck, overriding the fact that the fs is supposed to be in a known clean state. Apparently reiserfs does not have this safety feature - or does it? (I don't know.)
If this annoys you, turn it off. 'man tune2fs', or specifically,
HTH..
Re:My personal experience (Score:3, Funny)
Cool (Score:2, Interesting)
Re:Cool (Score:3, Informative)
Shawn.
Not just journaling (Score:5, Interesting)
Is this correct? Will the VFS also be extended so that you can make use of extended attributes in XFS?
Re:Not just journaling (Score:5, Interesting)
Re:Not just journaling (Score:3, Informative)
Re:Not just journaling (Score:5, Interesting)
Cooler, if I read the tea leaves right. I believe some time ago now there was a thread on lkml about whether it'd be possible to have files as also directories (and vice-versa). The reasoning behind this was simple: we want flexible filing system attributes, but not at the expense of API bloat. You want ACLs? That'll be another API then. Extended Attributes? Another API. What, you want heirarchical extended attributes too? Well you've just created another version of the filing system API haven't you.
The theory goes (and Hans Reiser, top guy, explains it much better than I can) that by altering one of the rules of the filing system, we can get lots more power and expressiveness without having to invent lots of new APIs. Let's say you want to find out the owner of file foo. You can just read /home/user/foo/owner. You can edit ACLs by doing similar operations. Now you can have something more powerful than extended attributes, but you can also manipulate that data using the standard command line tools too! Coupled with a more powerful version of locate, you can have very interesting searching and indexing facilities.
This has implications beyond just string attributes. Now throw in plugins, so for instance the FS layer interprets JPEGs and adds extra attributes. Now you can read the colour depth of an image by doing "cat photo.jpg/colour_depth" or whatever. You can get the raw, uncompressed version of the file by doing "cp photo.jpg/raw > photo.raw". Noticed something yet? You no longer need a new API for reading JPEG data, because you are reusing the filing system API.
But the FS is not a powerful enough concept, I hear you cry! Have no fear, for with new storage mechanisms comes new syntax too, to allow for BeFS style live queries. If you want more info, you should really read up on this stuff at Reisers site [namesys.com].
That's why ReiserFS is so good at small files as well as large files. Have you ever wondered why that is? It's not just a quirk of its design, it was very deliberate. One day, Hans wants to see us store as much information as possible in a souped up version of the filing system, so reducing interfaces and increasing interconnectedness. Or something. It sounds cool anyway :) That's one thing that RFS has that the other *FSs don't - the ReiserFS team has vision.
Re:Not just journaling (Score:2, Interesting)
Re:Not just journaling (Score:3, Interesting)
How do I use these named streams for a directory? To re-use your example, can I:
$ cat $HOME/owner
and get my username? Or will it be looking for a file named "owner" in $HOME?
Re:Not just journaling (Score:2)
$ cat $HOME/..owner
Alternatively, standard UNIX attributes may be placed in a subsubdir, so
$ cat $HOME/..metadata/owner
Nobodies entirely sure yet.
It's been done: Plan9 is the name (Score:2)
reiser is just implementing what others have done long time ago:
http://plan9.bell-labs.com/sys/doc/9.html
Re:It's been done: Plan9 is the name (Score:2)
I think what Reiser is talking about could be truly novel -- I'm sure someone has thought of it before, but I don't know that anyone's made it happen in a real OS. (Though I wouldn't be surprised to see it in an experimental OS)
XFS FAQ (Score:5, Informative)
Still, it's noteworthy that Linus has finally accepted it into his tree...
Excellent (Score:2)
This is great; more filesystem support is always good in my opinion. Now if we could just get some stable NTFS read/write support I would be set.
Re:Excellent (Score:2, Informative)
It's on the way. Read-only NTFS (rather poor in 2.4) has been rewritten and is much improved in 2.5, and a certain subset of read-write (writing new contents to an existing file) is reported to be stable. I haven't tried it. Full read-write may or may not make 2.6.0 but you can be sure it is in active development.
Re:Excellent (Score:2)
That will only happen if Microsoft gets a court mandate to open their specifications. MS has far too much economic benefit in deliberately breaking compatibility to not do so. They've changed the ACL portion of the FS in such a way as to break the Linux NTFS driver in every single NT-line kernel release since Linux came out with an NTFS driver.
Silly question (Score:5, Interesting)
When I install Linux, and it comes to anything to do with filesystems, I just go with whatever default it gives me.
I suspect I'm not exactly alone.
So ... what compelling reason is there for me to use any other filesystem? Being more stable or better with data loss is nice, but considering I've only ever had this problem once, doesn't mean that i'll leap up and down going "oo oo! got to have blahFS!" any time soon.
To give you an example, FAT16 to FAT32 was the fact you could have larger partitions. FAT32 to NTFS was because of permissions and security.
But whatever we have now (can't remember, i barely look) to XFS? What *compelling* absolutely-must-have reason do I have to go change from whatever my installer suggests putting on for me?
Or should I just stick with what the installer suggests from now until eternity?
Re:Silly question (Score:2, Interesting)
However, if you have a server that has to have high performance and has data that you *really* care about then one of ReiserFS, XFS, EXT3, etc... becomes a *really* good idea.
Re:Silly question (Score:5, Informative)
XFS is an extent based filesystem which means that you don't end up wasting tons of space having to allocate a 4K block for every small file. And you don't need to jump through tons of indirect blocks to get large files.
XFS allocated inodes on the fly so it grows with what data you put on there. Once again, not wasting space up front. And it sticks the inode near the file itself so the head does not have to move far on the hard drive.
XFS supports extended attributes which can be used for all kinds of extensions later on.
XFS has been around since 1994 and is the most mature of the journalling filesystems.
And there are many other reasons that I cannot think of right now.
Re:Silly question (Score:2, Informative)
It's also a 64-bit filesystem, so you could have extremely large files and filesystems, although my understanding is that the Linux VFS system can't handle the large sizes right now (1Tb max filesystem for instance). XFS is the standard filesystem for SGI's IRIX which doesn't have the restrictions.
Re:Silly question (Score:5, Insightful)
Actually I think ACLs are the reason why everybody is running as Administrator in Windows. They are just too damn complicated.
The Unix-permissions are simple. You can understand the concept of user-group-all in a few minutes and there are only 2 commands to remember (chmod, chown).
Also, Unix-permissions have so far fit with everything I needed and in the rare case you really need something special, there is also sudo.
I think ACLs are only useful for a tiny minority, IMO. I certainly don't need it.
Re:Silly question (Score:2)
The reason is, that they're accustomed to the DOS based Windows-Series.
For some people, the concept of a superuser and a normal user seems to be too complicated.
>The Unix-permissions are simple.
Great... Now how does a small group of students get read write rights on a set of files/directories?
>, there is also sudo.
It's just that you are switching into superuser-mode for every little thing a little out of box.
Since you're complaining about people running Windows as Administrator, you certainly are aware of the lack of style in this.
Not to mention, that it is out of question for every larger system (practically every system, which exists outside ones home).
Re:Silly question (Score:2)
One thing they are useful for is if you are replacing NT file servers with Samba servers. If you don't use ACLs (either XFS or the EA/ACL patch [bestbits.at] with ext2/3) then your Windows users who connect to your Samba shares don't get all the fine-grained permission control to which they are accustomed; Samba fakes it. Combine this with winbind and you end up with almost a perfect drop-in replacement for your NT file servers, and you don't have to manage those users separately. Sah-weet.
Re:Silly question (Score:5, Interesting)
standard UNIX permissions and allow you to do
the 2 common cases
1). Group finance has access + user Jill
2). Group finance has acces but not user fred.
But then again I wrote the Samba POSIX ACL
code so I'm biased
Windows ACLs are a complete *nightmare* in
comparison. I still don't understand why Sun
added an incompatible varient of Windows ACLs
to NFSv4 (ie. it's close, but not the same as
the real Windows ACLs. The problem is they based
the spec. on the Microsoft documentation of how
the ACLs work. Big mistake....
Regards,
Jeremy Allison,
Samba Team.
Re:Silly question (Score:3, Informative)
Or you have a proxy, you don't care if suddenly your cached data is lost, it will soon be refilled, it's not important data, you want performance without too much security (reiserfs)?
In fact each filesystem has inherent limits on inodes, filenames, permissions, etc... so you go with any that has a minimum for each thing you need. Journalling you don't really need unless you want to be able to step backwards or repair your filesystem in more interesting ways...
Re:Silly question (Score:3, Informative)
2) Journalled file systems mean fast re-boots on power outages
3) Speed. This depends on your usage. A huge mail spool machine may use ReiserFS on the mail spool. For most people it is a wash.
4) Ext3 can be remounted as ext2, and really good file system checking tools exist for ext2/3.
Mostly, though, you CAN just stick with whatever the default suggests.
Faster reboots (Score:2)
They mean faster reboots period because they never need to be checked on boot - so you don't get that annoying "Ahem, you've rebooted too many times, I'm going to check your hard drive while your client, who's looking over you shoulder, wonders why you re-assured him you'd only have his production server down for half a minute to install the new kernel, and I'm spending 5 minutes scanning his drives."
Of course you can turn off those checks on ext2 too, but that would be stupid.
Re:Faster reboots (Score:2)
Journalling does protect against software caused inconsistencies. It does not protect against hardward probs. Periodically, it is a VERY good idea to unmount and fsck while checking for bad blocks.
Re:Faster reboots (Score:2)
This is what scheduled downtime is for. I understand that it's a so-called "helpful measure" to automate the process, but at times, it's downright annoying. If the admin isn't bright enough to even schedule maintenance periods, then he ought to be told to clean out his desk.
Again, I completely agree with you, but I think that any system that runs periodic maintenance for the admin is really just making things a little *too* convenient.
Re:Silly question (Score:2)
If you have to ask... (Score:2)
Linux is used in an incredible variety of environments, from embedded systems without disks to seriously large servers and parallel supercomputers. As you might imagine, the default filesystem isn't always ideal. But, if you're just running an ordinary single-user workstation, and aren't experiencing any noticeable performance problems related to your disk access, then there's no reason to worry about your filesystem.
So "stick with what the installer suggests from now until..." you run into a reason to do otherwise, makes sense.
Re:Silly question (Score:2)
There may be no compelling reason for you to change from the default (which, presumably, were chosen as defaults becaused they'd satisfy most people). But for someone looking to optimize for a particular application, it's one more variable they can tweak (different filesystems each having their own strengths and weaknesses.)
For example, someone doing desktop video editing (really big contiguous files, high sustained data rate needed, etc) might want a different filesystem than someone running a highly active database server (lots of small table changes scattered across the filesystem).
Re:Silly question (Score:2)
There's no reason to switch from (ext3?) to XFS. But it's quite possible that the next time you install, if you're formatting a new drive, it will suggest XFS. Of course, converting an existing disk is enough of a pain that you probably don't want to do it.
Tinker Power! (Score:2)
Anyway, the big success story for Linux is servers -- and journalling file systems make a lot of sense for servers, because they're more bulletproof. I once worked in a place with a lot of Solaris servers using a non-journaling FS. Now we had fancy UPSs so the servers could go down gracefully. But they were no help when an overloaded power main caught fire (middle of summer), sending out a gigantic surge that took out the UPSs before the power went away. It was days before all the file system repair and restore was complete.
About a year later, I was working at a place with a lot of IRIX servers. Had a power failure there too. No surge this time -- but no UPSs either. So how long before the servers were back up? About ten minutes after the power came back. XFS, like other journalling file systems, doesn't get all inconsistent when it's interrupted.
You missed out the flamewar on the mailing list! (Score:2, Informative)
Mailing list archive [iu.edu]
Just search in the page for XFS and you'll find the thread.
Questions... (Score:3, Interesting)
When is Linux 2.6 likely to be released? I know that there is no fixed date, but what are the criteria?
My second question... Does it really matter when the 'official' release comes out, when distribution makers "roll-their-own" anyway?
Sorry if these sound like dumb questions to some of you, but I'd be interested to find out.
Re:Questions... (Score:3, Funny)
Re:Questions... (Score:2)
I suspect that a number of distributions will include 2.6 pretty quickly this time, because it'll be handled by someone who is good at stability. Also, the distribution makers are actually pretty close to the 'official' process, and they're really in the best position to judge stability on a wide variety of systems. By the time 2.6 is declared stable, most of the distribution makers will be comfortable with it, both in the official version and with their patches.
Re:Questions... (Score:2)
There is a list around of the desired features for 2.6 that was put together at the Linux Kernel Summit. A very hasty web search turns up this list [lwn.net], which doesn't seem to mention things already merged like the block-IO stuff.
Re:Questions... (Score:2)
"When it's ready."
Seriously, they'll release when the new features and changes they've made are stable and tested enough... and the release of a v2.6 is important, as it means it will be more widely used, more bugs found, etc. Most distrobution makers wouldn't ship a newly 'stabilised' kernel, e.g. 2.6.0, but would wait until it had matured a little...
Re:Questions... (Score:3, Informative)
+1, Funny. I think you mean after the code freeze, which usually happens a month later, well, two, three, ok, six months later. You also forgot to mention that Linus usually has multiple freezes, and the one on 31 Oct is only the first. With each successive freeze he puts on a more threatening tone, crying woe unto them who would dare tempt him to thaw the kernel again. Eventually the first code freeze happens, then maybe one or two more of those....
Even odds we get a 2.6.0 by June.
Yes! (Score:3, Informative)
Despite being a little more resource intensive than ext3, XFS has to be one of the better file systems available. I've used it (obviously) on SGI's and it's been outstanding, and opted to use it before ext3, JFS and Reiserfs (although I believe Reiserfs is just as nifty).
Having it accepted into the kernel makes upgrades a world easier, and hopefully I'll be able to move away from SGI's modified Red Hat installation. Although, I doubt Red Hat will support it out of the box.
The other issue that needs fixing with XFS is the lack of an emergency boot disk. XFS enabled kernels are huge, and that creates a slight problem when booting from floppy.
Rescue CD (Score:2)
CD burners are quite widespread, a quick rescue image could be quite small.
And yes I know not everyone has a burner, I don't either.
Boot partition (Score:2, Insightful)
I think the trick to this is to have a /boot partition, and a /root partition, and make them both ext2. Then you can boot from a floppy, and then boot the larger image on the boot partition. That was the reason given for having those partitions in the Linux Stadard Base documents, anyway.
But I'm an engineer, not an IT person, so I could be mistaken as I've never attempted to do it myself.
here's an interesting read (Score:4, Informative)
That's what menuconfig's for... (Score:2, Funny)
Ya. And people want to have every ethernet card in one box just 'cuz, so there are a bunch of different drivers for ethernet interfaces.
My experience with XFS (Score:5, Interesting)
- It's extremely reliable. Filesystems never got corrupted, even after a lot of ugly reboots.
- Recoveries after a crash are really fast. Almost immedate, better than ext3 and reiserfs.
- Every needed tool is available to resize filesystems, check filesystems, analyze filesystems and backup/restore filesystems.
- _BUT_ there's something strange. Basically during disk I/O, the whole system is unresponsive. While I'm compiling something, KDE becomes slow, playing videos is not smooth at all, etc. Just as if it didn't scale at all for concurrent disk access. So I finally switched back to ReiserFS just because of this. Maybe the 2.5.x series of kernel behaves differently.
Re:My experience with XFS (Score:3, Informative)
Just wondering, are you using the custom kernel from Gentoo? If so, have you compiled your kernel with either/both of the low latency patch and/or the preemptible kernel patch? What are your experiences with either of those two options when running XFS? I'd expect the use of either of those two to improve a system's responsiveness to user interaction when doing a lot of disk I/O, but if those don't help when using XFS, I wonder what kind of black magic is going on inside that code.
Re:My experience with XFS (Score:2)
Your observations were anticipatable. XFS was originally designed for real-time (high speed) data streaming, namely capturing and processing video (which require A LOT of disk space). That bias in design does not lend itself to concurrent disk access performance. Interestingly, your move back to reiserfs works well with reiserfs's strengths. I use XFS, and can't say I've experienced your problems, but I haven't tried compiling and watching video at the same time.
Having said that, I can't say whether your experiences are specifically due to XFS's design, or other factors; such as XFS's implementation under linux, or your tasks requiring a lot of RAM, or CPU (which applies to compiling, playing videos, and XFS). Your problems ith XFS could be resolved with a faster or 2 CPU's or a lot more RAM.
Re:My experience with XFS (Score:5, Informative)
Hmmm.. I'd assume that ext3 wouldn't be as good.. A fix on a fix usually sucks. And then I've heard about Reiser's file truncation problems. I use Reiser and no big problems."
---"- _BUT_ there's something strange. Basically during disk I/O, the whole system is unresponsive. While I'm compiling something, KDE becomes slow, playing videos is not smooth at all, etc. Just as if it didn't scale at all for concurrent disk access. So I finally switched back to ReiserFS just because of this. Maybe the 2.5.x series of kernel behaves differently.
I've had the same problems on 2.2.X when I didn't tweak my HD's to dma66 32 bit. Try doing a:
hdparm
hdparm -tT
If you dont like those settings, Drop into single user mode, with / read only and do this command
hdparm -X66 -d1 -u1 -m16 -c3
Now manually do a fsck on that partition. If you have errors, it's a bad mode. But if it works, then redo the -tT option (it's a benchmark).
Be aware that 2.4 does most of this for you, but sometimes can give to little of a setting (so your performance sucks). Then again, you could have an unsupported IDE device.
All the best..
Re:My experience with XFS (Score:2)
Soft Updates in Linux? (Score:2, Interesting)
using journals and using soft updates and have
decided that soft updates is the cleaner approach.
Can anyone explain to me why the Linux community
is so enthralled with the concept of journaling
file systems while the BSD community has quietly
but unanimously embraced soft updates?
But where is e2compr (Score:3, Insightful)
Being a linux developer for embedded production boxes and given the current increasing interest over linux in embedded along with embedded boxes typically running _WITHOUT_ hard disks (mostly just flash chips of some sort, due to their better life-time), I cannot help wondering why the kernel mailing list shows little or no interest towards ext2 (or ext3) compression.
JFFS and JFFS2 don't come into question in most cases as they tear through the fs layers and cannot be used with IDE flash chips for example.
Alcatel even released it two weeks ago for 2.4.17... loads of people, like me, must have ported it to 2.4.19 by now. But to get ext2 compression to 2.5.XX, forget it... but why?
This little like the lack interest towards under clocking, eventhough once you've overclocked your main computer to the max, you will start looking for more silent option, if not for the desktop computer, but for the closet firewall. Even if you don't have the interest now, you will, once you shack in with a gal.
Re:But where is e2compr (Score:2)
You may be asking a bit much of the typical Slashdot reader...
An interesting thing about XFS... (Score:3, Informative)
I find that very cool, for some reason. I guess one practical application is if you have a box that is the only one of that type (either big-endian or little-endian) that dies and you need to recover the data.
Re:An interesting thing about XFS... (Score:2)
BeFS (Score:2)
Re:BeFS (Score:2)
-adnans
Why is kernel-image so big? (Score:3, Interesting)
Several places it is mentioned, though, that the kernel image of XFS is very large, so much that you can't really fit it onto a floppy (although people over-format their floppies to get 1.8 MB or so onto them, and then the kernel might just barely fit.)
I can't understand why any filesystem should be so big -- it seems that the code to run the filesystem is almost as big as the rest of Linux put together. How can this be? Is it really all code? What could that code possibly be doing?
I studied XFS fairly extensively after I had to repair a disk that had 1 of its 23 heads fail. From the remaining 22/23rd of the disk I managed to recover almost every file and directory, by writing my own XFS filesystem interpretation code. The on-disk organization of the filesystem is fairly simple and straightforward, I can't imagine where the hundreds of K of code is going.
I won't be shocked if the answer does lie in that kjournald daemon -- that XFS is bigger than ext3 because ext3 puts most of the bloat into a user-mode daemon instead of the kernel.
thad
Bummer! (Score:2)
I survived powercuts and brownouts just fine when everything was on ReiserFS...
Related question (Score:3, Interesting)
Re:Related question (Score:3, Informative)
It's more complex under Linux. Here's the Linux-specific answer to this question from the FAQ:
Transactions? (Score:2, Interesting)
This way it would allow cool stuff like garanteed data consistency or rollback.
Imagine
/
$ begin_trans
$ rm -rf
$ rollback_trans
Re:Wow (Score:2)
Re:Wow (Score:2, Informative)
Re:Wow (Score:2, Informative)
Re:Wow (Score:2)
Re:That's all, just XFS (Score:2)
(Actually, I'm assuming that was a joke. I do like Linux's HFS support, since I run a mixed platform household.)
2.6 kernel goodies (Score:4, Interesting)
* ALSA support. ALSA is a pain to keep patching your kernel with every redownload. ALSA is a Good Thing, if a pain in the butt to configure. My guess is that there will be decent front ends on top of the thing when distros start shipping 2.6.
* Batch priority/boosted effect of nice levels. I've always felt that "nicing" something didn't have enough effect -- nicing something by one level is almost unnoticeable. 2.6 boosts this change. It also introduces batch priority, where a process gets *no* CPU time if there is *any* non-batch process in the runnable queue. Very sexy.
* Low, low latency. Just as 2.4 emphasized good multiproc support, 2.6 is emphasizing low latency. Preemptive kernel, lots of disabled-interrupt time being reduced (especially the godawful framebuffer console), etc, etc. This is top-notch for both I/O performance and multimedia. Linux kernel 2.6 is supposed to beat any current release of Windows in audio latency when released.
The only thing that I really wish Linux had was a prioritized disk scheduler. Linux can prioritize network traffic. It can prioritize processes. It just can't do the same with disk I/O. This is a shame, since I want my MP3 player not to skip when reading MP3s/paging, followed by X getting next highest priority when paging (so that the UI doesn't freeze up for long when paging something back in), and Linux just doesn't yet have the functionality. Currently, you can have a nice 20 process that's busy untarring a large tarball...and all your paged out processes will be blocked, waiting for this stupid tarball to finish.
Re:2.6 kernel goodies (Score:3, Informative)
the skipping in your mp3 player has nothing to do with disk i/o. it has to do with scheduling latency. that is, unless your mp3 player has been poorly designed, which many of them have been.
also, 2.5/2.6 is still missing the better patches for low latency (from andrew morton), and so its performance is still not as good as it could be.
2.6 doesn't beat windows at audio latency when using WDM drivers for windows. it (along with 2.2 and 2.4) beat windows with MME drivers. the WDM audio driver model is very fast, and windows has always done a better job of handling scheduling latency than linux (other than with andrew's patches). in 2.4 there are still places in a mainstream kernel that will stall the entire box for up to 1/10 second.
Re:2.6 kernel goodies (Score:3, Interesting)
Not true. I've done quite a bit of poking around this issue. I have plenty of spare CPU time, and I'm not using a sound server or similar. The problem comes when reading an mp3 from disk (and no, this is not a "DMA/umasked interrupts" is not on issue) and other *heavy* sequential disk i/o is being done by another piece of software (because of the amount of data, tar xzvf is frequently the culprit). Linux heavily weights disk scheduling towards overall performance, not fairness. Besides, this isn't mp3-specific -- other software does it too. Try cat
I remember seeing benchmarks of various Windows audio latencies and Linux latencys, and at least the low-latency people had Linux at least a couple of ms below Windows. I wasn't aware that only some of these patches were going in, though, so that could be the difference between what we're talking about.
Re:2.6 kernel goodies (Score:3, Interesting)
The skipping is caused by scheduling latency, as Paul suggests. I have written an mp3 player for Linux (see URL) and it only really skips when the audio output thread is not scheduled in time to satisfy the soundcard's needs. I.e. the Linux scheduler needs to make sure that whenever the audio thread wants to fill the soundcard buffers it must get the highest priority to do so. For example if you are using a soundcard buffer that is split into 2 fragments of 1024 bytes each that means that the audio thread needs to be scheduled every 6ms, 3ms for 512 byte fragments (44KHZ stereo, 16bit output). Even when your soundcard buffer size is 50 or 100ms deep you can very easily cause skipping if your audio thread is not scheduled for 100ms or longer. And this is pretty normal on a vanilla kernel for non-realtime scheduled processes. Think about it, your "cat >
In short, the soundcard will be starved of ready to play PCM data long before the decoder will be starved of MP3 encoded data (from disk). In the end it doesn't really matter because your music still skips, but it is important to identify exactly why it's skipping.
-adnans
Re:2.6 kernel goodies (Score:2, Informative)
From the ALSA [alsa-project.org] site:
"2002-02-13 ALSA has been integrated to the official Linux 2.5 tree! The initial merge is in patch-2.5.5-pre1."
Yippiee! Great sound, here we come!
Re:2.6 kernel goodies (Score:2)
Re:2.6 kernel goodies (Score:2)
I don't run stuff like SETI@Home, but lots of people do. Processing blocks probably shouldn't have much priority when you're doing stuff on the desktop.
Re:Already supported by major distros (Score:2)
No, Gentoo used to support this in the kernel... until 2.4.19-r7. It's been pulled out as of this update. When I asked about this on alt.os.linux.gentoo, I was pointed to this thread on the gentoo-user mailing list [gentoo.org]. In a nutshell, concerns of data loss when a system powerfails or bad interactions with preempt (which is also included in the Gentoo kernel) are the primary reasons. Luckily, the emerge --update did not toast my old kernel dist folder so i still have it... but you may want to wait.
This pertains to the current stable (1.2?) release.
Perhaps this move in the dev kernel will prompt someone in the Gentoo dist-building realm to re-add this to the kernel.
(i'd post a url to my questions in the ng but i x-no-archived the thread.. silly me)
(and this post is non-pro non-con xfs.. though I've worked with SGI systems, own an Indigo2 and think the fs is pretty solid).
Red Hat DOES NOT has XFS... (Score:4, Informative)
custom Red Hat installer for XFS.
There is some XFS-aware code in the Red Hat Linux installer, but there is no kernel support or userspace tools available, so what you propose simply can't work.
However, SuSE, Mandrake, Gentoo, Slackware, and Debian (to some extent) do have XFS support.
Re:Ummm... duh? (Score:2)
You are arguing for a method. From my experience that absolute best, without question method of determining file type is "magic bytes". This may seem strange, but magic bytes have the amazing capability of being preserved by virtually every interface anybody uses to copy files. That makes them infinitely better than any attribute. And don't give me any crap about magic bytes colliding or not identifying text files, I challenge you to find *any* file format that a real user would "double click to open" that cannot be identified by magic bytes.
The problem with magic bytes is filesystems that return the filename vastly faster than they return the data from the file. This makes it inefficient to look it up. Though probably no big deal for double-clicking, people seem to be addicted to the "icon" in the file viewer, though I have never seen anybody rely on this (try rearranging the small icons randomly on Windows and see who notices) and that requires opening every file. However "thumbnail views" seem to be catching on and these require opening the file anyway. I believe ReiserFS is addressing this and making it fast to open files. There is also a problem if the file viewer does not have read access to the files, though it probably could not read the attributes either in that case.
If you refuse to use magic bytes you must use attributes. Unfortunately you are seriously constrained by requiring an attribute that will copy by most file copying mechanisms and will be written by existing programs. It also helps if it is obvious to the user how to fix the attribute. Unfortunately I think the only workable solution is what Windows did, which was to use the file extension. My only experience with attributes is Mac OX9 and OS/X and I have always been annoyed with them, as a .jpg file will often lauch an unexpected program (such as a Classic app).
If a file system supports attributes, I strongly recommend that they be used as a "cache". This should be the obvious way to use them for "thumbnail views", but there is no reason to not do the same scheme for "types", if the type attribute is missing you then spend the time to use magic bytes to determine the type. File copying programs can strip all the attributes and the effect will be invisible to the user except for a slight slowdown the first time the file is used.
I also want to say that the Mac HFS system that preserves files is equivalent to the older Unix "hard" links that existed in the very first versions in 1970. I think these are rather an embarrassing part of Unix history and can actually be proven to be a mistake, and led to all kinds of horrible file system mantinence problems. They tried to fix it by disallowing hard links to directories, which coincidentally eliminated about 95% of the usefulness of them. "soft" links like ln -s make are much more useful, predictable, and the user and programs can control them in obvious ways.
duh! just like any other journalled file system (Score:2)
Yes, just like NTFS, ReiserFS, JFS, and ext3 (by default). That's kind of the whole point of journaled file systems: in the even of a crash, power failure, etc. you are guaranteed to get a consistent file system, though some data may be lost. Basically, you may lose a few files but you never lose the whole file system. ext3 is the only file system I know of that gives you an _option_ to journal data as well, but that makes it really slow.
Re:What ever happened to Tux2 (Score:2)
patent held by Network Appliance that seems to
cover Tux2 and so has discontinued work on it.
Never mind he has prior art, they have more
money and lawyers. Welcome to software
development in the USA in the 21st Century....
Jeremy Allison,
Samba Team.
Re:Journalling filesytems... (Score:5, Informative)
Here's the basic theory. Think about what happens when you make a change on a filesystem - say you add a file to a directory. The system has to:
If you don't do these things in the correct order, there will be times when the on-disk structure is not consistent. For example, you may have modified the directory to include an entry for the new file, but the entry points at an inode which hasn't been filled in yet. Or the inode may be filled in, but the free space pool hasn't been updated to correspond with the data block allocations in the inode. Throw in other modifications like deleting files or making them larger or smaller, and it gets pretty complicated. If the machine happens to crash at such a time - or the power goes out and you don't have a UPS - the disk will be in an inconsistent state. This has two major consequences:
Journalling prevents both problems (barring bugs in your OS or hardware, of course) by writing transactions to your filesystem. Instead of making changes directly to your directories, inodes, free block maps, etc, the filesystem batches up such changes by spooling them to a separate area on disk, the journal. Then, when it has written enough such changes to account for an entire, self-consistent transaction, it puts a marker in the journal indicating "transaction complete" and starts copying these changes to their usual locations on disk. Meanwhile, the next transaction can be spooled onto the end of the journal area, and it will get its own "transaction complete" marker when it is done. A journal can hold a lot of transactions - only limited by the journal size, which is usually configurable. When a transaction has been fully copied out of the journal to its final locations, it is re-labeled "journal free space" in the journal.
How does this help? Imagine that the machine goes down while a transaction is still incomplete in the journal. Next time you boot, the OS "replays" the journal: it looks for all the completed transactions and commits each part of a transaction to its correct permanent location. It ignores journal free space, and any incomplete transactions - essentially rewinding the filesystem state to the end of the last completed transaction. There is never any danger of "partially updated" filesystem state, since each transaction starts and ends with a known-consistent state.
(Ah, but what happens it the OS goes down again while replaying a journal? No big deal: next time it boots, it just replays the same journal again, which produces the same result as it would have done the first time.)
Some simplifications, obviously, but that's the basic idea. Did it help?
The different levels of journalling have to do with whether all filesystem data is journalled or only some of it. You usually only journal metadata, which is the filesystem structure: directories, inodes, free block maps, etc. That's because copying all your file contents twice (first into the journal, then into its permanent location in the filesystem) is quite slow. The main purpose of a journal is not to guarantee pristine file contents in the event of partially written files, but to ensure a consistent view of the filesystem as a whole - so you can avoid that long fsck and avoid ever ending up with a partially or fully scrambled filesystem (modulo hardware failure, of course).
HTH..