The Linux Filesystem Challenge 654
Joe Barr writes "Mark Stone has thrown down the gauntlet for Linux filesystem developers in his thoughtful essay on Linux.com. The basic premise is that Linux must find a next-generation filesystem to keep pace with Microsoft and Apple, both of whom are promising new filesystems in a year or two. Never mind that Microsoft has been promising its "innovative" native database/filesystem (copying an idea from IBM's hugely successful OS/400) for more than ten years now. Anybody remember Cairo?"
Don't try to keep up with Microsoft and Apple (Score:5, Insightful)
easy answer (Score:5, Insightful)
We live in a network-based universe. Local filesystems are already good - whether its just continued development in Reiser, or whatever else.
Nfs4, though - its like afs, only without the sucky stuff. AIX is now including nfs4 in its AIX5.3 release, even! With the Big Dog on board, we should realize there's wisdom in that direction ;)
Re:Don't try to keep up with Microsoft and Apple (Score:4, Insightful)
In this case, they're one and the same.
Why not use... (Score:2, Insightful)
Another solution in search of problem (Score:2, Insightful)
Nobody has come up with a compelling reason or feature to make me want to change filesystems.
Filesystems are tools (Score:5, Insightful)
So to develop a one handy "swiss army knife" of filesystems may not be the best route. For the most part one knows what a system will be doing and can build in the most appropriate filesystem for the job.
Re:bah! (Score:2, Insightful)
Re:Don't try to keep up with Microsoft and Apple (Score:1, Insightful)
Wow, hey - check it out everyone! Somebody who "gets it" instead of just uses FOSS stuff because they want to pretend they're cool.
THIS guy's attitude is what the FOSS community MUST begin to cultivate and it MUST find a way to push the din from all the screaming Microsoft haters down to an inaudible level (the cluetrain just dropped off a package: nobody cares if Microsoft has been promising something without delivering for 10 years. If they beat Linux to it, that's all that matters). The FOSS community disgusts me, and it's lack of focus that makes that so. The parent poster understands that the point of any software development should be to fill a need that's still empty, or to improve upon a tool that's already filling a need.
When more people get on board with pushing Linux to just be a good system, more people will use it. Nobody is going to switch to Linux just because YOU hate Microsoft. They WILL switch to Linux, however, when it offers them a good reason to do so.
Re:Don't try to keep up with Microsoft and Apple (Score:4, Insightful)
MS has basically announced/demonstrated most of the new features that are in longhorn. Effectively that has given the linux community two years to come up with competing features. Adding database features to a filesystem makes sense, beos has demonstrated that you can do some nifty stuff with it and both apple and MS have anounced to do this.
The linux community however is divided. You can install reiserfs, maybe develop some tools that use some of its more advanced features but that doesn't fundamentally change anything if openoffice, KDE and Gnome and other programs don't coevolve to use the new features.
The same goes for stuff like avalon. While everybody is still talking about how such technology might be used in OSS projects like mozilla, Gnome, MS is well on their way of implementing something that may actually work.
Filesystems with rich metadata were already a good idea ten years ago. The OSS community has talked about them where others have implemented them. Two years of more talking would be fairly consistent. IMHO the OSS community is underperforming in picking up new technology and good ideas.
Re:Filesystems are tools (Score:5, Insightful)
OpenVMS (Score:1, Insightful)
Re:Don't try to keep up with Microsoft and Apple (Score:2, Insightful)
I.e. by just following MS in many ways you are already following what people want and need.
Re:Another solution in search of problem (Score:5, Insightful)
More than three times a week, and that's criminal.
I mean, throwing things about in your home or My Documents directory are fairly standard. How often do you put your (picture) files in a \qw3r3et354t\bchnjc8g45\3j4n45g9u98d directory?
While everyone seems to see WinFS (and associated services) as some sort of search panacea, your ability to retrieve those files is linked to 1.) its metadata and 2.) your ability to recall a search term that appears in the metadata. If your search for "bird" and the metadata specifies "hawk", short of a dictionary search, you still cannot find it. It doesn't matter if the uber search capabilities can span the entire hard drive in 5 secs, and run through multi-dimensional data. You still need a search term, and that search term (in whole or in part) must appear somewhere in the file, be it the filename or metadata.
Essentially, WinFS makes data appear more ordered (assuming you take the time to fill out the fields). Otherwise, it's useless.
Re:Don't try to keep up with Microsoft and Apple (Score:4, Insightful)
How's about this for a better idea, instead of trying to keep with Microsoft try to keep up with sound software engineering principles in designing our file systems?
There may even come a time when the required action to impliment this idea is to do nothing.
KFG
What about granular permissions as in NTFS? (Score:2, Insightful)
Hey, let's admit that Microsoft did a good thing with NTFS. Before I get roasted, let me say I've been working with FAT, NTFS, EXT2/3, Reiser, and others over the last 12 years, and I've had a chance to get a view of reliability, ease of recovery, etc., with several of these in production environments. I think the NTFS permissions model is one that the Linux world would welcome over the old and, I think, inadequate U-G-O scheme we continue to tolerate.
not so fast ... (Score:3, Insightful)
Filesystems are so crucial to OS stability, that I'd say it's worth formally-verifying them to a certain extent (i.e. prove that the algorithms/code work, instead of just observing that they work in normal conditions).
P.S. The whole thing - filesystem as a DB - is complete crap. You can't do a bunch of fs operations in a single transaction and have ACID semantics on the transaction as a whole. Sure - searching is great. But database means much more than just a searching interface.
Re:Encrypted filesystems? (Score:3, Insightful)
Everyone seems to like this method. But do you also encrypt your swap partition? If not, then whenever the system swaps, unencrypted data gets stored somewhere on the swap partition.
Here's something that might terrify you: run grep on your swap partition and give it a few characters from your password. You needn't list the entire password. Scary, eh? (This won't work for everyone, but it might for you.)
Remember, if you're using loopback with crypto, you're just pissing in the wind unless you encrypt your swap, too.
Palm (Score:3, Insightful)
Re:Another solution in search of problem (Score:3, Insightful)
Nobody has come up with a compelling reason or feature to make me want to change filesystems.
I disagree. Why can I go to google and search the entire web for something and get an answer in less than 1 sec, and I can't do that on my computer or lan?
Re:Another solution in search of problem (Score:3, Insightful)
Filenames are metadata and are just as much under user control as database metadata, no more, no less.
KFG
Trust the Kernel team (Score:3, Insightful)
Always the stable version, didn't have problems until I tried reiserfs, switched back to ext2 ext3 actually, and I didn't have problems again.
The source of my problem appears to be resierfs directly or indirectly I don't know, like most users I don't really care either.
Doctor it hurts when I do this.
Then stop doing that.
Good enough for me.
The assumption is wrong (Score:3, Insightful)
This whole article is based on nonsense. Microsoft has a long way to go before it catches up with Linux in the filesystem area. There is no realistic prospect of Microsoft keeping pace with Linux filesystems in the foreseeable future.
(Before dismissing me a Linux fanboy, note that the above applies only to filesystems. When it comes to understanding of GUI issues, I'd make a similar statement but with Linux and Microsoft swapped. But that would be off-topic.)
I'm not the market they are looking for (Score:3, Insightful)
frankly, i don't care if the casual consumer uses linux or not (though a larger market share would have some benefits). the people who develop for linux generally want and need the same things as my self and i'm happy.
That being said, faster file searching is definatly a useful tool. But if the registry in Windows is any indication of what the file system is going to look like in order for anything to get found, i don't want it.
On a random side note, App Rocket [candylabs.com] is a nifty program launcher for windows that finds files very fast.
Re:Another solution in search of problem (Score:3, Insightful)
Side Note: Cool posibilites like this show how proprietary formats can really suck and lock users out from searching/indexing their own content. Even offerings from MS in this category will suffer from the inability to search/index proprietary formats. I doubt MS will get specs for all possible file formats out-in-the-wild, thus leaving much of an end-users content hidden from searching/indexing.
Re:Next generation? (Score:3, Insightful)
Is the Linux/Unix community so "steeped in tradition" (also known as stubborness, obstinance, intolerance, and narrow-mindedness) that it willfully clings to an outdated, inferior way of doing things?
Re:Indestructable is the killer app (Score:1, Insightful)
Not gonna play with alpha code (Score:4, Insightful)
Sorry, I'm not about to trust archived video to alpha code, or even beta code. If there's no release-worthy option on Linux, we have to stick to NTFS on Windows.
EMBED VERSIONING! (Score:3, Insightful)
Embed versioning into the filesystem. I believe Reiser has talked about this. Imagine being able to right-click on a file, folder or even partition and choose "roll back" or "restore" from the context menu. It then presents you with a list of snap-shot points you can restore to, starting with "last change".
Who backs up their hard drives any more? Have you thought of the problems and time involved in backing up 40, 80 or even 200 Gb of data? I'd MUCH MUCH rather have this feature than some enhanced search.
I'm working on this problem today (Score:3, Insightful)
- Integration with a Kerberos SSO strategy
- Fast performance
- Cross-platform compatibility with Windows
- Robust Access Control mechanisms, RBAC would be nice but DACL is probably reality.
In my opinion, these are the primary goals that companies are looking for. Not a "journaling" file system, or built-in encryption. Sure those are nice, but let's get the basics first. Unfortunately, CIFS is still in quite a state of beta (even on the 2.6 Kernel) and there don't seem to be any real other alternatives.
Re:not so fast ... (Score:5, Insightful)
Laughable - at best. Likely just worthy of a groan (Score:3, Insightful)
Granted, the proposed featureset of WinFS is vastly 'superior' to that of the 3 main linux contenders, but it could be argued that WinFS is neither a filesystem itself, nor is it on par with any of the linux filesystems in terms of performance or stability (if NTFS5 is to be of any forboding).
I seem to recall reading about several projects that impliment WinFS-like features. I don't recall what they were, and I don't think they were kernel-space projects, but I recall thinking, "this looks nice".
Besides, let's be honest here. What practical functionality does WinFS provide that is above and beyond the combination of 'locate', and 'file' used in conjunction? WinFS seems to me to be merely a crude hack so as to make up for the fundamental shortcomings with MS's OS design.
Re:Filesystems are tools (Score:4, Insightful)
I absolutely agree. And I actually think the current interface to filesystems is good. I don't want any major changes. Because major changes would most likely lead to all new kinds of metadata that no applications know how to deal with. And whenever your files get handled by a program without this knowledge, you are losing metadata which again means new applications that makes use of the metadata get screwed. So most of this inovation will just give us lots of compatibility problems. If anybody really want to inovate, and produce something good, then they should implement a clever implementation of the existing interface, that works well for different cases, that is both small and large files, deep trees, many files per directory, few files per directory. AFAIK reiserfs and XFS are doing quite well.
(for example, FAT32 has found its way into grandmother's desktop and production web servers).
FAT is a horrible example, because it didn't become this widely used because of quality. Minix' FS is simpler than FAT, it have more features, and it is a lot faster for random access to large files. FAT-16 had problems with small files, because on large partitions you were forced to use large clusters, which means lots of disk space wasted (I have seen 30% waste in real life cases). FAT-32 did improve on the problem with small files, because now you could have much larger partitions with 4KB clusters. But since FAT-32 still use linked lists to find data sectors (like previous FAT versions), FAT-32 is worse at handling large files than any previous filesystem. For example seeking to the end of a 625MB file in 4KB clusters requires following 160000 pointers. Most other filesystems use a tree structure, which means you can typically index the entire file with at most 3 or 4 levels of indirection, which means you need to follow 4 or 5 pointers. Would you try to cache the FAT table to speed up the access? Good luck, you would need 4 bytes of FAT table per 4KB cluster on the disk, so for a 160GB disk you would need to use 160MB of your RAM just to cache the FAT. And this doesn't get rid of the CPU time required to traverse the linked list.
Re:Next generation? (Score:3, Insightful)
The leader of the pack... (Score:2, Insightful)
First, all Unix file systems since some decades have proved to just fit the bill quite fine. Searchable Metadata and other "features" is actually application-level stuff. 98% of the data on an average Desktop Unix system (and 99.5% of the data on a server) does not need that, because it rarely changes and is of no special interest for the user at all. And if it does, application-level data is better integrated, faster, and more flexible.
What is happening here (and in many other recent discussions) is dragging the Free Software community into an arms race it can not win. You can't make Linux/Gnome (or FreeBSD/KDE or whatever your favorite Open Source system may be) into a system that is just like Longhorn or Tiger but gratis and free. Never. What Free Software really needs, now more than ever, is to be picky about its users, its uses and its features. Better offer 10% of all users a system that is better than offer 90% of them a system that is a poor emulation of the OS they get for free with their PC anyway.
This is a point that the Free Software community has to (re)learn, better today than tomorrow.
Re:EMBED VERSIONING! (Score:3, Insightful)
- this feature was not considered important by users and thus the systems offering it were not surviving
or:
- the feature was considered important or nice to have, but decision which system to buy was not made based on important or desirable features.
I think it could be the latter. However, that means that introducing useful features will not sell your system... what a wonderful world.
Not really (Score:3, Insightful)
Hardly. There are a lot of OSS projects that are leading the way with new technologies and in implementing good ideas.
But in quite a few areas it's not at all uncommon to see slow support for new tech. The community divides about how to implement the new ideas, which slows things down, but that division fosters competition and provides a base for testing out different ways of getting the new tech out the door.
Sometimes doing it well is better than doing it first.
Re:Another solution in search of problem (Score:4, Insightful)
A better label to use would be "complex". To respond to your argument that the only obstacles to db-fs are ignorance and blind conservatism, complex software is undesirable. It increases costs in terms of man hours to maintain it, it increases QA overhead, and it increases support calls from users who came to depend on a feature which was included for completeness, but was never audited for correctness or robustness. People don't code complex software unless they are paid to do it (and usually when a manager is making the technical decisions). This is the reason most open source/free software tools seem to follow the Unix philosophy; simple tools which do one task and do it well, but are yet flexible enough to build into more complex systems. A monolithic database filesystem does not appeal to the sort of psyche which produces open source code for that reason: Complexity doesn't make a programmer's job fun. In order to produce large amounts of code at a low cost as in the open source/free software world, the people behind the engineering of the software need to be having fun, and a complex database filesystem is a rather good example of something which is _not_ fun to produce and therefore unappealing to the hacker sort.
Re:not so fast ... (Score:4, Insightful)
They aren't a problem at all. Every email system can identify file formats it doesn't know how to deal with. Most can get external plugins. The file + attributes can be seen as just a type of file (like say
My wish: Case insensitivity (Score:2, Insightful)
-Lars
Re:Next generation? (Score:2, Insightful)
Of course interfaces need to change sometimes. But first you need to ask how much you're going to break. If the kernel hackers break existing interfaces too much, they risk alienating the users/distros and forking the kernel.
Is this a check on innovation? Absolutely. But I'll point out that Microsoft has even larger checks on innovation - they promise far more backward compatibilty. And it has to be binary compatibility, which is harder than source compatability.
Re:New FS (Score:1, Insightful)
How about a metadata standard? (Score:1, Insightful)
Is it just me, or a lot of supposedly smart people missing the very obvious problem that you're not going to be able to exchange files between these different systems and keep the metadata? Before anybody ships a metadata-based system, I'd like to see some kind of standard defined for metadata interchange. Frankly, I think a standard is long overdue (please let filename extensions die!), and it's absolutely essential that a standard exist before different formats become too firmly entrenched.
You would think that people would have learned their lesson, but this is starting to look like the 80s all over again.
Re:The assumption is wrong (Score:3, Insightful)
On the other hand, it already does almost everything ReiserFS 4 _promises_ to do, and with NTFS it actually works, tried in the real world, and can be trusted.
Small files aggregation : NTFS stores small files in the MFT directly
Plugin : NTFS reparse points
Encryption : there since ages
etc..
Linux supports a lot of filesystems, but very few come close to NTFS when it comes to capabilities, scalability,...
Re:Laughable - at best. Likely just worthy of a gr (Score:3, Insightful)
I'm wondering if you even know what WinFS is, comparing it to file and locate it laughable at best.
Try finding all mp3 by Brian Adams or Withney Houston on a 200Gb disk filled with 250'000 files with file and locate, you'll get the answer 10 minutes later.
With WinFS, it will take you a whooping 2 seconds maximum.
That is without talking about the user-defined attributes, etc... that make WinFS more powerful than anything of that kind on Linux.
Re:Next generation? (Score:1, Insightful)
Re:Filesystems are tools (Score:3, Insightful)
Re:Don't try to keep up with Microsoft and Apple (Score:2, Insightful)
Very true - locate, find and grep do their job quite well, so why should i dedicate 1/3rd of my HDD to index and database space ?
Re:Don't try to keep up with Microsoft and Apple (Score:2, Insightful)
Not all information is text based either. How do you store family photos in an easy to find manner? What directory/filename structure will allow you to quickly find the photo of daughterA at locationB taken sometime in yearC that doesn't contain ex-partnerD?
"Virtual folders" are the _best_ feature of the Evolution mail reader and beat the pants off a one-time search for something. This is simply because no matter how you organise your data, sometimes it just makes sense for it to stored under two places at once. For example, the directory which holds photos of daughterA and the directory which holds photos taken at locationB. It would seem to me that storing this data multiple times (or making lots of soft links) creates far more overheads then storing some extra metadata which is available to every program you run.
Also, relying on metadata created and stored by an application leaves you beholden to that application.
Whether or not people will actually make any use of a FS metadata capabilities is a seperate issue. I don't want to spend all my time re-arranging directories and shuffling my data around to make it easier to find. I bought a computer to do such menial tasks...
Re:Not gonna play with alpha code (Score:2, Insightful)
Nowadays I am running purely Linux, and I wish I could say the same. Fsck ring a bell? And no, the newer breeds aren't flawless yet. But it is good enough, so I'm using it.
Just silly to pick on one of the things MS has done that actually works - it may not be perfect, but it is far from bad. Sadly, it also seems far from being writable in a stable manner too.
Now, if you would like to pick on FAT32, I'm game. =)
Modern drives (Score:2, Insightful)
Jebus, do you want to run a production system on a drive with known bad blocks? Whoever hired you must be a complete moron.