The Linux Filesystem Challenge 654
Joe Barr writes "Mark Stone has thrown down the gauntlet for Linux filesystem developers in his thoughtful essay on Linux.com. The basic premise is that Linux must find a next-generation filesystem to keep pace with Microsoft and Apple, both of whom are promising new filesystems in a year or two. Never mind that Microsoft has been promising its "innovative" native database/filesystem (copying an idea from IBM's hugely successful OS/400) for more than ten years now. Anybody remember Cairo?"
New FS (Score:5, Interesting)
What are the winds of change saying? R..E..I..S..E..R...4... [namesys.com]
Re:New FS (Score:5, Informative)
Notice the plugin feature. This will create endless possibilities for what you can do with the file system. Want to tie a DB/SQL search function in to it? Write a plugin, want special security? Write a plugin. Tons of possibilites with ReiserFS4 and it is _very_ fast. This is hands down better then the MS "a filesystem as a DB" approach. ReiserFS4 will be like Firebird, lean-n-mean-n-fast. Want more features, grab _your_ favorite plugins!
not so fast ... (Score:3, Insightful)
Filesystems are so crucial to OS stability, that I'd say it's worth formally-verifying them to a certain extent (i.e. prove that the algorithms/code work, instead of just observing that they work in normal conditions).
P.S. The whole thing - filesystem as a DB - is complete crap. You can't do a bunch of fs operations in a single trans
Re:not so fast ... (Score:3, Interesting)
The real killer is stored procedures. It'll be a cold day in hell before those are allowed into a kernel.
And how do you email files with attributes or other metadata? They're not part of the regular file data, so all the usual email client
Re:not so fast ... (Score:5, Funny)
Re:not so fast ... (Score:3, Interesting)
I do think this is really funny, though. The more functionality people want to cram into the FS, the more they're going to look back at that famous Usenet thread, and reconsider... ;-)
Re:not so fast ... (Score:3, Funny)
MORTICIAN: Here -- he says he's not dead!
CUSTOMER: Yes, he is.
DEAD PERSON: I'm not!
MORTICIAN: He isn't.
CUSTOMER: Well, he will be soon, he's very ill.
DEAD PERSON: I'm getting better!
CUSTOMER: No, you're not -- you'll be stone dead in a moment.
MORTICIAN: Oh, I can't take him like that -- it's against regulations.
DEAD PERSON: I don't want to go in the cart!
CUSTOMER: Oh, don't be such a baby.
MORTICIAN: I can't take him...
DEAD P
Re:not so fast ... (Score:4, Insightful)
They aren't a problem at all. Every email system can identify file formats it doesn't know how to deal with. Most can get external plugins. The file + attributes can be seen as just a type of file (like say
Re:not so fast ... (Score:5, Insightful)
Re:not so fast ... (Score:5, Informative)
Sorry, but you are wrong here. Reiser4 is atomic and you can pack as many operations into one transaction as you like, you just have to use the reiser4 system call. This is, because there is no standard system call for atomic filesystem transactions. Modern filesystems are databases, build to store files and query them trough filenames, reiser4 is the first filesystem where search path can be done through plugins, therefore you can index everything you want.
Re:New FS (Score:5, Interesting)
We output a lot of digitally-created video files that are huge (think HDTV resolution). Most of these files are output uncompressed because either (a) the file format doesn't support compression or (b) the multimedia program doesn't support compression. Either way, a few minutes of HDTV-quality uncompressed video will absolutely destroy a few hundred gigabytes of space in no time.
We have to hold on to some of this video for quite some time, but we only need to get at it infrequently. It's too big to fit on DVD-R's, tape is too slow, ZIPping it up hinders easy access later, and removable hard drives are expensive. File system compression, on the other hand, does wonders. We routinely get 60%-80% compression on archived video files, and it's allowed us to stretch our disk capacity a long, long way because of it.
We've considered archiving our video in some kind of compressed streaming format like AVI, Quicktime, or MPEG-2, but none of these offer lossless codecs that are appropriate for us, and we're unwilling to accept using a lossy compressor.
So, I ask the question again: when, if ever, is anyone going to implement file compression on a Linux file system? Or does it already exist but is buried somewhere in some arcane HOWTO or website?
Re:New FS (Score:3, Informative)
http://sourceforge.net/projects/e2compr/ (Ext2 Compression)
http://squashfs.sourceforge.net/ (Squashed - Read Only, don't know what that means)
S
Not gonna play with alpha code (Score:4, Insightful)
Sorry, I'm not about to trust archived video to alpha code, or even beta code. If there's no release-worthy option on Linux, we have to stick to NTFS on Windows.
Re:New FS (Reiser4 has a compression plugin coming (Score:5, Informative)
Hans
(You can email edward@namesys.com for details).
Re:New FS (Reiser4 has a compression plugin coming (Score:4, Informative)
if the filesystem does the compression, the apps (or you) can't see it happen. that's the POINT. your suggestion, above, is ridiculous. If you had a tar.gz file, you could extract it to the FS, but it would actually be equally compressed (cause it's a gzip compressed FS), and then you could play with the files to your heart's content, without worrying about the compression, cause it's transparent. You wouldn't need or want some kinda plugin or something...
Unless the FS wasn't compressed, and you wanted a transparent way to access tar.gz files. That idea would make sense.
Re:New FS (Score:3, Informative)
If you consider "streaming" to mean something like RealMedia or other web-based streaming codecs, you are correct. However, working in the DVD/Digital Video/Multimedia fields, we do refer to MPEG-2, AVI, and so forth as a "streaming" format because it is composed of one or more "streams" of content. Basically, the different between what we have now (tens of thousan
Re:New FS (Score:5, Informative)
Re:New FS (Score:4, Informative)
Re:New FS (Score:3, Funny)
Re:New FS (Score:3, Interesting)
With Reiser3, doing `emerge -up --deep world` on my Gentoo box would usually take about 10 seconds after the progress spinner had started.
Now with Reiser4, it takes about 2 seconds after the progress spinner starts.
The speed really is absolutely amazing.
And from what I've read of Reiser4, it has all the database niceties for managing files and contents of files that WinFS is promising. Of course Reiser4 currently exists and is working on my home gaming maching and 4 machines here at wor
Hans Reiser's vision of the future (Score:5, Informative)
Re:Hans Reiser's vision of the future (Score:5, Interesting)
This includes Beagler/Dashboard
http://www.nat.org/dashboard
http://www.gnome.
And of course, the ambitious Gnome Storage project, being pushed by Seth Nickell. He recently wrote a paper comparing all the technologies, found here:
http://www.gnome.org/~seth/blog/document-indexi
Re:Hans Reiser's vision of the future (Score:5, Informative)
HFS+ is the current OS X file system, and that of Tiger (next revision of OS X) as well. Spotlight uses HFS+'s built-in metadata support to enhance it's search capabilities. What Tiger offers more to application developers is an API to add metadata to documents, something that was limited until now.
Re:Hans Reiser's vision of the future (Score:3, Informative)
What Tiger offers is a way for application developers to DECLARE metadata in their document formats... most formats have metadata of some kind already (in an mp3, id3 tags; in a image, resolution etc.; in a source file, dependencies and exported symbols); what Tiger lets application developers do is tell spotlight how to find the information that's already there.
Now, this may lead
Re:Hans Reiser's vision of the future (Score:4, Informative)
= 9J =
Don't try to keep up with Microsoft and Apple (Score:5, Insightful)
Re:Don't try to keep up with Microsoft and Apple (Score:4, Insightful)
In this case, they're one and the same.
Re:Don't try to keep up with Microsoft and Apple (Score:3, Interesting)
Well, the key to a database filesystem will be seamless data entry and simple, powerful access to search and reporting features.
I'm not sure what you mean by "seamless data entry." Maybe I missed something in the article. Are you suggesting people will be willing to provide meaningful metadata for a file when they aren't even willing to provide a meaningful file name? And "powerful access to search and reporting features"? As opposed to wimpy access to search and reporting features? It sounds a bit
Re:Don't try to keep up with Microsoft and Apple (Score:4, Insightful)
MS has basically announced/demonstrated most of the new features that are in longhorn. Effectively that has given the linux community two years to come up with competing features. Adding database features to a filesystem makes sense, beos has demonstrated that you can do some nifty stuff with it and both apple and MS have anounced to do this.
The linux community however is divided. You can install reiserfs, maybe develop some tools that use some of its more advanced features but that doesn't fundamentally change anything if openoffice, KDE and Gnome and other programs don't coevolve to use the new features.
The same goes for stuff like avalon. While everybody is still talking about how such technology might be used in OSS projects like mozilla, Gnome, MS is well on their way of implementing something that may actually work.
Filesystems with rich metadata were already a good idea ten years ago. The OSS community has talked about them where others have implemented them. Two years of more talking would be fairly consistent. IMHO the OSS community is underperforming in picking up new technology and good ideas.
Not really (Score:3, Insightful)
Hardly. There are a lot of OSS projects that are leading the way with new technologies and in implementing good ideas.
But in quite a few areas it's not at all uncommon to see slow support for new tech. The community divides about how to implement the new ideas, which slows things down, but that division fosters competition and provides a base for testing out different ways of getting the new tech out the door.
Sometimes
Re:Don't try to keep up with Microsoft and Apple (Score:4, Insightful)
How's about this for a better idea, instead of trying to keep with Microsoft try to keep up with sound software engineering principles in designing our file systems?
There may even come a time when the required action to impliment this idea is to do nothing.
KFG
Re:Don't try to keep up with Microsoft and Apple (Score:4, Interesting)
As a linux user, I don't sit back and think "this filesystem sucks". For the most part, I'm happy with ext3.
When I do try to make a wishlist, the only things I really want is KDE's IO Slaves integrated into the system at a lower level so that all programs can use it, and a more secure version of NFS. That's it. Perhaps some sort of revision control on certain files, but RCS works fine for me.
I don't want data forks -- it creates more problems (with transferring files) then it solves.
For a similiar reason, I don't want my filesystem to be a DB. I'm happy with files. Damn happy. I don't see what problems a database solves.
Just my $.02.
I'm not the market they are looking for (Score:3, Insightful)
Recall that Mark Stone... (Score:4, Informative)
easy answer (Score:5, Insightful)
We live in a network-based universe. Local filesystems are already good - whether its just continued development in Reiser, or whatever else.
Nfs4, though - its like afs, only without the sucky stuff. AIX is now including nfs4 in its AIX5.3 release, even! With the Big Dog on board, we should realize there's wisdom in that direction ;)
Re:easy answer (Score:4, Informative)
This is very much like saying "the future of filesystems is apache2, local filesystems are already good, now we have to concentrate on apache2".
ReiserFS is pretty damn good (Score:5, Informative)
Re:ReiserFS corruption (Score:3, Interesting)
Trust the Kernel team (Score:3, Insightful)
Always the stable version, didn't have problems until I tried reiserfs, switched back to ext2 ext3 actually, and I didn't have problems again.
The source of my problem appears to be resierfs directly or indirectly I don't know, like most users I don't really care either.
Doctor it hurts when I do this.
Then stop doing that.
Good enough for me.
I want a transparent filesystem/VM (Score:5, Interesting)
I want a disk equivalent of top - something that'll tell me what processes are kicking the shit out of the disks, and by how much.
If Linux could do that - it's more a VM thing than a filesystem - I'd stick with ext3 for years to come.
Who needs a filesystem in a database when you have a database that lives on your filesystem (updatedb). Get that updating in realtime, with more things (like permissions, access times etc.) and a lot of the work is done.
john
Re:I want a transparent filesystem/VM (Score:3, Informative)
PR & tech journalists to the contrary, that is all that is involved in Spotlight & WinFS. Spotlight runs on HFS+. WinFS runs on NTFS. Both are databases stored as files on existing filesystems. The only difference between those databases & updatedb is that they may be using bet
dtrace (Score:5, Informative)
It is, however, expert driven, unlike top, which is simple to use. Still, I think that dtrace shows the furture of performance monitoring apps.
Note that dtrace lives partially in the kernel - it's not portable to Linux.
Filesystems are tools (Score:5, Insightful)
So to develop a one handy "swiss army knife" of filesystems may not be the best route. For the most part one knows what a system will be doing and can build in the most appropriate filesystem for the job.
Re:Filesystems are tools (Score:5, Insightful)
Re:Filesystems are tools (Score:4, Funny)
Wow, your grandmother has production webservers! Cool.
Re:Filesystems are tools (Score:4, Insightful)
I absolutely agree. And I actually think the current interface to filesystems is good. I don't want any major changes. Because major changes would most likely lead to all new kinds of metadata that no applications know how to deal with. And whenever your files get handled by a program without this knowledge, you are losing metadata which again means new applications that makes use of the metadata get screwed. So most of this inovation will just give us lots of compatibility problems. If anybody really want to inovate, and produce something good, then they should implement a clever implementation of the existing interface, that works well for different cases, that is both small and large files, deep trees, many files per directory, few files per directory. AFAIK reiserfs and XFS are doing quite well.
(for example, FAT32 has found its way into grandmother's desktop and production web servers).
FAT is a horrible example, because it didn't become this widely used because of quality. Minix' FS is simpler than FAT, it have more features, and it is a lot faster for random access to large files. FAT-16 had problems with small files, because on large partitions you were forced to use large clusters, which means lots of disk space wasted (I have seen 30% waste in real life cases). FAT-32 did improve on the problem with small files, because now you could have much larger partitions with 4KB clusters. But since FAT-32 still use linked lists to find data sectors (like previous FAT versions), FAT-32 is worse at handling large files than any previous filesystem. For example seeking to the end of a 625MB file in 4KB clusters requires following 160000 pointers. Most other filesystems use a tree structure, which means you can typically index the entire file with at most 3 or 4 levels of indirection, which means you need to follow 4 or 5 pointers. Would you try to cache the FAT table to speed up the access? Good luck, you would need 4 bytes of FAT table per 4KB cluster on the disk, so for a 160GB disk you would need to use 160MB of your RAM just to cache the FAT. And this doesn't get rid of the CPU time required to traverse the linked list.
Gnome Storage (Score:5, Interesting)
This way we can test the waters without messing with the kernel. When the concept is tried, we can decide if we make PostgreSQL a required part of a GNU/Linux system, or a Hurd translator, or whatever.
But... (Score:3, Interesting)
I have about 18GB of files in my main home dir, and I can search it in seconds with slocate and if I need a content search, with glimpse.
I know that this kind of database FS provides a lot of cool opportunities in terms of meta-data, but how useful is it for non-techies, who usually dont name their files coherently, let alone correct ID3 tags or other other meta-data.
Compatible (Score:5, Funny)
Linux.com (Score:3, Funny)
File versioning (Score:4, Interesting)
There is an expectation that the application should do it, that means extra code in each application and they all do it slightly differently.
OK: need an O_NOVERSION on open(2) if the app *really* doesn't want this - eg a database.
Re:File versioning (Score:4, Interesting)
Interestingly enough, Microsoft has implemented just that very feature in Windows Server 2003. They call it "Shadow Copy Volume" and it's accessed through a "Previous Versions Client" add-on to any file's properties. If you overwrite or delete a file on a Shadow Copy Volume-enabled network share, you can just right-click on the file, select "Properties," and go to the "Previous Versions" tab to see all the prior versions of that file. You can recover any one of them you like and save it anywhere you like. Further, the server only saves the deltas between changes, so it's very space efficient.
This is one feature I'd *love* to see implemented on Linux. I don't think this is in Samba yet, is it?
Keep it all modular, please (Score:5, Interesting)
XFS comes close, ReiserFS 4 is nice, too. The most important thing is keeping the base filesystem simple and FAST. You think NTFS is fast? Try deleting a complete Cygwin install (>30K files) It takes AGES, even from the command prompt. I've deleted 15K files (That's 15 THOUSAND files) on Reiser 3 on the same machine, it took a few seconds.
DO NOT make a database driven filesystem. Some day we will have a true, document based desktop paradigm (OpenDoc anyone?) but probably not for several years, until then we need SPEED.
Speed and Versioning (Score:3, Interesting)
Next generation? (Score:5, Interesting)
Solid, universal support for ACLs, and while we're at it, let's fix the whole user/group namespace mess Unix has with it. Let's use an SID-style id like Windows does.
For example: my small network at home, centrally authenticated through ldap.
Now, windows knows the difference between the user "jim" on local machine A, "jim" on machine B, and "jim" the domain user. They'd be shown as MACHINEA/jim, DOMAIN/jim, etc.. The various SIDs take the domain (or workstation) SID and append the UID. So if his number is 100, his sid is "long-domain-sid" + uid. So when you pass around sid tokens, you know exactly which jim you're talking about.
Now in linux, we just have numbers for users and groups. If user 100 on machine A is "jim", user 100 could be "sally" on machine B. Moving that stuff to ldap becomes messy, now I have to reconcile the numbering schemes of all the machines I want to migrate. Ick. And you get all kinds of screwy stuff sharing folders, if you ls it on one machine it'll show wholly different ownerships.. Is the source of about a billlion and one nfs security holes.
And of course, since a file can only have one permission set - owner, user, group, it sure does make for some sucky shit. The lazy among us would just run as root all the time to avoid the whole damn mess.
I know there's a circle jerk of workarounds, patches and gotchas to avoid this, but it should never be a problem in the first place. The basic unix security model is out-of-date, and is the source of many systemic problems.
Re:Next generation? (Score:3, Informative)
UNIX has traditionally been about big systems with multiple users. Networks have been a standard feature for decades. In this sort of environment, you'd naturally use some network-oriented naming service, be it NIS or LDAP.
Windows has grown from a PC background where everything is traditionally local. In a networked environment there is little need for the MACHINEA/user when there is a DOMAIN/user (some
Re:Next generation? (Score:5, Interesting)
Or rather, it is the source of the NFS security hole. But it's okay. NFS4 (or 3, even) with Kerberos totally solves this problem, much more elegantly.
Everyone's all excited by ACLs, but I'm sceptical of their real world value. The "keep it simple" principle of security can't be emphasized enough. With ACLs, you have to really examine the access rights of a given object to figure out what's going on. With the standard Unix user/group system -- with simple directory-based inheritence -- it's completely transparent.
And, most importantly, I've yet to see one thing worth doing with ACLs which couldn't be set up with user/group permissions instead -- and more simply.
Re:Next generation? (Score:5, Interesting)
In the NT 4.0 days, one of the better ways to handle permissions was the 'AGLP' standard. User A)ccounts go in G)lobal groups, G)lobal groups go in L)ocal groups, and local groups get P)ermissions.
This allows a nice level of indirection. I implemented this standard by specifying that Global groups described groups of people, and that Local groups specified access privileges. I built Local groups on each server describing the kind of access privileges they offered. Generally, I would make four groups for each of my intended shares: Share NA (no access), Share RO, Share RW, and Share Admin. I would assign the appropriate ACLs in the filesystem, and then put Global groups from the domain into the proper Local groups. The Accounting group, for instance, might get RW on the Accounting share. Management might get RO, and the head of Accounting and the network admins would go into the Share Admin group.
What this meant was that, once I set up a server, I *never again* had to touch filesystem permissions. Not ever. All I had to do was manipulate group membership with User Manager... with the caveat, of course, that affected users had to log off and on again for permissions to take effect. But this is also true with Unix, in many cases. (when group membership changes).
Note that Windows 2K and XP have more advanced ways to handle this, so don't use this design in a Win2K+ network.... this is the beginnings of the right idea, but 2K added some new group concepts. Under Active Directory, this idea isn't quite right. (I'd be more specific but I have forgotten the details... I don't work much with Windows anymore.)
ACLs are key to this setup, because I can arbitrarily specify permissions and assign those permissions to arbitrary groups.
By comparison, User, Group, and Other are exceedingly coarse permissions, and it is very easy to make a mistake. What if someone from Human Resources needs access to a specific Accounting share, but nothing else? Under Unix, I can't just put them in the Accounting group, because that will give them access to everything under that Group permission. I'd probably have to make a new group, and put everyone from Accounting and the one person from HR into that, and then put the special shared files into a specific directory, and make sure the directory is set group suid. That is a lot of steps. Everything is always done in a hurry in IT, and lots of steps are a great way to make mistakes. Messing up just one can result in security compromise.
In my group-based ACL system, I'd still have to make a custom group, perhaps "HR People With Access to Accounting Share". But I'd only have to touch one user account, the HR person's, and wouldn't have to disrupt Accounting's normal workflow at all, or touch any filesystem permissions.
Instead of a whole series of steps, any one of which can be done wrong, I have only three: Create new Global group, put HR person in new Global group, put Global group in the correct Local group. All done. Hard to screw this up too badly.
Now, I'll be the first to admit that a badly-implemented ACL setup is a complete nightmare. But a clean, well-thought-out ACL system, in a complex environment, is virtually always superior to standard Unix permissions.
Re:Next generation? (Score:3, Insightful)
Is the Linux/Unix community so "steeped in tradition" (also known as stubborness, obstinance, intolerance, and narrow-mindedness) that it willfully clings to an outdated, inferior way of doing things?
Re:Next generation? (Score:4, Interesting)
No, the entire OSS community is not an echo chamber, but the kernel development community is. I've seen flamefests caused by some poor soul suggesting very minor changes to, for example, the semantics of pipes. Unix isn't just written in stone, it's laminated and stored in an evacuated nuclear-blastproof case 500 meters underground.
Is the Linux/Unix community so "steeped in tradition" (also known as stubborness, obstinance, intolerance, and narrow-mindedness) that it willfully clings to an outdated, inferior way of doing things?
Again, it is not the community as a whole which is stuck, but the kernel people. I was simply pointing out the truth, not trying to say it's a good thing. Although I think it is wise to be suspicious of radical new ideas until they have proven themselves, I think that many times ideas are rejected for purely dogmatic reasons, and that really restricts innovation.
Re:Next generation? (Score:3, Insightful)
Why????? (Score:3, Informative)
Any good (XFS, JFS, ext3) filesystem now has nice feature called Extended Attributes which is intented for STORING such a data (like previews etc.). And using user-space server it's much more easier to add plug-ins for various file-formats, "search" plugins etc.
Do we all have to make the same mistakes? (Score:4, Interesting)
Whether or not it is useful, one thing is clear: this sort of thing is not "innovation". Databases as file systems have been around for decades, as has the question of file system metadata. The UNIX choices in this area are by design, not by an accident of history, and the motivations behind those choices are as valid today as they were several decades ago.
Linux is a ways yet from having a fully attributed, database-driven, journaling filesystem. The direction of future development looks promising, though. Linux will certainly compete as the search wars come to the desktop. Linux's value to the enterprise depends on it.
There are two things one needs to keep apart: what functionality do we want to support and how do we want to support it. Search clearly is important. Metadata clearly is important. However, whether a "fully attributed, database-driven, journaling filesystem" is the best way of implementing those features is an open question. There are many possible alternative designs.
And, in fact, it seems right now as if Microsoft is, in fact, not building what the author seems to think they are building, but is choosing an implementation strategy that combines a more traditional file system with user-level databases.
been there, had that (Score:3, Informative)
The seamless filesystem-in-a-database was created in the Multi-Valued DB structure [multivaluedatabases.com] in the mid-60's and release as the the Pick OS [wikipedia.org]. It is still sold by Raining Data [rainingdata.com] and runs on Windows, Unix, and Linux.
OS400 is still different (Score:3, Interesting)
Why no MS DBFS? (Score:3, Interesting)
Palm (Score:3, Insightful)
The assumption is wrong (Score:3, Insightful)
This whole article is based on nonsense. Microsoft has a long way to go before it catches up with Linux in the filesystem area. There is no realistic prospect of Microsoft keeping pace with Linux filesystems in the foreseeable future.
(Before dismissing me a Linux fanboy, note that the above applies only to filesystems. When it comes to understanding of GUI issues, I'd make a similar statement but with Linux and Microsoft swapped. But that would be off-topic.)
Re:The assumption is wrong (Score:3, Insightful)
On the other hand, it already does almost everything ReiserFS 4 _promises_ to do, and with NTFS it actually works, tried in the real world, and can be trusted.
Small files aggregation : NTFS stores small files in the MFT directly
Plugin : NTFS reparse points
Encryption : there since ages
etc..
Linux supports a lot of filesystems, but very few come close to NTFS when it comes to capabilities, scalability,...
EMBED VERSIONING! (Score:3, Insightful)
Embed versioning into the filesystem. I believe Reiser has talked about this. Imagine being able to right-click on a file, folder or even partition and choose "roll back" or "restore" from the context menu. It then presents you with a list of snap-shot points you can restore to, starting with "last change".
Who backs up their hard drives any more? Have you thought of the problems and time involved in backing up 40, 80 or even 200 Gb of data? I'd MUCH MUCH rather have this feature than some enhanced search.
Re:EMBED VERSIONING! (Score:3, Insightful)
- this feature was not considered important by users and thus the systems offering it were not surviving
or:
- the feature was considered important or nice to have, but decision which system to buy was not made based on important or desirable features.
I think it could be the latter. However, that means that introducing useful features will not sell your system... what a wonderful world.
I'm working on this problem today (Score:3, Insightful)
- Integration with a Kerberos SSO strategy
- Fast performance
- Cross-platform compatibility with Windows
- Robust Access Control mechanisms, RBAC would be nice but DACL is probably reality.
In my opinion, these are the primary goals that companies are looking for. Not a "journaling" file system, or built-in encryption. Sure those are nice, but let's get the basics first. Unfortunately, CIFS is still in quite a state of beta (even on the 2.6 Kernel) and there don't seem to be any real other alternatives.
Apple does NOT have a new FS coming out. (Score:4, Informative)
Their solution is to build a service that can interact with individual files, including their native metadata (ID3 tags, pdf metadata, MS Office metadata, email headers, etc.) through metadata importers and to store the metadata indexes in a separate database. This is relatively similar to how iTunes does it's thing. The services will have lots of APIs open to apps to incorporate the functionality locally.
The obvious clue that HFS+ isn't going away is that Apple is finally pushing full HFS+ support back up to the command line utils like cp to support resource forks and whatnot in 10.4, so hopefully we can stop needing OS X specific tools like ditto.
They've been adding improvements steadily over the years, such as journaling and most recently case sensitivity. The more obvious question to me is why doesn't the Linux community just jump all over HFS+ and build off of Apple's work since they seem more than willing to give the HFS+ support back anyway?
Re:Apple does NOT have a new FS coming out. (Score:3, Informative)
Because the features they're adding to HFS+ are already available in other filesystems? There's nothing in HFS+ that would make linux users want to use it, and some compelling reasons why they would not. (Performance, size limits, lack of an online resizer, etc.)
Laughable - at best. Likely just worthy of a groan (Score:3, Insightful)
Granted, the proposed featureset of WinFS is vastly 'superior' to that of the 3 main linux contenders, but it could be argued that WinFS is neither a filesystem itself, nor is it on par with any of the linux filesystems in terms of performance or stability (if NTFS5 is to be of any forboding).
I seem to recall reading about several projects that impliment WinFS-like features. I don't recall what they were, and I don't think they were kernel-space projects, but I recall thinking, "this looks nice".
Besides, let's be honest here. What practical functionality does WinFS provide that is above and beyond the combination of 'locate', and 'file' used in conjunction? WinFS seems to me to be merely a crude hack so as to make up for the fundamental shortcomings with MS's OS design.
Re:Laughable - at best. Likely just worthy of a gr (Score:3, Insightful)
I'm wondering if you even know what WinFS is, comparing it to file and locate it laughable at best.
Try finding all mp3 by Brian Adams or Withney Houston on a 200Gb disk filled with 250'000 files with file and locate, you'll get the answer 10 minutes later.
With WinFS, it will take you a whooping 2 seconds maximum.
That is
EXT3 FS (Score:3, Interesting)
While I personally believe Redhat is known to push "unstable" releases, I was suprized that from 8.0 - EL3 the EXT3 fs was still crashing and Redhat was still offering this as default on an install.
Anyone else had better expierences with EXT3? I am curious if anyone has more information on why this FS seems so damn unstable.
For test purposes we run "out of the box" installs, so there should be no kernel tweaking or any other "anamolous" things going on with the install's or the boxes.
Some features I'd like to see in a new filesystem (Score:3, Interesting)
Suppose you had an infinit number of loop back devices and these were hidden and used internally by the file system and when you started an application you could "mount" what for many intents and purposes looks like a TARBALL and the application in question and ONLY the application in question got to see all the files in this TARBALL. Well, the files inside a "TARBALL" of this nature would probably not be compressed, but, they could be if desired... Well, that is the concept of a Partition Data Set.
In the case of a user logging in, when the shell is started a mount could take place against the user's private data set. By doing this on a shared machine, file security can be guaranteed. For export and import the system could mount a "shared" dataset.
This sort of secutiry is far superior to ACL's and anything present file systems offer for the very simple reason that normal people including systems administrators would not normally see any of the files inside one of these datasets. Consider the advantages of running an apache server where you KNOW all associated files needed by that release of apache are in a single dataset. There IS not easy way to lose a file or clobber it or accidently delete it and so forth. Next consider that when that copy of apache starts up it _could_ simply mount a set of files each of which contains the whole website for a given domain.
Upgrading to a new copy of apache would be as simple as copying in a new dataset and mounting it against the web datasets. If a glitch is found, simply restart the old copy.
Backing up a set of files becomes a simple copy operation. Replication can be accomodated as well.
Systems Administration in those old IBM mainframes was MUCH easier than with UNIX systems and this is in large part because of the way the system handled partition datasets.
------------
Now, with this we would want to be able to mark certain files as being external sort of like opening up a window, and through this window we could for instance access certain files which might be the executables and supporting scripts.
Of course people will point out we can accomplish some of this with a loop back mount. The problem with the loopback mount is that it populates the directory tree and this is what I really want to avoid. Frankly there really *IS* no reason for even a sysadmin to be able to see 90% of the files that say consitute a web server, or say PostFix, or PostgreSQL. We accomplish a lot if the executable which needs access to its supporting files has a "private" loopback and only this executable by default gets to see the mounted dataset.
--------------
Next idea is versioning the way Digital Equipment Corporation did it in the VAX. We simply append a version number and what the delete does is append a version number. With disk drive capacities heading into the stratoshere there is no reason to be conservative.
And this leads to the next idea which has been mentioned before... that is replication - across machines.
I can buy for $20 bux a machine (P1 200mHz) that can run a 20 GB hard drive and in fact I think they can run 80 GB hard drives as well. Rsync is useful, but a full replicating filesystem at the kernel level, or at least at the level of a deamon close to the kernel would mean that a machine can be backed up to another machine in perhaps another building automatically and with little effort.
Well, I'm sure other people have other things they might like to add. This is my wish list.
Re:bah! (Score:5, Funny)
Re:A year or two, or... three? (Score:3, Interesting)
In the early days of Linux (1992/1993ish) a new filesystem seemed to appear each week. Most were pretty unstable, though. My first Linux machine, which started out as v0.11, kept its root partition as minix-fs for a long time for this reason (and also because I didn't feel like recreating my system).
Re:Is it? (Score:3, Informative)
In a nutsheel (Score:3, Interesting)
(ie "Replacing node 37827 with node 5279867....replaced")
One the modification is done, it erases the entry.
After a crash, the system only need to look in the journal file to know which file 'might' be corrupted and restore the old version of each...
At least, that's how I understand it...
Re:Next premise, please (Score:3, Informative)
hfs+ [apple.com] supports a journal (starting with macos 10.2.2 server and 10.3 panther), and ntfs5 [microsoft.com] supports a journal (starting with win2k)
Re:Next premise, please (Score:5, Informative)
What are you talking about? NTFS has had journalling for over a decade. And Unicode. And ACLs. And streams. And reparse points (these are amazingly cool). And compression. And encryption. And
Now, MS doesn't use most of this good stuff, but it's all in there. Even three-letter file extensions on Windows are obsolete, since everything on NTFS can be an OLE server. There's nothing on Linux that comes close to the capabilities of NTFS. About the only major thing NTFS is missing is versionning, which VMS has.
Re:Encrypted filesystems? (Score:3, Informative)
Linux encrypted filesystems not really up to snuff (Score:3, Informative)
First, it's minimally supported by distros. I can't just set up a Fedora system out of box, and check "use encryption" and have it do an NTFS-style decryption of the file encryption key using the password entered at login for each user to decrypt that users' files. It requires hacking around pam and maybe initscripts.
Second, if that *was* done, it would take a different filesystem per user (per key), which is a pain to maintain.
Third, it can't be enabled by
Re:Encrypted filesystems? (Score:3, Insightful)
Everyone seems to like this method. But do you also encrypt your swap partition? If not, then whenever the system swaps, unencrypted data gets stored somewhere on the swap partition.
Here's something that might terrify you: run grep on your swap partition and give it a few characters from your password. You needn't list the entire password. Scary, eh? (This won't work for everyone, but it might for you.)
Remember, if you're using lo
Re:Why not use... (Score:3, Interesting)
Indestructable is the killer app (Score:3, Interesting)
Re:Indestructable is the killer app (Score:4, Informative)
Offsite backups are your friend. No matter what your filesystem's software, or the coolness of your raid array, or your battery-backed redo-logs; if a fire or a burglar takes your disks holding your filesystem you're hozed.
Personally, instead of a raid, I do a nightly "rsync" to a "yesterday" drive on a separate server (hense protecting myself from stupid-user failures as well as filesystem/disk failures); a "every time I did something significant" rsync to an encrypted filesystem removable drive kept in my car; and a "once in a blue moon" copy to DVDs in a safe.
An added benefit - upgrading an OS, or a computer is trivial, because the live backups are just that - live, and tested every night.
(Back to the filesystem topic, Reiser's whole naming idea is so much cooler than a heirarchy or a relational system I really hope this is the next big advance for Linux).
Re:why not improved ramdisk? (Score:5, Informative)
The solution would be to load things "on demand," as you've suggested.
Linux already does this, and it does more.
If you've ever looked at the output of free(1) after your system has been running for an hour or so, it will appear as if almost all your memory is in use. See those last two columns, "buffers" and "cached"? That's your "on-demand ramdisk" at work.
Linux will use memory that applications aren't using to cache filesystem data (including executables and metadata) to speed future accesses. If your applications need more memory than is currently free, the kernel will drop cached data rather than swap out application memory to disk. That way, you get the benefits of having your executables on a ramdisk, with the flexibility of not having to sacrifice running application performance in the process.
Re:MOD'd Troll???? (Score:3, Funny)
Re:Another solution in search of problem (Score:3, Insightful)
Nobody has come up with a compelling reason or feature to make me want to change filesystems.
I disagree. Why can I go to google and search the entire web for something and get an answer in less than 1 sec, and I can't do that on my computer or lan?
Re:Another solution in search of problem (Score:3, Interesting)
A fine question. Which makes me wonder: Is google's next killer app a new filesystem? As the search kings, and rumored Linux users, might they be about to enter the hard drive search / filesystem market? Various pundits speculate that gmail is their first foray into searching beyond the web; surely, at some point, their technology will reach our ha
Re:Another solution in search of problem (Score:5, Insightful)
More than three times a week, and that's criminal.
I mean, throwing things about in your home or My Documents directory are fairly standard. How often do you put your (picture) files in a \qw3r3et354t\bchnjc8g45\3j4n45g9u98d directory?
While everyone seems to see WinFS (and associated services) as some sort of search panacea, your ability to retrieve those files is linked to 1.) its metadata and 2.) your ability to recall a search term that appears in the metadata. If your search for "bird" and the metadata specifies "hawk", short of a dictionary search, you still cannot find it. It doesn't matter if the uber search capabilities can span the entire hard drive in 5 secs, and run through multi-dimensional data. You still need a search term, and that search term (in whole or in part) must appear somewhere in the file, be it the filename or metadata.
Essentially, WinFS makes data appear more ordered (assuming you take the time to fill out the fields). Otherwise, it's useless.
Re:Another solution in search of problem (Score:3, Insightful)
Filename
Re:Another solution in search of problem (Score:3, Insightful)
Re:Another solution in search of problem (Score:4, Insightful)
A better label to use would be "complex". To respond to your argument that the only obstacles to db-fs are ignorance and blind conservatism, complex software is undesirable. It increases costs in terms of man hours to maintain it, it increases QA overhead, and it increases support calls from users who came to depend on a feature which was included for completeness, but was never audited for correctness or robustness. People don't code complex software unless they are paid to do it (and usually when a manager is making the technical decisions). This is the reason most open source/free software tools seem to follow the Unix philosophy; simple tools which do one task and do it well, but are yet flexible enough to build into more complex systems. A monolithic database filesystem does not appeal to the sort of psyche which produces open source code for that reason: Complexity doesn't make a programmer's job fun. In order to produce large amounts of code at a low cost as in the open source/free software world, the people behind the engineering of the software need to be having fun, and a complex database filesystem is a rather good example of something which is _not_ fun to produce and therefore unappealing to the hacker sort.
Re:Apple isn't "changing filesystems"... (Score:3, Interesting)
Copying from three actively used locations, merging, and putting them on a slow external drive. Then I back up from there. I don't want to stop using my drives while I burn 30 DVDs.
so I guess renaming a bunch of files is faster than moving them to another partition
Noooooo, if you read my post I said that copying, moving and renaming, with a large amount of ID3 parsing, on HFS+ was faster than JUST copying on ReiserFS.
My dual Athlon never once locked up since I rem
Re:is this sarcasm (Score:3, Informative)