Linux Gains Lossless File System 331
Anonymous Coward writes "An R&D affiliate of the world's largest telephone company has achieved a stable release of a new Linux file system said to improve reliability over conventional Linux file systems, and offer performance advantages over Solaris's UFS file system. NILFS 1.0 (new implementation of a log-structured file system) is available now from NTT Labs (Nippon Telegraph and Telephone's Cyber Space Laboratories)."
NTFS (Score:2, Interesting)
Isn't this similar to NTFS's journaling file system?
Re:Bloat? (Score:3, Interesting)
Of course, you can delete files and re-use the space. But the performance slows down greatly once you start filling in "holes" left in the log after wrapping to the end of the allocated area. (A similar situation to database where you might want to compact, vacuum, condense, etc. a table).
Bundling (Score:3, Interesting)
Data is the new currency my friend (Score:3, Interesting)
Walmart's most prized possesion is their billion-billion-billion transaction customer sales database. They use it to find things like, among other things, men tend to buy beer and diapers at the time.
With disks costing $1.00/GB or less these days, many people including myself simply DON'T delete data anymore. I keep all my original digital photos (in
So yes, for many people, disk space is just something you keep adding to, like you'd move from a coupe to a sedan when you have kids and when you have that 6th kid you move to a minivan and if you happen to have 2 more, you get a cargo van when #8 comes along
Re:Shutdown versus power off (Score:5, Interesting)
That's a very bad idea. Normally, journaling file systems only guarantee that the file/directory structure remains intact. It does not necessarily guarantee that the data in the files hit the disk. Also, your disk will probably have a cache that is lost when you remove power. Whatever is in the cache will also be lost.
So your file system may be intact, but your practices will probably destroy data.
HDFS (home-dir FS)? (Score:5, Interesting)
With FUSE [sourceforge.net] it might even be possible for mere mortals like me.
Basically, I very rarely push more around more than 100-200kb at a time of "my stuff" unless it's big OGG's or tgz's, etc. Mostly source files, documents, resume's, etc. In that case, I want to be able to go historical to any saved revision *at the file-system level*, kindof like "always on cvs / svn / (git?)" for certain directories. Then when I accidently nuke files or make mistakes or whatever, I can drag a slider in a GUI and "roll-back" my filesystem to a certain point in time and bring that saved state into the present.
Performance is not an issue (at first), as I'm OK if my files take 3 seconds to save in vim or OpenOffice instead of 0.5 seconds. Space is not an issue because I don't generally revise Large(tm) files (and it would be pretty straightforward to have a MaxLimit size for any particular file). Maintenance would also be pretty straighforward: crontab "@daily dump revisions > 1 month". Include some special logic for "if a file is changing a lot, only snapshot versions every 5-10 minutes" and you could even handle some of the larger stuff like images without too much work.
Having done quite a bit of reading of KernelTraffic [kernel-traffic.org] (Hi Zack) and recently about GIT [wikipedia.org], maybe it's time to dust off some python and C and see what happens...
--Robert
Re:Shutdown versus power off (Score:3, Interesting)
Here's a little (simplified) tutorial on what happens when you a program writes a file to disk:
Nothing has been gained just yet... (Score:2, Interesting)
The system might hang under heavy load.
VMS isn't entirely closed source... (Score:3, Interesting)
As I recall, RMS is an indexed file management system. I wrote a molluscan taxonomy database system that used it in the 80s... but I usually encapsulate all OS-specific stuff in subroutines, so somebody has probably ported it to a cheaper database by now.
It sounds like you want NTFS (Score:1, Interesting)
Since it was designed for POSIX compliance, it has file change times, case sensitivity, and hard links. About the only issue is that you may have to write your own plug-in for proper soft link support.
As bonus features, you also get a change journal (persistent log of all directory changes -- useful for indexing applications), programmatic sparse files (you use ioctls to tell it where the holes are), Unicode filenames, multiple data streams per file, and a plug-in architecture (reparse points).
Additionally, there are already plug-ins (reparse point filters) for things like copy-on-write files, hierarchical storage (an old file may be archived to CD, but still has a directory entry, and when opened is brought back onto disk), and persistent mount points.
And in upcoming versions there will be more goodies, such as directory quotas, filename restrictions (so you could prevent somebody from creating a filename that ends in '.exe' in your web root), and fully distributed ACID transactions.
Now some may argue that some of these things, such as compression and encryption, don't belong in the filesystem. However, I can't think of any way to do it cleaner. For one thing, in order to encrypt a file you must first compress it (reduce entropy). Do you have compression on top of encryption on top of the filesystem, or do you have two different compression implementations?
Your compression layer would need to intercept not only reads and writes, it would have to intercept calls to get or set the file length, otherwise an unsuspecting program would get the compressed size. And now your compression layer needs some metadata about which files are compressed and what their expanded sizes would be. Where do you store that? Do you put it all in one file somewhere that an unwitting sysadmin might be tempted to delete it to save space, or where a single corrupt sector could screw up all the data, causing you to lose every one of your compressed/encrypted files? Remember, most filesystems log metadata changes, not data changes. Since compression is sitting on top of the filesystem, its metadata looks like regular data to the FS, so it is (along with all of your files) vulnerable in the event of a crash.
What happens when somebody decides to optimize the compression layer by storing some of the compressed data (perhaps a Huffman coding table) in the metadata file? Since the metadata file probably isn't encrypted, an attacker now has some information to help him decrypt your files!
dom
Re:getting rid of unwanted data (Score:3, Interesting)
Assuming that it does actually do the trick, it might be even better than wiping a single file. Since the whole drive would be filled with random data, there wouldn't be any conspicuous wiped blocks sitting in the middle of an otherwise normal looking filesystem.
Version control FS (Score:2, Interesting)
Re:Horrible headline (Score:3, Interesting)