Become a fan of Slashdot on Facebook


Forgot your password?

Slashdot videos: Now with more Slashdot!

  • View

  • Discuss

  • Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

KDE GUI Software Linux

Database File System 296

Posted by Hemos
from the no-matching-records dept.
ozy writes "With all the fuss about searching and Spotlight and WinFS, check out the Database File System a completely different interface for your files, implemented in KDE. There is actually a request for developers to join a project to implement this under GNOME and leave how we use the desktop today behind."
This discussion has been archived. No new comments can be posted.

Database File System

Comments Filter:
  • by manavendra (688020) on Monday September 06, 2004 @11:19AM (#10168870) Homepage Journal
    ...I can kind of see this would make it easy to search and locate documents. What about backups though? How would a user be able to group (manually) related files together, so that the whole bunch can be backed up later, without having to search for all seeminly related (or unrelated) keywords to trace all hitherto-unrelated documents?

    Secondly, with this mass of files being spread over several disks, surely, this is in a way forcing the user to "search" for everything. Or isnt it? Will the underlying FS layer still be accessible in the general way that it is?
  • Performance? (Score:2, Interesting)

    by saden1 (581102) on Monday September 06, 2004 @11:19AM (#10168871)
    Isn't this thing with DB's getting a little excessive? You're adding another layer and step to storing data which will in all likely hinder performance. I'm not sure the benefit out weight the cost.
  • by zero-one (79216) <> on Monday September 06, 2004 @11:21AM (#10168883) Homepage
    I have always thought that version control (file histories, branching and atomic changes) would be nice to have at the file system level. Instead of storing myessay-firstdraft.doc, myessay-seconddraft.doc, myessay-final.doc, the file system should do the work. Then if I want to make a bunch of changes (perhaps I want to try a new page layout), I should be able to commit them as one atomic change (or throw them all away if I change my mind). Then, when I want to make a set of documents with US spelling, I should be able to branch the whole lot (using no disk space) and make the small changes from UK spelling while still being able to integrate other changes I have made.
  • by mn3m05yn3 (772460) on Monday September 06, 2004 @11:23AM (#10168905)
    Doesn't seem to be true. Article indicates the DBFS will run on top of other filesystems. if you don't use KDE, you don't use DBFS... ext2/3 etc is still there. The real question is whether you can use KDE without using DBFS...
  • by ACK!! (10229) on Monday September 06, 2004 @11:25AM (#10168921) Journal
    Perhaps I am more organized than most but I already categorize my files and such in the hierachal file structure.

    Isn't Rieserfs planning to do this on the kernel level?

    Where does that leave other fs choices and storage and other idea dbfs?

    I see more and more people saying look what neat things you can do with these tools.

    But really why do you personally want something like this?

    Curious to see the response is all.

  • by carnivore302 (708545) on Monday September 06, 2004 @11:26AM (#10168925) Journal
    Will the underlying FS layer still be accessible in the general way that it is?
    Sure, because the dbfs is implemented in userland. All applications will work as before, including ls.

    I wonder if this could be made into a plugin for reiserfs 4.

  • by ultrabot (200914) on Monday September 06, 2004 @11:27AM (#10168939)
    I have always thought that version control (file histories, branching and atomic changes) would be nice to have at the file system level.

    Sounds like a job for an SVN plugin for Reiser4 file system. Anyone doing one already?
  • software? (Score:1, Interesting)

    by Alosja (787167) on Monday September 06, 2004 @11:28AM (#10168944) Homepage Journal
    might sligtly offtopic, but is there any open source software for windows that could do the same thing? I produce a lot of notes, and i want to be able to categorize my files.
  • by data1 (23016) on Monday September 06, 2004 @11:28AM (#10168945) Homepage
    The author is asking for help to move the project to Gnome.

    Quote:There is of-course the hard choice of platform. I choose KDE? because I am familiar with QT a bit, and because it is inherently object-oriented, being C++ and all. But in my mind GNOME? is much closer to how I would like a desktop system to function. So I would like to go for the GNOME option. I leave KDE developers to do what they want with this, and I am offering them support. But I would like to focus my efforts on GNOME and implementing the above in GNOME.

    Any volunteers?
    In the first place we will need developers. Would you like to join, send me an email ( with DBFS and JOIN somewhere in the subject. If you are not a developer, but still would like to help, please revisit this page in a few weeks. There will probably be a community website by then somewhere.
  • by greppling (601175) on Monday September 06, 2004 @11:37AM (#10169004)
    ...when he said on LKML, slightly paraphrased: "The only reason I see to put filesystem semantic enhancements into the kernel, is that it would be socially hard to get people to agree an a single userspace library."

    (In the course of the heated discussion about Reiser4.)

  • by DarkOx (621550) on Monday September 06, 2004 @11:41AM (#10169033) Journal
    Stop letting good ideas be victimized by the M$ marketing FUD machine. Someone said "database filesystem" That was a good idea. M$ can along and said gee lets steal that idea. Hey there is no existing implementation to copy how do we do it. The answer is they did not do it. They put a database to keep rather mundane information on top of NTFS(a real filesystem) and called it WINFS(NTFS and a almost completely unrealated database, not a file system). Keeping data on the relations ships between files is a nice idea. Putting it in user space is dumb. Its overhead. Look at most of the example he gave "find a word document I worked on last month". All that info is already in the filesystem. A filesystem really is already a database in the strictest since. It stores whats on which inode assoicated with how many blocks which you could think of as attributes. It also sores attibutes like permissions and dates. Why not just put some more attributes into it like subject and relatedtopicID . If you did that and then added the ability to maintain some other tables where you could put extended descriptions and stuff, and built up the query engine to be able to efficently solve queries users will likely ask then you'd have what your really looking for. Addionally you would lose the overhead to a degree because you'd be storing informaiton once instead of in the FS and in the database.
  • by lcsjk (143581) on Monday September 06, 2004 @11:43AM (#10169053)
    I already use blinkx, beta, from [], to automatically search my files along with internet keywords. It doesn't have the search by date or extension and is not configurable to my liking, but it seems to do a good job of finding things I have misplaced. Integrated with the author's system, this could make a great search system.

    Normally I file things in a hierarchial method by year and month and by project name (2004file/9sep/) or (2004file/workfile/projectname), but still I lose track now and then and need keywords. Change the "slash" slant to fit your OS.

  • by mooobo (804536) on Monday September 06, 2004 @11:43AM (#10169057)
    Have a look at this for a userspace filesystem

    There are a debian package called fuse-source and fuse-utils. I.e. use all these nice kio plugins on filesystem level.

    Just add

    deb unstable main
    deb-src unstable main

    to your /etc/apt/sources.list

    apt-get update
    apt-get fuse-source fuse-utils

    Actually, I didn't test that yet. Someone else?
  • by BenjyD (316700) on Monday September 06, 2004 @11:45AM (#10169065)
    The problems with the hierachical system are:

    - maintenance overhead on part of the user to create hierachy and maintain it. Every time you save a file you have to think "where do I put this?"
    - Finding files can be hard. Is that letter about the planning application in Documents/Letters or Documents/Planning App?
    - keeping files in two or more places at once is hard (as in the previous example). You can use softlinks, but that's hardly ideal and doesn't survive moving things around.

    Basically, the current file system imposes a significant overhead. Most power users have restructured the way they work and use a computer in order to minimise that overhead without really noticing. It's just become one of those things you have to do, like remembering to save documents.

    Why not shift the burden of organising the files onto the enormously powerful computer, rather than take up valuable human mental resources.
  • by reporter (666905) on Monday September 06, 2004 @11:49AM (#10169095) Homepage
    These days, operating systems like both Linux and Windows XP have too many bells and whistles that I simply do not need. Unfortunately, these bells and whistles drastically increase the amount of space that I need on my hard drive.

    Adding a database layer to Gnome sounds like using another 300 megabytes of storage on my hard drive. I simply do not need the database.

    If the FSF/GNU folks really want to do something revolutionary, they should fork Linux+Gnome into 2 distinct paths: minimalist and maximalist. The maximalist is what we have now. The minimalist is a minimally featured Linus+Gnome distribution. It is the bare minimum in functionality that we need to have a decent operating system and desktop.

    Into this minimalist installation, I will then add the applications (e.g. MatLab) that I use daily.

  • by lkcl (517947) <> on Monday September 06, 2004 @11:59AM (#10169153) Homepage
    standard filesystems only have ONE index - a hierarchical one that contains a certain amount of real-time-updated indexing (such as the timestamps on a directory)

    but it is NOT a relational database: you CANNOT easily create or use an alternative index to your files.

    that's what all the fuss is about.

    some people mentioned here that they already organise their files. great. fantastic.


    and how long would it take to reorganise?

    with a relational database, all your indexes are updated AUTOMATICALLY.

    therefore, doing searches on a relational database filesystem (find me all music files with dates between last week and last month: SELECT * from files WHERE files.type = "music" and NOW() - 7days

    you _can't_ do that sort of thing on a traditional filesystem.

    sure, you can emulate it by creating symbolic links all over the place, but what happens when a file is deleted or moved? you need to manage / relocate the symbolic links...



    fantastic idea.

    now can we have them as a kernel module, pleeease?
  • by eschasi (252157) on Monday September 06, 2004 @12:00PM (#10169160)
    To quote from the article (which most folks have not read, as usual):
    The DBFS does not actually store files, it holds references to files on the underlying hierarchy based file system.
    That line alone should answer many of the questions re backup, speed of FS performance, etc.

    At a deeper technical level, nany of the questions asked here have historical answers or clues in The Design and Implementation of the Inversions File System []. The abstract reads:

    This paper describes the design, implementation, and performance of the Inversion file system. Inversion provides a rich set of services to file system users, and manages a large tertiary data store. Inversion is built on top of the
    POSTGRES database system, and takes advantage of low-level DBMS services to provide transaction protection, fine-grained time travel, and fast crash recovery for user files and file system metadata. Inversion gets between 30% and 80% of the throughput of ULTRIX NFS backed by a non-volatile RAM cache. In addition, Inversion allows users to provide code for execution directly in the file system manager, yielding performance as much as seven times better than that of ULTRIX NFS.
    Note that this paper was published in early 1993. Many of the issues it addresses are relevant to DBFS, and many of DBFS's advantages are foreseen by that paper. IMHO DBFS has chosen a direction that should have better performance than inversion, not to mention lower risk and easier failure recovery.

    Inversion was built on POSTGRES, which makes one wonder what happened to the source.

  • by Anonymous Coward on Monday September 06, 2004 @12:04PM (#10169193)
    There is some non-free software which does this kind of thing, such as Tower [] Trim Context. We've been implementing this at my office. Users are often less than enthusiastic - because it changes the familiar Windows world, and breaks things like links in Office documents which depend on HFS storage...
  • by mgkimsal2 (200677) on Monday September 06, 2004 @12:08PM (#10169226) Homepage
    therefore, doing searches on a relational database filesystem (find me all music files with dates between last week and last month: SELECT * from files WHERE files.type = "music" and NOW() - 7days

    you _can't_ do that sort of thing on a traditional filesystem.

    Argh! Don't say things like that - someone will throw down a shell script which WILL do it (probably in one line) combining find, file, grep, perl/python and some other crap. Which, while it may work for some, entirely misses the real point, is that there's no *easy* way to do this. We've got GUI bindings for opening/saving/editing files, but nothing for harnessing the power of searching. Don't say it can't be done (because it can somehow) just say that it can't be done easily, which is the bigger point. :)
  • by The Mighty Git (27891) on Monday September 06, 2004 @12:09PM (#10169237)
    A db fs with rich searching of metadata requires the orderly and accurate entry of said metadata.
    You can't get organisation out of nothing - you are just asking people to be organised in a slightly different manner.

    An organised person can already work effectively in a filesystem with current tools. The fact that they are organised is the key.

    A disorganised person is not helped as their metadata will be erratic or absent. In fact might they now have the capability to be even more disorganised?

    As I see it this is not solving an organisational problem as much as shifting the interface to the problem. I do not believe it to be either an easier or better way to organise personal data. Conversely I do not believe it to be inherently worse either. It's just different.

    Where's the gain?
  • by Anonymous Coward on Monday September 06, 2004 @12:11PM (#10169248)
    It feels like sharing files in the manner that SharePoint does is an afterthought. First, they built a web portal, then they decided to add that feature in, along with a few ActiveX controls. More important to SharePoint was the web portal component. Where I work, everyone has a SharePoint site for a default webpage. This allows for simple announcements and memos without choking down the email system. It's also fairly unobtrusive, which allows users to read said memos if/when they like, rather than forcing it into their inbox where it would inevitably get deleted.
  • by whovian (107062) on Monday September 06, 2004 @01:09PM (#10169621)
    This basically seems like giving find the capability to do file and then storing the results with locate. And add some time stamping, sorting capabilities and GUI.

    Since all but the GUI are basic commands, it would seem sensible to have an underlying library with hooks for use by your choice of desktop manager.
  • by solios (53048) on Monday September 06, 2004 @01:12PM (#10169636) Homepage
    Using a linux desktop is like using Win3.1 or 95 hyped up on really lame raver drugs. I've always found it sad and extremely frustrating that a venue with MASSIVE potentiall for innovation has instead been spending its time reimplimenting Windows Explorer. :|

    I'm down with anything that makes the linux desktop experience a real linux desktop experience- not a pissass wannabe win95 experience or a solaris experience. There's so much cool stuff going on under the hood... but the thin candy shell feels like GPX or Coby* slapped onto BMW internals. I know it's possible to do better.... but after years of seeing Windows and MacOS features reimplimented (not nearly as well in most cases) in linux, I'm starting to lose faith. :-(

    *GPX and Coby make shit electronics that are cheap and break very, very easily. Coby specifically, in that their products (and LOGO!) are modeled after Sony... and the best they can do is to come across as a cheap ripoff.
  • by ctr2sprt (574731) on Monday September 06, 2004 @01:18PM (#10169676)
    OK, let's consider an example other than documents. Joe is a big music fan with a couple hundred CDs. He likes having instant access to any one song when he wants it and has a lot of hard drive space, so he's MP3-ized (or ogg-ized or flac-ized or whatever) his entire collection, plus all the MP3s he's acquired via other means.

    Joe is pretty good about organization, so he names every MP3 properly with the group, album, and track names, plus the track number. (Something like The Beatles/White Album/01-Back in the USSR.mp3.) This way if he knows, for example, the track name but not what album it's on, he can find it pretty quickly using a method like yours.

    The problem is if he wants to do something like put all the country music he has, for example, in a playlist. How does he do that? It can be done, certainly, but if he has a collection with several thousand MP3s it's so tedious and difficult as to be effectively impossible. What if he wants to listen to 60s rock? What if he wants to find a particular song he has, but all he knows is it's between 3 and 5 minutes long, came out between 2002 and 2003, and is probably categorized as either "pop" or "alternative?" What if he just wants a list of all the songs he never listens to because he's sick of what he's been playing lately? Or maybe he needs to free up disk space and wants to find out what he'll miss the least.

    All these things are impossible to do in an efficient and timely manner using our current system. He can certainly use a command-line ID3 tagger to strip out the things he cares about, something like

    find mp3 -type f -print0 | xargs -0 id3tag -l | grep 'Genre: Country'
    but that's painfully slow: a second or two per file means a big connection will take 15 minutes or longer to scan, and if you typed "Gerne" by mistake you have to do it all over.

    Now if you had a filesystem-like object which could be smart and store ID3 metadata in the filesystem, then it would be much faster: the main overhead to using the find/xargs/id3tag/grep approach described above comes from having to seek through the MP3 file to get at the metadata. The reason this needs to be a "core OS component," perhaps even part of the kernel, is that MP3 tags can change at whim and the filesystem needs to know about it or its metadata can get out of sync. It's possible, but impractical, to update this on a periodic basis, like the locatedb; it makes much more sense to have the kernel inform some plugin "Hey, this file just changed, do you care?" And if the plugin does care, it can look at the changes, see if it's affected, and possibly update the metadata to match. It could also go the other way, where Joe updates the filesystem metadata and the OS knows to update the MP3's ID3 tags too.

  • Not Unix (Score:5, Interesting)

    by flossie (135232) on Monday September 06, 2004 @01:20PM (#10169692) Homepage
    the DBFS does not store system files: No shared libraries, no font files or others like that. These are not documents, not files you look up at a day to day basis, and have no place in a file system.

    Whether or not you look at system files every day probably depends on what you are doing with your machine and what you consider "system files" to be. Moreover, this idea would seem to go entirely against the whole UNIX "everything is a file" philosophy which is supposed to be one of the great strengths of UNIX.

  • by KrackHouse (628313) on Monday September 06, 2004 @01:33PM (#10169776) Homepage
    I'm using Subversion for a project and the idea of Atomic Commits seems like an obvious direction for file systems. If that other slashdot story is correct, storage becomes less of an issue and it would be possible to roll back the system to any point in time or to only roll back one file if need be. Now throw an intuitive way to navigate files on top of that and you've got a sure winner.

    In the grand scheme of things, only a very small handful of us on earth are aware of Linux or even know what an Operating System is for that matter. File systems seem to be the big stumbling block for new users. Anything that can make computers and therefore access to information easier for the coming waves of new computer users (maybe billions of people?) will be a good thing. Even if the "bloat" slows down the system by 10%.

    I hate to preach but that old quote comes to mind "With great power comes great responsibility". I don't think most of the people working on the OS that will soon dominate in developing nations (that's Linux) are aware of the harm they can do by slowing down Linux development with petty personal disputes. Like it or not, Linux is no longer just an edgy hacker tool. It has the potential to change the lives of Billions of people.
  • by Anonymous Coward on Monday September 06, 2004 @01:42PM (#10169831)

    Hans Reiser has a lot of very ambitious ideas. But having ambitious ideas is easy; it's the execution that's difficult. Take a look at the recent threads on lkml discussing ReiserFS 4 (in particular, posts from Linus and Al Viro), and you'll see that while there's at least some interest in ReiserFS 4-like features, there's disappointingly little evidence that the ReiserFS people have given sufficient thought to some fundamental implementation problems.

    Also note that while Hans Reiser may have talked about competing with WinFS and including more database-like features in the filesystem, and while ReiserFS may be designed to ease the future implementation of such features, as far as I know none of that is actually done.

  • Re:gnome people... (Score:3, Interesting)

    by Anonymous Coward on Monday September 06, 2004 @01:42PM (#10169832)
    Here is my opposite hierarchical "document oriented" desktop: it does not have to be based on database file system, but intelligent caching and indexing

    1) No "Start" application menu. You never "start" an application. You always open (read:look at) a document by clicking on it. You want to write a new one? you click on "template" document, it immediately asks you for new name (there never exists unnamed one, even on memory - because there is no difference between memory and disk). Do you want to read your email? you look inside mail/inbox dirextory. Everything has a directory/file structure! The same is with web - although you can see (list) only "directories" (sites) where you have been. Otherwise you have to either type URL location dialog or use bookmark. Bookmarks kook just like links in ~/bookmarks/whatever_link_page. The same page you can list in virtual directory (of course only if if you allready bookmarked or visited it)

    2) No open/save dialog. "invisible" application flushes updates to the ducument frequently, it appears that you directly instantly to the file. On the other hand you should be able to see a "historic" (you can say undoed) under some history.

    3) No application "Close" (x, exit). You just remove (minimise) the document from your view. If you do not look at the document (work with, view url, ...) for some time, intelligent caching system invisibly closes application (of course the document is flushed, probably quite some time ago) and frees resource to the system. There is no difference between memory and disc allocated documents from user point of view.

    4) Taskbar is not a list of running applications but a "History" list of last opened documents (URLs, ...) with more-less constant number if items

    5) Window name functions also as "location" (URL) input, saves space

    6) there is no "all time" visible menu, toolbar, location bar (see (5)) inside the window - those only take space. Right mouse button opens context menu, together with Alt it opens "document" menu (full-blown what is more-less now visible on the top of windows), with Ctrl it opens "shortcut" menu (ala toolbar) under cursor (so you do not need to move too far.

    Oh, and did I say no start/close application?

    Roman Kantor

  • by Cnik70 (571147) <<seven2170> <at> <>> on Monday September 06, 2004 @01:51PM (#10169882) Homepage
    Take a look at an AS/400 (iSeries). They've been around for more than 10 years. And before that you had the System 38 and System 36... and so on..... Why is this some big revelation now?
  • by Tablizer (95088) on Monday September 06, 2004 @03:28PM (#10170489) Journal
    What about backups though? How would a user be able to group (manually) related files together, so that the whole bunch can be backed up later, without having to search for all seeminly related (or unrelated) keywords to trace all hitherto-unrelated documents?

    Defining "group" is one of the problems with hierarchical systems. What is the group you want is dependent on your needs at the moment. Relational makes it easier to create ad-hoc groups.

    For example, do you want to backup by file age, file type, and/or by topic? And in the real world topics naturally overlap. Set theory has an easier time with this because it is meant to deal with overlaps; but hierarchies get messy beyond about 3 orthogonal factors. Relational is closer to set theory than trees are.

    A drawback of sets is that most people are not familiar with non-tree arrangements. There will probably be a "training hump".

    More about sets versus trees [].
  • by 0x0d0a (568518) on Monday September 06, 2004 @03:49PM (#10170630) Journal
    I hate GNOME-VFS and KIOSlaves and all that stuff that people try to roll into a higher layer than they should. The idea of combining VFS functionality and a *desktop environment* is *stupid* and exactly the sort of fragmentation that hurts everyone. Want a userspace VFS layer (like, other than the existing transparent systems like LUFS and FUSE, which are much better if you can use them, because existing apps continue to work) Fine. Make one. Call it libvfs, and make KDE and GNOME bindings. But for the love of God, quit trying to roll filesystem functionality into desktop environments. It's ridiculous, and not where it belongs.

    That being said, I'm still not a huge fan of this.

    There are three main features that people seem to want with a DB-based FS:

    * Transaction/views/high-level-DB functionality.

    You don't want a FS that tries to do all this. DBMSes have worked for a long time to do this efficiently. If you want this, use a real DBMS, because it's going to smoke what someone tried to hack up into a filesystem.

    * Automated index updates. Basically, locate but atomatically updated as changes occur.

    Mac OS Classic had synchronous index updates as the filesystem ran. This makes the filesystem slow. It sucks for servers.

    It's a much better idea to do asynchronous index updates, where the index approximates the filesystem quickly, but perhaps not right away -- you can do that without killing performance. Basically, you have the kernel notify a userspace daemon when changes occur, which rebuilds the part of the directory hierarchy (possibly waiting a while to see if it can use idle time). If a ton of changes occur to a small portion of the directory hierarchy, instead of trying to keep up with each change (say, a million changes to part of the directory hierarchy), the update daemon rolls all those queued up updates into a singled queued up full update of that entire portion of the directory hierarchy. It does a bit of extra indexing, but doesn't get backlogged.

    This *could* be done under Linux, but there is one bit of kernel support that is missing -- currently, it is unusably inefficient for an app using Linux's existing dnotify() directory monitoring mechanism to deal with (a) changes to a directory containing many entries and (b) changes to directories anywhere on a filesystem (currently, dnotify() requires a FD for each directory to be monitored).

    There *is* a enhanced dnotify patch [] out that would improve Linux's dnotify() mechanism to the point where people can write daemons with this, but Linus has not yet rolled it in. Once he does so, we can get the ball rolling on such daemons.

    * Indexing of metadata.

    People want to be able to search for their files using metadata. Again, this can just as easily be done using such a daemon. Existing apps continue to work, people can choose where to put something in a unique hirarchy so that they can reach it again, but files can also be addressed via metadata. If you want to provide, say, a tabbed file selector containing a tab for selecting files via metadata, that's quite feasible.

    A major reasons why you don't want to provide a full DB-based interface is that you lose the existing hierarchical representation, and every app in the world stops working. You don't want people to just "save this file to a filesystem" and have to address it via metadata -- you want the user to give the thing some kind of unique identifier.
  • Re:Performance? (Score:3, Interesting)

    by EddWo (180780) <eddwo&hotpop,com> on Monday September 06, 2004 @06:05PM (#10171666)
    Why don't people get it? It isn't about entering arbitrary keywords as file properties, it's about forming relationships between items. You shedule a meeting, you create a new meeting item and link in the attendee items, location items etc. You take some notes at the meeting, you link the notes to the meeting item. You link the meeting item to the project you are involved in.
    Now you can locate the notes by filetype, time/date, project, meeting, people, location, without entering any text by hand, just linking one or two critical bits of information per new item.

What good is a ticket to the good life, if you can't find the entrance?