Slashdot is powered by your submissions, so send in your scoop


Forgot your password?
KDE GUI Software Linux

Database File System 296

ozy writes "With all the fuss about searching and Spotlight and WinFS, check out the Database File System a completely different interface for your files, implemented in KDE. There is actually a request for developers to join a project to implement this under GNOME and leave how we use the desktop today behind."
This discussion has been archived. No new comments can be posted.

Database File System

Comments Filter:
  • gnome people... (Score:3, Informative)

    by alphan ( 774661 ) on Monday September 06, 2004 @11:18AM (#10168861) Homepage
    ...seems to have something more interesting: storage []
    • Re:gnome people... (Score:4, Informative)

      by Jellybob ( 597204 ) on Monday September 06, 2004 @11:20AM (#10168879) Journal
      Storage has been more or less dead as far as I can see for a while now, however Beagle [] is showing good progress on the same front, having been demoed at conferences recently.
    • Re:gnome people... (Score:3, Interesting)

      by Anonymous Coward
      Here is my opposite hierarchical "document oriented" desktop: it does not have to be based on database file system, but intelligent caching and indexing

      1) No "Start" application menu. You never "start" an application. You always open (read:look at) a document by clicking on it. You want to write a new one? you click on "template" document, it immediately asks you for new name (there never exists unnamed one, even on memory - because there is no difference between memory and disk). Do you want to read your
  • by manavendra ( 688020 ) on Monday September 06, 2004 @11:19AM (#10168870) Homepage Journal
    ...I can kind of see this would make it easy to search and locate documents. What about backups though? How would a user be able to group (manually) related files together, so that the whole bunch can be backed up later, without having to search for all seeminly related (or unrelated) keywords to trace all hitherto-unrelated documents?

    Secondly, with this mass of files being spread over several disks, surely, this is in a way forcing the user to "search" for everything. Or isnt it? Will the underlying FS layer still be accessible in the general way that it is?
  • Performance? (Score:2, Interesting)

    by saden1 ( 581102 )
    Isn't this thing with DB's getting a little excessive? You're adding another layer and step to storing data which will in all likely hinder performance. I'm not sure the benefit out weight the cost.
    • Re:Performance? (Score:5, Insightful)

      by Anonymous Coward on Monday September 06, 2004 @11:31AM (#10168970)
      Depends on what you are using your computer for of course.

      You can say the same thing for a GUI, and its correct for certain applications of computers, but wrong in others.
    • Re:Performance? (Score:3, Informative)

      by bogie ( 31020 )
      Of course your right. But knowing that pretty smart people are working on this I don't think your going to see them go ahead with an implementation that's only half the speed of current linux file systems. I'm sure they'll only go ahead with this and integrate it into KDE when the performance is up to snuff. It's simply way too early to say that the benefits out weigh cost until the code is complete.
    • Re:Performance? (Score:5, Insightful)

      by psavo ( 162634 ) <> on Monday September 06, 2004 @11:34AM (#10168982) Homepage

      Isn't this thing with DB's getting a little excessive? You're adding another layer and step to storing data which will in all likely hinder performance. I'm not sure the benefit out weight the cost.

      Well, if it's only a name-translation thingy, then it shouldn't affect performance of file reading (when operating on sufficiently big files), only file opening/stat:ing.

    • Re:Performance? (Score:4, Informative)

      by kosmosik ( 654958 ) <kos&kosmosik,net> on Monday September 06, 2004 @11:48AM (#10169089) Homepage
      Nobody is sugesting to use such database FS for entire system. Only for specific data (f.e. user documents) - not entire system (binaries, libraries etc.) where such performance matters. Well in fact it will improve performance since right now applications that need such indexing (best examples are apps for organizing music (like iTunes) or digital pictures colections (like Adobe Photo Album or Google Picasa)) do it themselves which probably is not the fastest way and is not unified across the system. Now for *some* applications such view on files that lets you query for specific files/objects operating on query results rather as directories of files have much benefit. But it is only for organizing data, and in limited scope (as I've said - digital music, photography, probably some other fields). I don't really belive that this would seed up searching for office documents over LAN or smth. - when somebodys documents are in mess DB-FS won't change anything as the documents probably lack metadata, proper naming anyway.
    • These days, operating systems like both Linux and Windows XP have too many bells and whistles that I simply do not need. Unfortunately, these bells and whistles drastically increase the amount of space that I need on my hard drive.

      Adding a database layer to Gnome sounds like using another 300 megabytes of storage on my hard drive. I simply do not need the database.

      If the FSF/GNU folks really want to do something revolutionary, they should fork Linux+Gnome into 2 distinct paths: minimalist and maximal

      • You don't need to do anything to Gnome. There are plenty of existing Linux desktop environments that sound like they'd work for you. Take a look at xfce. If you want it even more stripped down, then try BlackBox, IceWM or WindowMaker. Most distros include all of the above.
    • Re:Performance? (Score:5, Insightful)

      by BenjyD ( 316700 ) on Monday September 06, 2004 @11:54AM (#10169126)
      Why not just run in console mode? All this GUI stuff is just getting in the way of absolute performance.

      If it adds 0.5 seconds to every time you save a file, but saves you 20 seconds of filesystem navigating every time you open the file, that's a worthwhile tradeoff. Add to that the fact that copmuters don't get tired or bored, while humans do, and it makes even more sense to shift as much of the burden of working onto the computer as is practical.
      • Re:Performance? (Score:2, Insightful)

        by kfg ( 145172 )
        Why not just run in console mode? All this GUI stuff is just getting in the way of absolute performance.

        Although this is a KDE related project the concept itself has nothing to do with whether you use a GUI or not and the performance hit comes at the level of the DB, not the GUI.

        As for shifting the burden to the computer it doesn't really do much of that either as a human mind still has to formulate and input the query terms as well as judge the validity of the query result.

        The DB as filesystem has a
    • Re:Performance? (Score:5, Insightful)

      by jgardn ( 539054 ) <> on Monday September 06, 2004 @11:59AM (#10169159) Homepage Journal
      Not necessarily. Consider the performance of finding a document you wrote two years ago. How long does it take you to walk through the directory hierarchy browsing file names? How fast is the file search tool? Wouldn't it be faster if you could say "Show me the documents I wrote two years ago" and the refine the search or browse the results?

      Storing data in a relational database is natural because it is more like the way we store data in our minds than the hierchical structures of traditional file systems.

      Also, we allow a complete abstraction of the underlying database in relational systems. The database can store the data however it sees fit, and can arrange the data on disk without the users noticing a change.

      I look forward to experimenting with a relational filesystem. I think it would be a wonderful thing to try out and see if it actually has the advantages I outlined above. I'd also like to see the actual disadvantages.
      • "Storing data in a relational database is natural because it is more like the way we store data in our minds than the hierchical structures of traditional file systems."

        Interesting, but it does remind of the oft-repeated "There are no straight lines in nature." Seems when it comes to even the most mundane things like mowing the lawn, planting a garden, or vacuuming the rug, we seek out or create a our own straight lines to adhere to a requisite logic imposed to make sense of it all.

        Put another way, it'
    • Isn't this thing with DB's getting a little excessive? You're adding another layer and step to storing data which will in all likely hinder performance. I'm not sure the benefit out weight the cost.

      [sarcasm]Yeah, you sure wouldn't want to gratuitously use the 99.5% idle time your desktop computer is currently wasting.[/sarcasm]

      Actually, my sarcastic response isn't even right..this will only affect performance when you're searching for a file, or saving one. How much time do you really spend doing that?

  • by kosmosik ( 654958 ) <kos&kosmosik,net> on Monday September 06, 2004 @11:19AM (#10168873) Homepage
    Such thing should be implemented at kernel level to be transparent for *any* aplication. Without this it will just lead to a mess (like 4 different implementations) and some apps working with it and most not. As f.e. you can browse SMB network with Nautilus but when you actually try to open a file (from SMB via Nautilus) in you will get a info that viewer does not support this method... It must be a standard system routine not another level between system and GUI.
    • Actually database storage should be implemented in the filesystem rather then kernel or window/desktop managers. That would make much more sense and theoratically, it will be faster.

      Just my uneducated opinion.

    • No, not in kernel. It should be done as a deamon, like he did.
      • by justsomebody ( 525308 ) on Monday September 06, 2004 @12:40PM (#10169428) Journal
        Yeah right. And whole world will use this daemon.

        Problem with this logic is that not everybody is gonna use DBFS. For example: Some people would like to use Reiser4.

        Proper thing would be dummy kernel (or some higher VFS, but making whole thing independant of wheter graphic mode is present or not) API for this kind of file access. If accessed FS is not DB related then it should convert standard functions (implementing some Metadata index on basic FS). If accessed system is Database FS then it should go trough it's native layer.

        Who says that

        MOVIES="*.avi *.mpg *.qt"
        for ftype in $MOVIES; do
        ls /mnt/volume/Personal/$ftype


        select from fs where (path = '/mnt/volume/Personal/*') and ((path = '*.avi') or (path = '*.mpg') or (path = '*.qt'))

        is different than

        select from fs where ((type='movie') and (location='Personal'))

        Second option could pose other options. Like searching by actors and directors, MP3 info, Office tags etc. But all open and save routines could be done trough and with Metadata. While Metadata and data would be connected.

        Let's say I copy MP3 files from CD. CD is not DBFS. All my tags go down the drain. Hope you don't expect I will copy all files trough this daemon or in graphic mode at all. Just as on plain system as on DBFS.

        Off course, it would help if VFS layer could detect MIME and act accordingly. Example: Old ISO9660 burned CD (or even better ssh mounted drive) with MP3. I copy these files to my computer remotely from terminal console on some other computer, computer being copied to, resolves ID3 and updates Metadata from ID3 tag index on fly. Without having some cron job.
        Same thing goes for my Office files. All files have it's creator and it's description. Why wouldn't go in Metadata index when file is received and saved from e-mail. If this problem is not solved before implementation then all you can expect are holes in your Metadata with a lot of non-indexed data.

        Well, that example is nothing new. Reiser4 already does that with plugins. The question here is:

        Will everybody use Reiser4? NO
        Will everybody write metadata plugin for Reiser4? NO
        Will Gnome or KDE support Reiser4 directly? NO

        Would this have better chance when Universal file access would be defined independent from FS and independant from Graphic mode? YES
    • Read about Beagle here []. I posted about this on Slashdot a few days ago.
    • That does not need to be in kernel level.

      Enough is a standard API so that Gnome and KDE can make a "File Open Dialog" for opening files.

      Applications should have an "Open Recent" option anyway, or an option to open a browser with all files created by them.

      Bottom line the term Database is IMHO missleading, or not needed. E.G. most iApps on a Mac have a root directory in the user home directory. In that directory they make a sub directory stucture like this: yyyy/mm/dd for each year, month and day where a f
  • stubborn (Score:3, Insightful)

    by mn3m05yn3 ( 772460 ) on Monday September 06, 2004 @11:20AM (#10168882)
    Article doesn't address whether or not we can turn DBFS off and use the more traditional hierarchical method of file placement. Will we be dragged into this kicking and screaming?
    • Article doesn't address whether or not we can turn DBFS off and use the more traditional hierarchical method of file placement.

      I would imagine the article author thought it was a given: we currently have choices of various file-systems, with no one FS requiring it be used to the exclusion of all others. I'd be highly surprised if DBFS was any different: people who want to use it will use it, and people who want to use Reiser4 will use that, and people who want to ext2 will use that.

    • Re:stubborn (Score:2, Insightful)

      by donkeyboy ( 191279 )
      Most of these DB File System implementations overlay the existing file system. The standard file access methods are still there. The DB functionality is implemented as extensions.

      The DB tables are then implemented as special files that coexist nicely with the existing directory files.

      Keep in mind that if a "new" file system breaks heirarchial file access is also breaks every app in existance.
    • Here's the facts:
      Database filesystems REQUIRE traditional filesystems to write on top of (unless the SQL server is implemented IN the kernel, which everyone (and I) agree is too much bloat). So, for DESKTOP machines, and STORAGE SERVERS, this technology would rock. It improves the ability for a user to find his/her files effeciently.

      Meanwhile, for mission critical systems, for the underlying systems, for EVERYTHING ELSE, they will continue using ordinary hierarchy file systems like Reiser.
  • by zero-one ( 79216 ) <> on Monday September 06, 2004 @11:21AM (#10168883) Homepage
    I have always thought that version control (file histories, branching and atomic changes) would be nice to have at the file system level. Instead of storing myessay-firstdraft.doc, myessay-seconddraft.doc, myessay-final.doc, the file system should do the work. Then if I want to make a bunch of changes (perhaps I want to try a new page layout), I should be able to commit them as one atomic change (or throw them all away if I change my mind). Then, when I want to make a set of documents with US spelling, I should be able to branch the whole lot (using no disk space) and make the small changes from UK spelling while still being able to integrate other changes I have made.
    • by ultrabot ( 200914 ) on Monday September 06, 2004 @11:27AM (#10168939)
      I have always thought that version control (file histories, branching and atomic changes) would be nice to have at the file system level.

      Sounds like a job for an SVN plugin for Reiser4 file system. Anyone doing one already?
    • MS SBS 2003 provides this through a "Previous Versions" addition to all XP computers on the domain. It is very poorly implemented and only affects files on the server, but has proved itself useful many times.

      I am definitly in favor of version control everywhere. Undo is arguably the best time saving feature ever, let's make an ULTRA-UNDO.
    • Exec 8 (Score:3, Informative)

      by jefu ( 53450 )
      Univac's Exec 8 (I think), its been quite a while since I used it, had versioning integrated into the files themselves - at least for some files. That is, they marked changes to the files as part of the file content up to 5 (I think) levels deep.

      This did mean that you had to use the right tools to get into the files or you had to cope with the changes in programs that worked on them.

      VMS also had file revision numbers on files as a couple of posters have noted.

      Both of these were nice in some ways, but

  • I like the looks of this and the way it can search the file system. Nice job! This would be a great way to keep track of multiple projects. Blows away the Winfs idea, I will try it out.
    • How does it blow away the WinFS idea? It basically is the WinFS idea. It stores metadata and file relationships in a database with a reference to the file location in the traditional hierachy.

      WinFS does all that and more with automatic metadata promotion/demotion, synchronisation, and queries generated by virtual paths for legacy applications.
  • Disadvantages (Score:5, Insightful)

    by BHearsum ( 325814 ) on Monday September 06, 2004 @11:21AM (#10168891) Homepage
    How much permforance overhead will this cause? The 'Desktop Environments' already eat a lot of RAM and CPU.
    How much disk space will you lose over this? All the metadata has to be stored somewhere, and just glancing over the link I read something about a versioning system, which will definently take up quite a bit of space. Will a 20gb hard drive become 15gb with DBFS?
    • Performance penalties, hard to tell, I assume for plain searches, the whole thing is faster. Read access might be the same because it goes plainly to file system level. Write access probably add a few miliseonds per file, to have the index updated, probably not if you have atomic commits on filesystem level and can do the indexing multithreaded. (reiser4 comes instantly to my mind)
    • Re:Disadvantages (Score:4, Insightful)

      by aodl ( 451686 ) on Monday September 06, 2004 @12:03PM (#10169179)

      While performance is something that should always be kept in mind, we are a long way away from the days of the original Macintosh where a desk accessory had to weigh in at 600 bytes [] in order to make the cut and fit into both memory and on a floppy disk. As current desktop machines outperform the high end servers of a few years ago, it would be nice to put a lot of that muscle to use in improving the user experience. I'm not excusing bloated and slow code here, but we don't really need to be counting bytes.

      In any case, database based operating systems have been around for decades, from OS/400 to the BeOS. Many BeOS users claimed it was hands down faster than any other shipping OS at the time, and it featured a journaling, database-styled file system. One of the primary developers of that file system is now working at Apple on Mac OS X 10.4's spotlight functionality.

      The thing is - as our desktop storage continues to grow at the pace that it does, and as we curiously find ways to fill it up, new ways of looking at and finding the information we store are going to be needed.

      DBFS, Gnome Storage, Apple's Spotlight, and WinFS, all take different routes to get there. It's worth looking at all and what they offer and where they differ. WinFS, is a new storage layer that combines file system resources with more structured data in a Relational/XML hybrid system, with the aim (from what I gather) of turning the file system into a global "soup" of data. That sort of soup can be seen in office suites or PDA style applications, and in older Operating Systems like the Newton OS, where everything is a shared and available resource that is stored and available through common structures. Spotlight, on the other hand, combines file system searches and indexes (think 'locate') with full content indexes and a metadata index, which uses 'importers' to parse out other file formats. Spotlight is not a new file system, but an indexing system that acts on files in the file system. From what I remember of Gnome Storage, it is similar, using the VFS layer and Postgres triggers and callbacks, along with plug-ins, to parse and extract relevant metadata and contents out of files. DBFS looks to be like WinFS in that it purely wants to be a new kind of information store. I don't know which style will win out. My theory is that technologies like Spotlight will eventually evolve into a new kind of storage system, while remaining familiar and file based for todays users and developers. But this is an idea whose time has more than come. It's something that's been promised for the desktop for at least a decade, and has been shown to work, albeit in targeted OS's (the Newton) or ones that never achieved mass market penetration (BeOS).

      So I think that performance concerns aren't that big of a concern, so long as (like all development) there are good people working on the solution.

      • I agree that I want software that starts to exploit my system instead of wanting forsaking features to minimize resource use....particularly when the resources are fairly abundant.

        Added to which, if someone wants "slimware", its already out there - it was written in 1995. If you are stuck with a 486, boo-hoo, fortunately there was a time when this was the cutting edge, and during that time people wrote and optimized code like lynx and fvwm and xview. So the code is there if you still need it, stop complaini

  • by ACK!! ( 10229 ) on Monday September 06, 2004 @11:25AM (#10168921) Journal
    Perhaps I am more organized than most but I already categorize my files and such in the hierachal file structure.

    Isn't Rieserfs planning to do this on the kernel level?

    Where does that leave other fs choices and storage and other idea dbfs?

    I see more and more people saying look what neat things you can do with these tools.

    But really why do you personally want something like this?

    Curious to see the response is all.

    • by BenjyD ( 316700 ) on Monday September 06, 2004 @11:45AM (#10169065)
      The problems with the hierachical system are:

      - maintenance overhead on part of the user to create hierachy and maintain it. Every time you save a file you have to think "where do I put this?"
      - Finding files can be hard. Is that letter about the planning application in Documents/Letters or Documents/Planning App?
      - keeping files in two or more places at once is hard (as in the previous example). You can use softlinks, but that's hardly ideal and doesn't survive moving things around.

      Basically, the current file system imposes a significant overhead. Most power users have restructured the way they work and use a computer in order to minimise that overhead without really noticing. It's just become one of those things you have to do, like remembering to save documents.

      Why not shift the burden of organising the files onto the enormously powerful computer, rather than take up valuable human mental resources.
      • - Finding files can be hard. Is that letter about the planning application in Documents/Letters or Documents/Planning App?

        find ~/Documents | grep "planning application"

        You've just taken a giant step into learning how to search for files in UNIX! In my next tutorial, I'll teach you the power of the "locate" command.
        • "grep" - why didn't anyone else on the planet think of that one? That's fantastic. Now I just HOPE to goodness I've always managed to always save everything under ~/Documents, never anywhere else, and then make sure I'm only searching for ASCII data inside any files. I'm all set! Thanks. You've made it much easier for me to find all my pictures and audio files related to various topics.
          • Meta data. All three formats you mentioned support embedded meta data.

            And YES, I would expect people to actually remember to enter proper meta data for their documents. If not, then they don't get the benefits of this type of system.
            • I'm not sure grep would always find the metadata in those - maybe it would. As another person pointed out, SXW files wouldn't easily be searchable (haven't looked at metadata for those). Assume that though all those file formats *do* have metadata available, wouldn't having a standard way to search and filter all those various metadata formats in one file dialog be a good thing? If this (or another) project would do that, I'd be all for it. And for those file formats that don't support metadata, this s
        • And if I called the file planning_app.sxw? plan_app.sxw? "letter to council.sxw"? "Planning Application" wouldn't even match your example. Yes, a bunch of regexs would probably find it, but that's a fairly involved process just to find a file I only need for a few seconds to look up a name or something.

          The point is, for most people the overhead of naming and organising files has become subconcious, and we have a bunch of tools to sort-of work around it. That doesn't mean an attempt to create a different sy
        • Exactly - doesn't matter how you organize files - sooner or later you have to search for one and searching with ReiserFS is damn fast.

          I guess the hoopla is mostly a Windows thing, where it is well nigh impossible to search for anything.

          Essentially, in a hirarchical system you sometimes search for stuff, while in a DB system, you *always* search for stuff - big deal...

        • by ctr2sprt ( 574731 ) on Monday September 06, 2004 @01:18PM (#10169676)
          OK, let's consider an example other than documents. Joe is a big music fan with a couple hundred CDs. He likes having instant access to any one song when he wants it and has a lot of hard drive space, so he's MP3-ized (or ogg-ized or flac-ized or whatever) his entire collection, plus all the MP3s he's acquired via other means.

          Joe is pretty good about organization, so he names every MP3 properly with the group, album, and track names, plus the track number. (Something like The Beatles/White Album/01-Back in the USSR.mp3.) This way if he knows, for example, the track name but not what album it's on, he can find it pretty quickly using a method like yours.

          The problem is if he wants to do something like put all the country music he has, for example, in a playlist. How does he do that? It can be done, certainly, but if he has a collection with several thousand MP3s it's so tedious and difficult as to be effectively impossible. What if he wants to listen to 60s rock? What if he wants to find a particular song he has, but all he knows is it's between 3 and 5 minutes long, came out between 2002 and 2003, and is probably categorized as either "pop" or "alternative?" What if he just wants a list of all the songs he never listens to because he's sick of what he's been playing lately? Or maybe he needs to free up disk space and wants to find out what he'll miss the least.

          All these things are impossible to do in an efficient and timely manner using our current system. He can certainly use a command-line ID3 tagger to strip out the things he cares about, something like

          find mp3 -type f -print0 | xargs -0 id3tag -l | grep 'Genre: Country'
          but that's painfully slow: a second or two per file means a big connection will take 15 minutes or longer to scan, and if you typed "Gerne" by mistake you have to do it all over.

          Now if you had a filesystem-like object which could be smart and store ID3 metadata in the filesystem, then it would be much faster: the main overhead to using the find/xargs/id3tag/grep approach described above comes from having to seek through the MP3 file to get at the metadata. The reason this needs to be a "core OS component," perhaps even part of the kernel, is that MP3 tags can change at whim and the filesystem needs to know about it or its metadata can get out of sync. It's possible, but impractical, to update this on a periodic basis, like the locatedb; it makes much more sense to have the kernel inform some plugin "Hey, this file just changed, do you care?" And if the plugin does care, it can look at the changes, see if it's affected, and possibly update the metadata to match. It could also go the other way, where Joe updates the filesystem metadata and the OS knows to update the MP3's ID3 tags too.

    • I find that a spacial interface to a hierarchical storage layout makes it very easy when I want to find my files. This kind of thing is more useful when trying to find files you didn't create / save.
    • I for one, would like faster searching. Google my filesystem, something like that. Personally.

  • by data1 ( 23016 ) on Monday September 06, 2004 @11:28AM (#10168945) Homepage
    The author is asking for help to move the project to Gnome.

    Quote:There is of-course the hard choice of platform. I choose KDE? because I am familiar with QT a bit, and because it is inherently object-oriented, being C++ and all. But in my mind GNOME? is much closer to how I would like a desktop system to function. So I would like to go for the GNOME option. I leave KDE developers to do what they want with this, and I am offering them support. But I would like to focus my efforts on GNOME and implementing the above in GNOME.

    Any volunteers?
    In the first place we will need developers. Would you like to join, send me an email ( with DBFS and JOIN somewhere in the subject. If you are not a developer, but still would like to help, please revisit this page in a few weeks. There will probably be a community website by then somewhere.
  • The database file system originated from the ideas of an object-orientated database. Keywords and references are all part of the orientation objects of the database to index to files or other objects. It does away with the traditional hierarchal view, being rooted at some place. The OODB does not need to be rooted as it is more like a web. The DBFS seems to try to implement part of the concept of the OODB. Good. There are many more features an OODBFS can offer: dynamic organization, classification, an
  • by greppling ( 601175 ) on Monday September 06, 2004 @11:37AM (#10169004)
    ...when he said on LKML, slightly paraphrased: "The only reason I see to put filesystem semantic enhancements into the kernel, is that it would be socially hard to get people to agree an a single userspace library."

    (In the course of the heated discussion about Reiser4.)

  • SharePoint anyone? (Score:3, Informative)

    by Petronius ( 515525 ) on Monday September 06, 2004 @11:39AM (#10169020)
    While everybody is busy making fun of WinFS, Microsoft is very quietly and successfully letting their customers install SharePoint sites all over the place.
    As usual, Xerox came up with the concept years ago (DocuShare). Sigh.
    • by Anonymous Coward
      It feels like sharing files in the manner that SharePoint does is an afterthought. First, they built a web portal, then they decided to add that feature in, along with a few ActiveX controls. More important to SharePoint was the web portal component. Where I work, everyone has a SharePoint site for a default webpage. This allows for simple announcements and memos without choking down the email system. It's also fairly unobtrusive, which allows users to read said memos if/when they like, rather than for
    • Our company installed sharepoint. Nobody seems to like it very much.
  • by MobyDisk ( 75490 ) on Monday September 06, 2004 @11:40AM (#10169029) Homepage
    Maybe we could call it a "filing" system since it indexes files that are on another file system. Really, a file system IS a database, not an add-on that indexes files. Still, perhaps this is a better approach than trying to redo all the file-system internals. Although to be truly useful, this needs to be an API that is GUI-independent, with GUI-bindings as needed.
  • by DarkOx ( 621550 ) on Monday September 06, 2004 @11:41AM (#10169033) Journal
    Stop letting good ideas be victimized by the M$ marketing FUD machine. Someone said "database filesystem" That was a good idea. M$ can along and said gee lets steal that idea. Hey there is no existing implementation to copy how do we do it. The answer is they did not do it. They put a database to keep rather mundane information on top of NTFS(a real filesystem) and called it WINFS(NTFS and a almost completely unrealated database, not a file system). Keeping data on the relations ships between files is a nice idea. Putting it in user space is dumb. Its overhead. Look at most of the example he gave "find a word document I worked on last month". All that info is already in the filesystem. A filesystem really is already a database in the strictest since. It stores whats on which inode assoicated with how many blocks which you could think of as attributes. It also sores attibutes like permissions and dates. Why not just put some more attributes into it like subject and relatedtopicID . If you did that and then added the ability to maintain some other tables where you could put extended descriptions and stuff, and built up the query engine to be able to efficently solve queries users will likely ask then you'd have what your really looking for. Addionally you would lose the overhead to a degree because you'd be storing informaiton once instead of in the FS and in the database.
  • by lcsjk ( 143581 ) on Monday September 06, 2004 @11:43AM (#10169053)
    I already use blinkx, beta, from [], to automatically search my files along with internet keywords. It doesn't have the search by date or extension and is not configurable to my liking, but it seems to do a good job of finding things I have misplaced. Integrated with the author's system, this could make a great search system.

    Normally I file things in a hierarchial method by year and month and by project name (2004file/9sep/) or (2004file/workfile/projectname), but still I lose track now and then and need keywords. Change the "slash" slant to fit your OS.

  • Have a look at this for a userspace filesystem

    There are a debian package called fuse-source and fuse-utils. I.e. use all these nice kio plugins on filesystem level.

    Just add

    deb unstable main
    deb-src unstable main

    to your /etc/apt/sources.list

    apt-get update
    apt-get fuse-source fuse-utils

    Actually, I didn't test that yet. Someone else?
  • by 3seas ( 184403 ) on Monday September 06, 2004 @11:49AM (#10169097) Homepage Journal
    Keeping the traditional file system is only logical as this database sort of file access is in fact a higher level of abstraction.

    Considering there are numerious project of such higher level of file access abstraction going on, it does become a secondary choice for the user if they want to use one of these higher abstraction level file access systems.

    To remove the traditional file system altogether would be a mistake, as then it could become a system of babel or keywords.... "what was I thinking when I created that keyword and lets not even get into what crazy joe was thining when he came up with his keywords...

    But hey, given how MS based developers would create some obscure dll name and place it in some obscure location in order to copy protect .....

    higher level abstractions are useful only to the point that you can, if need be, drop down in abstraction level to get your bearings as to where you are. If you cannot touch physical reality then how do you know you are not floating around aimlessly?

    being out of touch with physical reality can evenm be very dangerious and hard to correct.
  • Shit.... (Score:2, Offtopic)

    Link points to (University in Enschede, the Netherlands).

    My internet connection uses their network, so it may become sluggish for some time, while the /.ing is in effect...

  • by lkcl ( 517947 ) <> on Monday September 06, 2004 @11:59AM (#10169153) Homepage
    standard filesystems only have ONE index - a hierarchical one that contains a certain amount of real-time-updated indexing (such as the timestamps on a directory)

    but it is NOT a relational database: you CANNOT easily create or use an alternative index to your files.

    that's what all the fuss is about.

    some people mentioned here that they already organise their files. great. fantastic.


    and how long would it take to reorganise?

    with a relational database, all your indexes are updated AUTOMATICALLY.

    therefore, doing searches on a relational database filesystem (find me all music files with dates between last week and last month: SELECT * from files WHERE files.type = "music" and NOW() - 7days

    you _can't_ do that sort of thing on a traditional filesystem.

    sure, you can emulate it by creating symbolic links all over the place, but what happens when a file is deleted or moved? you need to manage / relocate the symbolic links...



    fantastic idea.

    now can we have them as a kernel module, pleeease?
    • therefore, doing searches on a relational database filesystem (find me all music files with dates between last week and last month: SELECT * from files WHERE files.type = "music" and NOW() - 7days

      you _can't_ do that sort of thing on a traditional filesystem.

      Argh! Don't say things like that - someone will throw down a shell script which WILL do it (probably in one line) combining find, file, grep, perl/python and some other crap. Which, while it may work for some, entirely misses the real po
    • therefore, doing searches on a relational database filesystem (find me all music files with dates between last week and last month: SELECT * from files WHERE files.type = "music" and NOW() - 7days

      you _can't_ do that sort of thing on a traditional filesystem.

      This is honestly where I have a problem with the concept of database filesystems and don't understand all the fuss over them.

      I can do exactly what you mention above (and more) using a user utility like "find" (in UNIX/Linux).

      As far as

  • by eschasi ( 252157 ) on Monday September 06, 2004 @12:00PM (#10169160)
    To quote from the article (which most folks have not read, as usual):
    The DBFS does not actually store files, it holds references to files on the underlying hierarchy based file system.
    That line alone should answer many of the questions re backup, speed of FS performance, etc.

    At a deeper technical level, nany of the questions asked here have historical answers or clues in The Design and Implementation of the Inversions File System []. The abstract reads:

    This paper describes the design, implementation, and performance of the Inversion file system. Inversion provides a rich set of services to file system users, and manages a large tertiary data store. Inversion is built on top of the
    POSTGRES database system, and takes advantage of low-level DBMS services to provide transaction protection, fine-grained time travel, and fast crash recovery for user files and file system metadata. Inversion gets between 30% and 80% of the throughput of ULTRIX NFS backed by a non-volatile RAM cache. In addition, Inversion allows users to provide code for execution directly in the file system manager, yielding performance as much as seven times better than that of ULTRIX NFS.
    Note that this paper was published in early 1993. Many of the issues it addresses are relevant to DBFS, and many of DBFS's advantages are foreseen by that paper. IMHO DBFS has chosen a direction that should have better performance than inversion, not to mention lower risk and easier failure recovery.

    Inversion was built on POSTGRES, which makes one wonder what happened to the source.

  • "I am currently looking for a job. Interested in employing me? Drop me an email."

    Ozy [] has worked on this in his time available as a student. If he gets a job doing something different, he might drop DBFS, and it might die a lonely orphan. Email [] him only with DBFS job offers, please!
  • The smallish (windows) app that you downloaded and in made an index of everything on your PC?

    (or was it from altavista?)
  • A db fs with rich searching of metadata requires the orderly and accurate entry of said metadata.
    You can't get organisation out of nothing - you are just asking people to be organised in a slightly different manner.

    An organised person can already work effectively in a filesystem with current tools. The fact that they are organised is the key.

    A disorganised person is not helped as their metadata will be erratic or absent. In fact might they now have the capability to be even more disorganised?

    As I see i
  • Apple's Spotlight (Score:5, Informative)

    by DuckWing ( 19575 ) on Monday September 06, 2004 @12:19PM (#10169287)
    For those of you that have not yet looked at the Mac OS X (Tiger) preview and WWDC web cast, the new spotlight technology built into the next version of Mac OS X looks very much like a fully integrated database file system. And it's incredibly fast. Go check it out! []. Note: QuickTime required. Mplayer may work for us Linux heads but I haven't tried it.

  • by Stevyn ( 691306 ) on Monday September 06, 2004 @12:19PM (#10169293)
    People can offer their opinions for or against this, but I think that any innovation benefits linux. I've read about WinFS and it sounds like a good idea, but who knows when it will be ready. If people working in their spare time can get something like this working in linux before Microsoft can get it out, I think that would just be another reason to trust the open source model of developing code and squash Ballmer's FUD.

    I don't have too much trouble using a hierarchy file system. I keep my stuff pretty organized, but computers are supposed to save time, not create more problems. If this database can do a good job, I'll give it a shot.
  • DBFS is a much more accurate model of our stored data, and how we use it, than the hierarchial databases we've used to bootstrap the world into using personal computers. But it's still really a prototype, proof of concept, because its data server calls the hierachical filesystem API of the Linux filesystem model, on which it runs. Underneath DBFS, and above the disk (or other storage media) driver, is an inode database, which is hierarchial. A giant improvement in efficiencies, whether speed, space, or comp
  • by whovian ( 107062 ) on Monday September 06, 2004 @01:09PM (#10169621)
    This basically seems like giving find the capability to do file and then storing the results with locate. And add some time stamping, sorting capabilities and GUI.

    Since all but the GUI are basic commands, it would seem sensible to have an underlying library with hooks for use by your choice of desktop manager.
  • Using a linux desktop is like using Win3.1 or 95 hyped up on really lame raver drugs. I've always found it sad and extremely frustrating that a venue with MASSIVE potentiall for innovation has instead been spending its time reimplimenting Windows Explorer. :|

    I'm down with anything that makes the linux desktop experience a real linux desktop experience- not a pissass wannabe win95 experience or a solaris experience. There's so much cool stuff going on under the hood... but the thin candy shell feels lik
  • One of the greatest features of GNU/Hurd is the idea of translators. By getting rid of the traditional file system model, you can make many interfaces to systems that "look" like filesystems, and such keyword based indexing/browsing/searching could be impelemented at the user level *as a filesystem*, rather than as a set of calls on top of a traditional filesystem requiring the application to "know" how to use it.

    The Hurd documents talk about everything from an "ftp filesystem" to ways to rewrite X. I can
  • Not Unix (Score:5, Interesting)

    by flossie ( 135232 ) on Monday September 06, 2004 @01:20PM (#10169692) Homepage
    the DBFS does not store system files: No shared libraries, no font files or others like that. These are not documents, not files you look up at a day to day basis, and have no place in a file system.

    Whether or not you look at system files every day probably depends on what you are doing with your machine and what you consider "system files" to be. Moreover, this idea would seem to go entirely against the whole UNIX "everything is a file" philosophy which is supposed to be one of the great strengths of UNIX.

  • Reluctance to change is a pretty common public reaction to just about anything especially when it's about something that people don't understand.

    Think about something as simple as USB technology. It was a great idea from the beginning, but we were all so used to "parallel for printers, serial for modems" mentality that we couldn't see into how useful and universal it could become. But how about today? Just about anything new can be given a USB interface.

    Now I'm reading about all kinds of reservations a
  • Today computers are dualistic; they have a screen area, this is where you input information, but it is violate. And they have a disk area which is permanent storage, but you need to explicitly save to that storage. There used to be a good reason for this behaviour, but today there is nothing wrong with storing every keystroke you input.

    Shhh! Don't let David Blunkett [] hear you! If he finds out that computers can do this, he will make it illegal not to use keylogging.

  • by KrackHouse ( 628313 ) on Monday September 06, 2004 @01:33PM (#10169776) Homepage
    I'm using Subversion for a project and the idea of Atomic Commits seems like an obvious direction for file systems. If that other slashdot story is correct, storage becomes less of an issue and it would be possible to roll back the system to any point in time or to only roll back one file if need be. Now throw an intuitive way to navigate files on top of that and you've got a sure winner.

    In the grand scheme of things, only a very small handful of us on earth are aware of Linux or even know what an Operating System is for that matter. File systems seem to be the big stumbling block for new users. Anything that can make computers and therefore access to information easier for the coming waves of new computer users (maybe billions of people?) will be a good thing. Even if the "bloat" slows down the system by 10%.

    I hate to preach but that old quote comes to mind "With great power comes great responsibility". I don't think most of the people working on the OS that will soon dominate in developing nations (that's Linux) are aware of the harm they can do by slowing down Linux development with petty personal disputes. Like it or not, Linux is no longer just an edgy hacker tool. It has the potential to change the lives of Billions of people.
  • by Cnik70 ( 571147 ) on Monday September 06, 2004 @01:51PM (#10169882) Homepage
    Take a look at an AS/400 (iSeries). They've been around for more than 10 years. And before that you had the System 38 and System 36... and so on..... Why is this some big revelation now?
  • A few years ago I remember seeing a filesystem based access gateway to MySQL. That was pretty neat because you could access rows of information using standard Unix tools like grep, sed, awk, numsum and so on.
  • by zdzichu ( 100333 ) <zdzichu AT irc DOT pl> on Monday September 06, 2004 @03:30PM (#10170506) Homepage Journal
    Before inventing something you should check if no one did this earlier. Because there you have GNOME Storage []. Don't be fooled by screenshots there. Storage isn't only cool search facility with native language parser ("computer, find me all porn I've downloaded yesterday" anyone?).
    Storage is, suprisingly, method to store files decomposed to contents. The great searching ability is a side effect.

    Imagine collaborating in of group of people over one document. Every one got some paragraphs to edit. With Storage, everyone can edit this document in the same time, seeing other's changes as letters are typed. Store version history and you have revision control. Throw in network transparency (you go to other department, connect laptop and automagically you can work on those department files) with OpenTalk (Zeroconf/Rendezvous) and you got best idea since hierarhical directories.

    Be sure to read whitepaper about Storage available on mentioned site. Also check for Storage related entries in Seth's blog [] (Seth is one architect of GNOME Storage). Now if only KDE people work on compatibility with Storage, freenix desktop would rule the world.

    BTW, KDE, don't miss chance of integration! KDE is planning to introduce google-like search in desktop. Don't reinvent wheel! Beagle [] is here, working. Just integrate Beagle with KDE desktop and we are set.
  • by Grendel Drago ( 41496 ) on Monday September 06, 2004 @03:47PM (#10170618) Homepage
    Joel on Software [] said it best:

    For example, WinFS, advertised as a way to make searching work by making the file system be a relational database, ignores the fact that the real way to make searching work is by making searching work. Don't make me type metadata for all my files that I can search using a query language. Just do me a favor and search the damned hard drive, quickly, for the string I typed, using full-text indexes and other technologies that were boring in 1973.

Money is better than poverty, if only for financial reasons.