Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Linux

New Linux Petabyte-Scale Distributed File System 132

An anonymous reader writes "A recent addition to Linux's impressive selection of file systems is Ceph, a distributed file system that incorporates replication and fault tolerance while maintaining POSIX compatibility. Explore the architecture of Ceph and learn how it provides fault tolerance and simplifies the management of massive amounts of data."
This discussion has been archived. No new comments can be posted.

New Linux Petabyte-Scale Distributed File System

Comments Filter:
  • Look at Google and Facebook, arguably among the top users of massive databases. They have petabytes upon petabytes of data stored and are constantly growing. But what happens if they lose some data?

    Nothing. They can always go back and regenerate that data. It's just a matter of time.

    So at this large scale, it doesn't make any sense at all to focus on data integrity beyond making sure that fopen() and fread() don't return garbage. It's the smaller databases that contain critical information that need data integrity. These are typically sub-terabyte, though some may creep over that limit in a few uncommon instances.

    And realistically, if you don't want your data to be hacked up, lost, then thrown out with a bad drive, ReiserFS or any other modern journaling filesystem is the right choice.

    I wouldn't bet money on distributed filesystems just yet.

  • by Hurricane78 ( 562437 ) <deleted @ s l a s h dot.org> on Wednesday May 05, 2010 @09:42PM (#32107018)

    I see a lot too many layers over layers there. Which always smells like the inner-platform anti-pattern [wikipedia.org] that a “enterprise consultant” would to, to me.
    But maybe I’m just misunderstanding things and that amount of layers is needed for large installations. Anyone here, who actually administers such large storage systems and read the article? Would be interesting to hear from someone with daily experience in this.

    Also, I could not find any mentioning of any ZFS-like scrubbing going on. Which in my experience equals zero reliability at all with today’s unreliable drives. How would that system detect a controller creating corruption? Or data degradation? I had those problems. And they killed half my data. Despite having a RAID, doing automatic backups with verification and having a git-like history of changes (to protect from accidental overwriting). Nothing of that helped me at all.
    Only constantly checking all data, and fixing them, before the errors become big enough for ECC to stop working, can prevent this.

    Did I miss it, or did they really forget that crucial part?

  • by SanityInAnarchy ( 655584 ) <ninja@slaphack.com> on Wednesday May 05, 2010 @10:29PM (#32107318) Journal

    Pick one.

    What you call a "rat's nest", we call "compatibility", and it works surprisingly well. Writing a game? Use OpenAL -- the distro will configure it to work. Need realtime audio for a DAW? Use JACK. Anything else? Use ALSA.

    What if you picked the "wrong one"? Doesn't really matter. If you managed to build a decent DAW on top of ALSA, it'll continue to work on top of ALSA. If you used OSS, that still works today.

    Video APIs? Flash has its own codecs, so all you need to know is xvideo.

    Seriously, you have even less of an excuse than people who bitch about how Linux has both GNOME and KDE, and oh, the horrors of actually having a choice.

  • by caffeinejolt ( 584827 ) on Wednesday May 05, 2010 @11:28PM (#32107664)
    I am not real familiar with ceph and after going through the pain to learn more about glusterfs (http://www.gluster.org/) only to learn that gluster was not quite ready for primetime (this was about 6 month ago - may have changed), I am a bit skeptical. Anyone know the main differences between ceph and glusterfs (besides that glusterfs can run in userspace)?
  • by iknowcss ( 937215 ) on Wednesday May 05, 2010 @11:40PM (#32107750) Homepage
    Actually, I'm glad that he didn't link to it. I swear, every other story on Slashdot has some comment with a link to XKCD. Hey, we get the jokes. All of us read XKCD. You don't link to a video of Yakov Smirnoff every time you make a Soviet Russia joke, do you?
  • by Anonymous Coward on Wednesday May 05, 2010 @11:48PM (#32107812)

    Yes, but Google's file system makes no attempt to implement either the POSIX standard or the Linux VFS. It's highly specialized to only deal with the types of loads that Google sees. As a general solution, it's worth is debatable.

And it should be the law: If you use the word `paradigm' without knowing what the dictionary says it means, you go to jail. No exceptions. -- David Jones

Working...