New Linux Petabyte-Scale Distributed File System 132
An anonymous reader writes "A recent addition to Linux's impressive selection of file systems is Ceph, a distributed file system that incorporates replication and fault tolerance while maintaining POSIX compatibility. Explore the architecture of Ceph and learn how it provides fault tolerance and simplifies the management of massive amounts of data."
In soviet Russia (Score:1, Funny)
Re: (Score:1, Funny)
Re:In soviet Russia (Score:5, Funny)
Let me guess - you work for the SEC and need it for your porn collection
Re: (Score:2)
No, I think that's PEDObyte.
Just keep them away from children (with guns and horrendous megaviolence preferably) and you're golden.
Re: (Score:2)
(with guns and horrendous megaviolence preferably)
Why stop at Megaviolence these days when you have Giga, Tera, and even Petaviolence at your disposal. :D
Also, that's why you simply don't mess with the animal activists. They will go all spatio-temporal distortion on your ass. 8I
Re:In soviet Russia (Score:4, Funny)
640 petabytes should be enough for everybody.
Re: (Score:1)
only if I trim my porn collection to only include the actual sex acts...
Re: (Score:1)
Re: (Score:3, Funny)
If you woke up one morning in Tokyo to discover that someone had blurred your genitalia during the night, I'd bet you would consider puking on someone too.
Re: (Score:1)
Re: (Score:2)
Those were installed by the same guy who mosaic'd your junk.
History (Score:4, Informative)
Re:History (Score:5, Informative)
FILE SYSTEMS SOFTWARE ENGINEER
Los Angeles, CA
New Dream Network has a vacancy for a Senior File Systems Software Engineer in Los Angeles, CA. Minimum requirements – Master’s degree in Computer Science or Computer Engineering, minimum of 2 years experience in storage programming, and background in Linux kernel programming, file systems development, network programming and Operating Systems design.
Qualified applicants should send a plain text resume to cephjobs@dreamhost.com
Re: (Score:3, Funny)
Qualified applicants should send a plain text resume
Ha! That'll cut down on the noise. I wonder how many job seekers have ever heard of plain text?
Re:History (Score:4, Funny)
"Plain text". That's just a Microsoft Word document with no embedded images or graphs or anything, right?
Re: (Score:2)
Cue image of hordes of Microsoft-Certified job seekers searching in vain for a font titled "Plain Text".
Re: (Score:2)
Re: (Score:2)
Don't quit your day job just yet.
Re: (Score:2)
Not quite, it's a OLE compound document with an embedded Plain Text object.
Re:History (Score:5, Funny)
I sent mine in ANSI format so I could blink my contact info...
Re: (Score:1)
%!PS /Courier findfont 12 scalefont setfont /row 769 def
1.00000 0.99083 scale
0 0 translate
85 {/col 18 def 6 {col row moveto (
That is a hilariously good start to a Thursday morning on the UK election day. Wish I could mod it funny.
)show /col col 90 add def} /row row 9 sub def} repeat
repeat
showpage save restore
Re: (Score:2)
hilariously good start to a Thursday morning on the UK election day.
You chaps still having those? Jolly Good!
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
BAUDOT thanks.
Re: (Score:2, Insightful)
Is data integrity really necessary for large data? (Score:2, Interesting)
Look at Google and Facebook, arguably among the top users of massive databases. They have petabytes upon petabytes of data stored and are constantly growing. But what happens if they lose some data?
Nothing. They can always go back and regenerate that data. It's just a matter of time.
So at this large scale, it doesn't make any sense at all to focus on data integrity beyond making sure that fopen() and fread() don't return garbage. It's the smaller databases that contain critical information that need data in
Re:Is data integrity really necessary for large da (Score:5, Informative)
Google's BigFile/BigTable architecture is a distributed filesystem. if a node goes down, the data that was on that node gets copied to other nodes to keep the replication count up.
Facebook is using apache cassandra, which adopts similar designs.
Re: (Score:2, Interesting)
Yes, but Google's file system makes no attempt to implement either the POSIX standard or the Linux VFS. It's highly specialized to only deal with the types of loads that Google sees. As a general solution, it's worth is debatable.
Re: (Score:1)
But that is not what the original question was about. The original question was about sites like Google or Facebook using anything like a distributed file system to keep from losing data.
Re: (Score:2)
Facebook uses MySQL/memcached, cassandra is only used for systems running the statistical analysis.
Re:Is data integrity really necessary for large da (Score:5, Insightful)
One of the remaining replicas of each block on the failed node is copied so the total replication count does not go down. The original was perhaps poorly phrased, no need to be a dick about it, though.
Re: (Score:2)
I suppose that's fine, you can always throw extra layers, but if a system is doing distribution already, why can't you throw in location transparency on top and add a Posix layer.
And none
Re: (Score:1)
Oh, and I forgot about Amazon Dynamo.
Re: (Score:2)
Re:Is data integrity really necessary for large da (Score:5, Insightful)
While google may be able to go ahead and re-index websites if it loses that data, "regenerating" gmail and google docs stuff isn't quite so easy, and even small amounts of data loss would kill those applications (especially among paid users).
Re:Is data integrity really necessary for large da (Score:5, Insightful)
You just contradicted yourself. You're right; it's just a matter of time. Only, thing is, this is the Internet. How long to recreate that data? Weeks? Months? Years? 6 months is an eternity on the Net.
If all the accounts and stories were lost on Slashdot due to a massive database failure, how many people would come back, creating a new account and so forth? How many long would it take before there was enough content and accounts to make it interesting again? Now realize that Slashdot is a drop in the bucket compared to Google.
Re: (Score:1)
Re: (Score:1)
Why? Is there something special about those?
Re: (Score:3, Funny)
Why? Is there something special about those?
You must be new here!
Nope (Score:3, Informative)
Nothing special at all. It only means Taco used sequential instead of randomised integers for user ids, which in turn can be viewed as a very loose chronology of user registrations.
In other words, no.
Re: (Score:1)
Good to see you too, old timer.
Re: (Score:1)
Re: (Score:2)
Yeah, but it's a pretty big prime.
Re: (Score:2)
Yeah, but it's a pretty big prime.
Yes but it's not an "optimus prime".
Re:Is data integrity really necessary for large da (Score:5, Informative)
Second, you have other sectors producing large amount of data beside your favourite networking website. One example is the LHC. It is going to produce terabytes of data per DAY (15 petabytes per year). Another are space telescopes. Those data can't just be 'regenerated'. 1 day worth of data is incredibly expensive to produce.
Distributed file systems are already there, and people use them. Maybe not on your level of computer usage.
When you don't know what you are talking about, I think it is better to just keep quiet.
Re: (Score:2)
That would reduce the number of posts on slashdot by about 99%.
Re: (Score:1)
Acutally your raid array can't regenerate your data in most failure scenarios because of idiotic design :
Bit error in RAID 1 :
disk A : 000000111011011
disk B : 001000111011011
that's the information your raid array has in case of a bit error. Do tell, which is the correct one ?
Or, better, yet, a 3 disk RAID-5 array :
disk A : 000000111011011
disk B : 001001010011001
parity disk : 001101101000010
clearly something is wrong ... now fix the problem.
RAID is worthless unless you know which data set is wrong.
Re: (Score:1)
Why do you assume that:
A: PB storage is very rare and only used by several large organizations.
B: PB storage is used to house generated data the can easily be replaced.
- Gilboa
Re: (Score:2)
Nothing. They can always go back and regenerate that data. It's just a matter of time.
No, they can't. This is a really, really important distinction to make. They cannot "regenerate" the data. They *might* (perhaps even "probably") be able to "recopy" the data, *assuming the original source is still available*.
Is it ready for primetime? (Score:5, Informative)
Re: (Score:1, Redundant)
Thanks. I was about to download it to service my rather large storage requirements for porn, but it seems too risky now.
Re: (Score:2)
For me this is quite a co-incidence, I just spent all yesterday reading up on fault taulerant distributed file systems and ceph and seemed quite promising until I realised they are also waiting on kernel 2.6.34 as it has their patches merged.
For anyone who knows more about this stuff, I was quite interested in xtreemfs as it seems to allow you to add nodes anywhere on the internet and it will deal
Re: (Score:2)
Yep and they are using btrfs for the underlying filesystem which is also not at the production use stage.
Would you clarify what the difference between Ceph and BTRFS is? From the description I thought that is what BTRFS and ZFS were supposed to be.
Totally not ripped from a webcomic... (Score:3, Insightful)
"Do you have support for smooth, full-screen Flash video yet?"
"No, but who uses that?"
Re:Totally not ripped from a webcomic... (Score:5, Insightful)
"Do you have support for smooth, full-screen Flash video yet?"
Frankly, that's Adobe's fault, not ours.
Re:Totally not ripped from a webcomic... (Score:4, Insightful)
Yes it is ours. If “ours” means: Us idiots who made Flash dominant in the first place, by using it in any way.
It always takes two. The ass doing it, and the idiot letting him do it. That guy with the narrow mustache from the 40s would agree to that: “What luck for rulers that men do not think.” ^^
Re: (Score:2, Insightful)
Frankly, that's Adobe's fault, not ours.
It could be our fault if you wanted it to be:
http://www.gnu.org/software/gnash/ [gnu.org]
http://swfdec.freedesktop.org/wiki/ [freedesktop.org]
Re: (Score:2)
Re: (Score:2)
I see the adobe developer made it here alright.
Dude, get another job if you hate this one so much.
Re: (Score:1)
Wow, there's a +1 insightful and a -1 troll in the same post. I've got mod-points, but was really not able to decide which way to go with this one.
Re:Totally not ripped from a webcomic... (Score:4, Interesting)
Pick one.
What you call a "rat's nest", we call "compatibility", and it works surprisingly well. Writing a game? Use OpenAL -- the distro will configure it to work. Need realtime audio for a DAW? Use JACK. Anything else? Use ALSA.
What if you picked the "wrong one"? Doesn't really matter. If you managed to build a decent DAW on top of ALSA, it'll continue to work on top of ALSA. If you used OSS, that still works today.
Video APIs? Flash has its own codecs, so all you need to know is xvideo.
Seriously, you have even less of an excuse than people who bitch about how Linux has both GNOME and KDE, and oh, the horrors of actually having a choice.
Re: (Score:2)
Our tools are better. Your "freetard" rhetoric doesn't matter. So does your "market share" rhetoric.
Adobe doesn't have any real excuse for being shown up by ALL of the "freetard" developers.
Re:Totally not ripped from a webcomic... (Score:5, Insightful)
So then you freetards need to stop whining when 99% of the world choices not to use or support your shitty OS.
99% of the world does use our OS. You're likely doing it right now. Or did you think Slashdot runs on IIS?
And not that it'd make much difference to an obvious troll, but I use proprietary software when appropriate, and I am in favor of open source, not necessarily "free software." Not every Linux user is RMS. (And if they were, they probably wouldn't be Linux users.)
Re: (Score:1, Funny)
Not every Linux user is RMS. (And if they were, they probably wouldn't be Linux users.)
Ahem ... That should be: "Not every GNU/Linux user is ... "
Re: (Score:2)
Guess what their renderfarm runs.
Re:Totally not ripped from a webcomic... (Score:5, Insightful)
> Having a rats nest of audio and video apis doesn't help the situation. You freetards should be happy what you get for your piece of shit OS.
The ffmpeg developers can manage yet the "professionals" at Adobe cant?
"freetardry" is the only reason h264 acceleration is supported under Linux.
If we waited for the nickel-and-dime-you approach to come to the rescue we would still be waiting.
At least with MacOS, Adobe had a real excuse.
Re: (Score:2, Redundant)
At least link the the comic you're totally not ripping from. ;)
http://xkcd.com/619/ [xkcd.com]
Re: (Score:3, Interesting)
Re: (Score:1, Troll)
Re: (Score:2, Insightful)
I don't read XKCD...
Re: (Score:2)
that's actually a good idea.
in Soviet Russia, Yakov smimoff [youtube.com] Links you!
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Even so. The this whole argument is mindless nonsense. Adobe finally only offered partial acceleration support even for Windows just recently.
The idea that any variant of Flash is any better than any other (or worse) is just Lemming nonsense.
Re: (Score:2)
Yes, but users of OSs that don't can't understand why anyone would use an OS that doesn't.
Re:Totally not ripped from a webcomic... (Score:4, Insightful)
A) Yes, I do. MPlayer will play any Flash videos, with a bare minimum of resources, and fully supports multiple video output methods, like xv and gl.
The PROBLEM is that Flash videos aren't directly available anywhere... You have to parse through a SWF video player object to even determine where to FIND the URL of the actual FLV or MP4 file. And add to that extremely aggressive plugin detection scripts on many sites, which will refuse to even embed the SWF if you happen to have an unknown VERSION of the flash player. Unfortunately, I've mentioned this before, and got several interested replies, but nobody has thus far written a browser plug-in that will masquerade as Flash 10, and understand just enough SWF to find the URLs, and either present them to the users, or automatically pass them to MPlayer. A sad, sad failing, to be sure, since
B) I (and many, many others) care VASTLY more about Linux's support for massive storage arrays than we do for it's support of Flash, and other user-level fluff. My servers never need to visit YouTube... But booting from a hard drive more than 2 terabytes??? Don't expect Windows to let you do that, without very specialized hardware (EFI firmware). Linux, however, can do it out of the box with many common distros.
pet-a-byte? (Score:2)
I'm not really sure how much a petabyte is. Could someone please translate to Natalie Portmans? or Station wagons full of congresses? or Rods to the Hogshead?
Zetta = Peta * 1,000,000 (Score:2)
I think I'll stick with ZFS. It's a million times better, give or take.
Re: (Score:1)
dont quote me on it as im too tired to look it up but i believe a petabyte is 1000 terabytes... and last i checked thats like billions of rods of hogsheds worth of Natalie Portmans being used as station wagons full of congresses.
Re: (Score:3, Informative)
Tera -> Tetra -> 4 -> 1000^4
Peta -> Penta (like Pentagram) -> 5 -> 1000^5
Exa -> Hexa (like Hexagon) -> 6 -> 1000^6
Zeta -> Setta (like 7 in many languages) -> 7 -> 1000^7
Yotta -> Otta -> 8 -> 1000^8
Or use 1024 if you don't like IEEE/IEC norms...
"Enterprisey" design? Yet no scrubbing? (Score:2, Interesting)
I see a lot too many layers over layers there. Which always smells like the inner-platform anti-pattern [wikipedia.org] that a “enterprise consultant” would to, to me.
But maybe I’m just misunderstanding things and that amount of layers is needed for large installations. Anyone here, who actually administers such large storage systems and read the article? Would be interesting to hear from someone with daily experience in this.
Also, I could not find any mentioning of any ZFS-like scrubbing going on. Which
Re: (Score:2, Informative)
Did I miss it, or did they really forget that crucial part?
You missed it. There is a scrubbing mechanism in ceph.
Re: (Score:2)
Also it uses BTRFS as the local filesystem, which does quiet a few checks as well.
Linux® (Score:3, Insightful)
The first word in the article summary is "Linux®"
Does that look weird to anyone else? I realize it's technically correct for the registered trademark symbol to be there, but somehow it just doesn't seem right.
Re: (Score:3, Informative)
Definitely looks weird. I always write it in all-lowercase. But apparently the trademark is either all-caps ("LINUX®") or the standard capitalized form ("Linux®") [linuxmark.org]
Someone should remind them to register "linux®" (all lowercase), before Darl tries to. A capital first letter just doesn't look right.
Re: (Score:3, Informative)
A word mark is always registered as all upper case. Lower and mixed case are still covered.
Re: (Score:2)
This is simply not true. For example, in Australia, the word "iPAD" (Siemens) vs "IPAD" (Apple) http://pericles.ipaustralia.gov.au/atmoss/Falcon_Users_Cookies.Run_Create [ipaustralia.gov.au]
How does this differ from glusterfs? (Score:2, Interesting)
Re: (Score:2)
Ceph reminds me more of Coda than glusterfs. Anyone remember coda?
Re: (Score:2)
They don't steal everything. (Score:2)
Re:Do niggers use linux? (Score:5, Insightful)
Distributed file systems work quite well when you have a single source of truth, but when you have multiple data stores, you can have multiple sources of truth. It essentially adds a temporal dimension to your data. As in, John Smith is a debtor of XYZ corp on Monday morning, but due to the server being down, we haven't realised on Tuesday morning that he paid his bill on Monday afternoon. Add late fee penalties.
It adds another layer of complexity to an application that delayed gestures roll back transitive actions between actors in an Ecosystem. In the example, it would be to send out another letter stating that the late fee penalties have been removed, and if already paid, a refund is to be issued.
Re:Do niggers use linux? (Score:5, Insightful)
Would it hurt to at least change the title while you strive for visibility and relevance? When I saw the title of your post, I half-expected to see a poorly-written diatribe against Jamal Jackson for playing basketball and chasing caucasian women.
Thank you, kind sir, for listening. We all must do our part to prevent trolling!
Thread titles vs Trolling (Score:5, Funny)
Would it hurt to at least change the title while you strive for visibility and relevance?
Well you didn't change it
I did. (Score:2)
Re: (Score:2)
Re: (Score:2)
It was noble of you to try to wrest control of a troll thread, but your comment loses a lot of credibility for being titled "Re: Do niggers use linux?"
While it's off-topic, it's at least an honest question! I'm sure the slashbots want to know the answer.
(retitled) Do POSIX stds require atomicity? (Score:1)
Re: (Score:2)
this article has some comparisons with Lustre:
http://www.linux-mag.com/cache/7744/1.html [linux-mag.com]