Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Linux Software

What's The Best Linux Distribution For Clustering? 57

syn1 asks: "There has been a proliferation of Linux distros over the last couple years. Many are specialized for specific tasks or needs. In terms of Beowulf Clusters, there are a growing number of distros specialized for these clusters. Although the old favorite among specialized Beowulf distros is Extreme Linux, other distros such as Syclid Linux and Scali Linux are catching up in terms of user share. Additionally, more people are using conventional distros (Red Hat, Debian, Mandrake, SuSE, etc..) and adding Beowulf support. I am just wondering what fellow Slashdotters think about these various distros when it comes to Beowulf Clusters and which ones they think are best."
This discussion has been archived. No new comments can be posted.

What's the Best Linux Distribution for Clustering?

Comments Filter:
  • What about Mosix [mosix.org]?
  • Debian it is.
  • ...on what you're doing. If you're doing something mission-critical, you might want to find one of the $1,000+ cluster distros, for the support that comes with them. Otherwise, I'd say just pick your favorite. Now, carpe dium or whatever - I've never setup a clutser. I've used them, but don't consider my words to be infaliable.

    ...............
    SUWAIN: Slashdot User Without An Interesting Name

  • People should do a little more reading before they post things like this to slashdot! *huff*

    You should have read the all-encompassing Linux-HOWTO!

    Or better yet, the more specific, completely non-generic Beowulf-HOWTO!

    Everyone knows that.

    -------
    CAIMLAS

  • I'm using Mosix for a Blender render farm.
    Works very well. The restrictions on mmap
    call though can be a problem, as it prevents
    process migration from the home node.

  • one of the more clever tricks to lure the slashdot editors into posting yet another "my distro is beeger than yours" holy war.

    (note: looking at how obvious the submitter tries to start a distro-war, it doesn't take to much cleverness to lure the editors)

    Wanna bet there will not be any useful discussion in this thread?

    I really wish Slashdot would start moderation on articles too, then this would be dismissed as Flamebait fast enough.

  • by eleitl ( 251761 ) on Sunday November 05, 2000 @05:03AM (#648641) Homepage

    It's a second generation Beowulf, with some
    very interesting features (see below). You can
    download it for free or purchase it for cheap
    (see link at http://www.scyld.com/ )

    http://www.scyld.com/clustering_overview.html

    [...]
    Scyld Beowulf installation is easy. It's like loading Linux onto a single PC.

    The Scyld Beowulf software provides the capability to start, observe, and control
    processes on cluster nodes from the cluster's front-end computer.

    Scyld Beowulf's cluster process control, BProc, decreases time to start processes
    remotely. With process migration times of ten milliseconds, BProc provides an
    order of magnitude improvement over other job spawning methods. Additionally
    BProc provides insight into job and cluster performance.

    Scyld Beowulf features Large File Summit (LFS) support via Scyld's Linux kernel
    updates and GNU C library which support 64 bit file access on the ext2 filesystem.
    Scyld Beowulf also includes utilities modified to take advantage this. (Basic text
    utilities, scp, ftp client and server).

    Scyld Beowulf includes GUI-based cluster node configuration, control and status
    tools.

    Scyld Beowulf ships with a customized version of the popular MPICH message
    passing library. This version is modified to take advantage of the unique process
    creation and management facilities provided by BProc which makes running MPI
    applications easier than before.

    Scyld Beowulf includes MPI-enabled linear algebra libraries and Beowulf
    application examples.
  • by Crutcher ( 24607 ) on Sunday November 05, 2000 @05:07AM (#648642) Homepage
    There are twqo basic types of clustering:

    1) Process clustering - This beowulf, it is designed to rip every last shred of CPU time out of boxen. It is a VERY custom, machine dependant thing. A good B-cluster will be so hand tweaked as to be almost unrecognizable as what ever distro.

    2) Server clustering - this is failover stuff, and distros can do this much better. Most people call it something like High Availability. But you are still likely to teak it up.

    This is not a very good question, because clusters tend to be so custom. Its like asking: "Whats the best frame to base a kit car on?" There /is/ a valid answer, but it simplifies more than it educates.

    -- Crutcher --
    #include <disclaimer.h>
  • I won't touch on PVM or MPI clustering, but as far as High Availability clustering goes, most of the distributions will use some form of lvs [linuxvirtualserver.org] Since it uses nice command line utiities you can write your own scripts, or you could use the gui they offer as well. Slap that software on any distro(make sure that the kernel's patched right) and you're ready to go.

    I've done this myself, and without starting a flame war, I've found that the easiest setup was achieved using RedHat. Their piranha tools make things easier and since the servers came with RedHat, I didn't have to waste too much time, nor did I have to drop a couple thousand dollars for their cluster distro, it all comes in the general distribution. During research for this project I read quite a bit about the TurboLinux distribution. The internals aren't much more than lvs, but the price tag scares you away (not that you couldn't do it with a stock TL and LVS, but to use their special distro it costs ... just like RedHat's. You're not really paying for the software, but rather the tech support). Whatever you decide, keep in mind a few things ..
    1. Any distro can do it.
    2. When you get the cluster up, do what you can to keep the distro/OS in the cluster the same. You'll save yourself a good bit of headaches in administration and make using the weighted algorithims a reality (ex: NT won't respond to the uptime, or ruptime polling requests, so you're stuck with the static weight that you assigned read the HOWTO for more).
    3. If you are using lvs, use direct routing. It's fast.

  • What is all this Beowulf crap? For highly-available systems, clustering usually means server fail-over. It means an active-standby configuration with a shared disk. If the active server dies, the standby mounts the disk, starts up the app, and carries on.

    For examples of shrink-wrapped versions, see Sun Cluster [sun.com], Veritas Cluster Server [andhttp], and a Linux based one, Turbo Linux Cluster Server [turbolinux.com].

    A lot of services have to be active-standby; only one server can be doing the job at a time. Any database falls into this category, including SQL-based, LDAP, and mail stores. This is where the above products would get used. For services that can be active-active, like web servers, DNS, mail relays, some form of load balancing is better and cheaper.

    There are distributed databases on the horizon, but few of them are ready for primetime. These would feel more like a Beowulf cluster.

    I'm not trying to tell you that calling Beowulf a cluster is wrong, but limiting clustering to just Beowulf is.

  • Carpe Diem = Sieze the day
    Caveat Emptor = Buyer Beware

    ;P
  • What are your goals, how many concurrent jobs will you be running (and with what priorities), and do you know where the bottlenecks reside?

    Clustering, high-performance computing in general encompasses a huge number of problems and solutions. There are literally gobs of different routes one could take. Beowulf and Benchmarks, while easy to remeber and look at, are not the solution to everything. Perhaps you need the vector performance of a Cray or maybe the cache-coherent shared-memory system of a Data General AViiON or Silicon Graphics Origin. It all depends on your needs. Do the research before assuming you need one exact solution.

    FWIW, you may want to look at SGI's Advanced Clustering Environment [sgi.com] for an all-inclusive, free, open-source solution. It's available for both SGI MIPS IRIX and IA-32/Intel Linux and works quite well with SGI's great Performance Copilot analysis software. They also know a thing or two about high performance computing. If you need more power you can build a warehouse of Linux boxes or a buy a 512-processor Origin 3000 [sgi.com] (w/ 1TB RAM and 714 GByte/sec bandwidth)... or a cluster of those!

    My $0.02
  • Hvem skal i kloster?
  • Mosix will be the best one when they will finish the implementation on the almost-not-unlike NUMA mmap wrapper.

    --
  • by GC ( 19160 ) on Sunday November 05, 2000 @06:19AM (#648649)
    I think the last version of Extreme Linux was (searches for his Extreme Linux CD) is based on RedHat Linux 5.0 - it's a little out of date now - code has moved on considerably.

    For you I would like to recommend some reading:

    Building Linux Clusters by David HM Spector [oreilly.com] published by O'Reilly, (hmmm site seems to be down, come back later, or check Google cached version [google.com])

    This book comes with a CD together with clustering software. It also comes with step-by-step instructions. I believe, however, that there are some errata, which means that some hacking will need to be done to get your cluster online.
    It also goes through some aspects of choosing hardware etc...

    A more in-depth resource, without step-by-step instructions, but with in-depth discussions on granularity of Beowulf systems and whether they are actually good for the tasks you have in hand is:
    How to Build a Beowulf, A guide to the implementation and application of PC Clusters [mit.edu] by the MIT Press

    Also check the The Beowulf Project Site [beowulf.org] and the The Beowulf Underground Site [beowulf-underground.org]

    Have fun!
    ---
  • 2) Server clustering - this is failover stuff, and distros can do this much better. Most people call it something like High Availability. But you are still likely to teak it up.
    I never really seen number 2 as a cluster solution. But wasn't that the first NT "cluster"? I think it was because MS called their failover system a cluster at that time that people started calling a failover system for a cluster. I might be wrong, but it's just my impression. Oh well never mind.
    --------
  • by yerricde ( 125198 ) on Sunday November 05, 2000 @07:08AM (#648651) Homepage Journal

    I really wish Slashdot would start moderation on articles too

    You really wish you were looking at Kuro5hin [kuro5hin.org]. All logged-in users are always moderators at all times, and all logged-in users can vote +1 or -1 (remind you of [e2] [everything2.com]?) on story submissions in the public queue.

  • well if I had some mission-critical task, and money was not an issue, I'd use SUN. Not that Sun is better than linux, but when you buy a cluster of machines from Sun they are accountable for the performance.

    When you set one up on your favorite distro of linux, you are accountable.

    For me a bigger issue is the reusability of code written with one type of cluster in mind to another type of cluster. Anyone have any experience with this type of thing? NonyMouse the Coward
  • You'll find it hard to find anyone who doesn't reply to that question (best Linux for clustering) with their current flavour of the month home Linux. We're too damn partisan.

    I've used 4 distributions in the last 2 years. I only use Linux, at the moment I however rank them by which is least sucky.

    My intention is to have my N machines properly clustered, so I read this thread with excitement. However, if the conclusions are XxxXxx or XxXX, then I'll give up right away.

    FatPhil
    (Xxxxxx is least sucky presently)
  • TurboLinux Cluster Server [turbolinux.com] provides High Availability functions that boosts uptime for services such as Web serving, mail hosting, news, and FTP. TurboLinux also has a high-performance clustering product called EnFuzion [turbolinux.com].

    Red Hat provides a package called High Availability Server [redhat.com] that includes load balancing, fault tolerance, and improved scalability for IP-based applications.

    --Loge
  • Suse 7.0 will soon be available for sparcs if it is not already. Suse comes with beowulf and pvmake. I cannot comment on how good it will be. At the moment I'd stick with Redhat 6.1 and install the clustering rpms from srpms. See if they build.
  • Actually, I think he was drunk at the time ;)
  • Apparently DIPC [cei.net] (Distributed IPC) can run with MOSIX [jhu.edu], although DIPC a few months ago did not optimize migrated processes [huji.ac.il]. It could work, but works better when DIPC realizes that processes are able to run on other systems.
  • The question was about which distros support linux clustering, and which people thought were good or not. It wasn't a call for a distro holy war, it wasn't a question about "How do I make a linux cluster?" or "What software packages are out there to cluster linux boxes?"

    Personally, I find that while Red Hat is not my favorite of the linux distros, Red Hat offers Red Hat Professional Services, and this is a very nice thing for management, if the cluster in question is going to be a in a production environment at a company or business somewhere. If it's for your home use, do what you like, but most PHBs tend to take extreme comfort in the fact that if something linux related breaks, they can call Red Hat if the cluster admin on-site can't fix it, and Red Hat will either try to help on the phone, or you can pay for RH Prof. Services to come out to your site and take a look.

  • Didn't they remove OpenGL support in BeOS 5? I'm sure you can put it back, but still.... Be is very professional, I suppose, and refuses to put something in a distribution before it's completely polished.
  • Actually, Caveat is a subjunctive so the proper translation is "Let the buyer beware."
  • Personally, I prefer Blacklab Linux [blacklablinux.com] as a professional solution, although unfortunately it is not free. Still, the expensive price of Apple PowerPC's makes this a viable solution only under a few conditions:
    1. You need to do mega number crunching: Nothing can really beat an AltiVec PPC processor with compiled Fortran.
    2. For some reason, the amount of wattage your cluster uses is important to you.

    I can't really think of any others, unless you're a crazy mac user. But don't listen to me; I'm just a crazy mac user.

    --
    Lagos
  • Using the term cluster in that sense is common with solaris boxes, in fact Veritas sells a product("Veritas Cluster Server") which does just that.

    I think that someone just made it up to confuse people, I try to refer to them as a "HA Cluster" or a "Computing Cluter"

    /*
    *Not a Sermon, Just a Thought
    */
  • Clustering takes on many forms. I would suggest Debian for distributed processing environments because of it's stability. But for HA clustering, it is really up to you. Figure out which distro *YOU* are most comfortable installing, then check out http://oss.missioncriticallinux.com. Their Kimberlite cluster will run on *ANY* distro.
  • Well SuSE and SGI are porting Failsafe to linux-ha.org High Availability project.. and SGI supposedly has much experience with that package. I'd look more at RedHat if their installation process didn't suck the big snarfborg.

    (as the SuSE liker nevertheless ends up developing for RH..)

    But missioncriticallinux.com's Convolvo says "any deestro".

    But for actual stability like trying to get the job done? The last two VA Linux boxes I bought had RedHat on them already, and hardware cost is a pretty big factor. Or did you want to start repartitioning that 50GB RAID array? Is there such a big difference between deestros after you shut everything down? How about which HA distros not which Linux distro?

    Someone's going to say BSD or die, etc etc. I'd much rather see people with actual experience responding and backing up what they say, and hear people with experience using the HA tools.

    Better yet screw the distro idea, someone just post a list of tools they like and ideas about compiling, resource management, and security.
  • For you I would like to recommend some reading:

    Building Linux Clusters by David HM Spector published by O'Reilly, (hmmm site seems to be down, come back later, or check Google cached version)

    That book is not very good. I wouldn't recommend it to anyone building a Beowulf Cluster. The examples are broken and there is a LOT of errata (a whole new ch2 from what I've heard?).

    Go get the Scyld Beowulf2 Beta... really, it's the way HPLC (High Performance Linux Clusters) are going... it's easy to admin, easy to setup, easy to understand. It's a big step in useability for Linux clusters.

    .laz

    oh, and how much will you pay for /. ID 87 ;)
    --
    My car is orange, my sig is not.

  • ... your choice should also depend on the hardware and the amount of time you want to spend tweaking the config. The Beowulf I help admin has bleeding-edge hardware that requires proprietary (closed-source, commercial) drivers that are usually packaged for RedHat, even tested against RH-specific kernels. Yes, I could probably take the RPM apart and install them on another distro, but then I couldn't really use the OEM's support as they would come back with 'we don't support that'.

    So, in *practice* your best choices would be, in my experience, RedHat for Beowulf-type clustering (process distribution) and TurboLinux for high-availability clustering (fail-overs)...
  • That's the earliest I've seen, I wrote the sig as a joke.

    How much anyone would pay would depend on your Karma!!

    Here we go again!
    ---
  • Debian has all the beowulf stuff you need prepackaged, like MPI and I think it has some batch programs. Just makes it easier to maintain if you ask me.
  • All logged-in users are always moderators at all times, and all logged-in users can vote +1 or -1

    While we're on the subject, Why does it need to be limited to the -1 to +5 range? How about a -100 to +100 range? Just imagine the bragging rights to get +100 Karma!

    For the record, I'm in favor of NOT letting all logged on users being able to moderate. I think it's wise to make newbies wait a while to get a feel for the website first.

    I'd be interested in finding out from an experienced Website Admin. just how much extra webserver load (if any) would result from letting all logged in experienced users moderate.
  • Quite right. I guess the question asked just isn't specific enough; the setup needed for high availability failover stuff is quite different compared to load balancing / process distribution high performance clusters.

    A very good place to start looking at various stuff available for linux clustering is www.linux-ha.org [linux-ha.org].

    Also worth mentioning if you think about the high availability (active/standby) configuration: if there's more than one service to be provided, you can get quite nice performance boosts by distributing active / standby roles on the machines in your cluster - having a database server for an ISP with oracle active on one node and postgres / mysql on the 2nd node gives you both great performance and high availability.

    It means an active-standby configuration with a shared disk

    Not necessarily; Personaly I like the solution of having seperate, local raid0 (or raid5) disk arrays in each of the nodes and keeping them synchronized over the network.
    • You don't need the special hardware for shared disk stuff.
    • it's much easier to physicaly seperate the nodes - all that's needed is a reliable network connection between the nodes.
    • You avoid the single point of failure you'd get with the shared disk device.

    For a practical implementation of disk synchronisation at the blockdevice level have a look at drbd [tuwien.ac.at].

    If you do want to go with shared media you'd best consider two seperate raid 0 or raid5 devices, each connected to both boxes (seperate scsi bus for each device). The two devices are then configured as raid 0 (mirror); if you throw in some scsi seperatores you should be set - the aim is to avoid the problems arising from a single device rendering the whole scsi bus unusable if it fails in a nasty way.

    You'll still want to have some aditional hardware for your cluster: having a good method for I/O fencing (guaranteeing that both nodes trying to write to a device at the same time scrambling the data) is a realy good idea; the easiest way to achive this is to provide a method for one node to controll the others power suply; in case a node decides it has to take over functionality because the previously active node is no longer responding it can power down or at least power cycle the other node to make sure it's REALLY down and not just hung for a few seconds.

    Designing and building clusters can be fun :-)
  • Actually, you just posted the first distro-war related thread. Sad for you, you're making your prediction come true!

  • by Coolfish ( 69926 ) on Sunday November 05, 2000 @10:53AM (#648672)
    Wow, imagine a beofwulf cluster of these...

    oh wait.
  • > For you I would like to recommend some reading:
    >
    > Building Linux Clusters by David HM > > Spector published by O'Reilly, (hmmm site seems to be
    > down, come back later, or check Google cached version)

    Check the readers comments section: 15 are extremely negative, only 1 is positive.
  • Blender can't cluster! Check the offical docs!

    Fortunately for SkyWriter, MOSIX isnt "clustering" since it turns a cluster of machines essentially into a single very-large SMP machine. All the program needs to know is how to thread or fork itself to use multiple processors. At least thats the theory. Never used it myself ;)
  • Is there any clustering technology available for *BSD? I mean, for someone who has assorted old hardware with many different processors and platforms, NetBSD seems ideal. And for a group of Intel boxes, FreeBSD has always been choice. (sorry:it's just more stable & higher performance than most linux distros--please don't take this as flame-bait, it's just my opinion; what i'm really looking for here is BSD alternatives to Beowulf clustering). And of course, anything that runs on OpenBSD is just kick-ass, although if you're running a cluster that's not connected to the Internet, securing everything is not really necessary, performance is more key. but i guess this might be a case of "just because i can."

    ------------------------------------------
    best slash sites:infantililsm.org [infantilism.org]
  • Speaking of High Availability clusters, , check out this site [linux-ha.com]

    I'm also quite keen on clustering, so when I'm back at my PC I'll rummage through my bookmarks and post some more links...

    Off the top of my head, I also remember Cplant... [sandia.gov]

    Then, there is Plan 9 [fywss.com]... Do check out their "Related Links" section!

    Trian

  • I have found a minimalist Slackware good for running the Linux-HA (heartbeat) software, with a bit ot tweaking.
  • This may be out of your price range, but Sandia Labs [sandia.gov] has a nice little machine they call the Cplant [sandia.gov] (Computational Plant). Its a cluster of about 500 linux boxes with supercomputer power [top500.org]. Its ranked 84th in the list of TOP500 Supercomputers in the world as of November 3rd.
  • ..a Beowulf cluster of the Best Linux Distribution for Clustering???

    OR a Beowulf cluster of /.ers discussing the best Linux distro for a 420 node cluster to generate the worlds largest and most detailed ascii penis bird for use in sigs. Or viewing fake Natalie Portman porn.

  • Beowulf means parallel processing and distributed queueing mainly. Not at all HA, which is more suited to "new-economy" business types anyway. :)

    For a real beo, go for Debian.

    Those so-called "beowulf specific" distros just won't cut it.

    Thanks,
  • Polyserve Understudy [polyserve.com] will let you cluster any combination of FreeBSD, Linux, Solaris and NT for loadsharing and failover. You used to be able to get prices on the web site, but I can't find them there now.
  • My 9 to 5 job uses Linux-HA.. been using it for almost 2 years.. Its worked quite well..
  • I have seen a couple of Novell clusters. I don't quite know how they fit in the picture. They all run at one time and are online. if one server fails, another server can take over the users and server programs(mail server etc.). The users normally won't see anything except that services freezes for 1-3 seconds.
    Oh well, they are doomed anyway :-)

    --------
  • Sure, clustering aint just Beowulf, and even then Beowulf is not the only High Performance solution.

    But even in High Performance solutions availability and scalability are things that are not to be forgotten. (Unless you don't mind your High Performance cluster to crash every week or so due to harddisk failures and overheating Pentium chips.)

    To come back to the submission question, to my opinion a Distribution for a Beowulf cluster should have:
    - a means to automate installation completely. If you miss this it would take you a lot of time to install a new machine each time one of the machines in your (500 node) cluster crashes.
    - an easy way to update your nodes (for the same reason)
    - a clear and understandable filesystem layout, that also protects your nodes from the clustered processes (you wouldn't want your /var to be filled up any time a job runs out of its limits)
    - hardware support for the devices of your choice (which include hardware raid mirrored disks, gigabit ethernet cards, fast io devices)
    - a good 'out of the box' security policy

    Now I don't know what distro has these features, but I might have to know somewhere soon in the future as I try to find alternatives to the really expensive O2000 10proc r10000 HPC cluster and the evenly expensive SUN E3500 / A3500 HA cluster we use at my job.

    Besides that I think the point noted above are also true for a distribution that just has to provide a platform for a serious bussiness server.

    For as far as I'm concerned there are some key functions I mis in scalability for the linux solution (or they just exist and I haven't looked good enough):
    - a filesystem that is journaled, life growable, has exellent performance on >1 TerraByte sizes and can be attached to two or more machines enabling failover (like veritas vxfs) (could coda help?)
    - an architecture that has the capability of real number crunching (like the O2000/O3000) while maintaining reliability and low prices (maybe alpha's are a solution here, or I just have to cool down and settle for 1000Mhz Pentium IV machines)
  • I'd be interested in finding out from an experienced Website Admin. just how much extra webserver load (if any) would result from letting all logged in experienced users moderate.

    All experienced users can moderate on [Everything 2] [everything2.com]. Each user who has [at least 50 XP] [everything2.com] (like Karma but you also get one for each write-up) is given 10 to 100 or more points per day with which to vote +1 or -1 on a particular write-up and cannot see other users' write-ups' scores until after voting on them.

  • Actually, "buyer beware" is subjunctive in English! It's not "Buyer, beware!" which is an imperative (with or without the !), but rather without the comma to set off what in Latin would be the vocative. Compare: God bless you. (In which case, clearly, you are not addressing God, since he is contrasted with "you", but rather are saying "may God bless you") The devil take you!
  • Has anyone tried using one of the tiny distros?
  • Agreed, Blacklab on G4 hardware is effortless to install (even without gfx) and it tears through double-precision floating point like nobody's business. But if you're a fella that just looking to _play_ with a cluster, you're probably MUCH better off buying a dozen eMachines/PeoplePC/whatever IA-32/x86 boxes.

    Now that Apple's G4s ship with onboard 10/100/1000 gigabit ethernet, I wonder if gbit switches will come down in price a bit.
  • Whenever I hear anything like this, I think of the word 'clusterfuck'. I don't know why, but I do :)
  • Thanks to previous posters we now know the Beowulf flavor of clustering (supercomputing) and the load balancing flavor of clustering (mostly web servers) and also the High Availability flavor of clustering (duplicated servers with or without load balancing.) There is one area I have not seen addressed and that is disk clustering where the disk farm (usually a RAID array) has (hardware) access by two or more machines. Microsoft call their version MSCS. It is useful for database servers because you do not need two copies of the database. One DB copy on disk can be read/written by more than one machine directly over a scsi ( or fiber) channel. (Disk redundancy is provided by the RAID-5). Is there such a thing for Linux? (Not RAID-5. It works fine right out of the box. It's the dual porting software I have not seen).
  • I love you too, lets meet up for a glass of wine and a chit-chat.

Scientists will study your brain to learn more about your distant cousin, Man.

Working...