Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Linux Business Software Linux IT

Flattening Out The Linux Cluster Learning Curve 89

editingwhiz writes "IT Manager's Journal has a good look at a forthcoming book, The Linux Enterprise Cluster, that explains in clear, concise language how to build a Linux enterprise cluster using open source tools. Writer Elizabeth Ferranini interviews author Karl Kopper in a Q&A. Is this more complicated than it appears to be? (IT Manager's Journal is part of OSTG.)"
This discussion has been archived. No new comments can be posted.

Flattening Out The Linux Cluster Learning Curve

Comments Filter:
  • don't you mean... (Score:3, Insightful)

    by beckett ( 27524 ) on Sunday October 31, 2004 @08:09AM (#10678186) Homepage Journal
    • by Anonymous Coward
      Um. all they did was swap the axis used. Just plot time on the other axis. Duh.
    • They were having problems with too many people learning how to cluster linux. Mentions in various forums about "imagine a Beowulf cluster of these" had reached epidemic proportions so they decided something had to be done.

      Thanks to this book the learning curve has been flattened down to something more appreciable and amenable to those who have complained about the problem. The curve has been flattened far enough that it takes two years to learn that clustering "will likely require more than one computer to operate correctly" (Chapter 403 pg. 8729). I count this as a big win for society.

      Ignore the anonymous coward who replied before me.
    • A better term might be

      "steep prerequisite" curve

      i.e. each advancement to the learning process requires a much higher prerequisite.
      If step 1. requires a high prerequisite then you would get the "running into a wall" effect.
  • by Amiga Lover ( 708890 ) on Sunday October 31, 2004 @08:11AM (#10678194)
    Now it's not just geeks, but also IT Managers who can imagine a beowulf cluster!
    • by Anonymous Coward
      Umm dude.. Enterprise cluster != beowulf cluster

      • by Anonymous Coward
        > Umm dude.. Enterprise cluster != beowulf cluster

        Oh for fuck's sake spare the geeky overliteral bullshit and grow a sense of humor and perspective. Thank fuck it's not GEEKS who are it manager's but true manager's, heaven help us if someone asked a geek to ever look at the big picture in an organisation.

        "I fail to see how a examining a large painting would enhance our productivity".

        Turn your coder brain off once in a while.
      • You bring an interesting point up, I wish each book on the topic of clusters mentioned which type(s) of clusters it dealt with...

        Looking for a good book on High-Availability clusters would be so much simpler
    • Rather than ridicule to level of expertise, I think it's important for IT management types to have their imagination fired up about clusters.
      They're the one's that can get funding and support for you to put one together.
      • They're the one's that can get funding and support for you to put one together.
        Hahaha!

        I proposed a minor change to the products we subscribe to from our ISP that would save money and I'm still fighting to prevent it from going to a committee to decide if we should get quotes from other ISPs.

    • by Anonymous Coward
      now just imagine a beowulf cluster of beowulf clusters...
      • *Disclaimer: I am tired. It is 6:30 on a sunday morning. I have done the one task I gave myself before I allowed myself to sleep, which was to make pawgloves for my halloween costume. Thus, sanity is overrated right now.

        Okay, the classic beowulf cluster is a 4x4 matrix of computers. Now, to have a beowulf of beowulfs, each of those computers on a cluster must be connected to its own 4x4 grid, so you now have a cluster of 256 computers, arranged somewhat suboptimally. Now, in order to communicate with these

  • OSTG? (Score:5, Informative)

    by ricotest ( 807136 ) on Sunday October 31, 2004 @08:13AM (#10678199)
    I must have missed this, and for anyone else who didn't know, OSTG is the new name for the Open Source Development Network (OSDN) Slashdot is a part of. They're now called the Open Source Technology Group [ostg.com].
  • by Timesprout ( 579035 ) on Sunday October 31, 2004 @08:19AM (#10678216)
    The guy puts a single 10 node cluster together and this qualifies him him to write the 'definitive guidebook called "The Linux Enterprise Cluster"'.

    Dont think so.
  • Very nice (Score:5, Informative)

    by a_hofmann ( 253827 ) on Sunday October 31, 2004 @08:40AM (#10678257) Homepage
    Installing and administering the various [linux-ha.org] open [linuxvirtualserver.org] source [systemimager.org] tools [kernel.org] can be tedious work, especially without documentation of how to put things together.

    A quick Google search [google.com] though reveals a lot of free papers and manuals on this very topic.
  • by Xpilot ( 117961 ) on Sunday October 31, 2004 @08:41AM (#10678262) Homepage
    ...is that there are a gazillion ways to do it, and every cluster vendor comes up with their own way, and there is no agreed-upon standard yet to easily deploy these things (AFAIK). Now the fact that there is no single vendor controlling how clustering works is a good thing, without a lack of a good standard as to what a clustered environment will offer to the application developer, the task of setting up clusters for different types of applications remains a tedious task.

    Lars Marowsky-Brée had a paper in the proceedings of OLS 2002 [linux.org.uk] describing the problem and a suggeted solution in his paper entitled "The Open Clustering Framework". I'm not sure how far standardized clustering has come since then. Anyone has any insight on the matter?

    • by barks ( 640793 ) on Sunday October 31, 2004 @09:28AM (#10678376) Homepage
      I have yet to meet anyone that's running a homebrew cluster and tell me which distro they're running...as awesome as they appear to be.

      What I gathered from my introductory Operating Systems class was that this was the next frontier and exciting market to keep an eye for.......that and creating applications for these setups was not as you said, standarized yet. Can Linux applications that normally run a single box setup migrate relatively easily to a cluster setup yet?
      • I've personally setup two clusters, both in the 40 CPU range. I've used various versions of Red Hat, I'm now using White Box Enterprise Linux.

        I use a PXE boot system with Anaconda kickstarts to get the software installed. A poast install script then configures everything else on the machine. When it reboots, the machine appears in the cluster and is ready to use. I use the Torque batch scheduling system.

        You don't need the cluster toolkits to setup a cluster! DHCP, TFTP and a configured kickstart file
      • THe answer is "it depends". First, there is no such thing as a generic "cluster". A cluster is just a bunch of machines cooprating to solve a problem (whether that problem is serving a website or computational physics, or the requirement for redundancy)

        Some types of applications, it's easy to visualize how to get a dozen or a hundred computers to help with the problem (serving static web pages). Others, it's not (databases)

      • by Anonymous Coward
        It depends on the application.

        OpenMosix is a clustering technology that allows you to use regular apps and benifit from the cluster.

        It works by migrating proccesses from one computer to another. So it's like a SMP machine, the fastest any single thread can be done is limited by the fastest cpu, however the it allows you to do more at once.

        For example with a 2 system cluster, you would be compiling the kernel. If you set it up only to use one thread at a time, then you get 100% of the original single mach
      • I think you'll want to take a look at openmosix. http://www.openmosix.org/ [openmosix.org] Instead of writing your app to be cluster aware, it hides all that good stuff in the kernel. All you need todo is have an app that'll fork/pthread appropiately and mosix takes care of all the messiness behind the scenes.
  • by egnop ( 531002 )
    End users would often complain about the system's slow response time.
    He says, "Because we couldn't print the forms for the warehouse people to select the products to be put on the truck, we'd have dozens of truck drivers sitting in the break room each day for more than 10 minutes.


    I actually don't get it, most logistics got wireless for about a decade now...
    and the truck driver has no right for a break...
    • If the system is taking a long time to do the work required to print a form how is wireless going to help?

      and the truck driver has no right for a break...

      What do you propose the driver does? Go drive around the block for 10 min until they are ready to load?

  • by wombatmobile ( 623057 ) on Sunday October 31, 2004 @09:23AM (#10678370)

    Publications like this play an important role in establishing best practices and community, two key enablers of standardization.

    These in turn will lead to greater adoption, and more publications. A virtuous cycle.

  • VMS clusters (Score:4, Interesting)

    by Anonymous Coward on Sunday October 31, 2004 @09:24AM (#10678371)
    Want practice with decades-mature enterprise clusters? Why not get a few old VAX or Alpha systems on eBay, and/or fire up a few instances of the simh [trailing-edge.com] emulator, then join the free OpenVMS hobbyist [openvmshobbyist.com] program (I recommend the also-free-to-hobbyists Process Software's Multinet [process.com] TCP/IP stack and server software).

    And please, don't be put off by VMS because DCL = your first exposure to a VMS system - feels more awkward than bash (in many ways, it certainly is!). It's in the underlying architecture of the OS where the fruits of tight engineering are really demonstrated.

    • Re:VMS clusters (Score:3, Informative)

      by hachete ( 473378 )
      This seems fairly active:

      http://gnv.sourceforge.net/

      includes a port of bash to VMS. Not sure how good it is.

      Having used and programmed DCL, it's not that bad.

      h
      • As I recall from my Decus days, there were a few alternative command line interpreters available; none however exceeded DCL's presence.

        Clustering VMS/VAXen was straight-forward, reliable, fast and exceedingly well-supported by DEC (for a fee anyways).

  • And he built a ten node cluster OF ten node clusters, then this is lame. and he is under-qualified to do the book(most likely) as most ACTUAL enterprise clusters are at least 20 nodes, possibly more if its clusters of blade servers.
    • All enterprise clusters I know of are in the 2 to 10 node range. The only reason there is a cluster is because of automated failover in the case of node or site failure. Performance and scalability is of no importance, they just buy a bigger box if necessary.

      Some of the sites have hundreds of machines, all clustered in small, manageable units. These are high-end IBM AIX boxes, high-end fiber storage with two sites 50km apart.

      Enterprises buy clusters for availability, not for scalability !

      Markus

    • I was responsible for a 6 node VAX cluster that topped out at over 3300 simultaneous interactive users running All-in-1 (e-mail, word processing) and database applications.

      You don't need 20 nodes if 6 can do the job.
    • Guess what? In clustering environments, the different boxes are usually similiar and run the same software. So adding boxen is very straightforward. He would be totally qualified by testing around a 2-node cluster, because something with more nodes is Very Similiar. That is, if you know how to network 2 computers, adding computers to your switch is not going to be a challenge.
  • Editor needed (Score:1, Informative)

    by Bleeblah ( 602029 )
    Who edited that article before it went live? It is a mess!
  • Mandrake CLIC (Score:5, Informative)

    by bolind ( 33496 ) on Sunday October 31, 2004 @10:07AM (#10678509) Homepage
    I will start by admitting that I am just a dumb university student talking out my ass. I have never set up an enterprise scale cluster.

    However, last january we set up a small (six node) cluster with the help of CLIC [mandrakesoft.com]. Once we realized the link between a Mandrake and consective dead CD drives [newsforge.com], we installed the cluster in little time.

    CLIC might focus a little too much on userfriendlyness and a little too little on flexibility, but for our purposes it was great. It sports ganglia, gexec, distcc and MPI (and probably more), and administration and deployment of nodes is a breeze.

    I heartily recommend CLIC for student/test/proof-of-concept projects.

  • I thought at first it said "author Karl Popper"...now that would've been a trick.
  • by Phoenix666 ( 184391 ) on Sunday October 31, 2004 @10:29AM (#10678589)
    I read about setting up a cluster about six months' back, and they said that you can only really run programs that are specifically designed to run on a beowulf cluster. It seems like if you could set up a cluster and be able to run any old app on it without special coding, then you'd have your massive adoption of linux. Plug-n-play supercomputer, using the crappy old boxes gathering dust under the cubicles.

    Is there any plans to take beowulf in this direction? Is it already possible, but I was just reading the wrong FAQ?
    • I'm by no means an expert, but I was under the impression that a cluster of yesterday's computers would easily be outperformed by a single top of the line computer. So, except for fun and learning, clustering with old computers is just a waste of time.

      • ...It depends. If you had a well-designed database, and reasonably partitioned datasets, a given query could be parcelled out to each server, and get you results far quicker than on a monolithic computer.

        If it wasn't so, then it wouldn't be a feature of Oracle, SQL Server, DB2, etc...
    • by photon317 ( 208409 ) on Sunday October 31, 2004 @11:29AM (#10678887)

      Inevitably high-performance clusters require software designed to run on high-performance clusters. It is better not to think of such a cluster as a single system, but rather as a network of individual machines with a tight network connection. Some of the clustering add-ons for linux approach and even achieve certain aspects of a "Single system image" type of configuration, but it's never completely like a single system.

      Back in 1997 or so I tried to get as close as I could to a true Single System Image by building off of the beowulf patchsets combined with patches for Distributed SysV IPC/SHM and a globally-shared root filesystem using CNFS (cluster-nfs, so that a few essential configfiles can have unique copies per cluster node). It was very daunting work to get those patches integrated together, and the end result was that without some kind of network-interconnect that was as high-speed and low-latency as a processor's FSB, there was always going to be a big performance hit doing things this way. Of course if an application happens to be perfect for simple HPC clusters (all cpu intensive, very little I/O, and the work is easily divisible without tons of IPC between the workers), then it runs fantastically on such a Single System Image cluster, but then again it would have run fantastically on a simple cluster that doesn't look like a Single System too. So what the Single System concept bought me really was a nice abstraction layer that made everything easy to deploy, configure and manage. But it came at a severe initial cost of human labour. It's not worth the trouble.
    • An openmosix cluster would behave more along the lines of what you are thinking, but ultimately for HPC applications at scale it is generally more efficient to not do openmosix and write the programs explicitly for parallelism mindful of the layout of processing elements (i.e. network topology or balance between SMP connected processing elements and network connections between nodes).
    • Okay so I set up my first cluster a few months back (nothing too much just 4 computers running BCCD [uni.edu]) Now I did this for an exhibition. I found BCCD easily configurable (but it is like Knoppix.. so you have to configure everything again everytime you reboot). We ran some programs using PVM. Now PVM needs special coding but MOSIX (both PVM and MOSIX come with BCCD) need not (well I havent tried it out for myself but thats what the docs say). It can redistribute the work to other nodes based on the parent node
    • Not really answering your question, but there is a thing called MPI [anl.gov] (Message Passing Interface) which is a cross-platform standard for parallelized programs. You write your program, and it will run on your Beowolf or that massive 24-way Sun, or even locally on you linux box, if written properly. Of course this will always be slower, in the single CPU case, compared code written the old-fashioned way.

      Another very important thing to remember is power consumption and cooling. You might be able to get fifty PI
  • Of course it's much more difficult than it appears to be... Ehm, look... Don't tell anyone, OK.

  • by double_h ( 21284 ) on Sunday October 31, 2004 @12:45PM (#10679310) Homepage

    A flat learning curve is a bad thing.

    The term "learning curve" was invented by the aerospace industry in the 1930s as a way to quantify improved efficiency from mass production (basically, the more you do a task, the easier it becomes). The term was later adopted by psychology and the social sciences, where most people first encounter it.

    In both cases, the horizontal axis of a learning curve represents time or effort, and the vertical axis represents amount learned or productivity. Therefore something that is intuitively obvious in fact has a steep learning curve.

    "Learning curve" was a technical term with a specific definition for decades before it was ever a (misused) marketing buzzword.

    Thank you for your time :)

  • What might I use to cluster together linux servers such that they actually act as a single SMP machine?

    I'd want to do more than just loadbalance webservices with it.

    It want shell accounts on this cluster that act as one maineframe. People would shell into their home directories (which I suppose would all be from one big NFS), and run processes and whatnot on the entire cluster.

    Any ideas?

    Love zaq
    • >Any ideas?

      Yeah - go to Google or some clustering forum/mailing list.
    • Re:little advice (Score:3, Informative)

      by dougnaka ( 631080 ) *
      The first part, act as one big SMP machine is what clustering does.

      The second part with shell acounts and home directories are all problems already solved by NIS/NFS. You could setup a pool of machines that all share the same NIS/NFS info so anywhere the user logged in they'd have the same files/passwords, and load balance it via ipvs or dns.

      AFAIK the current state of clustering works well for custom code situations, where you write your app to run on the cluster, but doesn't transparently make your 4 b

  • I have been looking at network filesystem level clustering and failover and NFS, SMB/CIFS and OpenAFS look like good choices for that. With NFS and CIFS you can have an active/inactive fail-over cluster.

    I don't know about NFS, but in the case of CIFS, the protocol spec has provisions for renegotiating locks if a connection is broken, but I don't know if there are bugs in win2k/XP clients with samba 3 servers. OpenAFS can have a sort of active/active setup, but the archatecture is such that there is only one server that handles the writes and the rest are read-only. In all of these you can have a semi active/active failover cluster if you move half of the active volumes to the backup server, but this adds a lot to the complexity of your fail-over system.

    Those services have a low to moderate amount of state information kept on the server. In the case of a graphical (VNC) terminal server, I don't know of any open source projects that will allow gnome session to be on one server, have that server go down, another server take over its ethernet MAC and IP address and continue processing where it left off on the backup server. The best I can think of is OpenMosix or maybe OpenSSI which are two single system image type clustering systems. If anyone knows anything, please reply and let me know thanks.
  • I think this is a very good idea because people where I live are always talking about "If I had a cluster" when they know almost nothing about Linux
  • I'd like to jump in here and make a few comments.

    First, about the book being a "definitive" guide. I cannot possibly claim to be an expert on every topic in the book--in fact, no one person can. The book is definitive, however, in that project leaders from each of the open source projects participated in editing and reviewing the material for the book.

    It is an over broad statement to say it is the definitive guide for building any and all types of Linux Clusters. The book describes how to build a cluster

E = MC ** 2 +- 3db

Working...