Flattening Out The Linux Cluster Learning Curve 89
editingwhiz writes "IT Manager's Journal has a good look at a forthcoming book, The Linux Enterprise Cluster, that explains in clear, concise language how to build a Linux enterprise cluster using open source tools. Writer Elizabeth Ferranini interviews author Karl Kopper in a Q&A. Is this more complicated than it appears to be? (IT Manager's Journal is part of OSTG.)"
The problem with clustering in Linux... (Score:5, Interesting)
Lars Marowsky-Brée had a paper in the proceedings of OLS 2002 [linux.org.uk] describing the problem and a suggeted solution in his paper entitled "The Open Clustering Framework". I'm not sure how far standardized clustering has come since then. Anyone has any insight on the matter?
Logistics gone digital? (Score:2, Interesting)
He says, "Because we couldn't print the forms for the warehouse people to select the products to be put on the truck, we'd have dozens of truck drivers sitting in the break room each day for more than 10 minutes.
I actually don't get it, most logistics got wireless for about a decade now...
and the truck driver has no right for a break...
VMS clusters (Score:4, Interesting)
And please, don't be put off by VMS because DCL = your first exposure to a VMS system - feels more awkward than bash (in many ways, it certainly is!). It's in the underlying architecture of the OS where the fruits of tight engineering are really demonstrated.
Re:The problem with clustering in Linux... (Score:4, Interesting)
What I gathered from my introductory Operating Systems class was that this was the next frontier and exciting market to keep an eye for.......that and creating applications for these setups was not as you said, standarized yet. Can Linux applications that normally run a single box setup migrate relatively easily to a cluster setup yet?
Re:This is the kind of book we need... (Score:1, Interesting)
Oh for fuck's sake spare the geeky overliteral bullshit and grow a sense of humor and perspective. Thank fuck it's not GEEKS who are it manager's but true manager's, heaven help us if someone asked a geek to ever look at the big picture in an organisation.
"I fail to see how a examining a large painting would enhance our productivity".
Turn your coder brain off once in a while.
Re:The problem with clustering in Linux... (Score:2, Interesting)
I use a PXE boot system with Anaconda kickstarts to get the software installed. A poast install script then configures everything else on the machine. When it reboots, the machine appears in the cluster and is ready to use. I use the Torque batch scheduling system.
You don't need the cluster toolkits to setup a cluster! DHCP, TFTP and a configured kickstart file work just fine with Red Hat.
Re:The problem with clustering in Linux... (Score:2, Interesting)
Some types of applications, it's easy to visualize how to get a dozen or a hundred computers to help with the problem (serving static web pages). Others, it's not (databases)
Beowulf Newbie Question (Score:3, Interesting)
Is there any plans to take beowulf in this direction? Is it already possible, but I was just reading the wrong FAQ?
Re:This is the kind of book we need... (Score:2, Interesting)
Okay, the classic beowulf cluster is a 4x4 matrix of computers. Now, to have a beowulf of beowulfs, each of those computers on a cluster must be connected to its own 4x4 grid, so you now have a cluster of 256 computers, arranged somewhat suboptimally. Now, in order to communicate with these systems, you are going to need some library functions. Classic beowulfs work well with the industry standard pvm libraries. They can also use openmosix if the application is not natively cluster aware. As we are dealing with clusters of clusters, some applications may not function properly if they were designed to work on just a single cluster. So, most likely, we'll end up needing to use a variety of techniques to beowulf squared an application, such as combining pvm and openmosix
Re:This is the kind of book we need... (Score:3, Interesting)
Looking for a good book on High-Availability clusters would be so much simpler
Re:The problem with clustering in Linux... (Score:1, Interesting)
OpenMosix is a clustering technology that allows you to use regular apps and benifit from the cluster.
It works by migrating proccesses from one computer to another. So it's like a SMP machine, the fastest any single thread can be done is limited by the fastest cpu, however the it allows you to do more at once.
For example with a 2 system cluster, you would be compiling the kernel. If you set it up only to use one thread at a time, then you get 100% of the original single machine performance, if you set it up to compile 4 things at the same time, you can get a 160%-180% increase in performacne over a single machine.
All apps can benifit from it, any sort of heavy mutlitasking can benifit from it. No extra programming needed, just a custom kernel that is patched and some services.
apps not designed for cluster with lots of state? (Score:3, Interesting)
I don't know about NFS, but in the case of CIFS, the protocol spec has provisions for renegotiating locks if a connection is broken, but I don't know if there are bugs in win2k/XP clients with samba 3 servers. OpenAFS can have a sort of active/active setup, but the archatecture is such that there is only one server that handles the writes and the rest are read-only. In all of these you can have a semi active/active failover cluster if you move half of the active volumes to the backup server, but this adds a lot to the complexity of your fail-over system.
Those services have a low to moderate amount of state information kept on the server. In the case of a graphical (VNC) terminal server, I don't know of any open source projects that will allow gnome session to be on one server, have that server go down, another server take over its ethernet MAC and IP address and continue processing where it left off on the backup server. The best I can think of is OpenMosix or maybe OpenSSI which are two single system image type clustering systems. If anyone knows anything, please reply and let me know thanks.
Comments From the Author (Score:2, Interesting)
First, about the book being a "definitive" guide. I cannot possibly claim to be an expert on every topic in the book--in fact, no one person can. The book is definitive, however, in that project leaders from each of the open source projects participated in editing and reviewing the material for the book.
It is an over broad statement to say it is the definitive guide for building any and all types of Linux Clusters. The book describes how to build a cluster that can be used to run mission critical applications to support an enterprise (it has little or nothing to do with working on the "Big Problem" as Pfister would call it).
(The book took four years to write by the way.)
I do hope it helps with the learning curve, but this is one of the advantages of building what I'm calling a Linux Enterprise Cluster--the system administrator can leverage his/her knowledge of Linux and add concepts that will allow them to build a cluster capable of supporting the enterprise.
I did not invent anything new for this book, and you CAN already find just about everything on-line that is in this book. I started work on the book in 2000 because, at the time, I wanted to have a guide book like this one that would hold my hand through the process of building a cluster that could support mission critical applications running GNU/Linux.
Finally, let me just agree with the comments about the number of nodes ("You don't need 20 nodes if 6 can do the job"). This book is not about building clusters for scientific applications where thousands of nodes and sophisticated batch job scheduling systems are required. How many nodes does it take to build the ideal cluster for your environment? I think that will depend on a lot of things including your budget, the impact of the failure of a single node in the cluster, how many instances of your application can run concurrently on a single node, performance bottlenecks from your node hardware, and so on. In my opinion, the ideal number of cluster nodes for an enterprise cluster--from the system administrator's standpoint--is about 10 (in a pinch you can log on to every node fairly quickly).
The cluster this book was based on has been in production long enough (over 18 months) to have undergone a complete hardware refresh by the way; so the text is based on actual experience (not just theory) and, as I mentioned earlier, it has been reviewed by subject matter experts to insure its technical accuracy.