Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Linux Software

Practical Beowulf 79

elsewhere sent us Linux gushes savings for oil giant where you can read about the 32 node beowulf cluster being used by an oil company to replace IBM super computers.
This discussion has been archived. No new comments can be posted.

Practical Beowulf

Comments Filter:
  • it has turned into a little tool so that
    unpopular ideas are pushed down. i can't
    believe you are letting this happen.

    i can't believe people thought this sort of bias
    wouldn't happen when moderation first started.
    power corrupts. please please please please please
    do something about it. it's a horribly broken kludge
    and it's not working.
  • by Anonymous Coward
    I believe there is a package called alien that will convert RPMs to tar.gz. Check out freshmeat for more.
  • by Anonymous Coward
    So, since it's essential for Amerada Hess to find oil to stay in business, and this analysis is a fairly important part to finding oil, do you think a fairly good argument could be made that this company is betting a good chunk of its operating strategy on Linux, thereby squashing the big FUD argument that "Nobody bets the business on Linux."? Maybe they're not "betting the business" but it sure sounds like they're thinking about at least throwing a big chunk into the pot.
  • to Hess gasoline... Now, if they'd only bring back 101 octane....
  • Posted by AnnoyingMouseCoward:

    ...is what the article said. Their software *could* have been ported to NT, but the article also mentioned that "...it would take several years..." to do so.

    This is the fundemental problem with Win Anything. M$ programming languages are totally non-standard. Having to port a 2E+6 line program from Unix to NT would have been *nasty*, so it's pretty easy to see why they went to Linux instead.
  • I think you guys are missing the point. The comment about saving money with NT was meant (i'm guessing) to show that with NT, they could have saved over their $2,000,000 initial price tag.

    Anyway, the beauty of this is that when big companies like this start using Linux, it's easier for me to convince my boss that it's a good choice for our small company.

    World domination: the sooner the better.

  • The Extreme Linux CD and most of the RPMs it contains are woefully out of date. The EL CD has a hacked RH 5.0 install with some kernel modifications, none of which are really necessary. Most clusters are being moved to 2.2.x kernels (for network and SMP reasons) and at least the channel bonding modifications have been moved to 2.2. My 24-node cluster is running 2.2.3 which I hand patched with the TCP_NoDelay and Channel Bonding modifications.

    If you are serious about getting a cluster running, take a look at:

    http://www.xtreme-machines.com/x-cluster-qs.html [xtreme-machines.com]

    This is very up-to-date and basically begins by advising you to use your EL CD as a coaster.

    Mike Prinkey
  • I don't think that moderation should be stopped, but it does need some tweaking.

    I too have seen some articles moderated down not because they were trolls, flamebait, or not informative, but because they were unpopular.

    OTOH, I have seen articles moderated WAY up (like to +5) without justification (like some of mine). Now, it looked to me that the subject just happened to strike an emphatic response with the moderator, and not be of pulitzer writing quality.

    I think that if an article is moderated up as well as down, then at least one of the modertators is judging content as opposed to writing quality and subject relevence. This should be investigated since that isn't what a moderator should do.

  • Have you priced out an SP2 lately. This was a tremendous price/performance savings.
  • This also doesn't take into account software costs in which linux definitely saves a bundle cause its free. So far as I remember, the SP2 software is not free and has fairly hefty support/maintenance costs.
  • Speaking only for myself, not my company, we don't want to stain or destroy anything. Maybe when people stop using oil derived products, all this might stop. We love to buy fast cars, products made from plasics (sneakers, computer parts, etc), products made from chemicals (hair gel, medicines, etc), power to power a computer, home, car. Supply is a result of demand. If you want to decrease the use oil, maybe you should try to work on the demand. Lower the demand for oil, i bet oil companies will stop poking holes. No one complains when linux is used for nuclear weapons research. what do people think that linux is being used for in all these universities and labs. if linux is to grow, it needs to used by as many types of apps as possible. besides, linux is free for use by all. who should determine how it should be use. i think the spirit on linux means that everybody can use linux for whatever results the want. anyway, that is just my oppinion (not my companies as i do work at a corporation and have to deal with suits)
  • check out gas prices over the past 20 years. i bet you'll see that gas is one of the best bargains around. how much did a car cost about 10 years ago. I remember not paying much more now that i did then for gas. people always complain about the price of gas. but in fact, factoring in inflation it has gone up very little. in fact alot of the price of gas goes to taxes. maybe you should contact your local politicians and complain to them.
  • I agree. I wish my company would do just that. Makes for a longer career and probably a better environment which i think everyone is in favor of. Good points!!!! I believe lightsabers use some those generic batteries you can get at Wal-Mart
  • The Key Point is that our cluster is now up to 96 Nodes. Hence we can now run 3 jobs at the same time. Maybe my math is wrong but my guess is that our total through put is 2.5 times more than the sp2 and we still save 1.5Million.
  • 1. Need better multi-system performance monitoring tools. IBM has a really nice performance tool called Performance toolbox which lets you create widgets to monitor a wide variety of system characteristics on a variety of system all within the same window. We would also liked to have had native drivers for our IBM 3590 tape drives. On AIX we use the features to put the label of the tape on the tape drive led. It would be nice to have better distributed management tools for installation and management. However, none of these things would disuade us from using linux. Linux has been a big success here at Amerada Hess.

    2. I would like to see journaled filesystems, more drivers for various hardware devices, better monitoring utilities, better installation tools for mass installs and upgrades. We would like to see gcc integrate the P3 mmx instruction set for floating point purposes.

    3. We chose RedHat really because that was what Harry Duffey and myself were most familiar with. For our programmers and users, the version of linux was irrelevant. It doesn't hurt with upper management the RedHat is probably the most visible and known Linux vendor at this time. In fact, we are in the process of taking a look at FreeBSD, Debian, and Suse just to level check RedHat. I have not relied upon RedHat for any support. The few times we called RedHat, we had a hard time getting support. I typically use Dejanews and find almost all the support i need.

    4. IBM is aware of what we are doing as they had an oppurtunity to sell us Netfinity PCS for our project. IBM has real nice systems (PC and Rs6ks) and they did a good job in our evaluation. However, all the PIII 500 performed almost exactly the same in our seismic processing benchmarks. Hence, we had no good reason to switch from Dell. I think IBM is very interested in the market. We have worked for IBM for years and they are good partners and I suspect if they are smart they will have an excellent Beowulf offering. I am not sure what this will mean for the SP2 though although I suspect for the time being, many large companies will buy the SP2 for name and reputation regardless of the price.

  • The beauty of linux is that we just installed standard redhat (no tweaking) on the systems, hook them up on a switch and we have a supercomputer.
  • Actually our exploration IT budget is probably closer to 4-5% of our exploration budget. You are only counting one project. We also have 150 Sun Workstations, several hundred Windoze PCs, 6 Sun Servers,SP2, Auspex, employees, software etc. Also, this sea-floor information is extremely expensive and adding just a little improvement could well cost close to a million dollars so it does really help. When you have all the information, it doesn't seem so bizarre and one would have to think that Jed's career here might be in jeopardy.
  • I think the statments means that by using NT we could have saved money in comparison to the SP2. However, we really never even considered NT as on option for several reasons

    1. Costs - Linux is clearly cheaper

    2. Reliability - Linux was clearly more reliable which has been born out. Our clusters have now been up over 3 months straigt.

    3. Scalability - How does one manage 96 NT boxes in a cluster. I had no idea. Unix has things like rsh and robust scripting making this a snap

    4. Portability - it was relatively easy porting our sp2 apps to linux. by the way we use Linda from SCA (Scientific Computing).

  • here it is. it is very simple and i know you guys/most anybody can do better but since you asked.

    here is the code for my little dsh script. very simple and rudimentary but it works. all you do is create a file with node names in it say /collection1. no security per se so maybe someone can show me how to add ssh or something else.

    Let /collection contain

    node1
    node2
    node6
    node8

    Usage Examples:

    export BCOLL=/collection1

    dsh date

    dsh /nfs/tech/update_kernel.22

    dsh reboot

    dsh "echo node1:/nfs/data /nfs/data nfs rw,bg,hard,intr >> /etc/fstab"

    The code

    #/bin/ksh
    if [ "$BCOLL" = "" ]
    then
    echo "Environment Variable BCOLL not set"
    echo "Please set BCOLL to a file with list of nodes it"
    echo "export BCOLL=/root/allnodes"
    echo
    exit
    fi

    if [ "$1" = "" ]
    then
    echo "Please provide command to run"
    echo "Usage: dsh command"
    echo
    exit
    fi


    echo "Collection: $BCOLL - Command: $@ - Date:`date`" >> /root/dsh.log


    for i in `cat $BCOLL`
    do
    echo > /tmp/dsh$$.$i 2> /tmp/dsh$$err.$i
    echo "Running $@ on $i" >> /tmp/dsh$$.$i 2>> /tmp/dsh$$err.$i
    echo >> /tmp/dsh$$.$i 2>> /tmp/dsh$$err.$i
    rsh $i "$@" >> /tmp/dsh$$.$i 2>> /tmp/dsh$$err.$i &
    done
    wait


    for i in `cat $BCOLL`
    do
    echo
    "----------------------------------------------- ------------------"
    cat /tmp/dsh$$.$i /tmp/dsh$$err.$i
    rm -f /tmp/dsh$$.$i /tmp/dsh$$err.$i
    done

    Jeff Davis

  • We actually evaluated IBM Netfinity PIII 500s. They were good performers. However, they were about equal to the Dells and since we have already been using Dells, we saw no reason to switch at this time. But we will continue to evaluate systems based on price/performance as we grow our system larger than 96 nodes.
  • by Jeff Davis ( 2828 ) on Tuesday May 04, 1999 @07:49AM (#1905596) Homepage
    We work for Amerada Hess on the beowulf project in the article. The author of the article was genuinely interested in how we used Linux and the difficulties in getting our upper managment to sign off on the project. The Beowulf project at Amerada Hess has really opened some eyes about the power of linux and hopefully this article will spread the word. Slashdot, Freshmeat, Redhat, Dejanews, Dell, and Paralogic have really helped this project run smoothly. I would highly recommend any company with a need for serious computation and a few programmers to give beowulf a shot. It really works and it can really save you money Jeff Davis Harry Duffey Amerada Hess Corporation Houston, TX
  • by Jeff Davis ( 2828 ) on Tuesday May 04, 1999 @08:20AM (#1905597) Homepage
    1) The main tools we had to create was a similar program to IBM's PSSPs dsh (distributed shell). However, we didn't take the time to put in all the features that IBM has in their dsh. As of right now, their is no similar corollary to PSSP on Linux that I'm aware of. There is a product called SMILE (SCMS) [ku.ac.th] which we were implementing beowulf wasn't quite ready. There is also a product call masshosts by John Mechalas which we are looking at. One nice thing about Linux/Unix is that there is really that much need to use these tools that often as the systems run and run without much need for interference. We also used the kickstart process to install the nodes. The SP2 definitely has a one up on installation and you basically setup the install process and say go. several hours later the system is ready. however, linux installs about 6 times faster than AIX. 2) I guess i could put my little tool out somewhere but i really think others could probably do a much better job of writing a more complete tool. I will try to compile our config and post it. Jeff Davis and Harry Duffey
  • I was thinking the same thing while reading the article when I remembered that IBM not only knows of Linux's capability in Beowulf clusters but has demonstrated it. Anyone remember the 17 Netfinity cluster IBM demonstrated at LinuxWorldExpo that matched a Cray on the PovRay benchmark.

    I feel IBM will be fine for quite a while as far as their AIX systems are concerned. The beowulf project does great work on high computation problems but fall relatively short on applications require large and fast disk space. Mostly due to a lack of a fast distributed filesystem. At least this is my understanding. Down the road maybe IBM will go back to being a hardware manufacturer primarily.
  • Please do post it. I, personally, may not use it. But that's how the community gets more robust applications. Someone does the basic fleshing out, and others add onto it what they feel they need.

    Glad Linux worked out so well for you guys.


    Chas - The one, the only.
    THANK GOD!!!

  • Seeing as how IBM has a team porting Linux to their RS/6000 workstations, I think IBM is planning on using Linux to help profit margins
    Christopher A. Bohn
  • Is that it is distributed as RPM only. I downloaded
    RPM for the sole reason of trying it, and frankly,
    nothing worked as promised. RPM refused to unpack
    the package...

    I think it would be nice of them to include a
    .tar.gz package as well. The entire world does not
    run Red Hat right yet.

    /* Steinar */
  • Also note that the boys from OPEC have a LOT to say about gas/oil prices....

    I'm not convinced that gas is a 'bargain', however. Non-renewable, not horribly clean. I would like to see more oil companies become 'energy' companies, and spend more on alternative fuel sources.

    And what kind of batteries DO they use in lightsabers, anyway?

  • NOT better performance by Anonymous Coward on Tuesday May 04, @03:16PM EDT I have the computerworld article, in print, and in both the text and in a little side box it says that the Linux Beowulf performs 80% as fast as the SP2. In other words, the SP2 is 125% faster then the 32 node Beowulf.

    Ummm.... Shouldn't you have said that the SP2 is 25% faster than the Beowulf? Where did you get 125%?

    By my sample calculations, if the SP2 runs at a "speed" of 100, and the Beowulf at 80, then the Beowulf is 80% as fast as the SP2 (80/100)*100 = 80%

    However, to see how much "faster" the SP2 is, do a percent difference calculation:

    ((100-80)/80)*100 = 25%

    OTOH, 80*1.25 = 100, so the SP2 is 1.25 times as fast, but not 125% faster.

  • Set your comment threshold to -1 and quit whining...
  • In one project at the University of Texas at Austin, researchers in the Department of Petroleum and Geosystems Engineering have found that a Beowulf cluster with 16 400-MHz Pentium II processors could perform oil-reservoir simulation calculations about as quickly as a comparable SP2.

    Ok, isnt the fact that they are comparable make them perform about the same? What factors were they using to determine if they were comparable? Case color? :)

    I just thought this was a little odd.

    paul
    ---------------------------------------
    The art of flying is throwing yourself at the ground...
    ... and missing.
  • I think it's a shame that this has been moderated to -1. That's not a flame after all. I know that "the community" won't agree on any single application of a free OS to be morally good or bad (and subsequently our licenses usually and rightfully do not include clauses to prevent code use by certain entities). However, I can understand why a person chooses to not jump up and down because of the use of Linux by an oil corp, and Slashdot should be a place where it's possible to to express such concerns. I too would be more happy if the news read "Linux used in scientific effort to significantly reduce oil usage worldwide". OTOH, the chance that penguins are washed up on Norwegian shores is quite small, considered that Antarctica is quite far away from Norway :) Dead auks are not better, though
  • Linux has proven itself in the reliability field for single nodes. Beowulf is still a fairly ad hoc thing, it may work for some things, maybe not for others. Linux just doesn't have the High Availability features some other OS's do, partly because of its origins on hardware that isn't itself HA. If you yank the plug on a single Beowulf node, it's likely to scream and die (the process that depends on it, the cluster in general should probably manage ... probably). Ironically, the folks who did implement this failover HA feature in some Linux cluster are none other than ... IBM.
  • FYI, Paralogic ( http://www.plogic.com/ [plogic.com]) is housed in Lehigh University's mountaintop campus. I'm pretty sure that there's no real affiliation, but I've seen this outfit. It is sharp.

    Isn't it interesting just how small the world seems to be some days?

  • As the member of the UT research group that actually assembled the cluster and ran the tests, they were comparable in terms of execution time and speedup. Our cluster of 300 MHz PIIs with 440LX motherboards, the cluster was about 1.8 times slower than the MHPCC SP, which uses 160 MHz P2SC thin nodes. Using 400 MHz PIIs with 440BX motherboards (and also 400 MHz Xeons), the cluster is as fast as the MHPCC SP.

    As for case color, the SP definately outperforms the cluster. Rows of beige cases on metal shelves hardly compare to black refridgerator-sized boxes with more blinking lights. Maybe if we build a cluster of the new SGI boxes...

    I would have liked to see more details of our work in the article, but the reporter just wanted to verify that the Hess results weren't out of line. You can find more details of our results at http://topeka.cpge.utexas.edu [utexas.edu]. Feel free to email me if you'd like specific details that aren't on the webpage.


    -jason
  • I think this is definately going to happen. I taked with the reporter who wrote this Computerworld article (I'm part of the group at the University of Texas that is mentioned at the end of the article). He said that Computerworld is mainly read by business-types who want to use technology to cut costs. I think Linux just starting to be accepted in this community, and will soon take off there like it has in other areas.

    -jason
  • You don't need the rpms from beowulf.org unless you want to use extensions to the standard kernel such as channel bonding or distributed pids. Otherwise, just install your favorite distribution on several machines, grab MPICH or PVM, and start experimenting.

    Feel free to send questions my way if you need help with this. Or check out any of the documentation at http://www.beowulf-underground.org/ [beowulf-underground.org].

    -jason
  • Since they are saving so much money, do you suppose gas prices will go down now?

    Oh wait, what was I thinking?
  • Just as a side note..Linux DOES run kerberos (case in point CMU [cmu.edu] and MIT [mit.edu]) ... So it can be almost as secure (if not equally secure)as the SP.
  • Hardly think that is in the offing. Take a look at the new G6 specs and more importantly, IBM's financial expectations based on this system.
  • It really depends on the package. Certain packages have built-in functions to handle node failure. Try running Oracle with PS option on Linux and you will see and it will not go down if only one machine is dead. Similarly with POV clustering. You have to ask yourself whether PVM and similiar daemons can handle node-failure. Do you know the answer?
  • The real question is: does the application handle node failure?

    Pvm is just a bunch of functions that allow you to pass messages between threads. It contains functions to handle node failures. However, it is left up to the application programmer to catch them.
  • The PHBs are coming! The PHBs are coming!
  • Though the company could have saved at least hundreds of thousands of dollars by opting to set up Windows NT clusters, porting its Unix rendering application would have been a huge chore, Forsyth said. The application is about 2 million lines of code and might have taken years to rewrite for Windows, he said. "We thought about that for three nanoseconds."

    I was quite confused about this a second. How could they possibly have saved anything by buying NT instead of using Linux? They would have to port 2M LoC rather than write a bit of custom management for the cluster.

    But then I realised that they would have saved (only) a few $100.000 if they used NT instead of the IBM SP2. Now they seem to save $1.870.000 by opting for Linux! (Minus the cost of writing custom code)

  • It said they could save a few hundred thousand dollars by going with NT. Since they only spent 130K on Linux, they clearly weren't talking about saving "a few hundred thousand" more than the Linux solution.

    Also, the fewCK was apparently before the cost of rewriting everything.

  • Probably long enough that you'd never live to see the completion. 32 PIII's wouldn't do squat. As of May 4th, Distributed.net has the equivilant of
    72,000 PII/266's running around the clock, and are
    completing about .03% of the keyspace per day. Quite frankly, 32 PIII's don't amount to anything at that kind of scale.

    On the subject, I really like what they're doing with the cosm project [mithral.com]. Think generic distributed.net, people can put in their own projects, put up their own reward system, it's the world's largest beowulf cluster -- okay, not strictly speaking but it's still a hell of a lot of horsepower.
  • IBM built a Beowulf cluster themselves. They sell hardware mainly. If people are using their "NT" servers running linux in parallel, it's all good.
  • by trichard ( 28185 ) on Tuesday May 04, 1999 @08:03AM (#1905623)
    As an IBM SP specialist, and a Linux advocate I am impressed and intrigued by the Beowulf configuration built by Hess.

    The SP does have a fairly robust management toolset (PSSP) and uses kerberos with parallel commands for secure multi-node management.

    To be honest, I haven't read up on the capabilities of Beowulf but the impression I get from the article is that Hess had to write some apps to approximate the management capabilities of the SP.

    Two questions:

    1) Did Hess have to write these apps because Beowulf has a weak management system, or did they have to write them because Beowulf has a functional system that is different from what they were used to?

    2) Any chance that these management apps would be available under the GPL? Or even that more details about how they configured their systems will be posted?

    I'd love to be able to show my customers what Linux can do. Maybe Jeff Davis from the above post can shed some light on this.

    Ted Richardson
  • I find it rather curious that the article claims that CIOs seem to think that linux has not proven itself in the reliability/stability field. We're (at least in theory) talking about folks that "should" have some tech knowledge ... aren't they used to BSOD's? Rebooting?

    Perhaps somebody should organize a contest to test reliability. Anybody wanna bribe^H^H^H^H^H er... pay off^H^H^H^H ... uh ... I meant, ask Mindcraft to do it? ;)

  • by D3 ( 31029 )
    This warms my heart to see $150k vs. $2M gets _better_ performance. Nice specs on the systems too 500Mhz PIII with 1G RAM each! I wonder how long the RC5-64 would hold up to this kind of power?
  • In the article they claim the IBM machine can be sub-leased out to another company and recoup costs. Also, if they did run out of their lease, they'd have to lease it again for another $2M not $150k so they did save quite a bit.
  • Yes, in order to save the wildlife, I've been fighting agains strategic oil drilling my entire life, I never drive anywhere, buy any petroleum based products, or buy from anyone who does support petroleum... Wait, you yutz, unless you can find that magical elixer for all the world's (around 75%) energy needs, don't freak about Linux being used to drill a hold in the ground. Besides, I'd rather see a beowulf cluster being used to draw up detailed maps of the ocean floor than being used to play quake at 1600x1200 at 32bit color. Don't get me wrong, I think that we need to figure out alternate means, but right now we're slaves to our addiction to things like electricity, mobility, and civilization, if it weren't for oil drilling, you wouldn't even be running that fancy pants computer on your desk. Just a little rant. Zealots piss me off.
  • It's just you.
    Seriously thought, while the article states that they would have saved several hundred thousand dollars by going to NT, they were able to save even MORE money by going with Linux. Also, it appears (and I may be reading too much between the lines here) that they also felt that the programmign aspect of the move would be easier if porting to a *nix-like operating system, rather than to Windows.
    Just some thoughts from a guy ripping out all the NT server machines at his work and replacing them and the RS6000's with linux boxen.
  • the 3 nanoseconds comment about porting to NT is pretty funny. But I can imagine how frustrating it would be to have Dr Watson make a house call 1 week and 6 days into an analysis! ;)

    seriously though, I ain't no Howard Hughes, but it seems bizzare to me that from a budget of $250 million they spend .05 of a percent on the analysis equipment! and they say they could probably spend the money saved on better quality sea-floor information... well duh! surely decent information to start with is kind of important.

    Who's running that show, the Clampetts?
  • come on Al, fess up!
  • Thanks for insight on the Beowulf project at Amerada Hess. And thanks for posting the dsh script that you guys used.

    I have a few more questions:
    • Were there any negative aspects on implementing the Beowulf cluster? Management tools, documentation, etc.?
    • What things would you like to see improved in Linux in general?
    • Why the Red Hat distribution? Do you consider its RPMs better than Debian or SuSe installation methods, or were there other reasons for this choice?
    • Have you heard IBM's reaction?
    These questions are not intended to be Flame-bait material! I have examined various distributions for smaller projects, and I would like your opinion on why Red Hat was used. I have a couple of "business case" reasons for recommending Red Hat to certain clients, but I am wondering if there were any other technical or support-related issues involved in your decision.
  • I wonder how long IBM will be so supportive of Linux when it starts to cut into their high profit market. Although, I suppose with the right leadership, they'll roll with the punches and realize that markets change, and they won't get stuck like last time. Has anyone else noticed the "small company" attitude that IBM has been trying to create? Almost like, "We're big and reliable", but at the same time they're saying, "Hey, we're small and flexible, and new!" Just a few thoughts. Really this is just a ploy to get a score greater than 1.

    Vidar Leathershod
  • Yeah, it would only take *4* billion years!!!
  • Amen. Gas prices really aren't that bad! Look at what they pay in Europe, it's about four times as much as we pay here!
  • and my cynical ways, but did the article make it seem like they could have "saved a lot of money" by moving to NT clusters, but that was too much work, so they decided to use Linux instead. plus, i was just reading in dr. dobbs last night how linux is such a great choice for beowolf because of its stability (don't have to worry about keeping 30 or so machines up & running through the course of your simulation).
  • One group at Los Alamos did just that with previous generation (21164) Alpha boxes and were quite happy with the results. I understand that they're now looking at building a cluster from 21264s, which have significantly better FP performance. Pretty zippy little buggers.

    http://cnls.lanl.gov/Internal/Computing/Avalon/

Life is a game. Money is how we keep score. -- Ted Turner

Working...