Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Linux Software Technology

Cray CTO: Linux clusters don't play in HPC 435

jagger writes "Linux clustering was touted as the next big thing by many vendors last week at ClusterWorld Conference & Expo 2004. But supercomputer vendor Cray Inc. scoffed at the notion of putting Linux clusters in the high-performance computing (HPC) category. "Despite assertions made by Linux vendors, a Linux cluster is not a high performance computer," said Dr. Paul Terry, CTO of Cray Canada."
This discussion has been archived. No new comments can be posted.

Cray CTO: Linux clusters don't play in HPC

Comments Filter:
  • Marketing (Score:5, Insightful)

    by Allen Zadr ( 767458 ) * <Allen.Zadr@g m a i l . com> on Tuesday April 13, 2004 @11:42AM (#8848731) Journal

    While Paul Terry makes some good points, in his statements, including the partial quote from the post, "Despite assertions made by Linux vendors, a Linux cluster is not a high performance computer, said Dr. Paul Terry, CTO of Cray Canada. "At best, clusters are a loose collection of unmanaged, individual, microprocessor-based computers."

    Remember to take this with a grain of salt. The inflammatory nature of the comment is nothing more than a marketing ploy to increase visibility of, and sell, the new Cray XD1

    • by Total_Wimp ( 564548 ) on Tuesday April 13, 2004 @11:54AM (#8848941)
      "At best, clusters are a loose collection of unmanaged, individual, microprocessor-based computers."

      I'm sure Paul Terry is nothing more han a loose collection fo unmanaged, individual human cells too. But I'm sure, with hard work and love, he can become a _real_ boy! Lets all have a hug.
    • Help me here... (Score:3, Insightful)

      by ScottGant ( 642590 )
      I'm a layman...I have no idea what I talk about, but of course that doesn't stop me.

      I know I keep coming back to Virginia Tech, but isn't all those G5's linked together to make the 3rd fastest supercomputer itself a cluster? Or is it considered something else?

      And if it IS considered a cluster, then why wouldn't a Linux based (along with the *BSD based G5s) be able to make a fast supercomputer?

      If so, then what Paul Terry is spouting is just FUD and marketing to help sell his product, yes?

      Just wondering.
      • Re:Help me here... (Score:5, Insightful)

        by maan ( 21073 ) * on Tuesday April 13, 2004 @12:04PM (#8849055)
        You're right in saying that the Virgina Tech cluster is the 3rd fastest supercomputer (LINPACK tests). I think that for some other tasks however, it would be slower. Sure, they use infiniband as an interconnect (very fast & low latency), but that doesn't change the fact that it's many separate nodes, each with its own memory. So if one processor were to access some memory on a different node, it would slow down things a little.

        So depending on the task at hand, the cluster might perform very well, or perhaps a little less well. Cray supercomputers are a big number of processors all in the same machine, and more importantly all sharing the same memory. Each processor has the same delay to access any memory content.

        The argument in favor of clusters, however, is that it's still cheaper to throw more computers in than to buy a Cray that would perform the same task in less time.

        In the end, there's a lot of marketing involved in all of this...

        Hope this helps (and that I'm not completely wrong!),

        Maan
        • Re:Help me here... (Score:5, Interesting)

          by krlynch ( 158571 ) on Tuesday April 13, 2004 @12:17PM (#8849230) Homepage

          So depending on the task at hand, the cluster might perform very well, or perhaps a little less well.

          Surely what you meant to say is that, depending on the task at hand, a cluster might perform very well, or perhaps perform attrociously. :-)

          Clusters tend to work well when the various nodes don't need to communicate very often but you need lots of cycles for the subtasks, while dedicated supercomputers tend to perform very well in tasks requiring vast amounts of internode communications bandwidth along with large numbers of cycles. If you need vast bandwidth and relatively low numbers of cycles, your pricepoint is likely a mainframe. And if you don't need either, you get a cheap desktop machine.

          Certain problems parallelize well on a cluster ... others don't. Some don't parallelize at all, and a cluster won't do you a darn bit of good. The different machines are designed for different uses ... and one should be careful not to push a "one size fits all" solution. The Cray guy clearly got it wrong on that point, and likely knows it, but he was marketting, not teaching a course in choosing hardware for the task at hand.

        • Re:Help me here... (Score:3, Interesting)

          by CatOne ( 655161 )
          It slows things down a little, yes, but it's not a huge difference. Infiniband can do DMA across machines -- so the memory on machine 2 *can* be directly accessed by the CPU on machine 1 (i.e. the CPU on machine 1 doesn't need to be consulted).

          Sure, this reduces peak efficiency. I think on the VT cluster it was in the 50-60% range (I could Google search but I'm lazy... shoot me)... that is, the total performance is about .5 or .6 times (2200 CPUs). This is pretty good, overall, compared to other system
      • There isn't a Cray system that can touch the brute parallel power of a big cluster like Virginia Tech's G5s. But depending on the kind of problem you're working on, there are Cray systems that would walk all over that G5 cluster.

        With problems that can be split up into hundreds or thousands of more-or-less independent subtasks, a cluster is the way to go. But for problems that can't be divided up like that, a smaller system with a few very tightly coupled extremely fast vector processors, like what Cray s
        • Valid Question, then (Score:3, Interesting)

          by Allen Zadr ( 767458 ) *

          At what price point does the Cray XD1 come in? While huge clusters are (supposedly) cheap individual computers -- I would argue that G5s are not inherantly cheap -- how many G5s that make up the Virginia Tech cluster would you have to get to before you've paid for a Cray XD1?

          I mention this because the article implies that Cray is planning on selling the XD1s at a price point cheaper than equivelant clusters. If they succeed at making the XD1 cheap enough, then it may be more cost effective to [[ effectiv

      • Re:Help me here... (Score:3, Interesting)

        by starm_ ( 573321 )
        I'n not that familiar with HPC either but I'll try to explain what I know in laymen terms. A cluster is nothing more than a bunch of computer networked together intellegibly with an OS that is capable of seperating tasks between these computer. Crays on the other acts more like one big computer. Like the cluster, It also has hundred of CPUs but they are all on the same "motherboard" ( if you can call it that). Some of them share memory. The memory is very high speed. (somethimes in configuration equivalent
  • by Anonymous Coward on Tuesday April 13, 2004 @11:43AM (#8848740)
    ...a Beowulf... cluster... thingy... doesn't that count?
  • by JargonScott ( 258797 ) on Tuesday April 13, 2004 @11:43AM (#8848751)
    A quote I've seen before:

    "If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens?"

    Maybe he meant penguins?
  • Business or science? (Score:2, Interesting)

    by grub ( 11606 )

    Dr. Terry's assertions remind me of a Seymour Cray quote I had as my /. sig a while back:
    "If you were plowing a field, which
    would you rather use? Two strong oxen or 1024 chickens?"

    I'm not picking a side, it just seems interesting that the Cray CTO would echo Seymour's thoughts. I guess it's for business and marketting reasons though, sadly.
    • Even though the chickens might be 80% more efficient, there are other considerations: Can you imagine the ridicule you'd get when you went into town?

      "Here he comes, get ready boys! Cluck cluck cluck cluck cluck, here chickey chikcey, Haw Haw Haw!", etc.

      BTM
  • CTO of Cray? (Score:2, Interesting)

    by shachart ( 471014 )
    You did notice he is the CTO of Cray... Canada??
  • by stevens ( 84346 ) on Tuesday April 13, 2004 @11:45AM (#8848776) Homepage
    Company officer claims competitor isn't as good as his product. Film at 11.
  • by heironymouscoward ( 683461 ) <heironymouscowar ... .com minus punct> on Tuesday April 13, 2004 @11:45AM (#8848779) Journal
    Oracle disclaim MySQL and PostgreSQL as "toy databases", Microsoft claims that "Apache cannot be used for real web serving", and Sun announces that "Intel and Linux simply cannot be used for enterprise computing".

    So all those supercomputing labs that use Linux clustering (that invented Linux clustering, even) have been wasting their time?
    • by dasmegabyte ( 267018 ) <das@OHNOWHATSTHISdasmegabyte.org> on Tuesday April 13, 2004 @12:54PM (#8849700) Homepage Journal
      All of those statements are true. And a cluster is not a mainframe, and the products sold by Oracle, Microsoft and Sun *DO* go far beyond their Open Source competitors in terms of functionality.

      The problem for these guys is that, in terms of real world enterprise usage, not everybody needs the features they offer. My business doesn't need the easy management and clustering features in IIS, heck the website hasn't been updated in months and this time kast year nobody even knew which machine it ran on. We don't need the task scheduling, file striping, data transformation, replication or XML features of Orcale. In fact, we only need a tiny sliver of the possible functionality of these great products...but we're unable to pay a sliver of the price. With OSS ramping up its feature set daily, for a lot of companies with our needs it makes more sense to train a guy on Linux than to drop five digits on Windows Server 2003 and SQL Server.

      As for supercomputing...well, a cluster is NOT a mainframe. They're two similar, but different things, with the main difference being the databus. If your task is to perform a lot of calculations on a trivial dataset, clustering is the way to go. If your task is to perform a few calculations on a massive dataset, you want a mainframe. The mainframe is simply more efficient at processing massive inputs and providing massive outputs because it was designed to efficiently pass data between processors -- give the same dataset to a cluster and most of your time is wasted negociating the network.

      Of course, these days networking is so fast that a cluster will probably do for most of the things people used to do on mainframes...but a cluster is still best for tasks which are easy to split apart and process in pieces.
    • Oracle disclaim MySQL and PostgreSQL as "toy databases"

      Yup kid and for very good reasons. Take sometime off from paying with your toys and regiter at OTN to learn about what Oracle 8i and 9i database are capable of. You'd literally blow your head off if you see what their Apps are capable of.

      For a start, consider 4 page long nested views that pull data from more than 40 tables where some contain upwards of million rows. And these views are accesed thousands of times a day in their apps.

      Here is an

  • Are too (Score:5, Interesting)

    by Anonymous Coward on Tuesday April 13, 2004 @11:45AM (#8848780)
    "Most cluster [experts] know now that users are fortunate to get more than 8% of the peak performance in sustained performance."
    Tell that to PIXAR. I don't believe it either.

    I guess that the simple problem is just that the algorithm applied is usually not suitable for massively parallel computing.

    • Re:Are too (Score:3, Insightful)

      by Technician ( 215283 )
      Tell that to PIXAR. I don't believe it either.

      Ya beat me to that one. I won't post it because it would be modded redundant, but I would have mentioned Google also.
    • Re:Are too (Score:5, Insightful)

      by dead sun ( 104217 ) <aranach@gma i l .com> on Tuesday April 13, 2004 @12:06PM (#8849083) Homepage Journal
      Pixar doesn't need telling, their problem breaks up so miraculously well that they'll see the best performance you could possibly expect from a cluster. The big problem, rendering a movie, decomposes into thousands of small problems, rendering a frame. Each machine in their cluster can handle a group of frames at a time with zero need to communicate or worse, share computation, with other machines in the cluster. It's the best case scenario.

      Many other computing problems don't decompose nearly so nicely. So there are certainly problems that probably won't see more than 8% of peak performance. If you were particularly inclined you could probably invent a problem that had to be done serially, leaving percent of peak performance equal to what percent of your cluster one box was. Cray is right to that extent and if you're solving a problem that falls into the category of not easily parallelized then perhaps one of their machines is the better tool for the job. But, like you mention there are instances where the cluster is a great tool and cost effective to boot.

      Heck, ever check out some of the faster interconnects like Myrinet? They're insane and exist because fast ethernet just doesn't cut it in some places. Just using a slow interconnect is enough to bring real performance down below theoretical peak. Luckily for Pixar off the shelf fast or gigabit ethernet is likely enough.

      Anyway, use the best tool available. If your problem falls into the category of trivially parallelizable like rendering a movie is then don't bother wasting your money on a Cray. If your problem isn't suited to a cluster, however, then maybe a cluster isn't the right answer. If you have a big problem that needs serious computation take the time to figure out what you need before taking a marketing drone's spiel for gospel in your situation.

      • Re:Are too (Score:4, Interesting)

        by bugnuts ( 94678 ) on Tuesday April 13, 2004 @01:27PM (#8850146) Journal
        All tests for the top 500 supercomputers [top500.org] are done solving a problem using Linpack [top500.org], not some trivially parallel code such as raytracing 100,000 frames of a movie.

        Message passing is the biggest issue with such solvers, and in a way, cray was absolutely right about Linux, although misleading. There are some tests going on now with a modified Linux kernel for doing true HPC, and it's been done in the past (I know, I've used it). Things like disk swapping pretty much immediately disqualifies you for high performance computing. It has its place of course, such as trivially parallelizable codes is one example (Pixar).

        Myrinet was out before Gbit ethernet was really available, and also has some nifty routing capabilities. And since the bottleneck for HPC is usually message passing, high performance computing will better realize its theoretical performance as the communication speed catches up to the processor speed.

        But, to Cray's discredit, making a blanket statement that Linux can't do HPC is like saying Macintoshes can't do HPC. [top500.org]
      • Re:Are too (Score:4, Informative)

        by GauteL ( 29207 ) on Tuesday April 13, 2004 @06:27PM (#8853929)
        In fact even rendering a single frame decomposes easily into lots of seperate task, because it involves raytracing backwards from every single pixel trying to calculate it's colour. And furthermore, each of these tracings are completely seperate and just begs to be parallelised.

        Raytracing is sometimes referred to as "embarrasingly parallel", because of this.

        Mathematical dependencies is the real destroyer of parallelism. Any situation where the next calculation depends on the result on the previous is a typical serial calculation that would do badly on any super-computer and might as well be run on a single single scalar processor like the Athlon or P4.
  • by Anonymous Coward on Tuesday April 13, 2004 @11:47AM (#8848811)
    Regardless of whether I agree with the article or not I feel compelled to point out that:

    The 1100 node Apple G5 cluster in virginia has yet to run any real scientific code. So far it has only ran benchmarks.
  • by WarlockD ( 623872 ) on Tuesday April 13, 2004 @11:47AM (#8848813)
    "We are dropping our line of Cray supercomputers and replacing them with rack mounted Beowulf cluster of 486's!"

    I am not saying Cray isn't worth it, but there is something to be said on replacing/fixing your supercomputer with over the counter parts.
    • yeah, like how to up your power bill by a factor of 10 or more. Seriously, think how much condensed power a cray computer has that dubs as a desk or bench. How many top of the line G5 macs would it take to equal that, and compare the power requirements and size. I'm not saying that the Linux supers aren't week, but that in some cases it is easier to use Crays because you can pack more of them into a smaller area with less power, and get better results per cubic foot and kilowatt.

      Also remember, these gu
  • by huhmz ( 216967 ) on Tuesday April 13, 2004 @11:47AM (#8848814)
    REading the article it's fairly obvious that Cray's CTO has an agenda, however, assuming he's right, what does play in HPC? Cray Prorpritary Cluser OS (TM) or what?
  • Sure... (Score:5, Insightful)

    by avalys ( 221114 ) * on Tuesday April 13, 2004 @11:47AM (#8848821)
    In other news...

    "Despite assertions made by Toyota salesmen, a Lexus sedan is not a luxury car," said Bill Taylor, CEO of Mercedes-Benz.
    • Re:Sure... (Score:2, Funny)

      by boisepunk ( 764513 )
      "Windows(TM) has a lower TCO than Linux"

      -Microsoft ad campaign

      (mods: don't hurt me. I mean nothing but to contribute to good discussion.)
  • He's got a point (Score:5, Insightful)

    by PissingInTheWind ( 573929 ) on Tuesday April 13, 2004 @11:47AM (#8848825)
    Clusters can get high performance on some types of tasks. But sometimes, you need fine-grained parallelism that just isn't available on a cluster.

    On the other hand, high performance usually comes through special hardware. And on that hardware, I think Linux could be the right thing (modulo some patches).
    • Clusters can get high performance on some types of tasks. But sometimes, you need fine-grained parallelism that just isn't available on a cluster.
      And sometimes you need high performance on code that doesn't have fine-grained parallelism, meaning a Cray vector machine would be slow. So I guess Crays aren't High Performance Computers either.
  • Flame Bait? (Score:3, Funny)

    by Natchswing ( 588534 ) on Tuesday April 13, 2004 @11:48AM (#8848833)
    This is news? This is the equivalent of posting, "My AMD processor is better than your Intel processor!" It's a quote designed to ignite a fact-less argument on who has bigger ones.

    Now, if the CTO of Cray Canada started talking about your mother than I think you're morally entitled and required to respond.

  • Of course he is going to say this. He is an exec at Cray, what would he say "Oh yeah, our machines are good at HPC but of course you could build a Beowulf cluster fairly cheaply and efficiently and you wouldn't have to rely on us to do it."

    Cray used to be a big name in computing but unfortunately for them, they are a relic now. They had their day and it hard to believe that they will be able to compete effectively against Beowulf clusters and Linux mainframes that IBM is pushing. With IBM's public love and
  • First they laugh at you... check!

    I guess all those universities using Linux clusters are a figment of our imaginations.
  • by Anonymous Coward
    It's not the vendors who are claiming that Linux clusters are real supercomputers, it's the people who are using them to do real supercomputer work. They sell themselves based on actual price and performance.

    Methinks Cray is feeling a little threatened...
  • Maybe so (Score:3, Insightful)

    by pair-a-noyd ( 594371 ) on Tuesday April 13, 2004 @11:49AM (#8848858)
    "Despite assertions made by Linux vendors, a Linux cluster is not a high performance computer,"

    Maybe so but not everyone can pull a Cray out of his ass when they need horsepower. A Linux cluster is affordable, a Cray is the thing of wet dreams..
  • Despite assertions made by Linux vendors, a Linux cluster is not a high performance computer, said Dr. Paul Terry, CTO of Cray Canada. "At best, clusters are a loose collection of unmanaged, individual, microprocessor-based computers."

    I guess they're not happy about being only #19 on the Top 500 Supercomputer List [top500.org]. Linux is considered faster than they are according to the list.

    The 'ol ad-hominem attack of "if you can't beat them ligitimately, attack them personally" just doesn't cut it Paul. Build a be
    • by flaming-opus ( 8186 ) on Tuesday April 13, 2004 @12:34PM (#8849429)
      Cray could easily be at or close to the top of the top500 list, their X1 architecture will extend that far. However, for a lot of really important supercomputing codes, it's no contest: The cray will trounce the clusters (linux or otherwise). Those #19 crays are only 256 processors. To get similar performance a stack of xeons requires thousands of processors. Some tasks just can be split appart that easily.

      A cray processor has eight floating-point units running at 800Mhz. The big Mac cluster (for example) uses G5 processors which have 2 FPUs at 2000Mhz. Thus the cray has a ~40% advantage. However, the G5 processor has ~4GB/s memory bandwidth. The Cray has ~50GB/s memory bandwidth. If you have a problem that needs to do a HUGE amount of math on a tiny amount of data, the G5 will rock. If you have a problem that needs to do a HUGE amount math on a GINORMOUS amount of data, buy the cray. (for a GINORMOUS amount of money too)

      Similaraly infiniband (ala the big mac) is really hot in the cluster interconnect space because it gives 2.5GB/s per node. The Cray gives you 51GB/s.
      You need to move a little data, buy a cluster. You need to move a lot of data, buy the Cray.

      There's no one solution for all problems.
  • by yppiz ( 574466 ) on Tuesday April 13, 2004 @11:49AM (#8848866) Homepage
    The Cray CTO makes the point that Linux clusters get, at best, just under 10% peak as sustained performance and uses this as a justification that Linux clusters are not HPCs. This is a reasonable criticism. Let's take the percentage he cites as real for a moment. Now what is the cost difference between a Linux cluster and a Cray (not some future offering, but today) and how much more of a Linux cluster could you afford? Would that offset the quoted inefficiency? Would the flexibility of being able to use commodity components further offset any advantage Cray might have? What about 24hr or same-day parts replacement without a hyper-expensive service contract? At the end of the day, I suspect the Linux cluster wins out even given the sub-10% efficiency figure Cray cites. --Pat / zippy@cs.brandeis.edu
  • the list (Score:3, Funny)

    by hakr89 ( 719001 ) <8329650d-c1bd-41 ... fbec8928.faku@me> on Tuesday April 13, 2004 @11:50AM (#8848874)
    well i guess now we'll have to add them to the list...after SCO and Microsoft of course
  • Funny... (Score:3, Interesting)

    by Greyfox ( 87712 ) on Tuesday April 13, 2004 @11:51AM (#8848887) Homepage Journal
    Works for Google...

    Wish I'd been there so I could have slapped him after about 3 seconds of stunned silence.

  • Marketdroidspeak again. Linux clusters have been a pretty common technique here for many (10?) years. Back then, you could call a bubble sort algorithm "research" if you ran it on a beowulf.
  • Problem (Score:5, Interesting)

    by rawgod0122 ( 574065 ) on Tuesday April 13, 2004 @11:51AM (#8848896)
    It all depends on the problem you are trying to solve. I have been doing some work of late that would not complete in my life time on the 108 node cluster that we have. But when programmed for and run on two Cray X1s I should complete inside of a week.

    Granted there are many codes (and more every day) that will run on clusters, the big iron will never die.
  • by foooo ( 634898 ) on Tuesday April 13, 2004 @11:52AM (#8848908) Journal
    Just because we love Lunux doesn't mean that clusters are HPCs.

    There are real issues that differentiate mainframe/supercomputers from large, powerful, clusters.

    Of course this all depends on your definition of an HPC. But I believe that it's reasonable to say that if parts of your computer are connected with low bandwidth connections (10/100,gigabit) they just can't handle the same kinds of transactions that a computer with parts that are connected by 10 gigabit or 1000 gigabit connections or whatever it is nowadays.

    As far as I know if you're deploying a large database it's still advisable to have a big huge IBM mainframe or a Unisys box or a Sun 10k instead of 4,8 or 16 clustered 8 proc machines.

    My point is there are valid arguments for not including clusters of commodity hardware in the HPC category.

    In my mind they aren't High Performance Computers... they are High Performance Clusters of Commodity Computers.

    ~foooo
  • by Meor ( 711208 ) on Tuesday April 13, 2004 @11:52AM (#8848909)
    Where's your God now hippies? Where's your God now?
  • by ERJ ( 600451 ) on Tuesday April 13, 2004 @11:53AM (#8848915)
    How well a cluster will do depends on the application that it is performing. Some problems can be divided into several small problems with little reliance on other parts of the problem (SETI / Encryption breaking). These things can be easily distributed to hundreds or thousands of "small" boxes for processing and are what a beowulf cluster would be good at.

    Other applications require the breakneck interconnect speeds that large Cray / Sun / etc.. build on. When the data being calculated on one CPU requires data from CPU2 to continue its calculations you don't want to have it wait for 100mbit or even 1gbit ethernet speeds. Even quicker interconnects such as SCALI [scali.com] are going to be slowed by PC bus speeds.

    Cray fills an important niche for those who can afford it.
  • Different tools (Score:5, Insightful)

    by BoneFlower ( 107640 ) <anniethebruce@ g m a i l . c om> on Tuesday April 13, 2004 @11:53AM (#8848923) Journal
    The comment was stupid, yes, but not all jobs that you'd use supercomputers for can be broken down into many threads as others can. A linux cluster will do well for some jobs, a cray box will do well for others. There *will* be times when a Cray system is so far superior to anything you could do with Linux that it becomes the only real option.

    However, dismissing linux cluster technology automatically is dumb. In many cases, it provides more than enough cpu power and I/O bandwith to support your reason for getting a supercomputer, and probably at less cost than the other options.

    Its all a matter of determining what you need the computer to do, determining your budget, and get the best system in your budget for the uses you have for it. Sometimes that will be a Cray, sometimes a Linux cluster.
  • After all, he's in the competition. Duh.

    The simplest rebuttal is that its not what you call it that matters but what you can do with it. And, judging by the ubiquitous deployment of linux clusters, the answer seems to be "almost anything under the sun".

  • Says who? (Score:4, Funny)

    by dagnabit ( 89294 ) on Tuesday April 13, 2004 @11:55AM (#8848953)
    Who is this guy and what does a company like Cray know about... oh... never mind.
  • by LostCluster ( 625375 ) * on Tuesday April 13, 2004 @11:55AM (#8848959)
    Clusters can rival a supercomupter when they are assigned is a task that's suitable for distributed computing. That is, work units can be divided up and worked on in any sequence... the result of segment 45 doesn't depend on knowing the result of 44 and such. Effectively, you can have the sum of all of the processors minus just a little overhead for the clustering.

    What Cray's rightfully pointing out is that for most business applications, however, distributed computing is not a viable option. When processing on a transaction basis, the transactions often need to posted in the exact order they were recieved, which means they must be taken serially. In those situations, the programs can't multithread work out to the other processors so well, and the cluster will end up running at roughly the speed of just one processor while the others waste clock cycles waiting for something to do.

    The cluster isn't the solution to everything. Nor is the supercomputer. You've gotta think about the job, then figure out which tool is right for the task.
    • No supercomputer (cluster or traditional) is going to work well if your app can't multi-thread as none of them derive their power from a small number of super powerful CPUs. For that you want something more like a traditional mainframe (and guess what, many banks still use them). The real difference between the Cray model and the cluster model is shared vs seperate memory. The question becomes "can your application be broken down into small chunks which are entirely self-contained". So rendering a movie wor
  • by RedLeg ( 22564 ) on Tuesday April 13, 2004 @11:56AM (#8848965) Journal
    Buy MY droids instead..... Move along.....


    His rhetoric is quite predictable, actually. He talks at some length about how and why clusters of PCs can't get the job done, and how clustering is inherently inferior to a REAL SuperComputer, then goes on to describe how their new product (which sounds suprisingly like a cluster of propreitary machines) can work. Repeat the above as it applies to the management software.


    If clustering doesn't work, and Supers are better / cheaper, explain why large companies (Pixar, NVidia, ...) Government Labs (Los Alamos National Labs, Sandia National Labs, ...) have invested, and are continuing to invest in and support their clusters.


    Note that this does NOT mean that clusters are suitable for ALL traditional SuperComputing tasks. It really depends on the problem. If the problem is better solved with a vector processor, then a vector machine (like a Cray) is what you want. If the problem is solvable in parallel, then a cluster might be the right answer.

  • by Saeed al-Sahaf ( 665390 ) on Tuesday April 13, 2004 @11:59AM (#8849009) Homepage
    While Dr. Paul Terry's comments are obviously self-serving, especially since in a way, with the Cray XD1 based on multiple AMD processors rather than proprietary Cray processors, he does have a point about the overhead of running the OS on each machine in a cluster, and the statement "The Cray XD1 is not a traditional cluster; it does not use I/O interfaces for memory and message passing semantics."

    In truth, such machine will always have a certain performance advantage over traditional clusters. The question is, will the price point be low enough to invalidate the idea of just adding more boxes to the traditional cluster.

  • by Richard Mills ( 17522 ) on Tuesday April 13, 2004 @12:01PM (#8849024)
    While I certainly disagree that you can't build a very high performance computer out a cluster of computers (Linux or otherwise), there is a lot of merit to the fact that clusters just don't scale well for certain classes of applications. Hence the renaissance of the vector supercomputer (ala the Earth Simulator [jamstec.go.jp]).

    Obviously, this guy is plugging the new Cray X1 architecture, which really is quite promising. For instance, check out this paper [sc-conference.org] by some folks at Oak Ridge National Lab that appeared in Supercomputing 2003.

    Of course, since this is Slashdot, I expect that there will be a deluge of posts decrying everything about the new Cray machine because it commits the cardinal sin of NOT USING LINUX. Oh, the horror!
  • by Anonymous Coward on Tuesday April 13, 2004 @12:02PM (#8849036)
    The "interconnect" latency (especially) and bandwidth in a cluster, even using very high-end network hardware, is much worse than that of a Cray-style supercomputer. This does make certain applications run slower, especially if not specifically tailored to clustered architecture. Some applications are very difficult to break down into small pieces and require extensive memory sharing between nodes, which clusters just can't do well.
  • by nacks1 ( 60717 ) on Tuesday April 13, 2004 @12:05PM (#8849065) Homepage Journal
    I happen to work in a facility that has large had both large supercomputers (cray t3e, j90, sgi) and linux and *nix based clusters (beowulf/linux, compaq/Tru64). The Cray CTO is correct that you can't just call every linux cluster out there HPC. Just about anyone with networking and linux knowledge can build a linux cluster.

    What really makes a difference between an HPC cluster and your normal every day cluster is the hardware interconnects used. There is a comment in the artical that refers to not using I/O for memory and message passing. I am not quite sure what he means by that, but I am guessing that he is saying that the network is not used for shared memory/message passing (MPI/openMP/SHMEM).

    If a cluster can limit the impact of latency between nodes either through smarter software or faster interconnects then I can't see any reason not to concider a linux cluster as HPC.

    Clusters without smarter software tend to be a real difficult coding platforms. Some developments with things like globally shared memory might make the difference, but there will still be the problem of latency between nodes.
  • by infinite9 ( 319274 ) on Tuesday April 13, 2004 @12:11PM (#8849145)
    "Despite assertions made by Linux vendors, a Linux cluster is not a high performance computer,"

    That's like saying that the automobile is not a high performance team of clydesdales. That's true, but it may be irrelevant. If it can get you there faster or better, I guess it doesn't matter.
  • Yawn. (Score:3, Insightful)

    by adrianbaugh ( 696007 ) on Tuesday April 13, 2004 @12:29PM (#8849370) Homepage Journal
    For some tasks distributed clusters are better, for others ultra-high-bandwidth Cray-type monsters are better. So what's new?
  • by 4of12 ( 97621 ) on Tuesday April 13, 2004 @12:31PM (#8849392) Homepage Journal

    but practically, the performanc is "high enough" and certainly a helluva lot cheaper than buying a custom system.

    It's just like the old days, except more so:

    Performance = log(Price)
    and you can end up paying a lot of money to squeeze out that extra performance.

    Given that Linux clusters can achieve speeds in excess of a teraflop, that available dollars for computer purchases are finite, and that per processor performance and price performance is increasing, the market size for the world's highest performing machine is rapidly vanishing to a set of measure zero.

  • by Performer Guy ( 69820 ) on Tuesday April 13, 2004 @12:38PM (#8849474)
    A lot of what he said isn't much of a surprise, however definitive statements about clusters not being supercomputers and being unmanaged loose collections of machines are a bit overblown. Management software exist for clusters and they are rather easy to program for with available popular and industrial strength libraries.

    Moreover many HPC applications actually scale quite well on clusters of Linux systems. Affordable interconnect infrastructure is increasing in bandwidth and reducing in latency, further broadening the scope of the problems these clusters can tackle. In addition each node can now comfortably have 2 or four processors giving even better bandwidth between CPUs sharing a node. With 64 bit processors and operating systems now available the final barriers to very impressive easy to use HPC Linux clusters have been removed which is exactly why Cray now sees them as a threat. Now is probably the worst time to talk of how a cluster is not a supercomputer. Clusters form a class of supercomputer that can now handle most supercomputer tasks. True there are classes of problems that the dedicated supercomputer systems CRAY sells will excell at, however clusters are useful workhorses in the supercomputer world and hold their own.

    Todays supercomputer problems are tomorrows computer problems and Cray must continue to find new classes of problems to solve as they always have, rather than attacking competing technologies, people will use clusters where the clusters meet their needs.
  • by jellomizer ( 103300 ) * on Tuesday April 13, 2004 @12:39PM (#8849486)
    True Parallele Programming with computer with over 16 or so CPUs is a slightly different mindset then the way most people program. In PP you can write a sort routiene that runs in O(log(x)) While with one processor system you can only do it in O(x). Most programs today that are threaded tend run a buch of code on one processor and its own memory. That is much the way that linux clusters work, by writting programs that minimalize the amount of comunications needed so then they provide high performace. But crays and the like super computer allows all the processors to comunicate with each other and the shared memory a lot faster. Thus making some algorithms run in Maginatudes faser.
    An example is when I took a course in Parrallel processing we used a MassPar system which had 1024 processors in a grid formation. Now woring on that system I was able to sort a list of a million random numbers way faster then my Duel Processor PC could.
    But on the flip side when I ran a program on the MassPar that wasn't designed parallel processing (emacs) it took upwards of 3 minutes to load it due to the age of the computer. While my PC could open up emacs in a split second. So on the clusters even the fastest in the world a Cray that may not be the fastest could actually beat it on many applications because of the faster bus comunication.
  • Whatever (Score:3, Interesting)

    by hackstraw ( 262471 ) * on Tuesday April 13, 2004 @12:49PM (#8849638)
    I'd like to see Paul Terry say this in front of everybody at the Super Computing conference where they announce the Top 500 Computers [top500.org]. Its worth noting that he is not bashing Linux per se, but "Linux Clusters", which is pretty arbitrary, because he should be saying "all clusters", because the OS really doesn't have too much to do with it. Supercomputing apps run in userspace, not kernel space, and the hardware, including interconnects or some kind of interprocessor communication drive the performance.

    The Cray XD1 looks like a nice system, but there are only theoretical performance values given, and noone can go out and buy one of these things yet. I also don't know how much these guys cost.

    I love this statement:

    Linux clusters do have a place. "For applications that require low performance, they are a cheaper solution," said Terry.

    Yeah, when we spend a million+ dollars on a supercomputer, we are thinking of low performance, because our applications require it. Thanks.

    I'm guessing this guy is a wannabe marketer who got stuck in a CTO position. There are plenty of HPC vendors out there, and trust me if this XD1 has a good price/performance and they work (this is key), then people will buy them with little questions asked. Otherwise, this whole article is just an advertisement that makes many statements without any evidence that the XD1 is any better than 4 Xboxes connected together over a serial connection. Next....
  • by MoFoQ ( 584566 ) on Tuesday April 13, 2004 @12:59PM (#8849772)
    doesn't this CTO of cray remind you of someone?
    "There IS no Linux in high-performance clusters."

    "There IS no Americans in Iraq."

    OMG! It's the former Iraqi mis-Informed-ation minister!

    Especially when 2004 has been dubbed the year of the penguin, it's wreckless to claim that Linux can't be used in HPC's.
    Hell, just look at the current top500 list [top500.org]. There's no Cray in the top 10 but there are two Linux based clusters there (and one based on OSX [FreeBSB based]).

    Here's a few:
    NCSA's IA32 Linux cluster [uiuc.edu]
    NCSA's IA32 Linux cluster [uiuc.edu]
    Space Simulator Clust at Los Alamos [lanl.gov] (SS51G based; makes me proud as I have a SS51G too)
    Beowulf - used in many Linux clustering projects [beowulf.org]
    Linux clusters at Los Alamos [llnl.gov] (they seem to have more than one)
    Virginia Tech's Supercomputer X [vt.edu]
  • by pragma_x ( 644215 ) on Tuesday April 13, 2004 @01:07PM (#8849900) Journal
    Despite assertions made by Linux vendors, a Linux cluster is not a high performance computer, said Dr. Paul Terry, CTO of Cray Canada. "At best, clusters are a loose collection of unmanaged, individual, microprocessor-based computers."

    Although this statement reeks of FUD, he's right about one thing: a cluster is not an HPC... that's why its called a cluster. But to say that a cluster is 'unmanaged' is one hell of a stretch IMO. All in all, he's just arguing semantics: nothing to see here, put down your flamethrowers, move along folks.

    Since this is slashdot, I'll add that the rest of the article is full of choice quotes all of which point squarely at basic FUD + marketing spin for their new cluster-cost-like product.

    It seems to me that Cray is just plain bitter that Linux (through all the cluster solution providers) has managed to steal Cray's thunder at a mere fraction of the cost. Cray's probably even more bitter that folks are willing to sacrifice performance (at least from Cray's perspective) just to save a buck.

    Okay, this is Cray we're talking about here: people are saving millions of bucks all over the place by using clusters instead of big expensive machines.

    And guess who wants 'their' slice of the pie back.
  • by Listen Up ( 107011 ) on Tuesday April 13, 2004 @01:26PM (#8850135)
    There are certain types of computing which simply cannot be done with microprocessor based platforms including clustering. One of these calculation types is vector processing. A Cray supercomputer is a vector processing based unit. When comparing a cluster of PC systems being used to calculate what a single Cray is designed to calculate, the Cray CTO is perfectly correct in his statement.
  • by shaitand ( 626655 ) * on Tuesday April 13, 2004 @01:39PM (#8850303) Journal
    I'm seeing alot of single threaded versus multi-threaded arguments.

    That's great and all, but for a single threaded application a cray isn't even going to smash your modern top of the line home pc by too terribly much.

    crays are massive smp systems, they need a multi-threaded app to take advantage just as much as a cluster does. The difference is in the bus speed. A cray has a much faster bus, and with equivelent processing and memory it will excel with a number of small quickly terminated threads, whereas a cluster will as well or better with larger more processor consuming threads.

    Why would a cluster ever do better? Simple, although a cluster has a drastically slower bus, there is memory local to the processor in question so there is much less congestion on the bus, and since if your shelling out for a cluster you will be switching rather than hub style whatever you do there will be almost without collisions and bus contention. Each node has it's own ram so there isn't much of an issue with contention for the bus and much greater memory throughput.

    So like I said, it's all about how fast threads spawn and terminate, because if your rapid firing threads then you will doing alot of communicating between nodes over the slow bus (network), if your sending good sized chunks of data do something and keeping your nodes busy they will spend more time working and less time communicating results and your cluster will tromp all over that cray.
  • by Animats ( 122034 ) on Tuesday April 13, 2004 @03:22PM (#8851538) Homepage
    Supercomputer CPUs are dead. It's been a long time since Cray made CPUs. Current-generation supercomputers are composed of large numbers of commercial microprocessors. The only major exception is the current fastest supercomputer, the Earth Simulator in Japan. It uses custom vector processors. They seem vaguely similar in architecture to Playstation 2 vector units, but they probably are unrelated.

    Every other machine in the top 10 is built from standard processors. The old DEC Alpha, PowerPCs, and IA-32 predominate, with a few Itanium machines.

    Because supercomputers today have several thousand processors, they can't even be big shared-memory multiprocessors. Speed of light lag in the interconnects would slow everything down. It just takes too long for the signals to make it across the room.

    So all supercomputers today are clusters of one kind or another, fast machines with slower interconnects between them. The hardware architecture revolves around interconnect schemes. The software architecture revolves around working around the limitations of the interconnect schemes. Tightly coupled problems don't map well to such machines.

    Bear in mind that we're talking about clusters of uniform machines located near each other with gigabit or better interconnects. We're not talking about "clusters" consisting of spare-time programs out at the end of Internet connections. Those are useful only for problems with almost no coupling between parts. Such problems are usually low hit rate search problems, like cryptanalysis, SETI@HOME, and such.

    Yes, there's the Cray X1, the last of the liquid-cooled monsters, but it looks like the only customers who bought one were Government agencies with old Cray machines.

  • Wrong Source (Score:3, Insightful)

    by Ra5pu7in ( 603513 ) <ra5pu7in@gm a i l . com> on Tuesday April 13, 2004 @03:39PM (#8851749) Journal
    Most readers have the right idea - you don't listen to a competitor's opinion when judging whether something is viable or not. It is very easy to twist the words to be "true" while misleading.

    A cluster isn't a supercomputer, by definition, but for many jobs can be equal or better. In other words: Those 2 oxen cost more, consume more resources, are only useful for the one job (pulling a plow) and only benefit a single owner. Those 1024 chickens cost less, consume less resource, are useful for many jobs besides the one (including laying eggs) and benefit their many owners.

"Protozoa are small, and bacteria are small, but viruses are smaller than the both put together."

Working...