Cray CTO: Linux clusters don't play in HPC 435
jagger writes "Linux clustering was touted as the next big thing by many vendors last week at ClusterWorld Conference & Expo 2004. But supercomputer vendor Cray Inc. scoffed at the notion of putting Linux clusters in the high-performance computing (HPC) category. "Despite assertions made by Linux vendors, a Linux cluster is not a high performance computer," said Dr. Paul Terry, CTO of Cray Canada."
Business or science? (Score:2, Interesting)
Dr. Terry's assertions remind me of a Seymour Cray quote I had as my
I'm not picking a side, it just seems interesting that the Cray CTO would echo Seymour's thoughts. I guess it's for business and marketting reasons though, sadly.
CTO of Cray? (Score:2, Interesting)
Are too (Score:5, Interesting)
I guess that the simple problem is just that the algorithm applied is usually not suitable for massively parallel computing.
So what DOES play in HPC? (Score:3, Interesting)
If it walks like a duck, and talks like a duck.... (Score:2, Interesting)
I guess they're not happy about being only #19 on the Top 500 Supercomputer List [top500.org]. Linux is considered faster than they are according to the list.
The 'ol ad-hominem attack of "if you can't beat them ligitimately, attack them personally" just doesn't cut it Paul. Build a better computer.
Efficiency and cost argument (Score:4, Interesting)
Funny... (Score:3, Interesting)
Wish I'd been there so I could have slapped him after about 3 seconds of stunned silence.
Problem (Score:5, Interesting)
Granted there are many codes (and more every day) that will run on clusters, the big iron will never die.
Re:And in other news... (Score:1, Interesting)
Show me a link where MS Says "Apache cannot be used for REAL web serving", you won't find it because that's not what they've said.
As to Sun Announcing that Intel and Linux cannot be sued for enterprise computing, well, they're p[artially right. I've seen Sun boxes trudge through loads that would hardlock a Linux box, ona daily basis, that's due to the hardware and the OS being built to work with each other. No one ever accused Slowlaris of having the "snappiness" of Linux, but if I had my choice of either a Sunfire with Solaris or a Dell with Redhat on it to take the brunt of my business, guess what, Sun is getting my money.
Well... he is sort of correct... (Score:5, Interesting)
What really makes a difference between an HPC cluster and your normal every day cluster is the hardware interconnects used. There is a comment in the artical that refers to not using I/O for memory and message passing. I am not quite sure what he means by that, but I am guessing that he is saying that the network is not used for shared memory/message passing (MPI/openMP/SHMEM).
If a cluster can limit the impact of latency between nodes either through smarter software or faster interconnects then I can't see any reason not to concider a linux cluster as HPC.
Clusters without smarter software tend to be a real difficult coding platforms. Some developments with things like globally shared memory might make the difference, but there will still be the problem of latency between nodes.
Re:VA Cluster yet to be used (Score:2, Interesting)
There were issues with unbuffered RAM, so they have decided to make a new cluster with the new 2ghz X-Serves which use EEC RAM ( and new IBM 970fx chips).
This has resulted in massive shipping delays for the dual X-Serves, but should mean that a very, very good machine is created at VT...
not a real supercomputer? (Score:2, Interesting)
Basicly, I disagree with Dr. Terry.
our project writeup is here. [csuchico.edu]
(Please forgive any mistakes or stupiness therein, we were 15, 15, and a 30something non-geek at the time.)
Re:And in other news... (Score:1, Interesting)
He's spouting the same ole FUD common of all Linux zealots, it just so happens he tried to present his argument using BS quotes.
Anyone that's actually used those software packages will realize that he's an idiot.
Re:Help me here... (Score:5, Interesting)
So depending on the task at hand, the cluster might perform very well, or perhaps a little less well.
Surely what you meant to say is that, depending on the task at hand, a cluster might perform very well, or perhaps perform attrociously. :-)
Clusters tend to work well when the various nodes don't need to communicate very often but you need lots of cycles for the subtasks, while dedicated supercomputers tend to perform very well in tasks requiring vast amounts of internode communications bandwidth along with large numbers of cycles. If you need vast bandwidth and relatively low numbers of cycles, your pricepoint is likely a mainframe. And if you don't need either, you get a cheap desktop machine.
Certain problems parallelize well on a cluster ... others don't. Some don't parallelize at all, and a cluster won't do you a darn bit of good. The different machines are designed for different uses ... and one should be careful not to push a "one size fits all" solution. The Cray guy clearly got it wrong on that point, and likely knows it, but he was marketting, not teaching a course in choosing hardware for the task at hand.
I'm confused... (Score:1, Interesting)
SGI has bought and sold the company so many times I lost track.
The funny thing about that is, now the same problems Cray is having, SGI is having as well: (trying to sell single supercomputer machines in a market that is heading to clusters because of price.)
...the same ole FUD? (Score:3, Interesting)
"Anyone using the terms 'zealot' or 'FUD' in a Slashdot discussion is immediately declared the loser of the thread and discussion stops at that point".
Of course I'm force to break my own corollary to make this point.
But to call me a "Linux zealot spouting FUD" (and excuse me for paraphrasing your lucid comment) because I mock a commercial vendor who says that the free alternative is no competition... WTF?
As it happens: I have 20+ years of experience in IT and I've used every one of those packages (except the Cray). Oracle, MySQL, IIS, Apache, Sun, Solaris, Linux. And hundreds of other platforms, as well.
My opinions are not those of a zealot, but pretty impartial and generally very accurate. There is a good reason, for instance, why the most critical servers in my business all run Debian Linux, why the desktops use Xandros, why our applications use MySQL, and why we're phasing our out Microsoft/COM+/IIS/SQLServer platforms. Zealotry has little to do with it, but good sense does.
The facts are these: open source, free, commodity IT has become good and cheap enough to exceed the capabilities (at any price) of many commercial systems. Most specifically, Cray, Oracle, Microsoft, and Sun find themselves spot center of the area that has been commoditized.
Re:Marketing (Score:2, Interesting)
Whatever (Score:3, Interesting)
The Cray XD1 looks like a nice system, but there are only theoretical performance values given, and noone can go out and buy one of these things yet. I also don't know how much these guys cost.
I love this statement:
Linux clusters do have a place. "For applications that require low performance, they are a cheaper solution," said Terry.
Yeah, when we spend a million+ dollars on a supercomputer, we are thinking of low performance, because our applications require it. Thanks.
I'm guessing this guy is a wannabe marketer who got stuck in a CTO position. There are plenty of HPC vendors out there, and trust me if this XD1 has a good price/performance and they work (this is key), then people will buy them with little questions asked. Otherwise, this whole article is just an advertisement that makes many statements without any evidence that the XD1 is any better than 4 Xboxes connected together over a serial connection. Next....
How is the XD1, as advertised, any different? (Score:2, Interesting)
FUD and Thunder-Mongering (Score:3, Interesting)
Although this statement reeks of FUD, he's right about one thing: a cluster is not an HPC... that's why its called a cluster. But to say that a cluster is 'unmanaged' is one hell of a stretch IMO. All in all, he's just arguing semantics: nothing to see here, put down your flamethrowers, move along folks.
Since this is slashdot, I'll add that the rest of the article is full of choice quotes all of which point squarely at basic FUD + marketing spin for their new cluster-cost-like product.
It seems to me that Cray is just plain bitter that Linux (through all the cluster solution providers) has managed to steal Cray's thunder at a mere fraction of the cost. Cray's probably even more bitter that folks are willing to sacrifice performance (at least from Cray's perspective) just to save a buck.
Okay, this is Cray we're talking about here: people are saving millions of bucks all over the place by using clusters instead of big expensive machines.
And guess who wants 'their' slice of the pie back.
Re:...the same ole FUD? (Score:3, Interesting)
True. But I spout what are called "stealth opinions", being understated (or even unstated) makes them harder to criticize, and I have the advantage of being able to change opinion in mid-spout to dodge the zealots.
Re:Are too (Score:4, Interesting)
Message passing is the biggest issue with such solvers, and in a way, cray was absolutely right about Linux, although misleading. There are some tests going on now with a modified Linux kernel for doing true HPC, and it's been done in the past (I know, I've used it). Things like disk swapping pretty much immediately disqualifies you for high performance computing. It has its place of course, such as trivially parallelizable codes is one example (Pixar).
Myrinet was out before Gbit ethernet was really available, and also has some nifty routing capabilities. And since the bottleneck for HPC is usually message passing, high performance computing will better realize its theoretical performance as the communication speed catches up to the processor speed.
But, to Cray's discredit, making a blanket statement that Linux can't do HPC is like saying Macintoshes can't do HPC. [top500.org]
Re:Help me here... (Score:3, Interesting)
Re:Marketing (Score:1, Interesting)
The XD1 is NOT the same as the big vector-processor X1s.
Re:Help me here... (Score:3, Interesting)
Sure, this reduces peak efficiency. I think on the VT cluster it was in the 50-60% range (I could Google search but I'm lazy... shoot me)... that is, the total performance is about
But the Cray guy is full of hot air. Of course you're going to sing the praises of massive SMP when that's what you have to sell. The fact is, if 1100 dual CPU machines clustered together can significantly outperfom the Cray, for less money, and they're easy to manage (they are...), then why not go that route?
So Cray sells FUD, because it's their last option.
Valid Question, then (Score:3, Interesting)
At what price point does the Cray XD1 come in? While huge clusters are (supposedly) cheap individual computers -- I would argue that G5s are not inherantly cheap -- how many G5s that make up the Virginia Tech cluster would you have to get to before you've paid for a Cray XD1?
I mention this because the article implies that Cray is planning on selling the XD1s at a price point cheaper than equivelant clusters. If they succeed at making the XD1 cheap enough, then it may be more cost effective to [[ effectively, cluster ]] a couple of these Crays, with less power consumption, heat dissipation and plain old real-estate.
It seems to me that TCO would be cheaper for the Cray, especially considering that the best clusters expect 5% of the member computers to be broken at any given time.
So, does anybody have Cray XD1 pricing? That, seems to me, to be the only way to rationally decide on the 'better' solution.
Provably non-Parallelizable? (Score:2, Interesting)
I asked this earlier in the thread: Provably non-Parallelizable? [slashdot.org]
Allow me to ask it again: What's the state of the art of proofs of parallelizability [and non-parallelizability]?
Is there a standard list of problems that have been proven to be non-parallelizable? Are there any problems that have been proved to be parallelizable, but for which no parallelizing algorithm has yet been discovered? Is there anything analogous to the NP-completeness conjecture in this field?