Cray CTO: Linux clusters don't play in HPC 435
jagger writes "Linux clustering was touted as the next big thing by many vendors last week at ClusterWorld Conference & Expo 2004. But supercomputer vendor Cray Inc. scoffed at the notion of putting Linux clusters in the high-performance computing (HPC) category. "Despite assertions made by Linux vendors, a Linux cluster is not a high performance computer," said Dr. Paul Terry, CTO of Cray Canada."
It's not the vendors... (Score:2, Informative)
Methinks Cray is feeling a little threatened...
Checking with the TOP 500 Supercomputers I find (Score:0, Informative)
Who doesn't play in what?
AC repost - STOP KARMAWHORES NOW ! (Score:1, Informative)
By Jan Stafford, Editor
12 Apr 2004 | SearchEnterpriseLinux.com
SAN JOSE, Calif. -- Linux clustering was touted as the next big thing by many vendors last week at ClusterWorld Conference & Expo 2004.
But supercomputer vendor Cray Inc. scoffed at the notion of putting Linux clusters in the high-performance computing (HPC) category. In fact, Cray showcased a system -- Cray XD1 with Active Manager -- that will compete in performance and price with some Linux clusters upon its release..
Despite assertions made by Linux vendors, a Linux cluster is not a high performance computer, said Dr. Paul Terry, CTO of Cray Canada. "At best, clusters are a loose collection of unmanaged, individual, microprocessor-based computers."
Businesses shouldn't expect supercomputer performance from Linux clusters, Terry warned.
"Cluster vendors would have you believe that their performance is the linear sum of each of their respective GFLOPS [Giga Floating Point Operations Per Second]," he said. "Most cluster [experts] know now that users are fortunate to get more than 8% of the peak performance in sustained performance."
Linux clusters do have a place. "For applications that require low performance, they are a cheaper solution," said Terry.
With XD1, Cray intends to make HPC a cheaper solution, too. "With the Cray XD1, Cray will introduce new price points that should make HPC solutions more available to industries that before couldn't afford such devices," Terry said.
Cray XD1 was developed by OctigaBay Systems Corp., a Vancouver, B.C., Canada-based company acquired by Cray on April 2. Formerly OctigaBay 12K, Cray XD1 will be released to some companies for testing in May. Full release is expected later this year.
The acquisition of OctigaBay's technology will allow Cray to move into new markets by "doing supercomputing on a smaller scale with some commercial, off-the-shelf components," said analyst Richard Partridge, vice president of Enterprise Server Solutions for DH Brown in Port Chester, N.Y. "Cray just can't shrink its custom-built supercomputer designs," he said. Having the ability to put a value-added HPC solution on AMD processors is a good way to move downmarket.
Cray XD1 marries the performance of large SMPs with the economics of cluster solutions, according to Terry. It will also pair new interconnect and management technologies with AMD Opteron 64-bit processors in a direct-connected processor architecture. Its parallel-processing capabilities will directly link together processors to relieve memory contention and interconnect bottlenecks found in cluster systems.
"The Cray XD1 is not a traditional cluster; it does not use I/O interfaces for memory and message passing semantics," said Terry. "For HPC, the most important thing is application performance, and the Cray XD1 is specifically designed to maximize application performance."
In some situations, XD1 would be a good replacement for very high-end Linux clusters, Partridge said. He sees the XD1 providing more "compute performance for the dollar" for organizations that do heavy number and data crunching and analysis. He noted, however, that Cray has shown analysts a limited amount of information about the new products.
Terry believes that individual copies of Linux used for HPC today are intrinsically "heavy" and run independently on multiple processors, significantly adding to the difficulty of managing clusters.
XD1's integrated management software -- Active Manager -- will eliminate the "FCAPS" management ills common to clusters. "Fault, configuration, accounting, provisioning and security" are not handled well by current cluster management solutions, he said. "Often times, [management] appears to be done as an afterthought instead of being designed into the system from the ground up," he said.
Active Manager, which was demonstrated at ClusterWorld, offers a single-point of system administration and cont
He's wrong, but he's also right. (Score:5, Informative)
Obviously, this guy is plugging the new Cray X1 architecture, which really is quite promising. For instance, check out this paper [sc-conference.org] by some folks at Oak Ridge National Lab that appeared in Supercomputing 2003.
Of course, since this is Slashdot, I expect that there will be a deluge of posts decrying everything about the new Cray machine because it commits the cardinal sin of NOT USING LINUX. Oh, the horror!
Re:Linux not usable for HPC? (Score:2, Informative)
Cray isn't anti-linux per se, just anti-cluster.
Somehow I wouldn't be surprised, the next step seems to be cray-marketed cluster nodes with a proprietary high speed interconnect. (If you can't beat them, join them).
The Cray will scale up (Score:5, Informative)
Re:Can you multithread your application? (Score:3, Informative)
Re:Checking with the TOP 500 Supercomputers I find (Score:3, Informative)
Every task has a maximum number of threads it can be broken into where adding another parallel process threads just won't make it any faster. For some, that number is in the stratosphere and doesn't have to be worried about. However, for others, that number is in the single digits. Those tasks aren't going to be helped much by a cluster that exceeds that number of processors.
Re:Help me here... (Score:5, Informative)
http://www.ibiblio.org/pub/Linux/docs/HOWTO/oth
has a great explanation using a grocery story analogy that makes it really easy to understand what kind of tasks will work well and what kind will suck. And unlike the cheerleaders that have been showing up since clusters became a big business is very balanced about it.
Still worth reading.
Re:Help me here... (Score:3, Informative)
With problems that can be split up into hundreds or thousands of more-or-less independent subtasks, a cluster is the way to go. But for problems that can't be divided up like that, a smaller system with a few very tightly coupled extremely fast vector processors, like what Cray specializes in, is what you need.
There are certainly plenty of HPC problems that aren't well suited for large clusters, but it sounds like the Cray guy might have been significantly overstating his point.
Still Linux (Score:2, Informative)
Re:If it walks like a duck, and talks like a duck. (Score:5, Informative)
A cray processor has eight floating-point units running at 800Mhz. The big Mac cluster (for example) uses G5 processors which have 2 FPUs at 2000Mhz. Thus the cray has a ~40% advantage. However, the G5 processor has ~4GB/s memory bandwidth. The Cray has ~50GB/s memory bandwidth. If you have a problem that needs to do a HUGE amount of math on a tiny amount of data, the G5 will rock. If you have a problem that needs to do a HUGE amount math on a GINORMOUS amount of data, buy the cray. (for a GINORMOUS amount of money too)
Similaraly infiniband (ala the big mac) is really hot in the cluster interconnect space because it gives 2.5GB/s per node. The Cray gives you 51GB/s.
You need to move a little data, buy a cluster. You need to move a lot of data, buy the Cray.
There's no one solution for all problems.
Parallel Programming. (Score:3, Informative)
An example is when I took a course in Parrallel processing we used a MassPar system which had 1024 processors in a grid formation. Now woring on that system I was able to sort a list of a million random numbers way faster then my Duel Processor PC could.
But on the flip side when I ran a program on the MassPar that wasn't designed parallel processing (emacs) it took upwards of 3 minutes to load it due to the age of the computer. While my PC could open up emacs in a split second. So on the clusters even the fastest in the world a Cray that may not be the fastest could actually beat it on many applications because of the faster bus comunication.
Re:Efficiency and cost argument (Score:2, Informative)
Re:VA Cluster yet to be used (Score:4, Informative)
The cluster remains, they have not shut it down and were swapping out individual racks for the upgrade.(something like one rack of X-serves is three racks of towers.
I don't think it's been published that they have or haven't ran any data besides benchmarks.
Re:Help me here... (Score:5, Informative)
Format links like this: <a href="http://somelink">link text</a>
It takes virtually no extra time and we don't have to trim the fucking slashcode spaces.
Oh, and here's [ibiblio.org] the link.
You guys are giving cray too much credit (Score:3, Informative)
That's great and all, but for a single threaded application a cray isn't even going to smash your modern top of the line home pc by too terribly much.
crays are massive smp systems, they need a multi-threaded app to take advantage just as much as a cluster does. The difference is in the bus speed. A cray has a much faster bus, and with equivelent processing and memory it will excel with a number of small quickly terminated threads, whereas a cluster will as well or better with larger more processor consuming threads.
Why would a cluster ever do better? Simple, although a cluster has a drastically slower bus, there is memory local to the processor in question so there is much less congestion on the bus, and since if your shelling out for a cluster you will be switching rather than hub style whatever you do there will be almost without collisions and bus contention. Each node has it's own ram so there isn't much of an issue with contention for the bus and much greater memory throughput.
So like I said, it's all about how fast threads spawn and terminate, because if your rapid firing threads then you will doing alot of communicating between nodes over the slow bus (network), if your sending good sized chunks of data do something and keeping your nodes busy they will spend more time working and less time communicating results and your cluster will tromp all over that cray.
Re:Marketing (Score:2, Informative)
Re:Marketing (Score:5, Informative)
As for research, it's more a case of researchers doing the old "Damn, I'll have to make do with this". And Origin and Altix systems are still selling well in the research market.
And don't forget, Cray is backed by US government departments such as the NSA. The X1 received a lot of such support, which Cray even admits themselves: http://www.cray.com/products/systems/x1/
Re:So what DOES play in HPC? (Score:2, Informative)
actually, if you read the datasheet, the XD1 runs Linux 2.4.21 with some modifications (see xd1_datasheet.pdf [cray.com]
So does the SGI Itanium machine. What sets these computers apart is that they offer better interconnections between the processors than clusters do.
The Bigmac has 1.2GB/s between two nodes through Infiniband whereas an SGI machine has 6.4GB/s.
As a summary, in a cluster you use slower links with higher latency and your processors communicate through messages.
In a SGI or Cray machine, you use fast and expensive links (think more wires, more expensive controllers) and your processors can work as though they all shared the same memory.
SGI sells systems with 128 processors where there is only ONE Linux kernel (as opposed to 128 in a Linux cluster).
Re:Are too (Score:4, Informative)
Raytracing is sometimes referred to as "embarrasingly parallel", because of this.
Mathematical dependencies is the real destroyer of parallelism. Any situation where the next calculation depends on the result on the previous is a typical serial calculation that would do badly on any super-computer and might as well be run on a single single scalar processor like the Athlon or P4.
Re:Are too (Score:2, Informative)
When I worked at Oak Ridge National Labs there were several applications that people ran on our clusters that were serious computations. Very few of the people there really cared one way or another if it was on the IBM SP-2 or on the intel clusters, just run on the hardware that has the shortest runtime.
We generally got well over 8% utliization, if that was all you were getting then you were not managing the cluster well. Basically both machines had similar problems, if one piece of software only utilized 10% of the machine (and that is possible, even probable, in either world) then you ran more than one person - they did it and so did we. It was rare a single person got exclusive use of the machines (they either shared on each individual node or the over all machine was split into smaller clusters/supercomputers). The lines between the two are very blurry, but of course Cray wants you to think differently.
This article is just like one of the researchers there that ran the Big Iron stuff. When I was still an intern I overheard him telling the new director about how clusters sucked because they cost so much more in salaries to maintain. While true, he overlooked that thier service contract with IBM cost more than triple what it would cost us to replace the whole cluster per year and hire four full time people to manage them, and they never got any hardware upgrades for it.
Each has thier strong points and weaknesses, and never trust someone who is trying to sell you something to give the full story.