12-Core ARM Cluster Beats Intel Atom, AMD Fusion 105
An anonymous reader writes "Phoronix constructed a low-cost, low-power 12-core ARM cluster running Ubuntu 12.04 LTS and made out of six PandaBoard ES OMAP4460 dual-core ARMv7 Cortex A9 chips. Their results show the ARM hardware is able to outperform Intel Atom and AMD Fusion processors in performance-per-Watt, except it sharply loses out to the latest-generation Intel Ivy Bridge processors." This cluster offers a commendable re-use of kitchenware. Also, this is a good opportunity to recommend your favorite de-bursting tools for articles spread over too many pages.
Were they bored? (Score:3)
Or, could they just not do the MIPS/Watt calculations without actually building the thing?
Re:Were they bored? (Score:5, Informative)
What would calculating the theoretical peak tell them about the (real) sustained performance?
Partitioning the problem in chunks that can be distributed to the nodes in the cluster adds overhead. Assembling the finished results does the same. It is kind of hard to predict what this over will be as it depends on the interconnect. In this case they used 100Mb/s ethernet, but there was contention from running NFS over the same network. Building it and measuring it is the only way to find out what kind of performance you really get.
Re: (Score:1)
Well, it's hard to get first post and simultaneously develop a complete explanation of the concept, but...
They have provided yet another valuable datapoint in the theoretical peak vs actual sustained performance testing set, but, again, this is widely studied, characterized fairly well and predictable with a bit of research and thought experiment.
Reading the article (also impossible to do in a first post time constraint), reveals that they had a particular idea about using a wooden dish strainer to rack the
Re:Were they bored? (Score:5, Insightful)
Re:Were they bored? (Score:5, Informative)
What I don't understand is why the summary is focused on ARM beating Atom when the overall winner - in performance, in performance per watt, and in cost - was the Intel Ivy Bridge... by a huge margin.
Re: (Score:1)
OMAP4460 45nm
Atom 45nm, or 32nm for newer models
Ivy Bridge 22nm
Re: (Score:3)
Whatever the excuse, a loss is still a loss.
ARM is way better for low power consumption stuff, but if you want performance/watt, Intel still leads.
Re:Were they bored? (Score:4)
News is "Ford Fiesta better than Fiat Panda". News is not "Ford Fiesta worse than BMW 7 series".
Of course Ivy Bridge is better. It'd be pretty shocking if it weren't.
Re:Were they bored? (Score:5, Insightful)
Re: (Score:1)
Well, it was an SoC from TI, but any current-generation SoC with the same core will get reasonably similar performance for CPU-bound tasks.
And the Atom they benchmarked wasn't an SoC anyway (only Medfield is), so... enjoying your trip?
Re: (Score:2)
How many cores, can you fit in a cubic meter for example, what's the performance per watt per cubic meter. What's the performance of a solution like the Tegra. How do you measure the difference between added hardware, like radios or GPUs, etc.
Re:Were they bored? (Score:5, Insightful)
I think the confusion is that people think Atom is analog to ARM. People keep confusing the fact that ARM is a core processor and Atom an SoC solution. It makes no sense comparing apples to oranges. An appropriate comparison would be an SoC from TI, Qualcomm or Samsung.
But then how could they generate media hype by announcing they are outperforming intel?
Re: (Score:2)
Why can't you compare apples to oranges? The are both fruits, are picked from trees, spherical, has mass and volume, colour and taste. Of course you can compare them! The result of such a comparison will most likely conclude they are not very similar, but the same conclusion will probably be made by comparing a car to a navel.
Re: (Score:1)
To get readers?
Re: (Score:3)
What I don't understand is why the summary is focused on ARM beating Atom when the overall winner - in performance, in performance per watt, and in cost - was the Intel Ivy Bridge... by a huge margin.
Because this is slashdot and the AMD/ARM vs Intel bias is almost as strong as Linux vs Windows? Their best selling point is the APUs but in reality Intel is the one favored most by the move to decent integrated graphics, people still buy Intel but now instead of an AMD/nVidia entry level card many just stick with the integrated one, making GPU market share become more like CPU market share. And Intel is the one with a half-decent ARM competitor (Intel Medfield), AMD isn't ready to play in that arena at all.
Re: (Score:3)
Re: (Score:1)
Actually, I bet they did it to attract women.
Re: (Score:2)
Re: (Score:1)
You are thinking like an engineer not a marketeer. And any guess which side writers/editors are going to lean towards? Even in the field that writes about engineering stuff?
Re: (Score:2)
Re: (Score:2)
Applications are SLOWLY making better use of multiple core machines, and that means that as time goes forward, more cores makes for a better experience. The problem you are seeing, that many people are not stressing the system is caused by applications not making good use of system resources. In most cases, even multi-threaded apps are using what, two or three threads when they should be using six or more for what is being done.
Basically, we are seeing most developers failing to re-write applications
Re: (Score:2)
A lot of apps simply can't be threaded that well.
Even games, with all their graphical snd sound goodness can't use multiple cores that effectively.
you will have one heavy thread which is doing all the graphics, you can throw AI on to one or two threads, put sound on another and UI on another, plus networking and other IO could be on additiona threads, but the graphics thread will be the really heavy one, and the rest will be very lightweight in comparison. You can't break the graphics thread out to multiple
Re: (Score:2)
price much? (Score:1, Interesting)
I dont know the exact model used but the first one I could find online was 182 bucks * 6, thats a grand just to prove a point (+ other hardware), hope it was worth it to beat a 60$ atom
Re: (Score:2)
They were given these. Didn't pay full price.
Unless you're also going to get whatever deal phoronix got, you're paying close to retail. So yeah, price matters.
Re: (Score:2)
Re: (Score:1)
Really, but if you don't understand why this is interesting you better turn in your geek card.
Re: (Score:3)
Nope. We just did this kind of thing back when something as powerful as that ARM hardware was considered leading edge. We also did real work with it.
12 ARMs to replace a trailing edge x86? Funny.
Re: (Score:1)
Except they didn't need all 6. 2 Pandaboards = 1 Atom 330 nettop. (A shade less in one benchmark, a bit more in the other.)
And I'm not sure where you pulled that $60 figure from, but I haven't seen any 330 nettops that cheap. Is this that thing where you count the whole system for one side, and just the CPU for the other side?
Re: (Score:2)
How is that board worth $182?
Re: (Score:2)
cause thats what they ask for it
Re: (Score:2)
Is this a niche product that is made in small quantities?
Re: (Score:2)
probably, I dunno, never heard of it until this article
Article summary says it all (Score:5, Insightful)
"Besides winning on performance and efficiency, the Core i7 3770K system would cost less than the cost of a six PandaBoard ES cluster setup."
So a single Ivy Bridge system, which takes up much less rack space, no cluster network ports, outperforms and costs less than the ARM cluster. Is that the definition of a no-brainer?
Re:Article summary says it all (Score:5, Funny)
And yet Phoronix managed to squeeze 16 pages out of it. Good job.
Re: (Score:3)
What I find interesting is the switch probably uses more power than the cluster.
Re: (Score:3)
Re: (Score:1)
No, that's the definition of "clearly not as interesting or cool a setup as a cluster of Pandaboards" ;)
NEW: Biological and eco-friendly pandaboards. Reinforced with eucalyptus fiber.
Re: (Score:1)
Complete with Lisa Simpsons face and all directly from the recycling plants of Mr Burns? :)
Re: (Score:3)
Now what WOULD BE interesting is a cluster of NUCs [liliputing.com] with Ivy Bridge Core i3s
Re: (Score:3)
That cluster would probably be more valuable if you melted it down to sell the precious metals inside it.
I can't believe they bothered, I can't believe someone wrote an article about it.. somehow I can believe it would get posted to slashdot, though.
Loses to Ivy Bridge (Score:3)
I must have been under a rock for the past few years, but are Ivy Bridge processors really more power-efficient than Atoms, Fusions and even ARMs? I thought they were designed more for speed than efficiency, while the others were made for low consumption. Was I wrong? On the internet?
Re: (Score:3)
Re:Loses to Ivy Bridge (Score:5, Interesting)
With the EP.C workload on all twelve ARM cores, the average power consumption was 30.4 Watts for all six PandaBoards, which is in line with each PandaBoard burning through 5~6 Watts under load. When it comes to the performance-per-Watt, the EP.C test was yielding an average of 1.78 Mop/s per Watt, which was an increase over the single PandaBoard ES at 1.60 Mop/s per Watt.
Page 8 of TFA (yes, my quote was the entire text on that page) claims otherwise, that efficiency of the cluster is even better than that of a single board. I really have no idea how they managed that.
Re: (Score:1)
No, each board had its own AC adapter.
Re: (Score:1)
I also noticed that the combined wattage requirement was less than that of a single system multiplied by the number of units. I'm guessing that their simple meter is not accounting for all the load, since there are transformers in the AC power supplies.
Re: (Score:2)
Maybe they count the switch in both cases.
Re: (Score:2)
My guess, is they may be using a different power supply. The pandaboard takes 5V @ 4 amp - hardly anything, really. A single quality 90% efficiency desktop PSU with 6 5V rails will supply that much power and, even if not operating at peak efficiency (low-amp high-efficiency PSUs are hard to find), it may have beat out the common wallwarts used for the devices.
Re: (Score:3)
You're confusing efficiency with total power consumption. A desktop Ivy Bridge certainly pulls more watts than the E-350 or Atom boards, but the amount of work that Ivy can do for each of those watts is higher, which gives Ivy the efficiency lead but not a total power-consumption lead.
Re: (Score:2)
I don't know about that. Our house has a last-die Sandybridge i5 which runs circles around the E350 we also have. Power use is roughly par between the two.
Re:Loses to Ivy Bridge (Score:5, Interesting)
Re: (Score:2)
In addition to the much-increased overhead of the cluster (all the mainboards, memory, storage, etc), the Ivy Bridge chip is on the brand-new 22nm process size while the Atom and ARM chips they tested are stuck on the old 45nm. They could have at least gone with 32nm Atoms and ARMs.
Re: (Score:2)
Most people don't have clusters, but in that case you are interested in the power usage of your cluster while the thing is running at full speed. It's not something you're going to put in your phone, but Intel manages some efficient processors.
Re: (Score:2)
Sandybridge, and now Ivybridge, are drastic hand over fist improvements over their previous architectual designs - particularly in terms of power use. An i5 at idle, for instance, is more power efficient than the first generation Atoms as well as the first-generation AMD Bobcat boards (eg. Hudson), but can do a whole lot more while not idled and still maintains a relatively low power usage.
I suspect that the reason we never saw the Atom SoC (Atom 2) was because the power savings engineering went into Sandy
SPIN (Score:4, Informative)
I'm getting Dramamine for everyone on Slashdot to counteract the ARM FUD.
1. Look at both the AMD and Intel boards for the low-end processors... notice anything? They have all of these... features like PCIe, real memory interfaces, SATA controllers, etc. etc. All of these features consume power. Huge amounts? Not really, but compared to both the E-350 and the Atom CPUs, the amount of power being measured for each board is including a very large amount of power that has zero to do with the CPU. Guess what would happen if I took an E-350 or Atom and put it in an equivalent to the Panda board?
2. Apparently ARM's marketing department ran out of money to pay the poster to describe the Ivy Bridge system used in this test. Here's the short results:
a. In the parallel benchmarks used in this test that are a (probably unrealistically) best-case scenario for the ARM cluster, a single Ivy Bridge CPU was 5 times faster.
b. Oh but ARM says: So what if Ivy is faster! It's a power hog... look it used over 100 WATTS OMG!!!! Well guess what? On a performace per-watt scale, the Ivy Bridge system is THREE TIMES BETTER THAN ARM.
c. Oh but the ARM fanboys will say that Intel cheated by using a better lithographic process!! Well guess what: ARM loudly brags that it is better because it is an IP only company, so you have to take the good with the bad.
4. Oh one more thing... the Ivy Bridge system had REAL PERIPHERALS like real memory, reali PCIe, a real SSD, etc. etc. that by themselves probably used more power than at least one of the ARM boards, probably 2 of them. Oh and by the way.. the power used for the network fabric needed to network those ARM boards... *NOT* included in the power consumption figures so ARM had that as an extra advantage! So in many ways the Ivy Bridge system was intentionally disadvantaged.. and was still THREE TIMES MORE EFFICIENT ON A PER-WATT BASIS THAN ARM IN A SERIES OF BENCHMARKS THAT ARE BEST-CASE-POSSIBLE SCENARIOS FOR ARM.
5. For all of those ARM fanbois who are about to say that PCIe, real RAM interfaces, real SATA support, etc. etc. are inelegant artifacts of the stupid x86 instruction set well.. bite me. The last 5 years of ARM trolls who have literally gone down the feature list of every feature that x86 has that ARM doesn't and found a way to call the features that ARM lacks stupid and moronic (until ARM implements them years later and then claims to have come up with them first) is pissing me off.
Re: (Score:3)
Oh one more thing: The Ivy Bridge system is also cheaper not only for up-front price but also for long-term power efficiency and you don't have to worry about maintaining 6 sets of a hardware and updating software on 6 different nodes in a cluster.
Re: (Score:3)
Point 3 was intentionally omitted and left as an exercise for the reader. If I had been using decimal points, I would have chalked it up to the FDIV bug. ;-)
Re: (Score:1)
Intel spinbot much?
Re: (Score:2)
Considering I also said that the AMD E-350 was misrepresented in these test and since the E-350 is a faster CPU part than the Atom, I must not be a very efficient Intel spinbot...
Re: (Score:2)
Re: (Score:2)
Okay, I'll do some counteract to counteract the ARM FUD.
Do you mean that OMAP doesn't have PCIe, real memory interfaces (what do you mean by "real memory interface"? Is there something like a "fake" memory interface?") SATA controllers, etc. etc. etc. Sorry, but they DO HAVE THEM. Plus, the OMAP 4 series has a GPU, video encoder/decoder, its own 2D accelerator and whatever interface it requires to create a smartphone. Guess what will happen if the OMAP lacked all that stuff?
So, maybe it can be a Intel vs
Re: (Score:2)
I don't suppose it does much good mentioning at this point that the Pandaboard has what is at this point a fairly dated CPU with a fairly low clock. When it came out, it was decent, but at this point it's almost 2 years old. The Tegra 3, for instance, puts it to shame in pretty much every regard.
Are we back to Beowulf cluster comments? (Score:2)
any time soon?
Readability works, but no performance images (Score:2)
http://www.readability.com/articles/sagdka0j [readability.com]
I was impressed that it gets the first 11 pages, and then it includes a 'Next Page' link to in-line the remaining pages. The problem is it didn't get the performance images, which are in separate iframes.
Re: (Score:2)
(self-reply) I just noticed that the iframes are "image/svg+xml" so maybe it's the content-type that's the problem, not that they're in a separate iframe.
It would be nice if Readability had page numbers and links to the original page, for problems like this.
This is fascinating (Score:2)
So intel finally beats arm in performance/watt, but a 2 board cluster beats intels lowest power offering. So, basically intel has finally eroded the advantage arm has in servers, but arm still maintains an edge in small, low power devices. I love that arm has been so competitive in certain areas. Its good to see something other than x86 everywhere. Imagine if there was no iphone. Imagine if there was no competition and arm was still just a slow, but modern and power efficient core? ARM has come a long, long
Did you know Atom consumes 30W? (Score:4, Informative)
So I'm asking myself how 12 ARMs equal the power consumption of one Atom. So I have sit through all the page loads. The "Atom" is a complete off-the-hself "Net Top" box designed to maximize performance (spinning hard drive and high-end graphics card) with the sole constraint of being noiseless -- i.e. the Atom was chosen by the NetTop manufacturer for low heat, not for low energy consumption.
OK, then for the comparison with Ivy Bridge, I wasn't surprised. I've been salivating about the low-power versions of Ivy Bridge for several months now. But this comparison wasn't even againt that. They used the highest clock cycle highest power 3770K variant, which is rated at 77W [wikipedia.org]. There is a 45W version for a bit lower clock speed. (BTW, Intel "produces" low-power variants the same way they "produce" high-clock variants -- they test the chips after manufacturing to see which ones draw less power.)
So, basically, the comparison is completely pointless and a waste of time.
Re: (Score:2)
To a first approximation, heat = energy consumption. You have to dissipate all that energy you use as heat, after all, and that's why lower wattage parts always run cooler, all other things being equal.
Re: (Score:2)
Re: (Score:2)
Parallelizable (Score:3)
Re: (Score:1)
Re: (Score:2)
It should be, and probably would be, if using the GPU for general computing purposes as you would a CPU was possible yet. But it isn't, so it hardly matters.
de-pagination tools (Score:3)
Safari's reader seems to make good work of that. One long page, all the photo's and no adds.
Wow! (Score:1)
Wow! Imagine a Beowulf cluster of these!
Comment removed (Score:3)
Cortex-A15? (Score:2)
The quad core Cortex-A15s have even better perf/watt. Better cache architecture in them. Support for 40-bit physical addressing. ARM is quickly catching up to Atom and Fusion in terms of performance.
Does ARM support ECC? (Score:2)
Real servers need ECC RAM. I'd be reluctant to even run a home file server without it, if that server contains critical data.
Does ARM support ECC? If not, then it can be ruled out on that basis alone. Atom and Bobcat can also be ruled out at this time since neither support ECC RAM.
A while back Intel announced a 2-core, 1.2 GHz Sandy Bridge "Pentium 350" that has a max TDP of 15W and has the standard server chip package, including ECC support. This would be nice for small, low-power servers. But for some rea