Forgot your password?
typodupeerror
Hardware Hacking Intel Linux Build Hardware

12-Core ARM Cluster Beats Intel Atom, AMD Fusion 105

Posted by timothy
from the 16-pages-seriously dept.
An anonymous reader writes "Phoronix constructed a low-cost, low-power 12-core ARM cluster running Ubuntu 12.04 LTS and made out of six PandaBoard ES OMAP4460 dual-core ARMv7 Cortex A9 chips. Their results show the ARM hardware is able to outperform Intel Atom and AMD Fusion processors in performance-per-Watt, except it sharply loses out to the latest-generation Intel Ivy Bridge processors." This cluster offers a commendable re-use of kitchenware. Also, this is a good opportunity to recommend your favorite de-bursting tools for articles spread over too many pages.
This discussion has been archived. No new comments can be posted.

12-Core ARM Cluster Beats Intel Atom, AMD Fusion

Comments Filter:
  • by JoeMerchant (803320) on Saturday June 16, 2012 @11:34AM (#40344499)

    Or, could they just not do the MIPS/Watt calculations without actually building the thing?

    • Re:Were they bored? (Score:5, Informative)

      by smallfries (601545) on Saturday June 16, 2012 @11:39AM (#40344537) Homepage

      What would calculating the theoretical peak tell them about the (real) sustained performance?

      Partitioning the problem in chunks that can be distributed to the nodes in the cluster adds overhead. Assembling the finished results does the same. It is kind of hard to predict what this over will be as it depends on the interconnect. In this case they used 100Mb/s ethernet, but there was contention from running NFS over the same network. Building it and measuring it is the only way to find out what kind of performance you really get.

      • Well, it's hard to get first post and simultaneously develop a complete explanation of the concept, but...

        They have provided yet another valuable datapoint in the theoretical peak vs actual sustained performance testing set, but, again, this is widely studied, characterized fairly well and predictable with a bit of research and thought experiment.

        Reading the article (also impossible to do in a first post time constraint), reveals that they had a particular idea about using a wooden dish strainer to rack the

    • by davydagger (2566757) on Saturday June 16, 2012 @11:42AM (#40344551)
      half the fun is building it. good excuse to build a 12-core mini-cluster. I think this is nothing more than some nerd showing off his latest toy. Which is not a bad thing. this 12-core'd cluster might be useful, at the very least proof of concept stage. I could imagine the uses for a highly paralleled mini-super-computer on an affordable budget.
    • Re:Were they bored? (Score:5, Informative)

      by timeOday (582209) on Saturday June 16, 2012 @11:43AM (#40344557)
      I think independent testing of this sort is tremendously valuable.

      What I don't understand is why the summary is focused on ARM beating Atom when the overall winner - in performance, in performance per watt, and in cost - was the Intel Ivy Bridge... by a huge margin.

      • by Anonymous Coward

        OMAP4460 45nm
        Atom 45nm, or 32nm for newer models
        Ivy Bridge 22nm

        • by TheLink (130905)
          Yeah and my car is based on old tech so it has crappier performance than more modern cars.

          Whatever the excuse, a loss is still a loss.

          ARM is way better for low power consumption stuff, but if you want performance/watt, Intel still leads.
      • by Idbar (1034346) on Saturday June 16, 2012 @12:10PM (#40344721)
        I think the confusion is that people think Atom is analog to ARM. People keep confusing the fact that ARM is a core processor and Atom an SoC solution. It makes no sense comparing apples to oranges. An appropriate comparison would be an SoC from TI, Qualcomm or Samsung.
        • by Anonymous Coward

          Well, it was an SoC from TI, but any current-generation SoC with the same core will get reasonably similar performance for CPU-bound tasks.

          And the Atom they benchmarked wasn't an SoC anyway (only Medfield is), so... enjoying your trip?

          • by Idbar (1034346)
            I'm not against this type of benchmarking, I actually enjoy reading people writing them up. Now, on the other hand, I don't think it's fair to compare a cluster of laptops vs. a cluster of desktops. It's fun... but without the proper metrics, it's useless.

            How many cores, can you fit in a cubic meter for example, what's the performance per watt per cubic meter. What's the performance of a solution like the Tegra. How do you measure the difference between added hardware, like radios or GPUs, etc.
        • by Dcnjoe60 (682885) on Saturday June 16, 2012 @01:18PM (#40345217)

          I think the confusion is that people think Atom is analog to ARM. People keep confusing the fact that ARM is a core processor and Atom an SoC solution. It makes no sense comparing apples to oranges. An appropriate comparison would be an SoC from TI, Qualcomm or Samsung.

          But then how could they generate media hype by announcing they are outperforming intel?

        • by the_arrow (171557)

          comparing apples to oranges

          Why can't you compare apples to oranges? The are both fruits, are picked from trees, spherical, has mass and volume, colour and taste. Of course you can compare them! The result of such a comparison will most likely conclude they are not very similar, but the same conclusion will probably be made by comparing a car to a navel.

      • by aliquis (678370)

        To get readers?

      • by Kjella (173770)

        What I don't understand is why the summary is focused on ARM beating Atom when the overall winner - in performance, in performance per watt, and in cost - was the Intel Ivy Bridge... by a huge margin.

        Because this is slashdot and the AMD/ARM vs Intel bias is almost as strong as Linux vs Windows? Their best selling point is the APUs but in reality Intel is the one favored most by the move to decent integrated graphics, people still buy Intel but now instead of an AMD/nVidia entry level card many just stick with the integrated one, making GPU market share become more like CPU market share. And Intel is the one with a half-decent ARM competitor (Intel Medfield), AMD isn't ready to play in that arena at all.

      • Agreed... and frankly... I thought the comparison was utter crap... Really... a first generation Atom against a modern ARM? First generation Atom was utter crap and solved no other issue that providing a cheap atom based platform to play with. What about the N2600 or even better... the Medfield (had to google for ages for that name haha)? Atom 330 was just not worth it.
    • by ddd0004 (1984672)

      Actually, I bet they did it to attract women.

    • by hairyfeet (841228)

      Not to mention the point is.....what exactly? The whole selling point of ARM is how long it will run on a battery, plugged in the difference between say 7w and 12w really isn't enough to get your panties in a twist over and while ARM may get lower power usage while doing work there is no denying that Intel and AMD have the IPS crown by a pretty wide margin, even more if your code is able to use OpenCL so you can use both halves of a Fusion APU to do useful work.

      I just don't get all the "Us VS Them" bulls

      • You are thinking like an engineer not a marketeer. And any guess which side writers/editors are going to lean towards? Even in the field that writes about engineering stuff?

        • by hairyfeet (841228)

          Well the nice thing I've found, which is why I ignore most of the benches, is that X86 has gone so far beyond good enough and into insanely overpowered that even a low end system simply never gets stressed, the users simply can't come up with enough work for the chip to do.

          Last year I built my dad a Phenom I quad because i found a kit cheap, now most here would consider that a pretty weak chip, we're talking a 2.1GHz first gen Phenom. Now guess what I found? That after 3 months he had simply never gone abo

          • by Targon (17348)

            Applications are SLOWLY making better use of multiple core machines, and that means that as time goes forward, more cores makes for a better experience. The problem you are seeing, that many people are not stressing the system is caused by applications not making good use of system resources. In most cases, even multi-threaded apps are using what, two or three threads when they should be using six or more for what is being done.

            Basically, we are seeing most developers failing to re-write applications

            • by vivian (156520)

              A lot of apps simply can't be threaded that well.
              Even games, with all their graphical snd sound goodness can't use multiple cores that effectively.
              you will have one heavy thread which is doing all the graphics, you can throw AI on to one or two threads, put sound on another and UI on another, plus networking and other IO could be on additiona threads, but the graphics thread will be the really heavy one, and the rest will be very lightweight in comparison. You can't break the graphics thread out to multiple

              • by hairyfeet (841228)

                That is why I never understood AMD having identical cores as it seems to be a waste with the exception of a few apps like video processing. That is why I snatched a Thuban when they were cheap as at least turbocore will ramp up when you are only using one to three cores heavily but a better design would probably be an uber-powerful Core 0, followed by a decently powerful Core 1 & 2, with the Cores after that being slightly more powerful than Bobcats.

                Because as you so rightly pointed out frankly it doe

  • price much? (Score:1, Interesting)

    by Osgeld (1900440)

    I dont know the exact model used but the first one I could find online was 182 bucks * 6, thats a grand just to prove a point (+ other hardware), hope it was worth it to beat a 60$ atom

    • by Anonymous Coward

      Really, but if you don't understand why this is interesting you better turn in your geek card.

      • by jedidiah (1196)

        Nope. We just did this kind of thing back when something as powerful as that ARM hardware was considered leading edge. We also did real work with it.

        12 ARMs to replace a trailing edge x86? Funny.

    • by Anonymous Coward

      Except they didn't need all 6. 2 Pandaboards = 1 Atom 330 nettop. (A shade less in one benchmark, a bit more in the other.)

      And I'm not sure where you pulled that $60 figure from, but I haven't seen any 330 nettops that cheap. Is this that thing where you count the whole system for one side, and just the CPU for the other side?

    • by reub2000 (705806)

      How is that board worth $182?

  • by Glasswire (302197) <glasswire@gmail. ... minus herbivore> on Saturday June 16, 2012 @11:45AM (#40344565) Homepage

    "Besides winning on performance and efficiency, the Core i7 3770K system would cost less than the cost of a six PandaBoard ES cluster setup."
    So a single Ivy Bridge system, which takes up much less rack space, no cluster network ports, outperforms and costs less than the ARM cluster. Is that the definition of a no-brainer?

    • by Noughmad (1044096) <miha.cancula@gmail.com> on Saturday June 16, 2012 @11:52AM (#40344605) Homepage

      And yet Phoronix managed to squeeze 16 pages out of it. Good job.

      • by KreAture (105311)
        It's called page views and ad reloads.

        What I find interesting is the switch probably uses more power than the cluster.
    • So a single Ivy Bridge system, which takes up much less rack space, no cluster network ports, outperforms and costs less than the ARM cluster. Is that the definition of a no-brainer?

      No, that's the definition of "clearly not as interesting or cool a setup as a cluster of Pandaboards" ;)

      • by aliquis (678370)

        No, that's the definition of "clearly not as interesting or cool a setup as a cluster of Pandaboards" ;)

        NEW: Biological and eco-friendly pandaboards. Reinforced with eucalyptus fiber.

        • by aliquis (678370)

          Complete with Lisa Simpsons face and all directly from the recycling plants of Mr Burns? :)

      • by Glasswire (302197)

        Now what WOULD BE interesting is a cluster of NUCs [liliputing.com] with Ivy Bridge Core i3s

    • To the arm fanboys it is, apparently. The whole exercise seems fairly pointless to me. Intel netbook cpu outperformed by 12 competing cpu's...

      That cluster would probably be more valuable if you melted it down to sell the precious metals inside it.
      I can't believe they bothered, I can't believe someone wrote an article about it.. somehow I can believe it would get posted to slashdot, though.
  • by Noughmad (1044096) <miha.cancula@gmail.com> on Saturday June 16, 2012 @12:00PM (#40344667) Homepage

    I must have been under a rock for the past few years, but are Ivy Bridge processors really more power-efficient than Atoms, Fusions and even ARMs? I thought they were designed more for speed than efficiency, while the others were made for low consumption. Was I wrong? On the internet?

    • by scheme (19778)
      They're more power efficient if you're looking for high performance at reasonable power levels. The ARMs might be much better for tasks that don't need much computation but if you end up needing to combine a bunch of ARM boards into a cluster to get the performance you need then there's a lot of overhead that adds to the power consumption without giving you much.
      • by Noughmad (1044096) <miha.cancula@gmail.com> on Saturday June 16, 2012 @12:18PM (#40344755) Homepage

        With the EP.C workload on all twelve ARM cores, the average power consumption was 30.4 Watts for all six PandaBoards, which is in line with each PandaBoard burning through 5~6 Watts under load. When it comes to the performance-per-Watt, the EP.C test was yielding an average of 1.78 Mop/s per Watt, which was an increase over the single PandaBoard ES at 1.60 Mop/s per Watt.

        Page 8 of TFA (yes, my quote was the entire text on that page) claims otherwise, that efficiency of the cluster is even better than that of a single board. I really have no idea how they managed that.

        • by CityZen (464761)

          I also noticed that the combined wattage requirement was less than that of a single system multiplied by the number of units. I'm guessing that their simple meter is not accounting for all the load, since there are transformers in the AC power supplies.

        • by fa2k (881632)

          Maybe they count the switch in both cases.

        • by CAIMLAS (41445)

          My guess, is they may be using a different power supply. The pandaboard takes 5V @ 4 amp - hardly anything, really. A single quality 90% efficiency desktop PSU with 6 5V rails will supply that much power and, even if not operating at peak efficiency (low-amp high-efficiency PSUs are hard to find), it may have beat out the common wallwarts used for the devices.

    • You're confusing efficiency with total power consumption. A desktop Ivy Bridge certainly pulls more watts than the E-350 or Atom boards, but the amount of work that Ivy can do for each of those watts is higher, which gives Ivy the efficiency lead but not a total power-consumption lead.

      • by CAIMLAS (41445)

        I don't know about that. Our house has a last-die Sandybridge i5 which runs circles around the E350 we also have. Power use is roughly par between the two.

    • by wbr1 (2538558) on Saturday June 16, 2012 @12:29PM (#40344841)
      Ivy bridge is more efficent in work done per watt yes, but ARM still wins for low power devices like phones because it draws so much less power. The fact that it does less with that power is moot, because it does enough and lets your battery last much longqer.
    • In addition to the much-increased overhead of the cluster (all the mainboards, memory, storage, etc), the Ivy Bridge chip is on the brand-new 22nm process size while the Atom and ARM chips they tested are stuck on the old 45nm. They could have at least gone with 32nm Atoms and ARMs.

    • Because they are looking at performance per watt, not "power usage during normal use." Most people think of "power usage during normal use" when they are talking on the internet, because they're thinking of power usage in their phone.

      Most people don't have clusters, but in that case you are interested in the power usage of your cluster while the thing is running at full speed. It's not something you're going to put in your phone, but Intel manages some efficient processors.
    • by CAIMLAS (41445)

      Sandybridge, and now Ivybridge, are drastic hand over fist improvements over their previous architectual designs - particularly in terms of power use. An i5 at idle, for instance, is more power efficient than the first generation Atoms as well as the first-generation AMD Bobcat boards (eg. Hudson), but can do a whole lot more while not idled and still maintains a relatively low power usage.

      I suspect that the reason we never saw the Atom SoC (Atom 2) was because the power savings engineering went into Sandy

  • SPIN (Score:4, Informative)

    by CajunArson (465943) on Saturday June 16, 2012 @12:20PM (#40344763) Journal

    I'm getting Dramamine for everyone on Slashdot to counteract the ARM FUD.

    1. Look at both the AMD and Intel boards for the low-end processors... notice anything? They have all of these... features like PCIe, real memory interfaces, SATA controllers, etc. etc. All of these features consume power. Huge amounts? Not really, but compared to both the E-350 and the Atom CPUs, the amount of power being measured for each board is including a very large amount of power that has zero to do with the CPU. Guess what would happen if I took an E-350 or Atom and put it in an equivalent to the Panda board?

    2. Apparently ARM's marketing department ran out of money to pay the poster to describe the Ivy Bridge system used in this test. Here's the short results:
          a. In the parallel benchmarks used in this test that are a (probably unrealistically) best-case scenario for the ARM cluster, a single Ivy Bridge CPU was 5 times faster.
          b. Oh but ARM says: So what if Ivy is faster! It's a power hog... look it used over 100 WATTS OMG!!!! Well guess what? On a performace per-watt scale, the Ivy Bridge system is THREE TIMES BETTER THAN ARM.
          c. Oh but the ARM fanboys will say that Intel cheated by using a better lithographic process!! Well guess what: ARM loudly brags that it is better because it is an IP only company, so you have to take the good with the bad.

    4. Oh one more thing... the Ivy Bridge system had REAL PERIPHERALS like real memory, reali PCIe, a real SSD, etc. etc. that by themselves probably used more power than at least one of the ARM boards, probably 2 of them. Oh and by the way.. the power used for the network fabric needed to network those ARM boards... *NOT* included in the power consumption figures so ARM had that as an extra advantage! So in many ways the Ivy Bridge system was intentionally disadvantaged.. and was still THREE TIMES MORE EFFICIENT ON A PER-WATT BASIS THAN ARM IN A SERIES OF BENCHMARKS THAT ARE BEST-CASE-POSSIBLE SCENARIOS FOR ARM.

    5. For all of those ARM fanbois who are about to say that PCIe, real RAM interfaces, real SATA support, etc. etc. are inelegant artifacts of the stupid x86 instruction set well.. bite me. The last 5 years of ARM trolls who have literally gone down the feature list of every feature that x86 has that ARM doesn't and found a way to call the features that ARM lacks stupid and moronic (until ARM implements them years later and then claims to have come up with them first) is pissing me off.

    • Oh one more thing: The Ivy Bridge system is also cheaper not only for up-front price but also for long-term power efficiency and you don't have to worry about maintaining 6 sets of a hardware and updating software on 6 different nodes in a cluster.

    • Intel spinbot much?

      • Intel spinbot much?

        Considering I also said that the AMD E-350 was misrepresented in these test and since the E-350 is a faster CPU part than the Atom, I must not be a very efficient Intel spinbot...

    • by ihavnoid (749312)

      Okay, I'll do some counteract to counteract the ARM FUD.

      Do you mean that OMAP doesn't have PCIe, real memory interfaces (what do you mean by "real memory interface"? Is there something like a "fake" memory interface?") SATA controllers, etc. etc. etc. Sorry, but they DO HAVE THEM. Plus, the OMAP 4 series has a GPU, video encoder/decoder, its own 2D accelerator and whatever interface it requires to create a smartphone. Guess what will happen if the OMAP lacked all that stuff?

      So, maybe it can be a Intel vs

    • by CAIMLAS (41445)

      I don't suppose it does much good mentioning at this point that the Pandaboard has what is at this point a fairly dated CPU with a fairly low clock. When it came out, it was decent, but at this point it's almost 2 years old. The Tegra 3, for instance, puts it to shame in pretty much every regard.

  • http://www.readability.com/articles/sagdka0j [readability.com]

    I was impressed that it gets the first 11 pages, and then it includes a 'Next Page' link to in-line the remaining pages. The problem is it didn't get the performance images, which are in separate iframes.

    • by mounthood (993037)

      (self-reply) I just noticed that the iframes are "image/svg+xml" so maybe it's the content-type that's the problem, not that they're in a separate iframe.

      It would be nice if Readability had page numbers and links to the original page, for problems like this.

  • So intel finally beats arm in performance/watt, but a 2 board cluster beats intels lowest power offering. So, basically intel has finally eroded the advantage arm has in servers, but arm still maintains an edge in small, low power devices. I love that arm has been so competitive in certain areas. Its good to see something other than x86 everywhere. Imagine if there was no iphone. Imagine if there was no competition and arm was still just a slow, but modern and power efficient core? ARM has come a long, long

  • by michaelmalak (91262) <michael@michaelmalak.com> on Saturday June 16, 2012 @12:41PM (#40344939) Homepage

    So I'm asking myself how 12 ARMs equal the power consumption of one Atom. So I have sit through all the page loads. The "Atom" is a complete off-the-hself "Net Top" box designed to maximize performance (spinning hard drive and high-end graphics card) with the sole constraint of being noiseless -- i.e. the Atom was chosen by the NetTop manufacturer for low heat, not for low energy consumption.

    OK, then for the comparison with Ivy Bridge, I wasn't surprised. I've been salivating about the low-power versions of Ivy Bridge for several months now. But this comparison wasn't even againt that. They used the highest clock cycle highest power 3770K variant, which is rated at 77W [wikipedia.org]. There is a 45W version for a bit lower clock speed. (BTW, Intel "produces" low-power variants the same way they "produce" high-clock variants -- they test the chips after manufacturing to see which ones draw less power.)

    So, basically, the comparison is completely pointless and a waste of time.

    • by Nimey (114278)

      To a first approximation, heat = energy consumption. You have to dissipate all that energy you use as heat, after all, and that's why lower wattage parts always run cooler, all other things being equal.

      • You missed my point. The Net Top included, besides the Atom, other devices such as the hard drive and high-end graphics card, which were power-hungry but did not happen to require a fan.
  • by Luthair (847766) on Saturday June 16, 2012 @12:43PM (#40344953)
    If you're looking at highly parallelizable workloads shouldn't the GPU in the AMD part be part of the equation?
    • by CAIMLAS (41445)

      It should be, and probably would be, if using the GPU for general computing purposes as you would a CPU was possible yet. But it isn't, so it hardly matters.

  • by burne (686114) on Saturday June 16, 2012 @12:48PM (#40344995)

    Safari's reader seems to make good work of that. One long page, all the photo's and no adds.

  • by Anonymous Coward

    Wow! Imagine a Beowulf cluster of these!

  • by scsirob (246572) on Saturday June 16, 2012 @02:47PM (#40345661)

    .. Imagine a Beowulf cluster of those!!

  • The quad core Cortex-A15s have even better perf/watt. Better cache architecture in them. Support for 40-bit physical addressing. ARM is quickly catching up to Atom and Fusion in terms of performance.

  • Real servers need ECC RAM. I'd be reluctant to even run a home file server without it, if that server contains critical data.

    Does ARM support ECC? If not, then it can be ruled out on that basis alone. Atom and Bobcat can also be ruled out at this time since neither support ECC RAM.

    A while back Intel announced a 2-core, 1.2 GHz Sandy Bridge "Pentium 350" that has a max TDP of 15W and has the standard server chip package, including ECC support. This would be nice for small, low-power servers. But for some rea

One small step for man, one giant stumble for mankind.

Working...