Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Virtualized Linux Faster Than Native?

Posted by ScuttleMonkey on Wed May 31, 2006 06:31 AM
from the too-specialized-to-count dept.
^switch writes "Aussies at NICTA have developed a para-virtualized Linux called Wombat that they claim outperforms native Linux. From the article: 'The L4 Microkernel works with its own open source operating system Iguana, which is specifically designed as a base for use in embedded systems.'" Specific performance results are also available from the NICTA website.
+ -
story
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by LiquidCoooled (634315) on Wednesday May 31 2006, @06:32AM (#15434194) Homepage Journal
    Warning

    Running a virtual Iguana OS from within a virtualised Linux environment is dangerous.
    ETROS and NICTA will not be held responsible for any resulting time paradoxes.


    hmmmm
  • I can Linus already gearing up to defend his position that microkernels are crap.

    However, I thought the purpose of a microkernel was stability, not performance.
  • Bad Second Link (Score:5, Informative)

    by Ctrl+Alt+De1337 (837964) on Wednesday May 31 2006, @06:36AM (#15434205) Homepage Journal
    Ignore the second link. The actual performance results are here [nicta.com.au].
  • by DaHat (247651) on Wednesday May 31 2006, @06:36AM (#15434207) Homepage
    Just how fast would a virtualized Linux instance running inside of a virtualized Linux instance running on hardware be?
  • by XoXus (12014) on Wednesday May 31 2006, @06:37AM (#15434210)
    The summary is misleading a bit - it's only faster on ARM v4 or v5 processors.

    From TFA:

    Wombat, NICTA's architecture-independent para-virtualised Linux for L4-embedded, can be faster than native Linux on the same hardware. Specifically on popular ARM v4 or v5 processors, such as ARM9 cores or the XScale, Wombat benefits from the fast address-space switch (FASS) technology implemented in L4-embedded, while this is not supported in native Linux distributions.
    • Only? (Score:4, Interesting)

      by Anonymous Coward on Wednesday May 31 2006, @06:53AM (#15434261)
      I'm not sure if you realize the market penetration of ARM-based processors. They're basically everywhere. One popular use is in routers. Many printers also have ARM chips. They're also very widely used in cell phones and other mobile technology.

      It benefits us all of more performance can be extracted from such chips, just because they're so widely used. Being able to get a greater degree of performance out of a device already in use can lead to lower-cost systems. To suggest that this is of limited use is naive, just because of how prevalent these processors are.

      • Re:Only? (Score:5, Informative)

        by JanneM (7445) on Wednesday May 31 2006, @07:10AM (#15434323) Homepage
        It benefits us all of more performance can be extracted from such chips, just because they're so widely used.

        The reaction is not against the performance but the disingenious presentation. A cursory reading makes it seem as if the performance gain was somehow tied to it being a microkernel, or that the virtualization step somehow magically speeded things up. It wasn't - their kernel is using some platform specific optimizations that Linux doesn't, that's all.

      • However that quote tells the reason for the performance boost: fast address-space switch (FASS) is supported in L4-embedded, but not in Linux native. IOW, it's not really "virtualized faster than native", but "using FASS faster than not using it". I guess you'd get even better performance if you'd make Linux native support FASS.
    • Microkernels will not come of age until CPUs support true modularization. Previously on /. :

      http://linux.slashdot.org/comments.pl?sid=185800&c id=15341069 [slashdot.org]

  • Neato but... (Score:3, Interesting)

    by tomstdenis (446163) <tomstdenisNO@SPAMgmail.com> on Wednesday May 31 2006, @06:43AM (#15434226) Homepage
    They sacrificed portability by performing some TLB caching hacks. It's a good idea but comparing it to Linux as a whole is a bad idea as Linux runs on more than the ARM they're testing on. If you look at all of the results most are comparable and exec/fork favour Linux.

    Tom
  • Twice the buffering (Score:4, Interesting)

    by jellomizer (103300) * on Wednesday May 31 2006, @06:45AM (#15434233)
    It is possible. First you have drive access. Normally the data is buffered in memory then is paged out to the drive when the OS sees fit. When it is on the memory it can be accessed faster. So now you are virtualizing the hardware so when the OS says write to the Hard Drive it goes to the Host OS who then buffers it in memory and writes to the drive when it seems fit, so the files are buffered in memory for twice as long, allowing twice the time that it can access the faster data. Usually that is the largest slowdown on the system is drive access, also because when the host OS is writing to the drive the Virtualized Linux kernel is free to do what it wants. I am sure if the application requires a lot of interrupt calls or a lot of displaying to the screen it will slowdown (Unless the virtualized video drivers are much more optimized then the normal ones)
    So it is possible, just as long as you have a system powerful enough to run both OSs well and with a lot of RAM.

    • by tomstdenis (446163) <tomstdenisNO@SPAMgmail.com> on Wednesday May 31 2006, @06:57AM (#15434272) Homepage
      This is OT.

      The speedup comes from TLB caching between processes. Not from "double caching".

      In Linux when you switch processes the TLB is flushed [e.g. reloading CR3 on x86-*]. This is a safe [but slow] way to ensure your virtual memory for a given process is mapped correctly. I'm guessing [didn't fully read the linked research papers] that they share a virtual memory base between processes but map processes to different regions or something. Unless they have segment limits this will cause problems with process isolation.

      For those not in the know, a TLB cache holds the translation of a virtual address into a physical one. Parsing a typical 32-bit address requires several layers [with 4KB pages it's four I think] of table lookup which is slow if you had to do it for every memory access. For example, take your 32-bit address, the lower 12 bits is the byte in a 4KB page, the next 10 bits points selects one 4KB page, the next 10 bits selects one 1024-entry array of pointers to 4KB pages. [iirc]. It's even worse in x86-64 mode as we are parsing a 48-bit virtual address.

      So the processor will cache TLB lookups. When you switch processes you have to flush it because the translations don't map to your processes physicals memory.

      Tom
    • This is interesting... quite often I've seen Windows XP start up faster under qemu than it would natively as Linux has kept an amount of the disk image in the cache. (Of course, if I start it from cold, it spends about 20 seconds just transferring a portion of the image into RAM, and then the rest of the startup is very quick.)
    • Also, a lot of software tends to sync after every write, believing that this is the holy grail for data consistency. This is not true, as losing power already pretty guarantees data loss. Plus, most hardware faults cause something worse than just a clean, nice instant shutdown. Having the disks synced won't save you from most motherboard glitches, a bit faulty memory, and so on. Plus, a sudden poweroff is something that can be easily handled with an UPS; there is no such easy way to ensure the rest of t
  • Hm (Score:5, Informative)

    by FidelCatsro (861135) * <fidelcatsro.gmail@com> on Wednesday May 31 2006, @06:49AM (#15434249) Journal
    Could it be because linux for ARM is not that well optimised . I can't imagine such massive performance gains otherwise , bar a massive bug in the kernel.

    Fast Address-Space Switching for ARM Linux Kernels [nicta.com.au]

    The Fast Address-Space Switching (FASS) project aims to utilise some of the features of the Memory Management Unit in the ARM architecture to improve the performance of context switches under the L4 kernel and ARM Linux.
  • by VincenzoRomano (881055) on Wednesday May 31 2006, @06:54AM (#15434264) Homepage Journal
    I think that the whole L4 family [l4hq.org] smicrokernels hould deserve some more attention from IT professionals.
    As far as I know L4 is one of the microkernels with more efforts for development. Along with MinixV3 [minix3.org] of course.
    • Sure it deserves attention, but what's the point of using L4 to run.....a monolithic kernel?

      When running Linux under L4 (like in L4Linux), when the Linux process dies because of a bug, the system DIES. Sure, you can restart it, but so can you do in linux when something oopses using Kexec.

      L4 was written to run real microkernels on top of it. If you want to run Linux instances so that a crash of the kernel doesn't crash the system, you'd surprised to know that Linux already includes in it's heart a vm-ish/mic
  • Are those winning performances valid also outside the embedded world?
    I fear that Linux running over a "normal" x86 CPU outperferms almost everything else.

  • Pet maths peeve (Score:3, Interesting)

    by Emil Brink (69213) on Wednesday May 31 2006, @07:17AM (#15434350) Homepage
    The performance results [nicta.com.au] page states:

    The result is that context-switching costs of virtualised Linux are up to thirty times less than in native Linux.

    (Emphasis in the original text). This is one of my pet peeves, since I think it's so sloppy use of maths. How can something be "thirty times less?" So, if it takes one second in Linux, it takes them ... what? 1 - 30 * 1 = -29 seconds? I guess they mean 1/30:th of a second, but still, that should have been caught before being published, imo. Or maybe it's just because I'm not a native speaker of English, that it annoys me so.

    • How can something be "thirty times less?" So, if it takes one second in Linux, it takes them ... what? 1 - 30 * 1 = -29 seconds?


      Definitely a problem with your interpretation of the english. "smaller by 30 times", "smaller by a factor of 30".

      The "less" there isn't read as a (subtraction) operator.
    • It's mathematically and grammatically quite sound: if X is 30 times more than Y, then Y is 30 times less than X.

  • by agent dero (680753) on Wednesday May 31 2006, @07:19AM (#15434360) Homepage
    I've been researching more and more into NICTA's microkernel and virtualization (for my L4::BSD [google.com] idea) and one thing that is important to understand is that NICTA's development is mainly on ARM, the Kenge toolset, as well as the Iguana OS are both much further along on ARM as opposed to i386

    Considering the work that NICTA does with companies that produce embedded hardware like Qualcomm [nicta.com.au], this isn't surprising, but don't go crazy about this.

    Linux development is much more fine tuned on x86, and Kenge/Iguana development is much more fine tuned on ARM; no need to start holy wars here ;)

    That said, nice work benno, chuck, and gernot (and whomever else I'm forgetting)
  • by waif69 (322360) on Wednesday May 31 2006, @07:39AM (#15434440) Journal
    If one were to use 33 levels of virtualization on the ARM processor, the efficiency is so great that power may be removed and the system runs on its own efficiency. Yeah! We don't need oil anymore.
  • by pmbuko (162438) <.pmbuko. .at. .gmail.com.> on Wednesday May 31 2006, @07:39AM (#15434443)
    Even better than the real thing....
  • by mikael (484) on Wednesday May 31 2006, @08:30AM (#15434723)
    ^switch writes "Aussies at NICTA have developed a para-virtualized Linux called Wombat that they claim outperforms native Linux.

    So if a para-virtualised microkernel runs a para-virtualised microkernel running Linux, then there should be an even greater speedup?
  • Strange (Score:3, Insightful)

    by Sgt Pinback (118723) on Wednesday May 31 2006, @08:37AM (#15434788)
    So, what are they trying to show? "Because we've implemented support for a certain MMU feature and native Linux hasn't, we've demonstrated that virtualizing Linux on L4 is a good idea"? Doesn't sound perfectly logical to me. Apples and oranges come to mind.
  • Welcome (Score:3, Funny)

    by Anonymous Coward on Wednesday May 31 2006, @09:33AM (#15435312)
    I for one welcome our new Fast Address-Space Switching overlords!