Forgot your password?
typodupeerror
Linux Software

Non-Deathmatch: Preempt v. Low-Latency Patch 178

Posted by timothy
from the floor-wax-and-dessert-topping dept.
LiquidPC writes: "In this whitepaper on Linux Scheduler Latency, Clark Williams of Red Hat compares the performance of two popular ways to improve kernel Linux preemption latency -- the preemption patch pioneered by MontaVista and the low-latency patch pioneered by Ingo Molnar -- and discovers that the best approach might be a combination of both."
This discussion has been archived. No new comments can be posted.

Non-Deathmatch: Preempt v. Low-Latency Patch

Comments Filter:
  • by BrianGa (536442) on Thursday March 21, 2002 @11:26PM (#3205568)
    Check out this [gatech.edu]
    comprehensive guide to Linux Latency.
  • by cs668 (89484) <cservin@cromagnon.com> on Thursday March 21, 2002 @11:52PM (#3205683)
    I agree with you except there is a critical difference between assuming that this would be the case and demonstrating that the assumptions are true.

    Clark Williams did a lot of work to prove that the assumptions you would have when looking at combining the two patches hold.
  • by Anonymous Coward on Friday March 22, 2002 @12:13AM (#3205750)

    Ingo Molnar's O(1) scheduler was integrated
    into the development tree back around Linux 2.5.4
    So it's already in there.

    Preemption was integrated about the same time.
  • by A Commentor (459578) on Friday March 22, 2002 @12:33AM (#3205811) Homepage
    The low-latency patches had a maximum recorded latency of 1.3 milliseconds, while the preemption patches had a maximum latency of 45.2ms.

    A 2.4.17 kernel patched with a combination of both preemption and low-latency patches yielded a maximum scheduler latency of 1.2 milliseconds, a slight improvement over the low-latency kernel. However, running the low-latency patched kernel for greater than twelve hours showed that there are still problem cases lurking, with a maximum latency value of 215.2ms recorded. Running the combined patch kernel for more than twelve hours showed a maximum latency value of 1.5ms.

    So after only 12, the low-latency patch degraded by an ungodly amount (1.3 -> 215.2 ms)!! and even the combined patch had a 25% degraded performance(1.2 -> 1.5 ms)!

    Embedded systems must have a very high uptime, it's not acceptable to reboot the machine every day to maintain performance. Many embedded systems require a downtime of less than 5 minutes per year. That doesn't give you much time to reboot the machine just for performance issues.

  • by Fluffy the Cat (29157) on Friday March 22, 2002 @01:04AM (#3205857) Homepage
    So after only 12, the low-latency patch degraded by an ungodly amount (1.3 -> 215.2 ms)!!

    You're misinterpreting the figures. After a short benchmarking, the worst figure recorded was 1.3ms. After the machine had been left up for 12 hours (thereby allowing there to be much more time for something odd to crop up), the worst figure recorded was 215.2ms. That doesn't mean that the performance had degraded - it means that over the course of those 12 hours, something happened that caused latency to peak at 215.2ms. It might be something that happens once every 12 hours, for instance.
  • by SurfsUp (11523) on Friday March 22, 2002 @03:07AM (#3206113)
    It might even be something that happens just as often on with the combination patch as with the low-latency patch, except the combo got lucky.

    If you'd actually read the article you'd know that this can't happen with the preempt patch + low-latency, not unless a spinlock gets jammed, then you have much worse problems. The preempt patch takes care of scheduling events that occur during normal kernel execution (and it does this much more reliably than the low latency patch) but since preemption isn't allowed while spinlocks are held, it can't do anything about latency due to spinlocks. This explains the apparently worse performance of the preempt patch - you're just seeing the spinlock latency there.

    The low latency patch breaks up the spinlocks with explicit scheduling points, which is pretty much the only approach possible without really major kernel surgery. That's why the combination works so well. In fact, the parts of the low latency patch that aren't connected with breaking up spinlocks aren't doing anything useful and should be omitted. The worst-case performance won't change.
  • QNX vs. Linux (Score:1, Informative)

    by Anonymous Coward on Friday March 22, 2002 @03:55AM (#3206208)
    I would like to see similar response graphs for QNX or other RTOS's for comparisons sake.

    Anyway IMHO to make a real assesment for any 'hard' realtime tasks is much too much effort for most of the readers here. =)

    But here are more white papers than you can shake a stick at....

    http://www.ece.umd.edu/serts/bib/index.shtml
  • by jawahar (541989) on Friday March 22, 2002 @04:24AM (#3206269) Homepage Journal
    Well, Windows CE 3.0 provides 50 ms latency response time running on a 166 MHz Pentium.
  • by mikera (98932) on Friday March 22, 2002 @06:12AM (#3206423) Homepage Journal
    Some very thoughtful analysis clearly went into this. It's well written up with explanations that hit the right balance of having the key technical details but focusing on the big picture of how to make applications run better under Linux. As a casual follower of kernel development, I now understand far more of the trade-off than I used to.

    I always think that tests and write-ups like this are a great way that people can contribute to Linux development without having to hack the kernel directly. There's no substitute for a thorough testing to help you improve your designs and theories.

    Nice job!
  • Another article (Score:2, Informative)

    by Onan The Librarian (126666) on Friday March 22, 2002 @08:58AM (#3206777)
    I wrote an article about low-latency for audio
    applications under Linux, you can read it here if interested:

    http://linux.oreillynet.com/pub/a/linux/2000/11/ 17 /low_latency.html

    It's more of a hands-on article, tells you how
    to do it yourself with Andrew Morton's patches.
  • by sagei (131421) <rlove@nOSPAM.rlove.org> on Friday March 22, 2002 @09:02AM (#3206789) Homepage
    First, I wanted to give my view of the results - what they mean and what that means. Note there are multiple notions of latency performance. Average latency and worst-case latency, among others, but those are most important. This test measured worst-case latency. Both are important - for user experience average case is very important and for real-time applications worst-case is very important.

    It is not a surprise the low-latency patches scored better, or that the ideal scenario was using both. The preemptive kernel patch is not capable of fixing most of the worst-case latencies. This is because, since we can not preempt while holding a lock, any long durations where locks are held now become our worst-case latencies. We have a tool, preempt-stats [kernel.org], that helps us find these. With the preempt-kernel, however, average case latency is incredibly low. Often measured around 0.5-1.5 ms. Worst-case depends on your workload, and varies under both patches.

    Now, the results don't mention average case (which is fine), but keep in mind with preempt-kernel it is much lower. The good thing about these results are that it does indeed show that certain areas have long-held locks and the preempt-kernel does nothing about them. Thus a combination of both gives an excellent average latency while tackling some of the long-held locks. Note it is actually best to use my lock-break [kernel.org] patch in lieu of low-latency in combination of with preempt-kernel, as they are designed and optimal for each other (lock-break is based on Andrew's low-latency).

    So what is the future? preempt-kernel is now in 2.5 and, as has been mentioned, Andrew and I are working on the worst-case latencies that still exist. Despite what has been mentioned here, however, we are not going to adopt a low-latency/lock-break explicit schedule and lock-breaking approach. We are going to rewrite algorithms, improve lock semantics, etc. to lower lock-held times. That is the ease and cleanliness of the preemptive kernel approach: no more hackery and such to lower latency in problem areas. Now we can cleanly fix them and voila: preemption takes over and gives us perfect response. I did some lseek cleanup in 2.5 (removed the BKL from generic_file_llseek and pushed the responsibility for locking into the other lseek methods) and this reduced latency during lseek operations -- a good example.

    So that is the plan ... it is going to be fun.

  • by Nick Barnes (11927) on Friday March 22, 2002 @10:47AM (#3207350)
    Now, the results don't mention average case (which is fine), but keep in mind with preempt-kernel it is much lower.

    The results do mention the average latency. For the vanilla kernel it is 88.3 microseconds. For the low-latency patch it is 54.3 microseconds. For the preemption patch it is 52.9 microseconds. Is 52.9 much lower than 54.3?

  • by by Linus Torvalds (568277) on Friday March 22, 2002 @11:56AM (#3207774)
    Well actually we have been discussing this recently on the kernel mailing lists. I am currently deciding whether this should be incorporated into the main tree, but am concerned that it may lower throughput.

    Alan has suggested I include both patches into the next 2.5 release (though there is quite a lot going on there so it may not make it in until the next one after that) and it will be fun to see the effects on latency and throughput, especially with the new I/O subsystem, in widespread use on various architectures; Clark Williams only compared the patches on single processor machines for example, where we have to pay special attention to the various SMP archtectures out there.

    But remember! Linux is not a RTOS and I have no intention of making it one, although there are forked kernels that do exactly that.

Somebody ought to cross ball point pens with coat hangers so that the pens will multiply instead of disappear.

Working...