Forgot your password?
typodupeerror
Programming Linux

Con Kolivas Returns, With a Desktop-Oriented Linux Scheduler 333

Posted by timothy
from the dare-not-speak-its-name dept.
myvirtualid writes "Con Kolivas has done what he swore never to do: returned to the Linux kernel and written a new — and, according to him — waaay better scheduler for the desktop environment. In fact, BFS appears to outperform existing schedulers right up until one hits a 16-CPU machine, at which point he guesses performance would degrade somewhat. According to Kolivas, BFS 'was designed to be forward looking only, make the most of lower spec machines, and not scale to massive hardware. i.e. [sic] it is a desktop orientated scheduler, with extremely low latencies for excellent interactivity by design rather than 'calculated,' with rigid fairness, nice priority distribution and extreme scalability within normal load levels.'"
This discussion has been archived. No new comments can be posted.

Con Kolivas Returns, With a Desktop-Oriented Linux Scheduler

Comments Filter:
  • by Fizzl (209397) <fizzl@@@fizzl...net> on Sunday September 06, 2009 @05:39AM (#29330055) Homepage Journal

    from the dare-not-speak-its-name dept.

  • Re:Glory! (Score:5, Informative)

    by Trepidity (597) <delirium-slashdot@h a c k i sh.org> on Sunday September 06, 2009 @05:52AM (#29330111)

    My question is: is it in the kernel tree yet? Is this that 2.6.31 scheduler change I heard about earlier yesterday, or is it something Completely Different?

    No, and probably won't ever be, though perhaps some ideas will be borrowed.

    From his FAQ:

    Are you looking at getting this into mainline?

    LOL.

    No really, are you?

    LOL.

    Really really, are you?

    No. They would be crazy to use this scheduler anyway since it won't scale to
    their 4096 cpu machines. The only way is to rewrite it to work that way, or
    to have more than one scheduler in the kernel. I don't want to do the former,
    and mainline doesn't want to do the latter. Besides, apparently I'm a bad
    maintainer, which makes sense since for some reason I seem to want to have
    a career, a life, raise a family with kids and have hobbies, all of which
    have nothing to do with linux.

    Can it be made to scale to 4096 CPUs?

    Sure I guess you could run one runqueue per CPU package instead of a global
    one and so on, but I have no intention whatsoever at doing that because it
    will compromise the performance where *I* care.

    The "bad maintainer" part is referring to bad blood over the adoption of Ingo Molnar's CFS [kerneltrap.org] over Kolivas's own RSDL, in particular at least one LKML poster suggesting that, all else being equal, it'd be better to merge Molnar's code, as he was more likely to be a reliable maintainer (Molnar's more tied into the workings of the mainline kernel development/merging/etc.).

  • Re:Glory! (Score:5, Informative)

    by kav2k (1545689) on Sunday September 06, 2009 @05:54AM (#29330117)

    Oh yeah, and which other scheduler's, if any, did this guy write?

    SD scheduler [kerneltrap.org]

  • forward looking (Score:5, Informative)

    by Trepidity (597) <delirium-slashdot@h a c k i sh.org> on Sunday September 06, 2009 @06:02AM (#29330141)

    Took me a while to figure out what "forward looking" means in this context, since "forward-looking scheduler" doesn't seem to be common terminology, and I assumed he wasn't talking about his grand forward-looking vision for schedulerdom.

    Based on some previous arguments he's had, it sounds like he opposes the common heuristic of upping interactive process priority by keeping track of how long processes sleep--- processes that sleep a lot are probably I/O bound, and should get a priority boost so they can run on the (less frequent than for CPU-bound processes) occasions when they're ready. Kolivas wants schedulers to be forward-looking in the sense that they decide how to schedule without looking at process run history, by looking purely at who's ready to run, available timeslices, priorities, etc.

  • by Trepidity (597) <delirium-slashdot@h a c k i sh.org> on Sunday September 06, 2009 @06:06AM (#29330165)

    He means something different by it--- that the scheduler should only look forward, not look back to per-process history in making its scheduling decisions. A common hack/heuristic to improve interactive performance is to boost the priority of processes that sleep a lot, since CPU-bound jobs sleep rarely, while interactive processes sleep a lot. Kolivas think that's a hack that obscures the real problems with interactive performance, and leads to unpredictable performance since it doesn't fix the underlying issues. So wants to design schedulers with good interactive performance that make decisions based purely on the current set of running processes and priorities, and the upcoming timeslices.

  • Re:Glory! (Score:5, Informative)

    by Trepidity (597) <delirium-slashdot@h a c k i sh.org> on Sunday September 06, 2009 @06:21AM (#29330211)

    Yeah, that makes sense, but he seems to have taken it personally. It sounds like part of it stems from his feeling [lkml.org] that Molnar unnecessarily wrote a replacement using his ideas and got credit for it, instead of helping out to turn one of Kolivas's fair-scheduling proposals into something that could be merged. Though from what I can tell Molnar's replies are all pretty friendly, and he seemed keen to provide appropriate credit.

  • Re:Glory! (Score:2, Informative)

    by kojot350 (1330899) on Sunday September 06, 2009 @07:24AM (#29330367)
    "...While all that pervasive multithreading made for impressive technology demos and a great user experience, it could be extremely demanding on the programmer. BeOS was all about threads, going so far as to maintain a separate thread for each window. Whether you liked it or not, your BeOS program was going to be multithreaded."

    "GCD embodies a philosophy that is at the opposite end of the spectrum from BeOS's "pervasive multithreading" design. Rather than achieving responsiveness by getting every possible component of an application running concurrently on its own thread (and paying a heavy price in terms of complex data sharing and locking concerns), GCD encourages a much more limited, hierarchical approach: a main application thread where all the user events are processed and the interface is updated, and worker threads doing specific jobs as needed."
    Very good in-depth article btw. http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars/1 [arstechnica.com]
  • Re:Glory! (Score:5, Informative)

    by Hurricane78 (562437) <deleted @ s l a s h d ot.org> on Sunday September 06, 2009 @08:15AM (#29330541)

    What is that? You don't have the choice of scheduler in your kernel? I'm using the Zen sources [zen-sources.org], and I get to choose between least half a dozen schedulers, including other settings. I am certain that this scheduler will make it into that patchset, and that I will enable it, as soon as zen-sources-2.6.31 get installed on my system.

    After all this is Linux! Not some one-company-one-kernel monoculture!

  • by markdavis (642305) on Sunday September 06, 2009 @08:15AM (#29330547)
    Almost all runaway processes are due to bugs in the end applications, not some situation created by the kernel.
  • Re:great news (Score:3, Informative)

    by myvirtualid (851756) <pwwnow@ g m a i l .com> on Sunday September 06, 2009 @09:53AM (#29330999) Journal

    ...why does the summary [sic] i.e

    Because the 'i' should have been capitalized since it was the beginning of a new sentence. Had Kolivas written "hardware, i.e." there would be no sic.

  • Re:Who cares? (Score:3, Informative)

    by Mprx (82435) on Sunday September 06, 2009 @09:59AM (#29331033)

    Compiling with SSD vs. mechanical HD:
    http://anandtech.com/storage/showdoc.aspx?i=3631&p=25 [anandtech.com]

    Compiling is CPU bound.

  • Re:Who cares? (Score:3, Informative)

    by TheRaven64 (641858) on Sunday September 06, 2009 @10:31AM (#29331233) Journal

    Also, compilation is not all that I/O bound, it is more CPU bound.

    Depends a lot on what you're compiling. A typical program on OS X, for example, begins with #import <Cocoa/Cocoa.h>. This includes a header which brings in around a hundred other headers for a total of about 3MB of preprocessed source. Most of the time you'll be using a precompiled header for this, but you still often get a spike of read activity at the start of a compilation, then a CPU-bound chunk, then a write-bound part as it generates the object code. This is why, when you use -j, you are recommended to use a few more processes than you have cores, so you can overlap the I/O-bound parts in one compile with the CPU-bound parts in the next.

  • Re:Who cares? (Score:4, Informative)

    by TheRaven64 (641858) on Sunday September 06, 2009 @01:16PM (#29332467) Journal
    If you're interested, the clang team have done a lot of profiling of exactly what takes time when compiling. It's particularly interesting how much of a bottleneck preprocessing is with gcc and, more importantly, distcc (which sends the preprocessed sources over the network for compilation). Most of the results are on the web site, with a few in the mailing list archives.
  • Re:Glory! (Score:5, Informative)

    by shutdown -p now (807394) on Sunday September 06, 2009 @03:18PM (#29333489) Journal

    The same is pretty much true of .Net's Windows.Forms. It's a bit faster than Swing, although not by much (some parts are actually slower - System.Drawing vs Java2D, for example), so it's a little more forgiving of doing work in the UI thread. It will still bite you in a non-trivial application. Of course, the framework provides absolutely no help in writing a multithreaded application, and all of the tools, examples and documentation make writing a multi-threaded application far more difficult than it should be.

    Yes, and things like Control.Invoke [microsoft.com] to marshal invocations from background threads to UI, and especially BackgroundWorker [microsoft.com], which are there specifically to provide a high-level (i.e. without locks) API for worker threads, with progress reporting and cancellation, must be just figments of my imagination?

    Have you actually written any WinForms code in .NET 2.0+?

  • Re:4096 cpu machines (Score:3, Informative)

    by SL Baur (19540) <steve@xemacs.org> on Sunday September 06, 2009 @07:07PM (#29335093) Homepage Journal

    If Cons new scheduler is as good as he tries to paint it, build kernel and use it in thousands.

    Ingo did some benchmarking. The following landed in my lkml mailbox about an hour ago:

    hi Con,

    I've read your BFS announcement/FAQ with great interest: ...

    As can be seen in the graph BFS performed very poorly in this test:
    at 8 pairs of tasks it had a runtime of 45.42 seconds - while
    sched-devel finished them in 3.8 seconds.

    I saw really bad interactivity in the BFS test here - the system
    was starved for as long as the test ran. I stopped the tests at 8
    loops - the system was unusable and i was getting IO timeouts due
    to the scheduling lag:

      sd 0:0:0:0: [sda] Unhandled error code
      sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
      end_request: I/O error, dev sda, sector 81949243
      Aborting journal on device sda2.
      ext3_abort called.
      EXT3-fs error (device sda2): ext3_journal_start_sb: Detected aborted journal
      Remounting filesystem read-only

    I measured interactivity during this test:

        $ time ssh aldebaran /bin/true
        real 2m17.968s
        user 0m0.009s
        sys 0m0.003s

    A single command took more than 2 minutes. ...

    (Lots of text elided) Apparently he did a lot of benchmarking and BFS didn't fare very well. Ah well.

    I hope this time Con takes him on. Competition Is Good and concentration on desktop interactivity is certainly high on my wishlist of desired optimizations.

  • Re:great news (Score:5, Informative)

    by pthisis (27352) on Monday September 07, 2009 @04:28AM (#29337931) Homepage Journal

    The impression I've gotten from reading various past "Con threads", is that while he tries in the beginning, Con doesn't deal well with this process; he can't keep his ego submerged, gets frustrated, and everything (perhaps including Con himself last time I read one of these threads) ends up unravelling.

    Agreed; Con seems not to be able to work well in the process.

    e.g. Ingo ran a bunch of benchmarks on BFS and made a long post to LKML explaining his results, that, while critical of its performance on a series of benchmarks, bent over backwards to be very polite in tone, with things like:

    First and foremost, let me say that i'm happy that you are hacking the Linux scheduler again. It's perhaps proof that hacking the scheduler is one of the most addictive things on the planet ;-) ...

    General interactivity of BFS seemed good to me - except for the pipe test when there was significant lag over a minute. I think it's some starvation bug, not an inherent design property of BFS, so i'm looking forward to re-test it with the fix. ...
    I hope to be able to work with you on this, please dont hesitate sending patches if you wish - and we'll also be following BFS for good ideas and code to adopt to mainline.

    And Con responded with a very defensive and confrontational tone:
    I'm not interested in a long protracted discussion about this since I'm too busy to live linux the way full time developers do, so I'll keep it short, and perhaps you'll understand my intent better if the FAQ wasn't clear enough.

    Do you know what a normal desktop PC looks like? No, a more realistic question based on what you chose to benchmark to prove your point would be: Do you know what normal people actually do on them?

    Feel free to treat the question as rhetorical.

    Full exchange here:
    http://thread.gmane.org/gmane.linux.kernel/886319 [gmane.org]

  • Re:great news (Score:3, Informative)

    by smash (1351) on Monday September 07, 2009 @05:49AM (#29338327) Homepage Journal
    Well, he has a point.

    For desktop use, I doubt many users *care* whether or not they drop some percentage of throughput on interactive apps, if it means that processes actually run "properly" (eg, video playback, gaming, audio processing, etc).

    ingo benchmarking some abstract processes that no desktop user would actually run day to day merely reinforces con's point.

    Yes, con may have come off as a bit of an arse, but given his previous "do not contact me regarding kernel matters" posting to LKML, only to be e-mailed with benchmarks on non-desktop hardware, performing non-desktop tasks that shows CFS to be "superior", I'm not surprised.

    I'm guessing he's pretty much "over" banging his head against the wall, trying to get people to "see the light" (or understand that the point is improving interactivity, rather than benchmark numbers).

  • Re:Glory! (Score:3, Informative)

    by Chrisq (894406) on Monday September 07, 2009 @06:34AM (#29338507)

    That's not two schedulers, it's just some tunables. See pages 391 to 444 of Windows Internals, 5th Edition (or comparable pages in earlier editions).

    I'd mod you informative, given that this is Linus's preferred option this is an important distinction

  • Re:great news (Score:3, Informative)

    by BhaKi (1316335) on Monday September 07, 2009 @08:03AM (#29338883)

    Alternatively, maybe pluggability would have to be done with self modifying code which left no indirection in place?

    No. In most modern CPU architectures, schedulers are implemented by handling a timer interrupt. The address for the handling code is put in the interrupt vector table during kernel start-up. For implementing pluggable-scheduling, all you need to do is to change the contents of the interrupt vector table. Once that is done, scheduling happens the same way as when there's only a single scheduler. So no. It doesn't require self modifying code and it's not a performance overhead to have pluggable schedulers.

What this country needs is a dime that will buy a good five-cent bagel.

Working...