Forgot your password?
typodupeerror
Linux Software

Torvalds Explains Scheduler Decision 411

Posted by kdawson
from the it's-the-guy-not-the-code dept.
Firedog writes "There's been a lot of recent debate over why Linus Torvalds chose the new CFS process scheduler written by Ingo Molnar over the SD process scheduler written by Con Kolivas, ranging from discussing the quality of the code to favoritism and outright conspiracy theories. KernelTrap is now reporting Linus Torvalds' official stance as to why he chose the code that he did. 'People who think SD was "perfect" were simply ignoring reality,' Linus is quoted as saying. He goes on to explain that he selected the Completely Fair Scheduler because it had a maintainer who has proven himself willing and able to address problems as they are discovered. In the end, the relevance to normal Linux users is twofold: one is the question as to whether or not the Linux development model is working, and the other is the question as to whether the recently released 2.6.23 kernel will deliver an improved desktop experience."
This discussion has been archived. No new comments can be posted.

Torvalds Explains Scheduler Decision

Comments Filter:
  • Re:good for you (Score:4, Informative)

    by larry bagina (561269) on Saturday July 28, 2007 @03:51PM (#20025499) Journal
    It's not closed source. It's available as part of OpenSolaris (CDDL). FreeBSD didn't have a problem integrating it.
  • by larry bagina (561269) on Saturday July 28, 2007 @03:54PM (#20025517) Journal

    CFS = completely fair scheduler

    SD = staircase deadline.

    That probably didn't clarify anything :/

  • by loserMcloser (748327) on Saturday July 28, 2007 @04:06PM (#20025637)
    The official kernel site [kernel.org] says 2.6.23 is only on release candidate 1.
  • by irwiss (1122399) on Saturday July 28, 2007 @04:20PM (#20025747)
    "If a scheduler makes games better but hurts general server performance..."

    IIRC that is the reason Con together with another person, whose name I can't
    can't be bothered to look up, wanted to merge plugsched to which they got a
    reply along the lines of "too much choice will split contributors" [kerneltrap.org] or some such

    And then Ingo turns around on himself, and claims something along the lines of
    "Oh okay, you should work on plugsched, may be it'll get merged" [kerneltrap.org]
  • Re:Nerds (Score:1, Informative)

    by Anonymous Coward on Saturday July 28, 2007 @04:26PM (#20025799)
    Oh, I'm sorry. Please go under your preferences and change them to show apple stories only.
  • by Anonymous Coward on Saturday July 28, 2007 @04:32PM (#20025855)
    I think the apparent rudeness stems from something deeper than a mere incomplete mastery of english.

    Linus is (as I am) a Finn by birth. No matter how long he has been abroad, he still follows Finnish habits and speech patterns at least to a degree. And they differ significantly from the west european tradition. For example, small talk is considered unnecessary or even rude in some situations. Getting to the point is a virtue in any conversation. To someone not familiar with this pattern, it will sound unfriendly! It's a two-way street: to me many english speakers sound terriby smarmy and guarded.

    Of course, Linus is apparently also rather clever. The downside of cleverness is for many having little tolerance for fools, real or percieved.

    An AFC (Anonymous Finnish Coward)
  • Re:Why not both? (Score:5, Informative)

    by MBCook (132727) <foobarsoft@foobarsoft.com> on Saturday July 28, 2007 @05:10PM (#20026207) Homepage

    Linux doesn't support that, as far as I know. There are variables you can tune though. More on this later.

    Something like that is very risky. Where as a filesystem can be used or not, and the code is only hit when accessing it, the scheduler is used constantly. If the scheduler could be switched at runtime, that means that either you have to have some kind of if statement on every scheduler entry point, or hide it all behind a pointer and a structure. Either one isn't as efficient as just having it hard wired in. You also have the complexities of being able to hand stuff off from one scheduler to another. Also, debugging get much harder (you have problem with slowness X, now which of the 3 schedulers are you using? Which version? What are the variables set to?).

    As for selectable at compile time, that means you have a have a well designed interface that lets you swap things out. That means it either has to be generic, or would favor one scheduler to the detriment of others. Sometimes this tradeoff is acceptable, sometimes it isn't.

    Now my understanding on this is that Linux doesn't support plugging in full schedulers. There were patches for that a few years go. Linus and others (Ingo especially, I think) said no, and the patches never made it in. Recently a system was developed that would allow a part of the scheduler to be plugged in. This way it could be better tuned for different workloads, without the full detriment of a full pluggable scheduler. This was done recently, and they were called out on this flip and explained quite well how they were a little hard, and this was a little different.

    Go read LWN [lwn.net]'s kernel pages. They talked about this in the last month or so, so it should be available to non-subscribers by now (although you should subscribe, they're great).

  • by the_greywolf (311406) on Saturday July 28, 2007 @05:18PM (#20026283) Homepage

    A few moments ago, Linus posted a message explaining why he rejected plugsched: He detests politically-motivated code.

    So I absolutely detest adding code for "political" reasons.

    I personally feel that modal behaviour is bad, so it would introduce what is in my opinion bad code, and likely result in problems not being found and fixed as well (because people would pick the thing that "works for them", and ignore the problems in the other module).

    And while I disagree with the choices he made regarding SD and plugsched, I do see his point: Each of those schedulers would become too specialized and would ignore the issues exposed in other workloads. SD works extremely well in most situations, but has trouble in specific high-load server environments. CFS, likewise, has problems with desktop usage. Both developers had different goals.

    (I just think Con met goals beyond his own better than Ingo did.)

  • Re:good for you (Score:5, Informative)

    by QuoteMstr (55051) <dan.colascione@gmail.com> on Saturday July 28, 2007 @05:24PM (#20026331)

    Since Linux is released under the GPL, every module must be under GPL compatible licenses, irrespective of whether they contain, or depend on, any GPL'd code.


    That's not true [wikipedia.org]. Non-GPLed kernel modules are "tolerated" by the Linux kernel developers, and in principle, a ZFS module could be created and loaded with no problems, assuming it doesn't rely on GPL-only symbols. AFAIK, the VFS doesn't have many of those.

    What can't happen under the CDDL is the ZFS code being included in the kernel source tree the same way XFS, ext3 and so on are. That doesn't mean you can't maintain and distribute a module separately! The only reason a ZFS module doesn't exist today is that nobody's gone through the trouble of creating one.
  • by acidrain (35064) on Saturday July 28, 2007 @05:32PM (#20026379)

    Whether Con was aware of it, when he tried to integrate into mainline Ingo was his main customer. Specifically the person he was trying to deliver work to. And Con committed the cardinal sin of telling a customer that the customer was wrong about what he wanted. Even if Ingo were too coked up to operate a keyboard reliably and had it all wrong, trying that never seems to work.

    Did Con gain anything by refusing to re-introduce the hack to get X working the way it had previously under load? Even if he'd just put in a #define that allowed it, and then spent the next year arguing to take it out, there wouldn't have been this breakdown.

  • by bconway (63464) on Saturday July 28, 2007 @06:03PM (#20026641) Homepage
    This whole article is a sick read. Con never claimed SD was perfect. And he argued with people who said his argument and ideas were flawed (Ingo, etc), who denied there were scheduling problems with their p4 3ghz 2gb RAM machines, and incidentally those very same people turned around and practically copied the whole concept.

    Anybody who subscribed to the -ck mailing list will be very aware how receptive Con was to bug reports and it's quite disgusting to see Linus make such sweeping statements to the contrary. Sadly, since Linus' word is gospel - even if he is speaking utter shit - then Con will get publicly slammed by people like you who think it's fine to comment on what they don't know about.

    Linus is trolling with that email and now people who don't know the situation will simply take his word for it. This is exactly why Con gave up.

    The LKML has failed to acknowledge it's problems yet again....
  • by Anonymous Coward on Saturday July 28, 2007 @06:56PM (#20026999)

    No help up, no sorry, just "Ho!".
    That's "oho!", not "ho!". It's a bit like "ohh!" or something, i think.

    And I don't know if it's only here in Finland, but generally whenever we fall we just try to get up as fast as possible, proceed with whatever we were doing acting like nothing happened, and hope that nobody noticed this embarrassing situation. So somebody helping us up and repeatedly saying how sorry he/she is would just make it worse ;)

    -- Another Anonymous Finnish Coward
  • by Anonymous Coward on Saturday July 28, 2007 @07:07PM (#20027079)
    Quoting someone on ck mailing list, i think this is worth reading:
    ---8----

    I don't really want to keep all that -ck flamewar going but this sum-up is
    a little strange for me:

    If Con was thinking SD was "perfect" why he released 30+ versions of it?
    And who knows how many versions of his previous scheduler?

    Besides Con always tried to help people and improve his code if some bugs
    or problems were reported. Archives of this list prove that. I reported
    several problems (on list and privately) and all were fixed very fast and
    with very kind responses. I had run -ck for months and years and it was
    always very stable (I remember one broken "stable" version).

    I don't know what exactly are you refering to when you say about those
    unaddressed reports but maybe it depends on who was asking, how and to do
    what (for example - purely theoretical one, I don't remember exact emails
    you refering to so I am not saying it happened - stating at the beginning
    that the whole design is unacceptable and interactivity hacks are a
    must-have won't make a friend from any maintainer and for sure lowers his
    desire to get anything fixed for that guy). Or maybe Con had some bad day
    or was depressed. Happens. But I really don't remember Con ignoring too
    many valuable user reports in last 3 years...

    And no - I am not thinking that SD was "perfect". Nothing is perfect,
    especially not software. But it was based on months and years of Con's
    experience with desktop and gaming workloads and extensively tested in
    similar uses by _many_ others. In nearly all possible desktop
    configurations, with most games and all video drivers. This is why it was
    perfectly designed and tuned for such workloads while still being general
    enough and without any ugly hacks. And because of these tests and Con's
    believe that the desktop is very (most?) important all bugs and problems
    in this area were probably killed long ago. I think even design was
    changed and tuned a little at the early stages to help solve such
    interactivity/dekstop/gaming problems.

    So it does not surprise me that CFS is worse in such workloads (at least
    for some people) because I strongly suspect that the number of people who
    played games with current version of CFS is limited to about 5, maybe 10.
    And I also suspect that you (and Ingo) will get many regression reports
    when 2.6.23 is released (and months later too... or maybe you won't
    because users will be to "scared" to report such hard to mensure and
    reproduce "unimportant" bugs). Hopefully such problems when reported will
    be addressed as soon as they can. And hopefully they will be easy enough
    to solve without rewriting or redesigning CFS and causing that way even
    more regressions in other areas. If not people will probably be patching
    O(1) scheduler back privately...

  • by Anonymous Coward on Saturday July 28, 2007 @07:35PM (#20027295)

    Con never claimed SD was perfect.
    Care to share with the group where exactly Linus says he did?
    In the article. That's the whole point of the article:

    "People who think SD was 'perfect' were simply ignoring reality," Linus Torvalds began in a succinct explanation as to why he chose the CFS scheduler written by Ingo Molnar instead of the SD scheduler written by Con Kolivas. He continued, "sadly, that seemed to include Con too"
  • by paleshadows (1127459) on Saturday July 28, 2007 @09:28PM (#20028005)

    A few months before Ingo wrote the O(1) scheduler, he flamed anyone who dared to suggest that an O(n) scheduler is a bad idea. He was *very* aggressive about it, going on and on about why O(n) is best and how O(1) would be worthless. Using Linus's words (about Con), Ingo "ended up arguing against people who reported problems [scheduler linearity], rather than trying to work with them". It therefore seems a bit strange that Linus uses this statement to describe Con, arguing this is why he favors Ingo...

    Importantly, Ingo was dead wrong back then (indeed, this is why months after, Ingo came up with the O(1), announcing it as if it was his idea and as if nothing ever happened, not *ever* saying something like "I was wrong, sorry for the flames").

    In contrast, Con was right in refusing to pollute the design of SD with Ingo's unfairness discipline. (This is what Linus referred to when he made the "arguing against" statement.) And what do you know? A few years after, Ingo comes up with a "Completely Fair Scheduler"...

    I'm in scheduling research for many years. I followed the long Linux scheduling saga, which actually started way before Con was in business. Please believe when I tell you: Linus comments about Con are ludicrous, and petty. This is not Linus's finest hour.

    Note however that this does not mean that Linus made the wrong decision: Even though SD is somewhat better than CFS, Ingo is orders of magnitude a better programmer than Con, orders of magnitude more knowledgeable, he gets paid to do the work, has gotten along with Linus for years, and will eventually make CFS as good as SD and even better. This is the real reason for Linus's decision. (Or at least, it should be.)

    But the stuff Linus said about Con... well, that's just Linus being small.

  • by trybywrench (584843) on Saturday July 28, 2007 @09:46PM (#20028169)
    I will say this, in the mid-'90s I used X-windows under Unixware on *Pentium 1s* as a desktop machine. I now use X-windows under Linux on a Pentium 4 (with 5-10x more main memory) as a desktop machine. I would argue that my desktop user experience is as problematic now as it was then *despite* the hardware improvements.

    I'm late to the thread but i could not agree more. I got into linux in '95 and used it as a desktop then with fvwm and then later with Afterstep. After that I ended up operating linux from telnet then ssh exclusively. I recently tried out a desktop linux (Ubuntu something.. don't remember which) and i was amazed how much things have stayed the same. Sure, gnome and a gui installer are nice but it's still feels, and runs, like the same old desktop os from '95
  • Re: Agreed... (Score:1, Informative)

    by Anonymous Coward on Saturday July 28, 2007 @10:27PM (#20028469)
    This is written by Linus in one of his emails to LKML:
    > That was where the SD patches fell down. They didn't have a maintainer
    > that I could trust to actually care about any other issues than his own.

    I have been on Con's SD mailing list for >2 years now, and not once ever have i seen this to be the case.
    He has always (time, etc permitting) followed up on user reported problems.

    So... what happens if Ingo suddenly dissapears?
    Im sure others will continue his work.
    The same should be said for Con's scheduler also, which makes Linus's comments even stranger.
  • Re:Why not both? (Score:1, Informative)

    by Anonymous Coward on Sunday July 29, 2007 @12:06AM (#20029089)
    Well, Con did offer a pluggable scheduler patch. That was rejected on the ground that the line count was positive, i.e. the existing scheduler plus the plug-in code had more lines than just the existing scheculer. NO SHIT! I mean that's the PHB argument if I ever heard one (and I did. Loads).

    Frankly,

    1a) Linus rants gainst Subversion, because it's not his way of doing shit, therefore it's crap.
    1b) Con's pluggable scheduler is rejected for the reason outlined above.
    2) "Ingo's" new scheduler - aka my stuff was shite, but I'd rather die than accept somebody else's code so I'll redo it.
    3) Ingo defending his new scheduler as being O(1) - when it's NOT - alleging that if you give an upper bound to n, O() has a limit as well, which is at best dishonest, when he is the one who marketed his previous contraption as being O(1).
    4) Now this.

    There's something really wrong with the (some of the) core Linux kernel developers. Ego I guess.

    The irony is that when you have a more centralized repository (svn), you actually get a more distributed decision making, whereas the whole distributed repository crap ends up with just one ource of releases.
  • by Anonymous Coward on Sunday July 29, 2007 @12:09AM (#20029111)
    Here is what Linus said in lkml

    lkml full quote of Linus:
    > People who think SD was "perfect" were simply ignoring reality. Sadly,
    > that seemed to include Con too, which was one of the main reasons that I
    > never ended entertaining the notion of merging SD for very long at all:
    > Con ended up arguing against people who reported problems, rather than
    > trying to work with them.
    (from Linus's post to lkml [lkml.org])

    The other misrepresentation is in this quote
    > As a long-term maintainer, trust me, I know what matters. And a person who
    > can actually be bothered to follow up on problem reports is a *hell* of a
    > lot more important than one who just argues with reporters.
    Because Kolivas was quite good with feedback--arguably better than Ingo or Linus (for example with Linus vs. SCSI-emulation cdrecord)--and recently had the single instance to which Linus refers. Ironically, Kolivas rejected a request to have code renice X processes for the same reason Linus rejected a request to keep SCSI emulation as de-facto: the design and code is cleaner and more correct. In fact, while Con would argue about small design issues and change his views, this renicing instance (some people called out as trolling because of the insistence and seeming insincerity) is the only time I've seen Con (and I have followed his work since he started) flatly and repeatedly reject a request.

    There is merit in Linus's argument that a comparison of CFS and SD showed no "significant difference" in performance.

    My personal take is that several years of minor spats between Ingo and Con made a better -ck patchset and -mm patchset, but never brought -ck closer to mainline. It came down to the good-old-boy system where Linus knew and trusted Ingo better than Con, and if there was a major disagreement, Ingo's side would be favored. But, honestly, Linus could also be considering that Ingo's resume has always been a programmer's resume (even if Ingo was one of the youngest maintainers at his start) while Con is a self-taught programmer (just to improve kernel responsiveness nonetheless!) with a primary passion as a physician. If Con decides not to continue with the -ck patchset, I am sure he will refocus the extra dedication and time to new patients.
  • Re:Excuse me? (Score:3, Informative)

    by dbIII (701233) on Sunday July 29, 2007 @12:32AM (#20029249)
    The third bit SHOULD be obvious - if you run the thing all the time you will know how it behaves. If you run somebody else's work all the time you know how that behaves instead and seriously limit your exposure to the thing you are working on. A second low end box is cheap or even sitting there to be taken for free in a storeroom.
  • by 12357bd (686909) on Sunday July 29, 2007 @04:11AM (#20030241)

    I fully agree with the first part of your post, but I don't buy this 'Con's reaction justify Linus judgment' argument:

    1) It seems clear on the mailing lists that Con was really a good maintainer (only one 'problem' reported).

    2) Con's reaction seems quite understandable, after a very long time working on a project you see it not only refused by the maintainers (perfecty ok on that), but suddenly 'copied/inspired/wahtereryouwant' by that very same maintainer (not quite right).

    EGO leads to sectarism, that's all. The problem is that it seems that Con's scheduler was very good at gamming, and it's a shame that Linux dimissed a good piece of code on a specially sensible area for personal motives.

  • by makomk (752139) on Sunday July 29, 2007 @07:45AM (#20031073) Journal
    Con's scheduler had been under development for years, and had been tested and tweaked for good performance. The scheduler that was merged into mainline was basically written from scratch by Ingo in a couple of days, but IIRC was in line for merging from the start, despite the lack of testing.

APL is a write-only language. I can write programs in APL, but I can't read any of them. -- Roy Keir

Working...