Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
Software Linux

Linux Gets Completely Fair Scheduler 274

SchedFred writes "KernelTrap is reporting that CFS, Ingo Molnar's Completely Fair Scheduler, was just merged into the Linux kernel. The new CPU scheduler includes a pluggable framework that completely replaces Molnar's earlier O(1) scheduler, and is described to 'model an "ideal, precise multi-tasking CPU" on real hardware. CFS tries to run the task with the "gravest need" for more CPU time. So CFS always tries to split up CPU time between runnable tasks as close to "ideal multitasking hardware" as possible.' The new CPU scheduler should improve the desktop Linux experience, and will be part of the upcoming 2.6.23 kernel."
This discussion has been archived. No new comments can be posted.

Linux Gets Completely Fair Scheduler

Comments Filter:
  • crap (Score:5, Funny)

    by cachimaster ( 127194 ) on Tuesday July 10, 2007 @09:58PM (#19820885)
    just finished make xconfig;make from 2.6.22!
    • Re:crap (Score:4, Funny)

      by cachimaster ( 127194 ) on Tuesday July 10, 2007 @10:05PM (#19820937)
      Is not fair!
    • by EmbeddedJanitor ( 597831 ) on Tuesday July 10, 2007 @10:05PM (#19820943)
      Should take you less than 15 minutes to get there again.

      If you really want a rough time, see how long it takes to rebuild a different OS.

  • by EmbeddedJanitor ( 597831 ) on Tuesday July 10, 2007 @09:58PM (#19820887)
    For the really touchy-feely OS out there!
  • Neato (Score:4, Insightful)

    by friedman101 ( 618627 ) on Tuesday July 10, 2007 @09:59PM (#19820899)
    What sort of gain can the typical linux user expect because of this?
  • Process Neutrality? (Score:5, Interesting)

    by Speare ( 84249 ) on Tuesday July 10, 2007 @10:03PM (#19820917) Homepage Journal

    I know enough about process scheduling to fill a ketchup cup at the nearest burger joint, but it struck me that this sounds like the debate about "network neutrality" vs "tiered service." The O(1) was supposed to be a very generic decision-making system that made a decision in a very agnostic way (to simplify the work down to a predictable consistent order of work). This CFS strikes me as a system which will have a much higher level of complexity and context awareness, which sounds like some processes will get more than others. The intention is to make it fair in the real world but not necessarily balanced, since not all processes are alike in their needs or expectations of task switching.

    This is just rambling on, and admittedly it may be straining a metaphor too far, so don't go crazy biting my head off for not knowing all things about the kernel. See 'ketchup cup' above.

    • by Anonymous Coward on Tuesday July 10, 2007 @10:22PM (#19821035)
      Sort of. Scheduling algorithms are very important for routers too. So there is an analogy. But the analogy isn't with a tiered internet. It's with protocol based QoS. For instance, VoIP requires very low latency, but BitTorrent doesn't. So shaping traffic so that VoIP stuff gets handled by a router first (while minimally affecting BitTorrent) improves the quality of service. On the kernel scheduling side of the analogy, some software needs to have quick access to the processor, often, but for short periods of time. A GUI interface is an example. Real-time software is a more important example.

      A tiered internet is something else entirely.
      • Re: (Score:3, Interesting)

        by mc6809e ( 214243 )

        Scheduling algorithms are very important for routers too. So there is an analogy. But the analogy isn't with a tiered internet. It's with protocol based QoS.

        The analogy with a tiered internet is fine, provided you look back far enough in the history of computers.

        Before personal computers became common, people got a lot of their work done by renting time on mainframes. People that wanted cheap CPU cycles had their jobs wait for spare cycles. Those that needed immediate answers paid more and their jobs got a

    • by DreadSpoon ( 653424 ) on Tuesday July 10, 2007 @11:16PM (#19821377) Journal
      I think you have this TOTALLY backwards.

      The old scheduler was filled with huge chunks of complex code to try to guess at which processes were interactive and such, and would then specially treat those processes differently when scheduling.

      The CFS does none of that. It schedules all processes the same, in a completely fair manner, and doesn't have any special logic in it that tries to classify processes at all, other than nice levels.

      The part yet to be merged is the process grouping, which again isn't anything like the interactivity guessing code. It's just a simple way to say "these processes belong together, so when you do the CPU scheduling, treat them as a single group." It's basically just a weighting mechanism with a logical container.
    • Re: (Score:2, Insightful)

      by Anonymous Coward
      The crucial difference is that the CFS gives/takes processor time to/from processes purely based on improving the overall processor throughput and "responsiveness" of the system. CFS makes its decisions exclusively from a technical standpoint.

      This is not comparable to tiered network service because tiered network service makes decisions on what data packets to carry/drop based on political, legal, and business policies.

      As an example, CFS will probably place a low priority on background backup tasks and a hi
  • Prediction ... (Score:3, Informative)

    by Gopal.V ( 532678 ) on Tuesday July 10, 2007 @10:04PM (#19820929) Homepage Journal

    I've sort of gazed for a few seconds at the CFS articles and the following phrase caught my attention the most

    it uses a time-ordered rbtree to build a 'timeline' of future task execution

    But more importantly, I think the factor which'll probably sway me the most is /proc/sys/kernel/sched_granularity_ns. Except I've been salting my config options with one true test [slashdot.org] ... that kind of thing makes you paranoid about random tune-ups :)

  • by SoVeryTired ( 967875 ) on Tuesday July 10, 2007 @10:11PM (#19820983)
    Karma Whores:

    Steal your insightful comments from http://linux.slashdot.org/article.pl?sid=07/04/22/ 1335255 [slashdot.org]

  • Why... (Score:5, Funny)

    by lawpoop ( 604919 ) on Tuesday July 10, 2007 @10:14PM (#19821005) Homepage Journal
    Why does this sound like the title of a Monty Python Skit?

    "Why isn't my process getting more CPU time?"

    "Well, Sir, it's a Completely Fair Scheduler."
  • how it's possible? (Score:2, Informative)

    by Saija ( 1114681 )

    "The new CPU scheduler should improve the desktop Linux experience"

    really? and how it's suppose to do that wonderful thing?
    ps: i'm just curious and noob, so please don't smash me... ;)
    • Re: (Score:3, Informative)

      by Anonymous Coward
      The CPU scheduler affects the latency of processes. Interactive applications are very latency sensitive - if they do not get scheduled frequently enough the system will feel sluggish. A good desktop scheduler will therefore schedule all of your interactive tasks frequently. I don't understand the details of the CFS, but if it claims to improve the desktop Linux experience then it must do this.

      The tradeoff with short timeslices is that there's more overhead due to context switches and so the overall time s
  • by Desmoden ( 221564 ) on Tuesday July 10, 2007 @10:23PM (#19821039) Homepage

    We saw crazy performance improvements implementing kqueue in bsd, would love to see something that great at handling many sockets standard linux.
  • Questions (Score:2, Interesting)

    by flar2 ( 938689 )
    Is there a default scheduler in the linux kernel? If so, which is it? Are there several schedulers to choose from? If so, which one(s) do the major distros use? Will the new CFS be the new default or just another choice?
  • --mm line (Score:5, Informative)

    by Enderandrew ( 866215 ) <.moc.liamg. .ta. .werdnaredne.> on Tuesday July 10, 2007 @10:34PM (#19821105) Homepage Journal
    CFS has been available for some time in Andrew Morton's -mm branch of the kernel. If you really want it now, just download his latest patch and there you go.

    I've reen running with it for some time, and I really like it. I'm still not sure if it is better than Con Kolivas' SD scheduler in his patchset, but we'll see.
  • I thought the design was pretty elegant. although in practice we've been having huge problems getting Linux to scale past 32 or so CPUs. partially the locking and partially because the scheduler is not smart enough to schedule complex workloads effectively that you see on huge SMP systems.
  • by s_p_oneil ( 795792 ) on Tuesday July 10, 2007 @10:37PM (#19821133) Homepage
    The only way to make it completely fair is to let one thread slice the time up, and let the other thread choose which slice it wants. ;-)
  • by smartdreamer ( 666870 ) on Tuesday July 10, 2007 @10:47PM (#19821209)
    So does Linux reached the computer's communist's holy grail?
  • Improve how? (Score:3, Interesting)

    by suv4x4 ( 956391 ) on Tuesday July 10, 2007 @10:51PM (#19821231)
    The new CPU scheduler should improve the desktop Linux experience, and will be part of the upcoming 2.6.23 kernel.

    Could someone outline concrete problem the Linux desktop scheduling had right now that are visible resolved by CFS.

    I'm not a heavy user of Linux desktop (just servers on the shell), but it was always my experience that Linux handles simultaneous multimedia tasks (for example) better on the same hardware than Windows.

    While I contribute this more to architectural problem on the Windows side (such as.. it's quite easy an app to stall Explorer.exe or vice versa, no amount of scheduling helps there), I'm curious to see if there's tangible difference someone could describe with CFS running desktop software in Linux.
    • Re: (Score:3, Informative)

      by detain ( 687995 )
      Its been already said, but ill repeat just for completion.

      Basically right now the scheduler is unbiased, giving ticks to all applications regardless of their need for processing time. An example of this would be in X windows when you have little taskbar icons that rarely do anything, vs having cd burning software running.

      The scheduler will quickly learn that most of the time it asks the taskbar application if it needs to do anything, it doesnt, and that most of the time it asks the cd writing software to d
      • Re: (Score:3, Informative)

        by ultranova ( 717540 )

        The scheduler will quickly learn that most of the time it asks the taskbar application if it needs to do anything, it doesnt, and that most of the time it asks the cd writing software to do anything, it neeeds cpu. So very quickly it will start checking on the cd writing process more frequently than the taskbar process. This will give you a very noticable performance increase in your system.

        How did this rubbish get modded informative ? Is it someone's idea of a joke ? Or do people simply apply the "info

    • Re:Improve how? (Score:4, Informative)

      by the_greywolf ( 311406 ) on Tuesday July 10, 2007 @11:52PM (#19821605) Homepage

      CFS and Con Kolivas' SD both aim to improve interactivity of processes under high load - in particular, the goal was to reduce scheduling latency for applications which have realtime needs - like audio players. Con Kolivas has been maintaining variations no his low-latency Staircase design for several years with precisely that goal in mind.

      On the desktop, it improves latencies for (for example) music players and 3D games, improving performance and elimingating jitter, lag, and general choppiness. Both SD and CFS achieved this under loads as high as 50.

      On the server, it can have several benefits, including improved time-to-network latencies. They both want and need test cases for servers that show no detrimental effects. If you want to help, you can try out CFS on a server and report to Ingo if there are performance or latency issues.

      • Re: (Score:3, Interesting)

        I don't see how this improves the situation for realtime applications. A realtime application's need can be summarized thus: "I need X amount of resources (CPU time, I/O bandwidth) by timestamp Y." As I understand it, CFS distributes CPU time completely equally among runnable processes, so if the realtime application needs more than an equal share in order to maintain its realtime processing, it's sorry out of luck.

        If I run 50 processes that are spinning, each one of them will get just as much CPU time as
  • Poor attribution (Score:4, Insightful)

    by the_greywolf ( 311406 ) on Tuesday July 10, 2007 @11:48PM (#19821579) Homepage

    So little credit is given to Con Kolivas, whose Staircase Deadline scheduler (a more mature and refined design than CFS) spurred Ingo to finally improve his scheduler (which he wrote on the spot because, apparently, Con's scheduler wasn't good enough for him).

    And all Con gets is a minor footnote.

    • Re:Poor attribution (Score:5, Informative)

      by Anonymous Coward on Wednesday July 11, 2007 @12:06AM (#19821683)
      Not only does he get very little credit, but the whole experience of trying to get his patches merged into the mainline have soured him on kernel development altogether:
      [ck] It is the end of -ck [bhhdoa.org.au]

      It is clear that I cannot develop code for the linux kernel intended only to
      be used out of mainline and not have mainline get involved somewhere along
      the line. Whether it be the users or even other developers repeatedly
      asking "when will this be merged". This forever gets me into a cycle of
      actually trying to merge the stuff and ... well you all know what happens at
      that point (again I had nastier words but decided not to use them.)

      This is pretty sad for linux kernel development.
      • Re:Poor attribution (Score:4, Informative)

        by Anonymous Coward on Wednesday July 11, 2007 @03:29AM (#19822645)
        here is Linus' take on this issue:

        A big issue for me is also the difference in how Con and Ingo took criticism. Both SD and CFS were criticized by various people over different things, and quite frankly, Ingo ended up trying to figure out why something didn't work, while Con ended up getting involved more in flame-wars with the people who criticised SD. Is that important too? Yes it is.
        The full posting of Linus is at: http://bhhdoa.org.au/pipermail/ck/2007-June/007856 .html
    • Re:Poor attribution (Score:5, Informative)

      by tglx ( 664015 ) on Wednesday July 11, 2007 @02:50AM (#19822475)
      > So little credit is given to Con Kolivas ...
      > And all Con gets is a minor footnote.

      I'm a kernel developer myself and quite surprised you see it that way.
      Let's take a look at the kernel code:

      1) Ingo credited Con for the "fair scheduling" approach right on the first page of kernel/sched.c. That's the
      most prominent place you can get credited for working on the Linux scheduler

          * 2007-04-15 Work begun on replacing all interactivity tuning with a
          * fair scheduling design by Con Kolivas.

      2) He credited Con for a line of code that he added to CFS from SD, in kernel/sched.c

          * This idea comes from the SD scheduler of Con Kolivas:

          This is the only SD code in CFS - the two designs and approaches are quite different.

      3) He credited Con in Documentation/sched-design-CFS.txt

            I'd like to give credit to Con Kolivas for the general approach here:
            he has proven via RSDL/SD that 'fair scheduling' is possible and that
            it results in better desktop scheduling. Kudos Con!

      4) Finally he credited Con in the CFS commit log as well:

        commit c31f2e8a42c41efa46397732656ddf48cc77593e
        Author: Ingo Molnar
        Date: Mon Jul 9 18:52:01 2007 +0200

                sched: add CFS credits

                add credits for recent major scheduler contributions:

                    Con Kolivas, for pioneering the fair-scheduling approach
                    Peter Williams, for smpnice
                    Mike Galbraith, for interactivity tuning of CFS
                    Srivatsa Vaddagiri, for group scheduling enhancements

                Signed-off-by: Ingo Molnar

      I don't see much more places, where credit could be documented.

      • Re: (Score:3, Informative)

        by Anonymous Coward
        I'm not a kernel developer but happened to be reading the mailing lists when the "CFS" originally hit the scene a few months ago.

        Basically Ingo Molnar, the author of CFS, who is also the maintainer of the scheduler in the kernel, opposed the inclusion of the competing SD scheduler from Con Kolivas for years. Then he claimed that he was just suddenly inspired to whip up a new scheduler that addresses the exact same problems. He then did so in "62 hours".

        If you start at this point and read the next 20 or so
  • This update seems to have come relatively soon after the O(1) scheduler (about a year?) which is relatively quick for changes to really important low-level parts of an operating system. Does this mean that the Linux community was relatively unhappy with the O(1) scheduler which was touted as a very good thing at the time?
    • Re: (Score:3, Informative)

      "This update seems to have come relatively soon after the O(1) scheduler (about a year?) which is relatively quick for changes to really important low-level parts of an operating system. Does this mean that the Linux community was relatively unhappy with the O(1) scheduler which was touted as a very good thing at the time"

      The Linux O(1) scheduler has been around since 2002.

      It's pretty good, but there are corner cases where you can fool it. For example, if a process classified as interactive goes CPU-bound,
  • CFS vs. O(1) (Score:5, Informative)

    by Ingo Molnar ( 206899 ) on Wednesday July 11, 2007 @12:09AM (#19821707) Homepage

    (disclaimer, i'm the main author of CFS.)

    I'd like to point out that CFS is O(1) too.

    With current PID limits the worst-case depth of the rbtree is ~15 [and O(15) == O(1), so execution time has a clear upper bound]. Even with a theoretical system that can have 4 million tasks running at once (!), the rbtree depth would have a maximum of ~20-21.

    The "O(1) scheduler" that CFS replaces is O(140) [== O(1)] in theory. (in practice the "number of steps" it takes to schedule is much lower than that, on most platforms.)

    So the new scheduler is O(1) too (with a worst-case "number of steps" of 15, if you happen to have 32 thousand tasks running at once(!)), and the main difference is not in the O(1)-ness but in the behavior of the scheduler.

    • Re: (Score:2, Interesting)

      by Anonymous Coward
      I thought the depth of rbtrees were C*ln(n) where 1C2, so that the number of "steps" in going down the tree would be bounded to 16*2 in this case?

      Maybe I'm remembering the worst case behaviour of rbtree wrong.
    • Re:CFS vs. O(1) (Score:5, Informative)

      by eggnet ( 75425 ) on Wednesday July 11, 2007 @01:12AM (#19822027)
      Big O notation describes performance as "n" approaches infinity. If you cap n, then of course you cap the execution time, that's the case for most any algorithm. What you're describing still remains O(ln(n)).

      Frankly big O notation isn't a very good way to describe scheduler performance. Execution time under common loads, and maybe an extreme case would be better. Who cares about an O(1) scheduler that always takes 1 second to schedule the next task :)
    • Re:CFS vs. O(1) (Score:4, Informative)

      by Elladan ( 17598 ) on Wednesday July 11, 2007 @01:16AM (#19822043)

      This kind of a silly thing to say. I mean, all terminating algorithms on a finite machine are O(1) ultimately.

      For example, your 1 gig machine only has 2^(1024*1024*1024*8) states it can go through to reach an answer, not including disk IO... and as we all know, O(2^[1024*1024*1024*8]) =~ O(10^2585827972) = O(1). :-)
      • Re: (Score:3, Informative)

        by Champion3 ( 599877 )
        Remember that the formal definition of big-O notation says that it holds _for_all_ n greater than some n0. But in this case n has a clearly defined upper bound. This argument is not new; it's used in realtime systems as well.
    • Re:CFS vs. O(1) (Score:4, Insightful)

      by SillyPerson ( 920121 ) on Wednesday July 11, 2007 @01:52AM (#19822195)
      Am I the only person worried that the main author of CFS does not seem to understand big O notation and red-black trees?
    • Re:CFS vs. O(1) (Score:4, Interesting)

      by John Nowak ( 872479 ) on Wednesday July 11, 2007 @02:17AM (#19822325)
      If it's a red-black tree, it's O(2 ln(N)). The fact that N never gets too big has nothing to do with it. It is... disturbing... to hear this from someone like Ingo. I also have no idea where ~20-21 came from, as the maximum depth at four million is higher than that.
      • Re:CFS vs. O(1) (Score:4, Insightful)

        by Ingo Molnar ( 206899 ) on Wednesday July 11, 2007 @04:12PM (#19829567) Homepage

        (To answer your question: the 20-21 comes from other limits to the task space - right now we are still limited to 32k pids.)

        Yes, you are right, operations on an rbtree of an arbitrary data structure are of course an O(log2(N)) algorithm, no argument about that.

        I know what the mathematical meaning and definition of the big/little ordo/theta notations is (probably better than i should ;), I only wanted to point out the fact that an O(log2(N)) algorithm for most data structures in the kernel (or elsewhere on today's computers) is equivalent to O(1) in practice, especially if N is fundamentally limited to 15 bits like in this case!

        The main purpose of the ordo/theta notations is to be able to talk about and compare the performance (worst-case/best-case/average-case) qualities of algorithms. Sticking to their strict mathematical definition in cases where it departs from their original purpose results in worse software :)

        And talking about big ordo differences between algorithms operating in finite machines still makes sense (naturally): for example, O(sqrt(N)) is not equivalent to O(1) in practice - it can still be very large, even with a pretty limited N. O(N) is also obviously very relevant in practice, even on very limited machines. But the difference between O(log2(N)) and O(1) is insignificant in most cases, and in fact it is deceiving in this case. (as i pointed it out with the O(140) example.)

  • by ZeekWatson ( 188017 ) on Wednesday July 11, 2007 @02:03AM (#19822243)
    Ingo Molnar is the worst kind of loser: an attention whore. His O(1) scheduler turned out to be a piece of crap and Con Kolivas spent a considerable amount of time implementing a better solution: Staircase Deadline (SD). The SD scheduler is a well tested, good performing scheduler and just when you think its going to be merged into Linus' kernel and replace Ingo's O(1) turd (and remove Ingo's name from some "important" files), Igno spends a couple of days reimplementing SD. I guess he wont be getting his name deleted after all!

    This shows the black side of open source. Con developed SD in the open and Igno stole his ideas. It was only after people started pointing out that CFS looked _very_ similar to SD that Igno even admitted that the design was based on Con's SD work.

    The only reason CFS is in the kernel and not SD is politics.
  • Of course.. (Score:5, Funny)

    by Ztream ( 584474 ) on Wednesday July 11, 2007 @12:17PM (#19826459)
    "Hello. My name is Ingo Molnar. You killed my task. Prepare to die."

When you make your mark in the world, watch out for guys with erasers. -- The Wall Street Journal