Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

The Completely Fair Scheduler

Posted by CmdrTaco on Sun Apr 22, 2007 11:31 AM
from the i'm-always-late dept.
hichetu writes "Kernel trap has a nice summary of what is going on behind the scenes to change the Linux Scheduler. The O(1) Linux scheduler is going to be changed so that it is fair to interactive tasks. You will be surprised to know that O(1) is really too good not to have any side-effects on fairness to all tasks."
+ -
story

Related Stories

[+] Linux Gets Completely Fair Scheduler 274 comments
SchedFred writes "KernelTrap is reporting that CFS, Ingo Molnar's Completely Fair Scheduler, was just merged into the Linux kernel. The new CPU scheduler includes a pluggable framework that completely replaces Molnar's earlier O(1) scheduler, and is described to 'model an "ideal, precise multi-tasking CPU" on real hardware. CFS tries to run the task with the "gravest need" for more CPU time. So CFS always tries to split up CPU time between runnable tasks as close to "ideal multitasking hardware" as possible.' The new CPU scheduler should improve the desktop Linux experience, and will be part of the upcoming 2.6.23 kernel."
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by Anonymous Coward on Sunday April 22 2007, @11:33AM (#18831995)

    I thought Linux used Cron as a scheduler ?

    • by ozamosi (615254) on Sunday April 22 2007, @11:37AM (#18832027) Homepage
      That is for scheduling background tasks that run once a day (or whatever you set it to)

      This is for scheduling CPU resouces in real time. To decide if Firefox or Apache is going to be executed the following split second.
      • by LiquidCoooled (634315) on Sunday April 22 2007, @11:51AM (#18832103) Homepage Journal
        Can't we just give the processes weapons and let them decide which follows?
        • by sphealey (2855) on Sunday April 22 2007, @12:02PM (#18832197)
          > Can't we just give the processes weapons and
          > let them decide which follows?

          That is actually the kind of question that my Operations Research professor (who also did a lot of work in CPU simulation and performance estimating) used to throw onto final exams as the "separate the B+ from the A" question. If your answer was interesting enough he would send you over to one of his Masters candidates to see if it could be taken any further. So I wouldn't count your suggestion out from the start!

          sPh
          • by mosel-saar-ruwer (732341) on Sunday April 22 2007, @12:32PM (#18832419)

            GP: Can't we just give the processes weapons and let them decide which follows?

            P: That is actually the kind of question that my Operations Research professor (who also did a lot of work in CPU simulation and performance estimating) used to throw onto final exams as the "separate the B+ from the A" question. If your answer was interesting enough he would send you over to one of his Masters candidates to see if it could be taken any further. So I wouldn't count your suggestion out from the start!

            Behold: The Mother of All Possible Comp Sci Flame Wars: The Darwinistically Selected Genetic Algorithms -vs- the Intelligently Designed Algorithms.

            Bumper Stickers $4.95; T-Shirts $19.95:

            $DEITY does not play dice with the Turing Machine.
            • by Anonymous Coward on Sunday April 22 2007, @01:34PM (#18832801)
              Coming up with an idea (even if totally made up) and the backing it up with arguments is much harder than memorization and regurgitation and actually backing it up with things having to do with that class shows you have learned something, or at least know about the concepts discussed in the class.
        • by HerrEkberg (971000) on Sunday April 22 2007, @12:05PM (#18832227) Homepage
          Just throw this [unm.edu] into the kernel and we are good to go.
        • by Progman3K (515744) on Sunday April 22 2007, @12:41PM (#18832489)
          Yes, that's called psDoom
          http://psdoom.sourceforge.net/ [sourceforge.net]
  • Fair? (Score:5, Funny)

    by alienmole (15522) on Sunday April 22 2007, @11:37AM (#18832029)
    If scheduling was completely fair, this would have been a frist ps0t.
  • by 0xdeadbeef (28836) on Sunday April 22 2007, @11:40AM (#18832039) Homepage Journal
    Free software: because some processes are more equal than others.
  • Surprised? (Score:5, Funny)

    by WombatDeath (681651) on Sunday April 22 2007, @11:43AM (#18832055)
    You will be surprised to know that O(1) is really too good not to have any side-effects on fairness to all tasks.

    No I won't, because I don't know what the hell it means.

    Hah! In your face, Taco!
    • Re:Surprised? (Score:5, Informative)

      by Eevee (535658) on Sunday April 22 2007, @11:50AM (#18832101)
      In computer science, Big O notation [wikipedia.org] is used for the complexity of a task.
    • Re:Surprised? (Score:5, Informative)

      by jrschulz (684749) on Sunday April 22 2007, @12:06PM (#18832235) Homepage
      A scheduler is the piece of software that brings you the illusion of multi-tasking. Because a single processor (with a single core) can only run one process at the same time, the operating system switches the process currently running. And it does this very fast (IIRC up to 1000 times a second in the case of linux).

      The scheduler decides which process runs when and has to make sure that no process has to wait in the queue forever without getting his share of CPU time (this is what is called "starving").

      Since the scheduler is a program by itself, it has a specific runtime characteristic, usually dependent of the number of programs waiting for their CPU share. The special property of the current scheduler in linux is that its runtime is in fact independent of this number. That's expressed in CS by O(1).
  • I/O prioritisation (Score:5, Interesting)

    by Doug Neal (195160) on Sunday April 22 2007, @11:47AM (#18832069) Journal
    Linux really doesn't need a new process scheduler. What it could really do with is I/O prioritisation. Windows now has it, so there's no excuse. CPU power is fairly abundant these days so managing its usage is less of an issue than it used to be, but I/O bandwidth is often in short supply and I/O-bound applications can choke a system and make interactive processes a pain in the ass to use. I'd like to see some way of reserving and limiting bandwidth to particular devices for particular processes. And an equivalent of "top" for monitoring processes' I/O activity would also be extremely handy... as far as I know, the system calls don't even exist in the kernel to do this yet.
    • And an equivalent of "top" for monitoring processes' I/O activity would also be extremely handy... as far as I know, the system calls don't even exist in the kernel to do this yet.
      Windows has this to an extent. I press Ctrl+Shift+Esc to open Windows Task Manager, then View > Select Columns... > turn on Page Faults Delta, and I see the penalty for sticking with the 6-year-old paid-for PC that I still use.
    • by Anonymous Coward on Sunday April 22 2007, @11:57AM (#18832149)
      > What it could really do with is I/O prioritisation.
      ionice [die.net]. Available since 2005-08-28.

      > And an equivalent of "top" for monitoring processes' I/O activity would also be extremely handy

      I agree, that would be nice.
    • by Stephen Williams (23750) on Sunday April 22 2007, @12:28PM (#18832379) Journal
      And an equivalent of "top" for monitoring processes' I/O activity would also be extremely handy

      I'd love something like that.

      There's a way of logging I/O; it's pretty rough-and-ready, not really suitable for permanent use, but can be handy for figuring out what keeps causing a laptop HDD to spin up, for example. As root, do:

      echo 1 >/proc/sys/vm/block_dump
      I/O is then logged to the kernel ring buffer, and can be retrieved with dmesg. The entries look like:

      pdflush(138): WRITE block 1161864 on dm-4
      pdflush(138): WRITE block 0 on dm-3
      pdflush(138): WRITE block 524328 on dm-3
      pdflush(138): WRITE block 786952 on dm-3
      pdflush(138): WRITE block 786960 on dm-3
      When you've finished, do

      echo 0 >/proc/sys/vm/block_dump
      as root to turn it off again.

      Like I said, very rough-and-ready, nowhere near as nice as a proper I/O top would be, but there it is.

      -Stephen
    • by Animats (122034) on Sunday April 22 2007, @01:24PM (#18832747) Homepage

      Linux really doesn't need a new process scheduler. What it could really do with is I/O prioritisation.

      QNX has that, which is essential for real-time work.

      QNX has the advantage that I/O, like almost everything else in QNX, is done via inter-process message passing operations. The message passing system uses priority queues, and so requests to file systems and devices get handled in priority order. So resource managers (file systems, device drivers, etc.) don't have to explicitly handle priorities; it's done for them. Some resource managers, like disk handlers, process multiple requests at a time so they can reorder them to optimize access, but network devices and such are FIFO at the resource manager level and priority ordered at the message level.

      The end result is that you can compile or surf the web on a system that's also doing real time work without interfering with the real time tasks.

  • by tji (74570) on Sunday April 22 2007, @12:36PM (#18832437)
    No, I think I'll wait for the unbelievably fair scheduler, or perhaps the ridiculously fair scheduler.
    • You are incorrect because the act of scheduling the next process to run requires constant time regardless of how many processes there are. The O(1) refers to this, correctly.

      The reason this is important, and the reason they are worried about the act of scheduling the next process rather than time complexity over all N processes, is that if scheduling the next process were not constant time, the percentage of time spent scheduling the next process would grow larger as you added more processes. That's fine if you're on a desktop system and you go from 100 to 200, but as the number starts getting large on (say) big servers, you start running into a situation where all your CPUs are perpetually tied up trying to figure out what process to run next.

      O(n) time over N processes is not a problem; you've either got the CPUs or you don't. If you don't, then your performance will suck for reasons that are your own fault. If you do have enough CPUs, then the time spent scheduling will remain in step with the time spent running the processes, and this is fine. However, if the time spent scheduling grew, every time a process was scheduled, across all CPUs, then your $50000 server would be worthless because it wouldn't be able to handle the workloads you intended for it. All those expensive CPUs would sit there figuring out what process to run rather than running it.

      So, it is NOT a misnomer. It accurately describes the portion of the problem that the developers are concerned with. It's O(n) over n processes, and that's great because it means you can get to n without breaking down.
        • by ASBands (1087159) on Sunday April 22 2007, @12:49PM (#18832559) Homepage

          Here's how Linux 2.6.x task scheduler works:

          • A "runqueue" keeps track of all essential tasks assigned to a given CPU [Queue is O(1) efficient]
          • Each runqueue contains two priority arrays, the active and the expired. As a task completes, it is moved (during the move, the new timeslice is calculated - the point of debate and largest fundamental change to the task scheduler) [Moving from static array to static array is O(1) efficient]
          • When the active priority array is finished, the expired array becomes the active array [Swapping 2 pointers is O(1) efficient]
          • The priority arrays are of fixed length 140, as that is the amount of priority levels Linux has [O(140) = O(1)]
          The point of debate comes if two threads have the same priority, in which case they are put on a round robin in the priority array. However, the confusion comes in that the execution is O(n) (which makes sense if you think about it), but the scheduler itself handles these at O(1) efficiency.
          • by xenocide2 (231786) on Sunday April 22 2007, @01:32PM (#18832791) Homepage
            It seems like the difference is that people call it an O(1) scheduler to reflect the fact that all the work required to be done at the end of a timeslice (record keeping, picking a new process etc) is done in constant time, and people arguing for O(n) are referring to the cost to schedule all processes. Nobody's saying they can find a schedule for n processes without looking at all of them.
    • by jb.cancer (905806) on Sunday April 22 2007, @01:20PM (#18832723)
      Well, yes & no. It really was Con's RSDL that got ppl looking seriously into changing the mainline scheduler (there already being several out-of-mainline alternatives like nicksched, etc.)

      Con's scheduler seemed to work better at higher workloads than the mainline, by just trying to distribute load evenly and not trying pretty interactivity tricks. But several ppl did say it didn't perform well for certain X client workloads. That's when Ingo's CFS was posted.

      There really is 2 alternatives Ingo's CFS & Con's SDL that's being simultaneously tested by the kernel developers now, and none is accepted into mainline.

      So it wouldn't be fair to say that CFS is *the* next Linux scheduler. It could be SDL as well.