Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Programming Software Linux IT Technology

Debating the Linux Process Scheduler 232

An anonymous reader writes "The Linux 2.6.23 kernel is expected around the end of the month, and will be the first to include Ingo Molnar's much debated rewrite of the process scheduler called the Completely Fair Scheduler. In another Linux kernel mailing list thread one more developer is complaining about Molnar and his new code. However, according to KernelTrap a number of other Linux developers have stood up to defend Molnar and call into question the motives of the complaints. It will be interesting to see how the new processor really performs when the 2.6.23 kernel is released."
This discussion has been archived. No new comments can be posted.

Debating the Linux Process Scheduler

Comments Filter:
  • by Opportunist ( 166417 ) on Friday September 14, 2007 @11:27AM (#20604637)
    Is someone who does understand the differences able to explain, in non-kernel-developer terms, what the big differences will be for the average user, developer or administrator? I mean, I'd love to discuss it, but first of all I'd want to know what we're discussing.
    • by Jeff DeMaagd ( 2015 ) on Friday September 14, 2007 @11:37AM (#20604811) Homepage Journal
      I'd love to discuss it, but first of all I'd want to know what we're discussing.

      That's not the Slashdot way. We're supposed to have an unfounded opinion based on insufficient facts and preexisting prejudices.
      • Re: (Score:3, Insightful)

        Indeed. There are known discussions. These are discussions we know that we know. There are known undiscussions. That is to say, there are discussions that we know that we don't know. But there are also unknown undiscussions. These are discussions we don't know we don't know. They occur a lot on slashdot.
    • by Aladrin ( 926209 ) on Friday September 14, 2007 @11:40AM (#20604865)
      Average? Probably nothing. But for devs/admins that are worried about certain processes taking more time than others, it -should- be more fair and keep things running smoother.

      It's possible for programs right now to exploit how the current schedule dishes out time. As far as I know, they currently only do so out of ignorance, rather than malice. The new scheduler just corrects the problem.

      It's not something a user can really see unless they know exactly what they are looking for, and unless a dev/admin has a program that's behaving unfairly, it's not really going to matter to them, either.

      There is another invisible effect as well... Kolivas apparently publicly announced his decision to stop working on the kernel, which would include the current scheduler. That means finding another maintainer for his code, should any problems surface. If you've got 2 pieces of code that test the same in speed (as they do according to some), and 1 has a dev that's willing to keep working on it, and the other doesn't... Which would you pick?

      The new code also has the added advantage of being a really really neat idea, which encourages people to work on it as well.
      • by Opportunist ( 166417 ) on Friday September 14, 2007 @12:07PM (#20605259)
        That sounds sensible. With the increase of Linux boxes run by "average" people (who will not directly notice a difference), the threat of malware for Linux is going to be on the rise, too. And those people usually know how to exploit even the smallest flaws in a system.
        • Re: (Score:3, Insightful)

          by rbanffy ( 584143 )
          The scheduler has nothing to do with security exploits. This will only distribute cycles more evenly. Additionally the ideal malware should take steps to prevent it from getting too much CPU time in order to avoid detection.
      • by ciroknight ( 601098 ) on Friday September 14, 2007 @12:34PM (#20605629)
        Kolivas apparently publicly announced his decision to stop working on the kernel, which would include the current scheduler. That means finding another maintainer for his code, should any problems surface. If you've got 2 pieces of code that test the same in speed (as they do according to some), and 1 has a dev that's willing to keep working on it, and the other doesn't... Which would you pick?

        Wow, not even a full year has past and we're already getting revisionist historians trying to change the situation.

        Kolivas quit because of the scheduler debacle, because nobody would listen to Kolivas but were apt to follow Linus and his cronie Ingo around when they drum up more-or-less the exact same thing. Instead of critically listening to Kolivas' points, Linus and Ingo attacked Kolivas' merits. Under that kind of personal attack, I couldn't say I wouldn't have quit just to shut them up. Not all of us are stubborn mules and jackasses.
        • by peragrin ( 659227 ) on Friday September 14, 2007 @12:56PM (#20605933)
          What goes around comes around.

          Revisionist history is working both ways I see. Whenever Linux or another kernel developer would bring up a point of failure in Kolvias's scheduler instead of Fixing the problem Kolvias would lash out and say it wasn't broken.

          CFS won not because it was a better scheduler at the time, but because Inglo worked with the developers to make it better, instead of fighting everyone who questioned anything about it. FOSS projects are about helping everyone, and listening to new Ideas. Something Kolvias was having a hard time doing.

          That is at least how i read the whole debate.
          • by SEAL ( 88488 ) on Friday September 14, 2007 @01:07PM (#20606079)
            Whenever Linux or another kernel developer would bring up a point of failure in Kolvias's scheduler instead of Fixing the problem Kolvias would lash out and say it wasn't broken.

            CFS won not because it was a better scheduler at the time, but because Inglo worked with the developers to make it better, instead of fighting everyone who questioned anything about it.


            I believe Linus was the claiming Ingo worked better with the developers (or at least I saw him write to that effect on the LKML). By contrast, Kolivas had many individuals helping out with his branch who were quite pleased with his progress.

            The problem is that Kolivas was working to help desktop, and particularly 3D game users. He'd say it wasn't broken if it was optimizing for those particular platforms at a cost of a fraction of a percent Oracle performance. Most of the kernel devs don't care 2 cents about 3D or desktop users, and many are employed by large enterprise businesses, so you can see who won.

            Personally I think Kolivas should've been given full access to merge his code. Many of us found his work very useful and it's sad to see him driven away by a few people who are oblivious to what everyday users need. Even if his scheduler was not the default one, it still would've been nice to have as a mainline kernel option.

            - SEAL
            • by rbanffy ( 584143 ) on Friday September 14, 2007 @03:21PM (#20608221) Homepage Journal
              "The problem is that Kolivas was working to help desktop, and particularly 3D game users."

              I don't know for sure, but I think it should be trivial to have a kernel switch activated on boot to set the preferred scheduler. This way, 3D gamers would be happy to set the -ks (I fail to remember its correct name) while the rest of us would be happy to leave it alone and get the CFS. Maybe by having a modular scheduler architecture would allow to have kernels with the -openvz or -oraclesbest or whatever other schedulers one may want (and could get support from the kernel developers).

              A slightly more clever approach would be to let one switch schedulers on the fly, but I think it's asking too much.
              • by AnyoneEB ( 574727 ) on Friday September 14, 2007 @04:07PM (#20609127) Homepage
                You are right. It would be easy. In fact, someone wrote a patch for it called plugsched [google.com]. It was not accepted into the kernel due to the fact that it would supposedly discourage the idea of simply making a scheduler which worked well for everyone.
                • by ciroknight ( 601098 ) on Friday September 14, 2007 @04:49PM (#20609719)
                  You shouldn't say "someone did", it was Kolivas himself that first offered the pluggable scheduler patch so that his patch could be used along side any new future schedulers and offer a concrete way to benchmark the changes caused by scheduling. And this was done years ago, circa 2004: http://ck.kolivas.org/patches/plugsched/ [kolivas.org]

                  Of course, Linus and Ingo rejected those patches as well.
              • Re: (Score:3, Insightful)

                I think it should be trivial to have a kernel switch activated on boot to set the preferred scheduler. This way, 3D gamers would be happy to set the -ks (I fail to remember its correct name) while the rest of us would be happy to leave it alone and get the CFS. Maybe by having a modular scheduler architecture would allow to have kernels with the -openvz or -oraclesbest or whatever other schedulers one may want (and could get support from the kernel developers).

                I don't know one thing about how PlugSched was implemented, but other unices like HPUX have the ability to designate that certain cpus will run under specific schedulers. So you can run straight FIFO scheduler on one pair of cpus, a real-time scheduler on another set of cpus and the normal unix scheduler on the rest. You need to combine that with a way to bind processes to those cpus to get the benefit of the different schedulers, but that almost goes without saying.

                I'm kind of surprised that linux does

          • by recoiledsnake ( 879048 ) on Friday September 14, 2007 @01:08PM (#20606089)

            CFS won not because it was a better scheduler at the time, but because Inglo worked with the developers to make it better, instead of fighting everyone who questioned anything about it.

            That was certainly not how I read it when it happened. I came off with the impression that Kolivas was ripped off of his ideas by Inglo and that Linus was wrong in brushing Inglo off and going with Inglo.

            The LKML email linked reinforces my suspicion. Habits and attitudes don't die easily. We see ANOTHER guy complaining that his patch was half bakedly ripped off with no explanations whatsoever and a tangential acknowledgement.

            FOSS projects are about helping everyone, and listening to new Ideas. Something Kolvias was having a hard time doing.

            Did you even read the linked email? Inglo seems to be talking in patches and with no discussion. If something were to happen to him or if he quits due to any reason, there would be no one who would understand the code and the features behind it enough to maintain it.

          • Re: (Score:3, Insightful)

            by Opportunist ( 166417 )
            Folks? Could we concentrate on creating a good system instead of egos?

            You know, this really has a lot of the Life of Brian. "This is his gourd", "No, we'll follow his sandal".

            The nice thing about Linux is that we can actually choose which one we prefer. Put both into the distributions and let the people sort it out. The better one will prevail.
            • Re: (Score:3, Insightful)

              by ciroknight ( 601098 )
              PlugSched would have let us do this in a big way, letting us change at boot or even while the system is running what cpu scheduler is working. But, most distributions pull most of their code from the Mainline and Linus' blessed code, so it's unlikely any distribution will ever see Kolivas' scheduler in action to know whether or not it could do better than Ingo's bastard-clone.

              Now we get even more people questioning Ingo's code, and instead of examining the code, we get people starting to question Roman's
      • Re: (Score:3, Interesting)

        by Wdomburg ( 141264 )
        Kolivas maintained the SD scheduler, which never made it into mainline.
      • by GooberToo ( 74388 ) on Friday September 14, 2007 @01:19PM (#20606253)
        Average? Probably nothing.

        I don't think that's really fair. Average these days actually has a large spread. Average web user? Average game player? Average video watcher? Average musician? Averaging what? And that's the point of the CFS. Each of those users have different expectations and each reflect a different type of scheduler workload. The old scheduler has lots and lots of code to handle various performance problems for corner cases for each type of "average user" depicted above. This in turn means the code is hard to read, hard to understand, and even harder to maintain. Worse, some code which help some corner cases actually make things worse for others. This in turn means more special case coding and even more complex testing.

        The CFS is designed to simplify most everything the O(1) schedule does without tons of complex heuristics and special code to address various corner cases. In other words, in stead of trying to guess what you really want, the CFS simply tries to be as fair as possible for everyone, thusly ensuring many categories of corner cases are simply no longer an issue with the CFS.

        This does not mean CFS is always better than O(1), but early results indicate they are traveling down a good road. In fact, the last round of patches I read about, actually establish CFS ~12% faster then the old O(1) schedule. Of course, it's yet to be seen how well it will be received and how quickly it will prove it self with a large user base. The devil is in the details with schedulers and sometimes real world user workloads don't reflect anything which has been tested/validating during development.

        Nonetheless, development on CFS continues and it certainly is providing hope it will be smaller, faster, provide lower latency, be easier to maintain, and address a wider range of workloads, including many corner cases, without requiring complex code to track and analyze heuristics.

        Long story short, yes, if all goes well, even the average user may notice an improvement depending on the particulars of their workload. Wouldn't you notice a more responsive desktop? ;)

    • by Zephiris ( 788562 ) on Friday September 14, 2007 @11:50AM (#20605013)
      Essentially, the difference is how well processor resources are divided up, how evenly, and how big the pieces are each process or task gets. Most anyone who has used Linux has had the dreaded moment where you're trying to multitask a bit, and are compiling a program while listening to music, or waching a video, and then...that's terrific, video frames are dropped, or the audio skips. Even if intermitant, it's quite annoying, at the very least. The 'Completely Fair Scheduler' is an attempt to have more fair, sane, and generally less complex scheduling. This also happens to reduces the worst case latencies, averaging from (at least on the tests with my computer) 120+ms on vanilla 2.6.22 scheduler, to ~2.6ms with CFS.

      It's largely a drastic improvement over the old scheduling mechanisms that Linux has relied on, although other OSes have largely worked through such problems some time ago.

      While it's not exactly THE most scientific, I had a few rounds of testing over which did better on load vs. things still behaving exactly the way they should. I ran all of them with audio playing through KDE artsd, video player, glxgears, etc, loaded, plus inducing a CPU load via 'stress'. Linux, even with CFS, it's still fairly easy to 'upset' it by just producing a fairly large (2-4) amount of load. Solaris did notably better. While it seemed to have a few quirks with scheduling in general, it could sustain a load of around 8-12 without producing video/audio frame drops. FreeBSD, with the experimental SCHED_ULE 2.0 scheduler (as of March 2007) could sustain a load of over 80 with no problems, frame drops, or even glxgears slowing down to a complete crawl (although you wouldn't want to especially use OpenGL at such, it was still getting the speed of software glxgears), and even at 120+ load, the mouse wouldn't respond, while everything else kept going fine. This seems purely useless, but it really comes in handy if trying to do one or more KDE compiles while watching video, on Linux, this tends to be prevented. For the uninitiated, load averages like that are basically a multiplier vs. how much actual work your computer can do in real time. Eg, a 0.5 load would mean you're doing 50% of what you could in realtime. A 2.0 load means you're trying to handle twice what you can do in realtime, it is weighted against how many processors you have (I have one), but other things like disk access can also contribute to the load average, depending on OS.

      So, longer story short, a superior CPU scheduler can make a world of difference in how things behave when your system's something else with the CPU(s) at the same time.
      • by Hatta ( 162192 )

        Linux, even with CFS, it's still fairly easy to 'upset' it by just producing a fairly large (2-4) amount of load. Solaris did notably better. While it seemed to have a few quirks with scheduling in general, it could sustain a load of around 8-12 without producing video/audio frame drops. FreeBSD, with the experimental SCHED_ULE 2.0 scheduler (as of March 2007) could sustain a load of over 80 with no problems, frame drops, or even glxgears slowing down to a complete crawl (although you wouldn't want to espec

        • by TheRaven64 ( 641858 ) on Friday September 14, 2007 @12:52PM (#20605897) Journal

          Well damn. That's just... embarassing
          It's more embarrassing than the grandparent states, actually. ULE 2 has been improved a lot in the last few months, and the newer ULE 3 performs a lot better, particularly on SMP systems. From what I've seen, the new Linux scheduler has roughly the same shape (and size) performance curves on 1-8 CPU systems as the old 4BSD scheduler that ULE replaces.

          Why's everyone using linux if it sucks so much?
          Because Linux sounds cool, while BSD sounds geeky. I was recently reminded of how badly Linux sucks when I went over some old code I'd written to get the CPU name and speed. The FreeBSD and OpenBSD implementations of this code each called a single sysctl for each result. The Linux version had to read /proc/cpuinfo and parse it. Actually, it had to parse it in two different ways, because it turns out the format of cpuinfo is different on x86 and all other platforms. Reading the battery life was even worse. On FreeBSD it's just a matter of reading the hw.acpi.battery.life sysctl (one line of code). For Linux, it involves parsing some messy procfs stuff with a format that has a habit of changing between releases. I don't understand how a developer could prefer Linux to any other UNIX.
          • Re: (Score:3, Insightful)

            by cartman ( 18204 )

            I was recently reminded of how badly Linux sucks when I went over some old code I'd written to get the CPU name and speed. The FreeBSD and OpenBSD implementations of this code each called a single sysctl for each result.

            There are good reasons for exposing cpu info (and other kinds of OS data) as files, rather than gathering that data through system calls. Exposing the data as files makes that data usable to the range of tools people use on an OS (Unix) that supposedly treats everything as files. The data

            • Re: (Score:3, Insightful)

              by Cajal ( 154122 )
              Of course, the parent poster wasn't just complaining that Linux exposes this information as files rather than via sysctl. The real problem is that the format of the information changes frequently, and varies between platforms. /proc/cpuinfo, for example, varies widely between x86, ppc and sparc.

              It's exactly this sort of inattention to detail that makes me not enjoy developing on Linux.
          • by Dolda2000 ( 759023 ) <`fredrik' `at' `dolda2000.com'> on Friday September 14, 2007 @03:52PM (#20608861) Homepage

            I was recently reminded of how badly Linux sucks when I went over some old code I'd written to get the CPU name and speed. The FreeBSD and OpenBSD implementations of this code each called a single sysctl for each result. The Linux version had to read /proc/cpuinfo and parse it.
            While I agree with you that Linux seems almost pathetic compared to BSD when under heavy load, I have to object to that reason for calling the *BSDs superior to Linux. In fact, the sysctl interface is the single most bothering thing about the *BSDs to me. Isn't the treatment of everything as a file the very core of the Unix philosophy? It may be true that files like /proc/cpuinfo could very well be made easier to parse, but I find nothing to complain about with e.g. the /proc/sys/ tree, or, for that matter, sysfs. Requiring a special syscall just to fetch kernel data that could just as well be made available through the file system is, in my mind, just ugly for a Unix system, and I can only imagine that it is done that way for historical reasons.

            In almost any other way, however, I would prefer the *BSDs over Linux. I still run Linux, though, but that's mainly because I want to keep running Kerberized NFS. :) When FreeBSD implements that, I may well switch.

        • Well damn. That's just... embarassing. Why's everyone using linux if it sucks so much?

          Take his comments with a grain of salt. You can safely and completely ignore his comments about worst case latency because they are meaningless. In the Linux kernel, worst case latency almost always is caused by device drivers. Besides, most people have no idea how to properly measure worst case latency...and the measurement method may not reflect a real world workload in any meaningful way.

          Depending on the devices you
        • Re: (Score:3, Funny)

          by m50d ( 797211 )
          Well damn. That's just... embarassing. Why's everyone using linux if it sucks so much?

          The same reason as windows: hardware support.

      • by Ant P. ( 974313 )
        I haven't bothered testing it with numbers, but I've definitely noticed the improvements since 2.6.0.
        Back then, skipping sound happened just by dragging windows around. In 2.6.22, the only time I've had that happen is when something else in my system went berserk and ate all my ram+swap up, or a full-on crash.
      • Re: (Score:3, Interesting)

        by rg3 ( 858575 )
        I'm not trying to debunk your claims or anything, but I tried to reproduce that behaviour in my 2.6.22.6 kernel (no CFS, that will be in 2.6.23), and I couldn't make mplayer drop any frames. I launched mplayer to play an MPEG4 movie with MP3 audio, launched 8 infinite loops and ran updatedb so as to keep a good amount of IO too. mplayer didn't drop any frames and the audio didn't skip at all. The load average was 10 point something. And I set the scheduler governor to powersave, so I was effectively running
    • Re: (Score:3, Informative)

      by FauxPasIII ( 75900 )
      Each process running on Linux has a "niceness" value which you as the user can set. The value indicates which ones you want to have more access to CPU power. The numbers range from 19, meaning roughly "only use the CPU when noone else needs it", to -20 meaning "all your CPU are belong to it".

      The new scheduler will make those values behave more like they're supposed to relative to one another, and hopefully use fewer resources for itself in doing so.
      • by mamer ( 536310 ) on Friday September 14, 2007 @12:36PM (#20605669)
        Another question: I've been waiting for OpenMP support in gcc, it seems to be coming soon. In the meantime, I tried to parallelize my code by hard coding my threads using pthreads. I tested it by running several matrix-matrix multiplications "in parallel" on a multi-core CPU, and what I've got was all threads running on the same processor. Only after increasing my number of threads to a couple of dozens, I get some of them to run on the second processor. So basically, I am not getting any performance gain. I asked a number of people an they tell me "this is an old problem, basically the Linux kernel scheduler is stupid and nobody has bothered to fix it". Now, is Ingo's new scheduler fixing this? if gcc-openmp relies on the kernel scheduler, should we expect that open-mp will basically work-but-not-really-work on shared memory multi processor machines? I think this is an important issue to address, especially in an era where high-performace computing has become the driving force behind the hardware. BTW, how are otehr commercial compilers overcoming this scheduling problem?
      • You know, it's stuff like this that shows Linux is NOT meant for consumer use, and why attempts at making user-friendly desktop distros still feel rather patched together. A -20 to 19 scale? What the hell? I'm sure there's some valid reason why that particular scale is used, but I'm also sure that reason has very little to do with ease of use. To someone without arcane knowledge, that is a completely arbitrary range, as so much about Linux systems seems to be arbitrary. And when something appears arbitrary
    • The big user-visible difference are improved fairness and interactivity. If you have multiple tasks sharing a cpu, the amount of cpu time allocated to each task is better regulated.

      Also, nice levels are more predictable. In general, decreasing the nice level by one means that the task gets 1.25 times as much cpu. This means that a nice -19 task gets approximately 50x the cpu time as a nice 20 task.
    • by dpilot ( 134227 ) on Friday September 14, 2007 @12:30PM (#20605585) Homepage Journal
      Do you want your media to play without skips or drops while you're compiling your new kernel?

      There have long been tricks like "interactive priority boost" or "nice -10 X" that attempt to make the desktop more responsive, and media play smoothly. But others believe those are just tricks, bound to misbehave in corner cases, and that a good scheduler and well implemented priority scheme will do just as well without the drawbacks. That's where CFS is trying to be. In particular most desktop responsiveness is of the sort, "I need a little CPU, and I need it NOW!" while compiles and such are "I need lots of CPU, and I'll take it whenever I can get it." The CFS keeps track of not just who's using a timeslice, but how much time they're using. That way, those short bursts of CPU keep their priority intact, while more CPU-intensive processes tend to get some priority degradation.

      This goes back a little farther than Ingo Molnar's current involvement. A while back, Con Kolivas began putting in a bunch of work on the scheduler trying to get desktop response to work right, essentially he wanted his media, and his compiles, too. He did a lot of work and attracted a lot of users and fanbois along the way. More recently, Ingo Molnar get interested too, and came up with the "Completely Fair Scheduler." When it came time to pick one, Linus saw the CFS doing pretty well, still under heavy and active development. CK's scheduler was also pretty good, but the fanbois poisoned the waters, insisting that it was perfect as it was, and didn't need fixing. Linux chose an active development model over "perfection." Unfortunately Con Kolivas felt slighted in the process, and left. IIRC, he may have been absent during the decision window, and his fanbois did him in.

      Add to all of this the fact that the kernel can now run tickless, so that laptops can really scale back their power in between keystrokes or while you're reading the screen. There has been quite a bit of interesting work on scheduling, lately.
      • by SL Baur ( 19540 )

        When it came time to pick one, Linus saw the CFS doing pretty well, still under heavy and active development.

        No, that's not really what happened. Based on my reading of the list at time, Linus had already decided to use CFS from the time it first was published.

        Two things happened that were very misleading. The first was the initial CFS announcement. It appeared at the time, that at the same time that Ingo was being highly critical of SD, he was implementing a variant behind everyone's back. This wasn't true; I have no reason to doubt his words that he did throw together the first version in a day or so of ha

    • Re: (Score:3, Informative)

      Average developer or administrator? Your system will be more "stable" under heavy loads, with fewer/no processes starved for CPU cycled. The new scheduler (building on Kovilas work in an unfriendly fashion) better divides up processor time among multiple tasks.

      Average user? Multimedia tasks will not skip or stutter while the system is under load. The opposite of Vista's network performance taking a nose dive while playing MP3s, Linux systems with the new scheduler will see little/no impact from background/n
  • by budword ( 680846 ) on Friday September 14, 2007 @11:29AM (#20604673)
    I doubt any of us could tell the difference. Storm in a tea cup.
    • by MyLongNickName ( 822545 ) on Friday September 14, 2007 @11:40AM (#20604867) Journal
      As a windows user who has very little experience with Windows: This is one of the strengths of open source. if you have a large enough base of contributors, these "little" details are brought out into the open, and you can really understand how things work. I've read a bit on the subject, and it is interesting to see the different approaches that can be taken to something that most of us do not even think about.

      With Windows, how does this work? I will never know for sure. if MS doesn't choose to make it known, it isn't known. If they choose to make it known, then I just have to trust they are telling the truth (Windows Update anyone).

      With a project like this, you are much more likely to get the best approach to the situation.
      • As a windows user who has very little experience with Windows

        Should be "very little experience with Linux". You would think Slashdot would have a 'preview' to help with this :) Sheesh :)
      • by samkass ( 174571 )
        I see your argument, but the facts don't seem to actually back it up. By many benchmarks, Linux is currently worse than almost all the closed-source schedulers at handling loads greater than 2.0 on an interactive system and has been for years. The CFS patches seem to mostly close the gap, but I haven't seen many argue it will make Linux better than the alternatives, just let it at least play in the big league ballpark.
        • Indeed, and on top of that I have the feeling that instead of resolving the scheduling issues many other "fixes" were proposed and/or implemented like preemptive locks and 1000Hz ticks, etc. We'll see how all those together will affect performance.

          On the other hand, I have always liked the scheduler when it came to high-load and not too latency critical situations, even though IMHO the FreeBSD scheduler still scales better.

    • by Archangel Michael ( 180766 ) on Friday September 14, 2007 @11:45AM (#20604939) Journal
      More importantly, if there are more than one Scheduler, and if someone could tell the difference, why isn't s/he using the ALTERNATE Scheduler and compiling their own custom, tweaked and totally tuned kernel?

      Seriously, most people aren't going to notice, and those that do notice, ought to be able to compile their own kernel, and ought to do exactly that. This is nothing short of an esoteric discussion and shouldn't extend beyond kernel developers. Most people don't know, and don't care which scheduler is implemented.

      I'm one of those somewhere between caring and not. I only care about the supposed differences in approach to scheduling, and quite frankly, from what little I understand, the various schemes to scheduling have their advantages and disadvantages. I seriously doubt that ONE is better in all circumstances compared to all the others.

      • Re: (Score:3, Informative)

        by LizardKing ( 5245 )

        if there are more than one Scheduler, and if someone could tell the difference, why isn't s/he using the ALTERNATE Scheduler and compiling their own custom, tweaked and totally tuned kernel?

        Someone (Con Kolivas?) suggested a "pluggable" scheduler API. I think this was even backed up by patches to provide this functionality. Linus Torvalds rejected the proposal - I think he said that the benefits would be outweighed by the need to maintain multiple schedulers. My opinion is that the kernel could have inc

        • Someone (Con Kolivas?) suggested a "pluggable" scheduler API. [snip] Linus Torvalds rejected the proposal - I think he said that the benefits would be outweighed by the need to maintain multiple schedulers.

          ISTR, the actual issue was the overhead needed to put the scheduler in an API instead of in the kernel was unacceptably high. That said, there are modules for compiling in the different schedulers.

          It's been stated a couple of places that the different schedulers have different strengths & weaknesses

          • Re: (Score:3, Interesting)

            by larien ( 5608 )
            Hrm, wonder how Solaris does it? You can have multiple schedulers running under a single OS instance, although using different ones in the same processor set means you might not get the results you're after. Source is in Open Solaris, so there may be scope to re-use it elsewhere, dependant on licensing restrictions.

            For those who wonder, default schedulers under Solaris 10 include TS (Timeshare, the default), IA (Interactive), FSS (Fair share scheduler, workload management), RT (Real Time), FX (Fixed) an

        • Well, those same people have admitted that plugable schedulers are not a good idea outside of scheduler development and research. Since maybe a dozen people need/want to do that, the overhead imposed simply didn't make sense.
      • "most people aren't going to notice"

        I think most people notice when their computer starts to lag, skip frames and audio tracks, etc. They may not be able to identify this as a scheduling problem, but they do notice.
      • by zojas ( 530814 )
        the big concern is that the current scheduler is great for big servers, but totally sucks for desktop systems. using the old scheduler on my core2 duo system with 3 gb ram, I get crappy response from gui apps if heavy disk io is running in the background. that to me seems completely ridiculous, and in fact the kernel developers should be downright embarrassed by that situation!! I'm hoping the new scheduler will be better. (to be fair, part of the blame lies with the developers of the bloated gui apps, but
    • Re: (Score:2, Informative)

      I doubt any of us could tell the difference. Storm in a tea cup.

      I probably could. But then again, I do a lot of realtime audio recording and editing and therefore I make use of Ingo Molnar's REALTIME and PREEMPT kernel scheduler patches.

      The standard scheduler, without those patches, is just about completely useless for realtime audio recording and editing, even with nothing more than the necessary apps, JACK, X, a lightweight window manager (openbox), HAL, syslogd, anacron, and 6 gettys running. Even taken anacron out of the situation didn't help.

  • by RightSaidFred99 ( 874576 ) on Friday September 14, 2007 @11:30AM (#20604683)
    When Nerds Attack 3: The Nerdening.
    • by Azarael ( 896715 )
      What I wonder, is which scheduler they will use to pick a programming slot for the soap opera that someone is apparently trying to produce..

      On next weeks show, Linus' returned from the dead, evil twin hatches a devious plot to turn kernel design discussions into a nerd-culture flame war. Tune in to see what happens next!
  • That was a very entertaining read. I love it when strong personalities squabble, and egos collide. Open Source is Fun!
  • by Anonymous Coward
    Another CFS flamewar within 2 weeks of the last slashdot article on it.
    http://linux.slashdot.org/article.pl?sid=07/09/01/1853228 [slashdot.org]

    And yet the most important performance bugs in the kernel haven't had any updates.
    http://bugzilla.kernel.org/show_bug.cgi?id=7372 [kernel.org]
    http://bugzilla.kernel.org/show_bug.cgi?id=8636 [kernel.org]

    I do not understand the fixation on CPU scheduling when there are so many other things that need attention. [Heck, if disk IO performance is so broken, I certainly don't have the guts to try out the new fire
    • by Aladrin ( 926209 )
      Because they care. What else do they need? An individual developer doesn't care what you, or anyone else wants. They work on what they think is

      A) Important.
      and
      B) Possible for them to fix.

      Why you'd want a scheduler guy to work on IO performance is completely beyond me. Might as well have a janitor cook dinner for you.
    • The flamewar is based on drama between 2 waring linux developers. One is accusing the other of being Linuses favorite and stealing algorithms and ideas with the different schedulers. I think the other developer quit linux alltogether as a result.

      That is why its a big deal. Many developers have loyalities to one of the process scheduler develoepers.

      So in other news its drama
      • by Ant P. ( 974313 )
        No, that was Con Kolivas. The new guy this time around seems to have appeared out of nowhere claiming to have some far better scheduler implementation than the CFS one that's been in development for months, but is being a complete dick to everyone else.
        • Re: (Score:3, Informative)

          by phantomlord ( 38815 )
          Roman is a long time kernel dev and is the maintainer of AFFS, HFS, M68K and kconfig. He's hardly new to the scene. New to the scheduler code, perhaps.

          All of this started with Roman doing a code review of CFS a month or two ago. Roman asked some questions to clarify what certain parts of the code were doing, Ingo asked Roman to provide more info so he could see where CFS was falling short on Roman's test cases. Both sides kept trying to talk passed each other. Eventually, Roman got frustrated and provide
    • by gmack ( 197796 )
      For 7372 the fix is to use a better designed filesystem such as XFS. As far as I know there are people dedicated to trying to fix ext3's inefficiancies but for the most part people who want better performance are just switching to other filesystems.
  • And find out which ones better.
  • by maelstrom ( 638 ) on Friday September 14, 2007 @12:03PM (#20605197) Homepage Journal
    I know they've changed the model of development for the kernel, but how many new schedulers have we gone through between 2.4 and 2.6 now? Maybe it is just me, but the scheduler seems like a pretty important piece of the kernel.... Ripping it out every 6 months and calling it "stable" seems a bit off to me.

    Oh well. I guess I'm just getting cranky in my "old" age.
    • The reason behind this according to an interview with Linus is that he wanted to stabilize a common codebase.

      Many commerical developers have steered clear of linux and supported Windows and Solaris for this reason. Oracle even has a script that will make their rdms refuse to run if any modifications are made on a stock rhles installation. Binary and abi compatibility is important as Microsoft knows.

      Yes there are changes like this but it wont break apps or a huge way linux works. 2.6 is likely to be permanen
    • by luciofm ( 844395 ) on Friday September 14, 2007 @12:30PM (#20605575)
      There was the 2.4 schuduler, the old O(1) 2.6 scheduler and now the new 2.6 CFS scheduler...
      This doenst seem to me to be ripped every 6 months, unless the 2.6 tree is just about 6 months older...
    • by Spoke ( 6112 )

      Ripping it out [the scheduler] every 6 months and calling it "stable" seems a bit off to me.

      If you want a stable kernel, don't use the vanilla kernel. Linus has repeated stated that a perfect kernel is not the goal with his tree. His goal is to promote the development of the kernel which maintains some sense of stability but he pretty much guarantees that it will not be perfect.

      The process of creating a bugless kernel requires that all new development stop for an extended period of time. Which is what happe

  • Why not having the possibility to choose the sheduler ? What about a modular kernel sheduler, so everyone will be happy.
    • by Chirs ( 87576 )
      Because then every function call becomes an indirect call, causing performance penalties.

      More importantly, however, this requires a stable "scheduler API". Not all schedulers want to hook the same things, which means that you need to add (and maintain) a superset of the hooks required by all the various schedulers.

      Finally, anyone who cares enough to be replacing their scheduler should be technically advanced enough to apply a patch and recompile the kernel.
      • Depends on how it's done. FreeBSD supports two schedulers, but you have to make the choice at compile time. There is no overhead, because it's just a matter of compiling one set of scheduler functions instead of the other. The linker then sorts it all out for you. Xen, in contrast, uses an extra layer of indirection, since it allows schedulers to be chosen at boot time. I believe Solaris uses a mechanism similar to Xen, although it might do some runtime patching.
  • For example, ReactOS had a flamewar regarding the "stolen code from Windows", and it was nearly identical. There was this obsessive guy that got fed up over nothing just because his pride as a person was hurt. In the end he was just misinterpreting stuff. The other guy tried to be calm and understanding, but it didn't work.

    In the end, it's just about one thing: Some developers, no matter how high their IQ is, are too full of themselves because they have a stupid complex and a low self-esteem.
  • It will be interesting to see how the new processor really performs when the 2.6.23 kernel is released."
    Why would you have to wait? Couldn't someone just grab the source, compile it, and benchmark it? Yes, it may have bugs, but it should at least give an indication of overall speed.
  • BEOS scheduler? (Score:3, Interesting)

    by Danathar ( 267989 ) on Friday September 14, 2007 @12:19PM (#20605451) Journal
    Does anybody know what kind of scheduler BEOS used before it's demise? I seem to recall it ran circles around other OS's at the time when it came to multitasking multimedia.

  • -Emacs vs. vi
    -Reiderfs vs. ext3 (obsolete)
    -GPLv3 vs GPLv2
    -GPL vs BSD ....

    and now:

    Scheduler wars!
  • It may be time for Linux development to split. One fork will focus on stable code that works like a UNIX, and the other in forging new boundaries. I think the FreeBSD developers did something simple.

    There is a good reason for this. When you want to make something stable, you want to take proven ideas and refine them so you can make guarantees.

    But for our hacker souls, and our inner adventurers, we also need something that is determined to break new ground and make no guarantees. The CFS is being justified b
  • by Cassini2 ( 956052 ) on Friday September 14, 2007 @01:52PM (#20606633)

    One way to cause dramatic performance problems on a Windows machine is to simply write a program that accesses lots of files. Performing a network backup with the Windows Networking API is a good example of this. Windows responds by fetching the files from disk and using system memory as a cache. In the process, the working set of programs running on the computer is paged out. The result is that low-priority activities can dramatically slow down potentially important activities on the computer. A good example of this is doing a network backup or a background virus scan on a Windows computer while trying to do any foreground activity (like browsing the web or using Microsoft Word).

    So far, in my experience, Linux seems pretty immune to these priority inversions. Will the new scheduling algorithms allow low-priority processes to cause priority inversions by abusing non-processor resources like the network or disk drives?

  • by tota ( 139982 ) on Friday September 14, 2007 @02:31PM (#20607245) Homepage
    The discussion is not as bad as it sounds (almost normal for LKML!), it's just that Roman wants to talk about the maths and Ingo works with patches... as Willy Tarreau pointed out "I know for sure that the common language here on LKML is patches".

    Beyond the heated discussion with Roman Zippel, there are still a few workloads which can trigger regressions, one of which I found running some unit tests.

    This is covered in this thread [kerneltrap.org], and although there is now a version of CFS which does not exhibit the problem (see graph of combo3-yield patch [devloop.org.uk]) it is not the one that is meant to be merged in 2.6.23 (these patches are 2.6.24 material) so Ingo is getting me to test patches until this regression can be solved.

    One slightly annoying thing is that the current fix involves using sysctl to switch back (at least partially) to the old scheduler mechanism!
  • that Ingo takes the work/ideas of others about the scheduler and presents his own implementation without proper reconnaisance or collaboration.

    As the main responsable of the kernel, don't you think that something has to be done? At the current state I (as a developer) would not try to contribute to such a personally 'monopolized' project, that's sad.

  • by l4m3z0r ( 799504 ) <kevin@ubeEINSTEI ... minus physicist> on Friday September 14, 2007 @03:46PM (#20608739)

    As I started reading the comments on here I noticed that many were quick to down Ingo for his transgressions and its quite obvious from the comments that no one has bothered to read the exchange on LKML in order to become familiar with what is going on. I have read it, I have 0 bias for either Zippel or Molnar and I can say without any reservation that Zippel is a wank and Molnar is borderline saintly.

    A recap of what I have read and understood about the entire situation:

    • Zippel shows up with some patches out of the blue and makes some claims about them.
    • Molnar takes time out of his day to look at and understand the patches and points out a few things such as lack of comments and missing functionality, also points out that some of the things Zippel wants to do have been done since the last time Zippel looked at the code and that he should look into the latest they've done.
    • At this point Zippel decides to try to get into a pissing contest with Molnar and makes more claims about the superiority of his math and how the patches aren't the point.
    • They argue back and forth all the while Molnar very seriously considering and understanding Zippels concerns. Molnar asks Zippel to split up his giant patch into smaller pieces that are commented and cleaned in order to bring the useful ones into the kernel more easily.
    • Zippel essentially whines and repeats the same arguments over and over for a bunch of threads.
    • Molnar basically pleads for him to cooperate so CFS can get the benefits Zippel has to offer.
    • Zippel refuses, acts like a child.
    • Molnar reworks a few of Zippel's ideas into smaller patches and incorporated them into CFS and attributes this to Zippel.
    • Zippel is very rude about this and accuses Molnar of stealing from Zippel.
    • Molnar again asks him to cooperate.
    • Other people get annoyed start criticizing Zippel.

    Ultimately I think Zippel is purposefully trying to provoke Molnar throughout all of this. His wild accusations are nothing more than games that he is playing, the guy has a chip on his shoulder and if Linux was my toy, I would have blocked him from the mailing lists.

"Can you program?" "Well, I'm literate, if that's what you mean!"

Working...