Con Kolivas Returns, With a Desktop-Oriented Linux Scheduler 333
myvirtualid writes "Con Kolivas has done what he swore never to do: returned to the Linux kernel and written a new — and, according to him — waaay better scheduler for the desktop environment. In fact, BFS appears to outperform existing schedulers right up until one hits a 16-CPU machine, at which point he guesses performance would degrade somewhat. According to Kolivas, BFS 'was designed to be forward looking only, make the most of lower spec machines, and not scale to massive hardware. i.e. [sic] it is a desktop orientated scheduler, with extremely low latencies for excellent interactivity by design rather than 'calculated,' with rigid fairness, nice priority distribution and extreme scalability within normal load levels.'"
Linux on the Desktop/Linux on the Server (Score:3, Insightful)
Clearly, Desktop Linux and Server Linux have some things in common, but they also have different needs. I'm not intimately familiar with any kernel programming but I do have some basic understanding of how it all works and even I find it relatively easy to understand that the needs of a good and snappy desktop and those of reliable server are going to have some differences.
I think it is beyond time that some sort of kernel operating mode optimizations are enabled like this scheduler thing for desktop even if the defaults are for server.
Re:Glory! (Score:5, Insightful)
No. They would be crazy to use this scheduler anyway since it won't scale to
their 4096 cpu machines. The only way is to rewrite it to work that way, or
to have more than one scheduler in the kernel. I don't want to do the former,
and mainline doesn't want to do the latter. Besides, apparently I'm a bad
maintainer, which makes sense since for some reason I seem to want to have
a career, a life, raise a family with kids and have hobbies, all of which
have nothing to do with linux.
Which is not to say that it might not find it's way into the Ubuntu Desktop mainline patchset, for example. Sure it might not make sense for the mainline kernel, but it surely makes sense for a user focused distro like Ubuntu - they already have patched base and server kernels, so why not a genuine desktop targeted kernel?
Re:Glory! (Score:5, Insightful)
I wonder what BeOS had, that was so good. I mean, was it a scheduler thing? Or was it the pervasive multithreadedness that the OS almost forced upon the developers? Whatever it is, it worked like black magic: BeOS would always listen to the user input, no matter what the heck it was doing in the background, no matter what insane load was on the CPU - your mouseclicks were always reacted upon immediately, your drags were always reacted upon immediately, your typing, resizing, brushstrokes, midi-signals, whatever, always, under any circumstance, were immediately and smoothly followed by the correct response.
I was hoping Windows 2000 would achieve that, then I was hoping Windows XP would achieve that, then I was hoping some of the newer 2.6 kernels in Linux coupled with innovations in X would achieve that - but I was always deeply, utterly disappointed. Then I kinda hoped Vista would get somewhat close to what BeOS did. Oh yeah, now that was a hope decisively smashed.
Re:great news (Score:5, Insightful)
I think that's only going to be a good thing, because IMO the arguments against pluggable schedulers are weak. "we need the few people working on this to just make the core better for ALL CASES" is about the most valid i've heard, but linux is too broadly applied to force it to meet all cases. realtime, embedded, servers, desktop: i just don't think one scheduler can be shoehorned to maximize performance for all those. You wind up with a crippled scheduler that really only achieves maximum performance in at most one of those four domains. And the question of there being enough developer minds working on it? you can bet that more commercial enterprise will start throwing money at it when they can customize it for their domain.
It's like the dynamic syscall argument in a way. without dynamic syscalls, the argument goes, all the 'fringe functionality' people have to think harder and have to integrate their stuff into the current syscalls/drivers/subsystems. (apologies ingo) however, without dynamic syscalls, all the "middle of the road" functionality people like hardware manufacturers, are unwilling to release drivers that they essentially have to ask customers to compile as a supported option.
Both, IMO are cases of cutting off your leg to spite your foot.
He ain't kidding. (Score:5, Insightful)
CFS can't even cope with a CPU-bound application [foldingforum.org].
Who here runs Linux on anything with more than 16 cores? Why should everyone else get the shitty end of the stick just because of maybe a dozen institutes with deep pockets?
Re:great news (Score:5, Insightful)
I think anyone who cares and knows anything about this debate is hoping Linus sees the light and allows work to begin on pluggable schedulers. There are no definitive arguments against having pluggable schedulers, and plenty of formidable ones for them. I never really understood Linus' handling of Con in the past, I really hope that, this time round, the new BFS is given a fair assessment, and if it's found to be better under desktop use patterns, adopted for use in desktop distros.
The idea that the Nokia N900 smartphone uses the same process scheduler as my now-dated laptop as well as my 8 core server is just silly.
Re:Glory! (Score:5, Insightful)
The whole point is moot. Relying on a single maintainer is just plain stupid. "All things being equal" they should choose the code which OTHER people can maintain easier.
Re:He ain't kidding. (Score:3, Insightful)
>Who here runs Linux on anything with more than 16 cores?
Along the same lines... Who here runs their Linux *servers* with 16 or *less* cores? Probably 99.9%?
And "server" doesn't really mean anything. At work, we use Linux thin clients, so the Linux "server" is really dealing with 150 desktops, except not managing X/kb/mouse. So should it be treated like a "server" or a "desktop" for scheduling?
Re:Glory! (Score:2, Insightful)
Re:He ain't kidding. (Score:5, Insightful)
I think what you want is not a single scheduler designed for the desktop, but one designed for server processes. That's probably the whole argument here - there isn't a single scheduler that can work efficiently for the 2 wildly different types of work a user put a machine to, but currently you don't have a choice. This is all about giving users choice of what kind of scheduler they'd like to run. You might even find that a scheduler designed for lots of CPUs (at the expense of interactivity probably) would suit you much more than the current system, especially when you buy more cores.
Re:Glory! (Score:2, Insightful)
The pervasive threading made it somewhat more difficult to actually write applications for, and considerably more difficult to write cross-platform applications that worked well on BeOS and other systems (Windows, Mac, Unix and so on). That didn't help with the fairly small number of applications available for BeOS. By all accounts, the rest of the OS provided a pretty decent API though.
Using a multi-threaded UI isn't unique to BeOS though. It just happens to be the only platform that required a multi-threaded UI to do anything at all. At least two platforms come to mind where a multi-threaded UI is required, because the framework is just too slow and unresponsive if you don't.
In Java, Swing UIs tend to perform abysmally badly if you do any non-trivial work inside the UI thread. The UI code isn't all that fast, and it's design lends itself toward doing lots of work in the UI thread, which causes the UI to hang. Most Swing applications have terrible responsiveness as a result. However, you can use worker threads to actually do the work, and use the UI thread only for event handling - if you do that, a Swing application can be extremely responsive. It's slightly trickier to do, but once you get the hang of it, it's not too hard.
The same is pretty much true of .Net's Windows.Forms. It's a bit faster than Swing, although not by much (some parts are actually slower - System.Drawing vs Java2D, for example), so it's a little more forgiving of doing work in the UI thread. It will still bite you in a non-trivial application. Of course, the framework provides absolutely no help in writing a multithreaded application, and all of the tools, examples and documentation make writing a multi-threaded application far more difficult than it should be. You [i]can[/i] write a multi-threaded Windows.Forms application, but nearly nobody does. Which is a shame because, as with Swing, getting all the work off the UI thread makes a huge difference to the application's responsiveness.
Most other frameworks are fast enough that most application developers don't feel the need to multi-thread the UI, because the UI isn't noticeably slow. While it might not actually be slow, it surely could be much faster.
I kind of like Qt 4's approach. It's still optional, but it makes it pretty easy to create worker threads. The worker threads communicate using signals and slots, and Qt automatically handles dispatch between threads by mapping a cross-thread signal to an event on the target thread. It's pretty much the simplest approach I've ever seen - it works the same way as .Net's cross-thread delegate invocation, but it's completely transparent, and doesn't require anywhere near as much pointless boilerplate code.
Re:Who cares? (Score:3, Insightful)
During testing (on the Windows platform!) I guess it's safe to assume that everything was handled by filesystem cache.
The comparisation with compiling the kernel on Linux on a machine with not too much RAM doesn't stand.
Re:4096 cpu machines (Score:2, Insightful)
Re:4096 cpu machines (Score:5, Insightful)
Re:Glory! (Score:1, Insightful)
I would say that Con's FAQ entries demonstrate exactly that Linus was right. That is not the attitude of a reliable maintainer. ...
My observation is that Con tried to "work within the system" and got nowhere. He ran up against the politics and personalities at core of Linux, and once you've done that, there's nowhere to go. At some point, I can't blame the man for throwing up his hands and adopting a "Who The Fuck Cares?" attitude. I don't think it says much about his "reliability". I think it says more about the limits of the "benevolent dictatorship" Linux governance model. I'm not saying that a better model exists -- I'm just saying the current model isn't perfect. One consequence of Linus' dictatorship is that a certain number of talented people will be driven away from Linux.
Re:16... okay for the desktop for 12 months (Score:3, Insightful)
16 probably isn't very far off. The ARM Cortex A9, which is starting to ship into handhelds and mobile phones, scales to 4 cores. The A10 will probably handle 16, so expect to see handheld computers with 16 cores in the next couple of years. Of course, when you're on battery power, you'll probably want to turn a few of these off, so the scheduler has to decide not just which jobs to run, but how many cores to enable at any given time. This is a really difficult problem (you can read some interesting papers on the subject, quite a few funded by Intel research grants, if you look) because running two cores at 500MHz can use less power than running one at 1GHz, but only if they are both loaded. Once you add in the ability to scale the clocks on each core independently it becomes even more tricky. Then you need to add in the requirements of asymmetric multiprocessing environments; deciding if it is worth turning the GPU core on to run this OpenCL kernel, or should you schedule it on the CPU, for example.
Any scheduler created today is likely to look horribly antiquated in five years. There are so many open research problems in the domain, before you even get down to implementing the algorithms.
Re:great news (Score:3, Insightful)
The sic is in the wrong spot.
It reads "it is a desktop orientated scheduler". Note the topic subject is "desktop oriented scheduler".
It should read "it is a desktop orientated (sic.) scheduler.
Re:16... okay for the desktop for 12 months (Score:4, Insightful)
I guess you didn't read TFA:
In the meantime if you care about CPU utilization and latency then use this. Tomorrow will take care of itself. It's not like if you buy one computer or graphics card, or build one kernel, that you're tied to it for the rest of your life. You use this year what's available and update when the situation warrants it.
Re:Glory! (Score:5, Insightful)
No normal user cares about their video encoding being 2 seconds slower (over a 3 hour process) because they wanted to answer their email. If that's really important to you, you are probably doing your video encode overnight or during some time when nobody's using the computer, anyway, and then it doesn't matter.
Instant response is *always*, *always* more important than all other tasks. Always. One of the many, many things BeOS got right.
Re:great news (Score:2, Insightful)
All people are occasionally idiots; this is forgivable. Being an asshole is not.
You would do well to learn some humility and respect. People here are most likely to be bright and honestly mistaken than they are to be stupid or lying. The appropriate response is not to flame them, but to educate them. Consider at all times the impact of your words.
If you really must rant, post anonymously.
Re:Glory! (Score:3, Insightful)
I haven't tested this scheduler. However, during Con's previous scheduler effort, sound skipped in ZSNES under mainline kernel, and didn't skip under Con's scheduler, in identically loaded machines. However, idle priority was not really idle-only, since a cpu-burning task running at idle priority could cause similar skipping (despite doing nothing but a simple "while(1);").
The numbers relevant here are the average, maximum and minimum latency, where latency is defined as the time between a sleeping task becoming eligible to run again and it actually starting to execute or a task exhausting its timeslice and the next time it starts to execute, in idle, lightly loaded and heavily loaded machines.
The argument for a purely forward-looking scheduler that doesn't implement any heuristics is that the maximum latency is a function of the number of tasks running, their priorities, the priority of the current task and whether the task is waking from sleep or has been descheduled due to having used its timeslice. This means that maximum latency is bound (and usually low), resulting in execution that feels snappy (low latency) and smooth (no great variations in latency). A heuristic-using scheduler, on the other hand, can easily end up in a situation where a task is unexpectedly scheduled a lot later than it would in a nonheuristic scheduler; in other words, while the average latency can be low, the maximum is unbound (or at least the bound is very high). These unexpected seemingly random huge latencies (latency variation, to be exact) are what's perceived as "jerky" behaviour, or so the theory goes anyway.
I agree that we need actual objective data to base decisions on. Does the kernel currently have capability of measuring these things (time when a task starts executing, time when a task stops executing and the reason for it, and a time when a task becomes eligible for execution and the reason for it) and if not, could one be added?
Re:great news (Score:3, Insightful)
Linus Torvalds has, for once, made pretty clear arguments against it. Various philosophical ones etc. but also several solid technical ones
See this email [lkml.org] and this one [lkml.org].
The grandparent's statement that "here are no definitive arguments against having pluggable schedulers" glosses over the fact that Linus' arguments have to be proven wrong. I can believe that in this, like most things, Linus is wrong; however it's experimental science not philosophy. Someone has to write the code.
The scheduler is probably the piece of kernel code which actually does something which gets called most (many times a second even if only user activity is ongoing). A level of inefficiency which would be okay in an IO scheduler which will normally have to wait for a slow disk access just can't be accepted in a process scheduler and even a single level of indirection might really be a killer. Possibly it would be better to have separate kernel builds for small and large installs than having pluggability in which case even CK's new scheduler may not prove the need for pluggable schedulers. Alternatively, maybe pluggability would have to be done with self modifying code which left no indirection in place?
Re:great news (Score:3, Insightful)
Well, he has a point.
So what? If you have a point, but you're being a dick about it, people are far less likely to notice. And given Ingo at least *tried* to be civil, the least Con could do is return the favour, rather than immediately becoming an offensive asshole. For example, he could've responded with:
"Well, recall, the purpose of the scheduler is to enhance desktop performance. Thus, I've designed it to favour low latency over high throughput, and as a result, it's not really surprising that, in throughput-related tests, which I consider more of a server-style workload, BFS performs less well as compared to other schedulers."
No no. He opted for the far more dickish:
"Do you know what a normal desktop PC looks like? No, a more realistic question based on what you chose to benchmark to prove your point would be: Do you know what normal people actually do on them?"
Honestly, WTF? And he's surprised when he attracts hostility? Please.
Con: Step 1 to becoming a decent human being: try not being an asshole. No, really.