Preemptible Kernel Patch Accepted 374
An Anonymous Coward writes: "The preemptible Linux kernel patch that was originally introduced by MontaVista Software and more recently championed by Robert Love has been merged by Linus Torvalds into the main linux development-kernel tree, beginning version v2.5.4-pre6. This adds a far greater degree of real-time responsiveness to the standard Linux kernel, by reducing
interrupt latencies while kernel functions are executing. The story at LinuxDevices.com includes comments by Robert Love, and there is also a recent interview with Robert Love about the preemptable kernel here and a whitepaper about the technology by MontaVista here."
Wow. (Score:5, Insightful)
Re:Wow. (Score:3, Interesting)
This will be a good first step in reducing latency and increasing response time in X and other programs.
Re:Wow. (Score:3, Informative)
The even numbered (2.2, 2.4, etc) builds are API stable.
API stable means that a program you wrote for 2.4.0 would run without change on 2.4.99 because the libraries and system APIs are identical.
Now, ideally they'd be stable as in not crashing, but that's never going to happen. New testing releases often have problems.
"But, it's not testing, they released it," you say.
Did you get a buggy kernel from Redhat, or Debian, or any other distro? Or did you go download it seperately and install it? If so, it's not released for end users.
This doesn't mean that those kernels weren't buggy, just that they weren't guaranteed not to be.
Stable is like free, a word with many connotations. In this case is means unchanging, not crash-free. If you want crash-free, simply wait a week and see what people have to say. (You never need a new kernel so badly you can't wait.)
Anyways, the new VM was a big change, but if the original VM wasn't cutting it, I think Linus did the only thing he could. He wanted 2.4 to be usable (if not perfect) so he swapped out a potentially better VM for a simpler VM that would work now. Otherwise 2.4 would still be unusable for many applications and people who needed it would have to use 2.2.x or wait for 2.6 which is likely quite a ways off.
Don't forget that changing the VM doesn't change the APIs. A program written for Rik's VM works fine with the AA VM and vice versa.
Re:Wow. (Score:2)
for those who don't read at 0, I'll sum the whole discussion down right now.
person 1: OSS sucks.
Person 2(re: Person 1): Microsoft sucks. OSS is better.
Person 3(re: Person 2): Linux sucks. BSD is better.
So to sum it all up,
Everything sucks!
Re:Wow. (Score:3, Funny)
And what about 2.4? (Score:3, Insightful)
I know that I shouldn't ask this because there has already been enough changes and troubles in 2.4 - but I've got some Karma to burn:
Wasn't this patch long enough available on 2.4 so that it should be stable enough?
Re:And what about 2.4? (Score:4, Informative)
On the other hand, an awful lot of users have been testing and reporting back to lkml, and Robert Love has been persuing the bugs with the dedication of a first love. I'm sure that scores points with the power(s) that be on LK.
Re:Wow. (Score:2)
Pre-empt (Score:2, Interesting)
Nice work (Score:5, Informative)
[] Smoother video
[] Smoother user interface
[] A seemingly more responsive computer
[] Overall smoothness in operation
(reply to this if you'd like to add to my list)
Congrats to Linus for getting this ready so soon, and to those who helped develop it.
EricKrout.com
Disadvantages (Score:2, Informative)
Love: I'll start with a quick explanation of how the patch works. Right now, the kernel is not preemptible. This means that code running in the kernel runs until completion, which is the source of our latency. Although kernel code is well written and regulated, the net result is that we effectively have an unbounded limit on how long we spend in the kernel. Time spent in kernel mode can grow to many hundreds of milliseconds. With some tasks demanding sub-5ms latencies, this non-preemptibility is a problem.
The preemptible kernel patch changes all this. It makes the kernel preemptible, just like userspace. If a higher priority task becomes runnable, the preempt patch will allow it to run. Wherever it is. We can preempt anywhere, subject to SMP (symmetric multi-processing) locking constraints. That is, we use spinlocks as markers for regions of preemptibility. Of course, on UP (uni-processing) they aren't actually spinlocks, just markers.
The improvement to response is clear: a high priority task can run as soon as it needs to. This is a requisite of real-time computing, where you need your RT task to run the moment it becomes runnable. But the same effect applies to normal interactive tasks: as soon as an event occurs (such as the user clicking the mouse) that marks it runnable, it can run (subject to the non-preemptible regions, of course).
There are some counterarguments. The first is that the preemptible kernel lowers throughput since it introduces complexity. Testing has showed, however, that it improves throughput in nearly all situations. My hypothesis is that the same quicker response to events that helps interactivity helps throughput. When I/O data becomes available and a task can be removed from a wait queue and continue doing I/O, the preemptible kernel allows it to happen immediately -- as soon as the interrupt that set need_resched returns, in fact. This means better multitasking.
There are other issues, too. We have to take care of per-CPU variables, now. In an SMP kernel, per-CPU variables are "implicitly locked" -- they don't have explicit locks but since they are unique to each CPU, a task on another CPU can't touch them. Preemption makes it an issue since a preempted task can trample on the variables without locks.
Overall I think the issues can be addressed and we can have a preemptible kernel as a proper solution to latency in the kernel.
Re:Disadvantages (Score:3, Informative)
Re:Disadvantages (Score:3, Informative)
It crashes my machine occasionally. Dual p3/700 (so it's SMP -- which complicates matters.) Without the preempt patch, the box stays up for months at a time. With it, it seems to lock up hard after a few days.
So far, at least two crashes happened while burning a CD. I wonder if that's a coincidence ..
Re:Disadvantages (Score:2)
Unfortunately, there's no info to get -- the box is locked solid.
Maybe there was some dump info sent to a vc somewhere -- but with the screen displaying X, I can't see it.
This is my work box, so I can't really have it crashing. It's easier to just go back to the stock 2.4.17 kernel and leave it at that. And the preempt stuff doesn't help it much anyways.
Re:Nice work (Score:5, Informative)
Other archs like alpha use values higher than 100 e.g.
include/asm-alpha/param.h:# define HZ 1024
include/asm-ia64/param.h:# define HZ 1024
You may try it if you aren't going to go to 2.5.x in the near future, but hell, if you don't mind twisting and breaking the kernel by altering the HZ value, why not 2.5.x!
Note: I notice that whenever I talk about changing kernel values the post will be modded redundant. I know a lot of you guys know about kernel insdide out so this might bore you - but come on! I'm sure so many other would be interested in it.
Re:Nice work (Score:5, Informative)
First, for those that didn't get it from the parent post, HZ is a system wide timing value. It has nothing directly to do with the mouse.
What it does deal with is how many times a second the system's interrupt timer fires. The problem with increasing the interrupt timer frequency is that you waste more time servicing interrupts than doing real work. It may improve interactive "feel" because the timer interrupt will trigger higher priority tasks to be rescheduled more often, but at the price of higher system time and lower "throughput".
Compared to the preemptible kernel patch, increasing HZ is actually harder on throughput, especially on slower systems. Much work has been done on finding and killing long held locks [zip.com.au] not covered by the preempt patch (thanks to Andrew Morton and RML), an approach which has been shown to be quite effective. Increasing timer interrupt frequency means you're creating more pointless interrupt load, which goes against the approach and advances of the other low-latency patches.
There is an interesting discussion of the HZ value and how it effects Linux in a VM at Linux Weekly News [lwn.net] and for more arcana check out the high resolution timers [sourceforge.net] project.
Regards
Re:Nice work (Score:2)
Yeah, it only took 2 years for it to get integrated. If MS was that lazy, the slash-shit would hit the fan.
I find that funny. It took MS 5 years to hop on the internet bandwagon. Same with a fully 32-bit OS. 5 years. 2 years looks downright speedy -- and good luck getting MS to integrate somebody elses code into theirs. They *still* haven't fixed the bug in Direct Cable Connection which disallows the use of an ECP cable(which is hella fast.), though a patch for that has been on the internet for almost 4 years now.
Will this apply to X Windows? (Score:3, Interesting)
Example. Running XMMS and pushing play on an MP3 the video display and the sound are not synched. I am running a reasonable video card and sound card (Geforce 256 and a SB-Live) and I expect the video to work on the same scale and rate as the audio, like MS-Windows.
BTW, this has been one of the biggest complaints I have had against X86free and why I haven't completely made the transition to Linux yet. If this patch does in fact improve the response time of X86free, then I would be more likely to use it more often than I use XP.
Re:Will this apply to X Windows? (Score:2, Informative)
However, I doubt that it's XFree86's fault, as the port of X-Chat (which was built with GTK) to Windows shows the same menu behavior as its Linux counterpart. On Linux, however, IceWM exhibits no menu delay whatsoever.
Then, of course, you have to take into account if you're running a theme that uses pixmaps. If you're running bubbles-gradient, for example, you're more than likely wasting a horrendous amount of CPU cycles just to highlight a button. Even with fast themes like thinice, the delay is still there.
It's this kind of clunkiness that makes me wonder how people can use themes like this [themes.org]
Re:Will this apply to X Windows? (Score:2, Informative)
"and how X runs non natively" huh? you mean in user space. The graphics in windows 2k/xp run in the kernel which is what you actually mean by all this.
"pushing play on an MP3 the video display and the sound are not synched" This has to do with the sound being buffered in xmms, the video is rendered from samples as they are placed into the sound buffer instead of when you actually hear them. This has nothing to do with anything but xmms.
Re:Will this apply to X Windows? (Score:3, Funny)
Tradeoffs? (Score:4, Interesting)
Re:Tradeoffs? (Score:2, Funny)
Re:Tradeoffs? (Score:3, Informative)
Re:Tradeoffs? (Score:5, Informative)
Net effect - expensive operations can be suspended for user interactiveness. Can this impact performance, Yes. Noticeably? No.
If you're running a big-ass server, it's probably head-less, anyways - and you won't have any large, interactive processes preempting the kernel for smoothness.
If you're running a workstation, this means that X won't bog down as much when you're running those huge simulations, compiles, etc.
If you're on an embedded device, you can use this to try and get real-time responsiveness. (perhaps not ideal, but, in an embedded situation you have enough control that if you need a better real-time guarantee, you have other options (e.g. rtlinux).)
If you're on a modest, consumer PC - X won't suck as much.
All in all, this is a good idea. In theory, you lose some efficiency making several thousand context switches/second, but that's the price you pay for multi-tasking. Yeah, certain kernel operations may take longer, but, you get a better responsiveness, which - for most people, is a good thing. Most interactive individuals are seldomly pegging their processor at 100% utilization for any worthwhile period of time. (Games are an exception.)
This is good stuff.
Re:Tradeoffs? (Score:3, Informative)
Re:Tradeoffs? (Score:5, Informative)
But you will have IO-bound processes coming alive faster once their data is available, often improving throughput. There have been benchmarks floating around that indicate that a lot of typical server workloads benefit from this patch too.
It appears that this is generally a good thing. The only downside is the added complexity.
Re:Tradeoffs? (Score:5, Insightful)
The cost doesn't have to be borne by the end user. The cost can be developer time / clues. In the Free Software world, you do get that for free.
Re:Tradeoffs? (Score:2, Troll)
despite appearing mature and clueful, way off mark.
No, despite trying to sound like the wiser one, you are way off the mark. If this was just a patch that unplugged a logjam it would have been applied a very long time ago. No, it took time because there were tradeoffs. Yes, those tradeoffs may not be entirely tangible or even noticeable by the end user, however there *are* tradeoffs.
For more proof, I'll direct you to the large number of clueful responses to my original question.
Re:Tradeoffs? (Score:2)
Did you read them?
Re:Tradeoffs? (Score:2)
Re:Tradeoffs? (Score:3, Troll)
False. There *is* a tradeoff. And you probably want to take an Operating Systems course before spewing "there is no tradeoff with Free Software" nonsense. (BTW, I wanted to ask the same question).
Anyway, here is how it works: a ready, higher-priority process can kick off a running, lower priority process before the running process's time slice expires. This does indeed improve responsiveness so that your machine "feels" (*)faster, but in reality it actually runs slower. The cost of pre-emptible kernel is that it does more process switching than a non-preemptible one (see above, it can (and does) interrupt a process before its time slice is finished). More process switching requires more CPU time, concequently, less CPU time is spent on actually doing work. So yes, the good thing is that it decreases latency (hence better responsiveness). But the bad thing is that it decreases throughput (the amount of work actually done) because of the increased process switching overhead.
(*) The reason your machine "feels" faster is that the GUI becomes more responsive. But that is pure illusion! Your machine actually does less work. Thus, the pre-emptible kernel patch would probably be useful for workstations, but you definitely don't want to use it on a server.
So the question becomes: what is the throughput/latency tradeoff with the current implementation of the preemptible kernel?
Re:Tradeoffs? (Score:3, Insightful)
This is not true in case of workstations, whose primary purpose is to provide smooth and fast environment for people to work, not to crunch numbers.
Neither does your assumption hold for embedded systems - their function is often to provide fast responce to external signals, which they can do much better now. Most embedded systems don't utilize 100% of the processor power either.
It is only in the case of servers with heavy I/O that your reasoning makes sence. But the solution is in the hardware - use bigger blocks of data, and the processor won't be interrupted too much.
Re:Tradeoffs? (Score:5, Insightful)
Not quite true - some Linux patches give unilateral improvement. But I do see your point.
None of the other responses to this thread (that I've noticed) addressed one tradeoff: complexity and bugs. Ever since Linux started to support SMP systems, SMP kernels have been somewhat buggier than UP kernels. This is because there are a lot of potential mistakes to be made - getting and releasing spinlocks and semaphores at the right places is not trivial stuff. Of course bugs have been fixed over the past several years, and SMP is now considered a standard Linux feature (in 2.0 it was "experimental"), but there are still no doubt lots of SMP bugs in some of the more obscure device drivers.
The problem with the preempt patch is that it introduces all these potential bugs into a standard non-SMP kernel, since preempt uses much the same basic mechanism as SMP. Most people only have one CPU, but now these people will be exposed to the same "increased level of risk" as SMP systems.
In a way, that could be considered a benefit - this may serve to flush out some of those last remaining SMP bugs. The SMP code paths will be exercised by a lot of people now.
Re:Tradeoffs? (Score:3, Informative)
If a low-priority task locks a resource, it can still be preempted by a high-priority task... but if the high-priority task also wants that resource, it's going to have to get in line just like everyone else. This is also what leads to the possibility of priority inversion. [ncl.ac.uk]
Nothing changes if you substitute "kernel task" for "task" in the preceding paragraph.
--
I like canned peaches.
Scalability of a pre-emptible kernal? (Score:4, Interesting)
As far as I can tell, a particularly responsive kernal wouldn't scale very well, since there wouldn't any guarantees as how much "time" as being spent on a thread/process by the CPU.
Think of a large, multi-user environment based on Linux. Do you really want any user to pre-empt the processing in the kernal by CPU to the detriment of other users? A more logical answer to this is to have set guarantees as how much processing time is given to each user. It shouldn't matter if it's one user or 2,000 users, the speed of applications for each user should stay the same as much as possible.
Maybe I'm describing Solaris, or some other operating system like this.
Re:Scalability of a pre-emptible kernal? (Score:4, Insightful)
The fact that this is built into the kernel means that we don't always have to go out and download patches to change this. I would assume that vendors using the Linux kernel would make decisions on how to compile the kernel to suit their environment.
There are lots of people that are frustrated by the current need to go and get patches to change this. Incorporating it into the main kernel should be very positive, IMHO.
Agree, here is the excerpt from the config file (Score:3, Informative)
--- 1.3/arch/i386/Config.help Tue Jan 29 06:32:09 2002
+++ 1.4/arch/i386/Config.help Sat Feb 9 11:11:32 2002
@@ -25,6 +25,16 @@
If you don't know what to do here, say N.
+CONFIG_PREEMPT
+ This option reduces the latency of the kernel when reacting to
+ real-time or interactive events by allowing a low priority process to
+ be preempted even if it is in kernel mode executing a system call.
+ This allows applications to run more reliably even when the system is
+ under load.
+
+ Say Y here if you are building a kernel for a desktop, embedded
+ or real-time system. Say N if you are unsure.
+
CONFIG_X86
This is Linux's home port. Linux was originally native to the Intel
386, and runs on all the later x86 processors including the Intel
Re:Scalability of a pre-emptible kernal? (Score:3, Informative)
*Not technically, but that's the most common usage.
Re:Scalability of a pre-emptible kernal? (Score:5, Insightful)
Actually, I would think it would be the opposite. Being able to preempt within the kernel can pretect you against a DOS attack where a process repeatedly makes long running kernel calls. That would give that process more than it's fair share of time, and other processes couldn't respond to interrupts as well. Without a fully preemptable kernel, you can't guarantee how long a process can run, because it is impossible to preempt them while they are in a kernel call.
Re:Scalability of a pre-emptible kernal? (Score:5, Insightful)
The issues you raise are a matter of scheduling policy, not of whether the kernel is preemptible. Furthermore, for most interactive tasks, the correct behavior is to react quickly, because those tasks haven't used up their timeslices, since they blocked waiting for input. In this case, interactive processes give up the CPU to wait for input, and then get their time back as soon as they have a use for it.
Of course, this all also applies to tasks which "interact" with the network or the hard drives, which is any task when you have swap space. Processes which are waiting on input want to run as soon as their input is ready, and don't care about time before that. Processes which are not waiting on input want to run as much as possible, and don't care exactly when. Having the scheduler's instructions followed as closely as possible benefits both kinds.
won't help X too much (Score:2, Interesting)
Anybody who uses X and Windows regularly knows the difference in responsiveness. X Windows does what it was for designed extremely well-- a client/server display system. However, due to the marshalling and de-marshalling of X calls, even if completely local, it will always be less responsive than other methods (winblows).
But I have an idea. Develop a system that implements the exact same interface as X but does no marshalling/demarshalling. Pixels can be written directly to the framebuffer. So you are thinking, "Yeah, but I want to use X apps without recompiling". Ok, use library interposition. This also allows you to use a "local" and "global" X library to maintain client/server capabilities. For those who aren't familar which library interpositioning, it essentially takes advantage of dynamic linking (set LD_LIBRARY_PATH on Unix). If you want to run a X program that directly writes to the framebuffer, then switch your LD_LIBRARY_PATH to a different directory before the program is executed. This could get annoying, but a Window Manager like Gnome could take care of this automatically.
Granted that our existing X server would have to be retrofitted to allow 2 different types of X libraries to update the same display to that we can run standard client/server X apps with the new "directXfree86" (no pun intended) apps.
However library interpositioning can be used to make X programs more responsive without sacrificing client/server capabilities and compatibility with existing applications (except those statically linked of course).
Can you say "Re-inventing the wheel?" (Score:5, Interesting)
Writing pixels directly to a frame-buffer is slow. You lose all of the acceleration features of your video card. Keeping as much of the protocol at a high level as possible is good. The only things that benefit from direct frame-buffer access are programs that do all their own rendering. (Think video decoders.)
Still, if you think about it, the basic gist of your idea is to get rid of the network channel from the communication protocol, and instead have the app talk directly to the X server, say, in shared memory. If so, then how does your idea compare to MITSHM [reptiles.org] and Shared-Memory Transport [precisioninsight.com]? Or the Direct Rendering Interface [sourceforge.net] for that matter? And for 2-D stuff, let's not forget the Direct Graphics Architecture extension [xfree86.org]. Nothing stops GTK, Qt and friends from using any one of these technologies if they'd improve performance and latency.
--JoeRe:Can you say "Re-inventing the wheel?" (Score:5, Informative)
Of course, I think the poster didn't really mean direct framebuffer access, but rather trimming Xlib where possible to not do things that increase latency locally, which, as many have pointed out Xshm does that very thing..
Re:Can you say "Re-inventing the wheel?" (Score:3, Informative)
Not that I'm an expert on Video hardware or anything, but that is paraphrased from an explanation given by X-inside for their X server.
Maybe you could call it Direct Rendering Interface (Score:5, Informative)
Man, people piss me off sometimes... I wish people would actually read something about X before bitching about it on
I don't know why people think X is so horrible. X just destroys Windows as a windowing system. The only plus Windows has it that it has better hardware support. Other than that, X blows Windows away.
And this got mod'd up to 4... Sheeesh
Re:Maybe you could call it Direct Rendering Interf (Score:2)
One of the sad, unintended consequences of Linux's popularity is that there is a young generation of geeks out there who think that X-windows is something other than a comedy of errors.
The toolkit, the inefficiencies in communication, the lack of intelligent control at the terminal side, and the list goes on and on...
Re:won't help X too much (Score:3, Informative)
Impossible. The X11 protocol is incompatible with this idea.
This has been tried. See the D11 paper by Kilgard.
The idea is called Direct Rendering and it is not a significant performance win for most graphics ops. The obvious exception is high bandwidth graphics such as OpenGL and streaming video. You'll notice that XFree86 already has direct rendering for OpenGL and streaming video.
Summary: X11 is not the bottleneck on your desktop.
Re:won't help X too much (Score:2)
1) X server event loop should be driven by the monitor retrace, not by mouse/keyboard events
2) Resizing windows should be synchronous, not asynchronous (this causes the "lag" effect when resizing a window on X)
3) Toolkits should be double-buffering everything, using client-side layout code (rather than X window objects), and holding all drawing until input events have been handled.
4) (controversial) - get rid of the window manager; incorporate it into the X server.
Re:won't help X too much (Score:2)
e.g. window resize events are handled asynchronously because back when X was being designed, it was unthinkable for the client and graphics hardware to be able to respond interactively. But today's hardware can handle this with ease - and that's why window resizing feels so much better on MS Windows - because GDI was designed with assumptions that fit modern machines (i.e. fast video hardware, and no network, so synchronous resizing is not a problem).
It's sort of a self-fulfilling prophecy - X thinks of itself as a slow graphics system implemented over a high-latency network, and so that's what it becomes, even though the hardware is capable of so much more...
Re:won't help X too much (Score:2)
1) although network traffic and context switches are not the main problem with X, cutting the window manager out of the loop would probably almost double the speed of handling window move/resize events. (do you *really* want to wake up the WM on every mouse interrupt, when X is perfectly capable of moving windows itself?)
2) it is *hard* to write a good WM. There are lots of little details in event handling/ordering, client communication, etc, that are easy to get wrong or omit. One common codebase, approved by the X developers, could reduce the proliferation of semi-correct WMs.
3) are WMs *really* that different? I've used many of them over the past couple years (sawmill/sawfish, icewm, kwin, enlightenment 16, 4Dwm, the GNUstep one...). Honestly the only reason I've switched WMs is because I saw a neat theme that wasn't available for the one I was using =). Aside from that, I've set them all up to operate almost the same way... So I don't really see the benefit in having 10-20 mediocre WMs rather than 1-2 really good ones... Minor customizations like theming and different window behaviors should be added through small DLL plugins rather than writing a whole new WM...
This is bunk. (Score:2)
If you want more responsiveness, fix your toolkits. This is happening in GTK+ v2. Look at the changelogs and code. IF you treat a video card like a framebuffer, you lose out bigtime. If you do everything as a vector op, you save bigtime ! This is (on of the) reason(s) why OpenGL is popular -- it's a vector API for 3D graphics.
Re:This is bunk. (Score:2)
It is? Great. I've been developing a GTK+ app [obsession.se] for three and a half years, using GTK+ 1.0 and lately 1.2. Since I'm looking forward to GTK+ 2.0, I recently downloaded a snapshot of the development series (1.3.10) and built it, to try it out. Geeez, was it slow! Now, I don't have any numbers or anything, but based on my experience, the simple list test program I wrote feels 3-5x less responsive than it would be under GTK+ 1.2. Clicking a list item has a noticeable delay before it gets rendered in the selected state. Now, my machine (a K6-233/128) is obviously not a modern day monster, but still. If there are initiatives to make GTK+ 2.0 faster than its predecessors, they sure seem to start by going quite a bit in the opposite direction.
This and O(1) scheduling by Ingo Molnar _rock_ (Score:5, Informative)
Re:This and O(1) scheduling by Ingo Molnar _rock_ (Score:2, Informative)
Re:This and O(1) scheduling by Ingo Molnar _rock_ (Score:3, Funny)
Quake 3 has never been smoother on my machine. 2.4.18-pre7 with Robert Love's Pre-emptible Kernel patch and Ingo's O(1) patch.
Sounds awesome. Quick question: does Ingo's patch make all of userland O(1), or just the kernel? I'm curious.
(1/5 wink)
Re:Ingo? (Score:3, Informative)
O(1) is constant time regardless of input size. O(n) means time grows linearly with input size. There's others but I don't know that much about it...
Re:Ingo? (Score:4, Informative)
Classifying algorithms this way is *extremely* useful for working out what will give good real-world performance.
In general, you want to stick to O(1) or O(log n) to have performance that scales in a reasonably effective way.
Quite a lot of algorithms are O(n) which is OK for small values of n but can get nasty when n becomes large. Inserting a value into a linked list is a typical O(n) algorithm - OK for small lists, bad for long ones as you have to run down the list to find the correct insertion point. (Tchnically you only have to go halfway down the list on average, which would make it O(0.5n), but by convention and for practicality purposes the notation drops and constant factors).
A large proportion of processor bottlenecks are due to getting stuck in O(n^2) or O(n.log n) tasks. Sorting algorithms tend to fall into this category, which explains why they are often slow and/or processor hungry.
Higher polynomial orders such as O(n^4) etc are possible, but generally less common. Sometimes writing a sort algorithm really badly will get you into this territory however
Then there are the *really evil* algorithms that behave like O(2^n) or O(n!). These rapidly become intractable as n grows. Good examples would be the exhastive search of soloutions for the travelling salesman problem, or an exhaustive search of the tree of moves for a game like chess. When faced with this kind of problem, you are basically forced to either limit yourself to small values of n or choose an "approximate" algorithm, such as accepting the best solution found after a timeout period.
Re:Ingo? (Score:3, Informative)
Re:This and O(1) scheduling by Ingo Molnar _rock_ (Score:4, Informative)
Here is the bitkeeper log (Score:3, Informative)
It's just 3 hours old
A very nice way to follow the fresher kernel !
didnt work for me with 2.4.17 (Score:2, Interesting)
Im looking foward to trying this patch out again when 2.4.18 comes out and i hope it works better.
-phinn
and it won't (Score:3, Informative)
NVIDIA drivers have to be rebuilt when you build a new kernel. As for PPP, you were probably just missing a driver when you configured.
Recompile the drivers, they will work. (Score:3, Informative)
I don't know what problems others have or have not had, but I've never had a bit of trouble with the preempt patch.
Whatchew talkin' bout? (Score:3, Informative)
Hrm... I am running that exact setup, and due to ISP/CLEC madness, I am also using PPP for connectivity. In fact, I am writing this dialed in with a 2.4.17-preempt kernel. No issues with all of the above plus a GeForce3 with the newest NVidia drivers.
So far, I have to say I am very impressed with the performance. I do notice a difference because I have taken to creating Divx;-) movies which proves to be a loborious task. I can rip a DVD and preview the
Other arches? (Score:4, Interesting)
I've got a Powermac 7200 I'm playing with YDL on right now...
(Note: I am not a programmer. Should this be something patently obvious to anyone with the most casual knowledge of OS programming, I still don't know. So don't flame me.)
--saint
Re:Other arches? (Score:5, Informative)
I wondered too (I also have a 7200), and found this answer in the changelog [kernel.org]:
Re:Other arches? (Score:2)
Great News! (Score:4, Informative)
I have the 2.4.18-pre8 with the O(1) schedullar and it is great. It was easy to untar the 2.4.17 kernel, add the 18-pre8 patch, and then add the O(1) patch. I then copied over my old
Rolling your own kernel branch is actually a lot of fun and I am learning a lot. Anyone can do it, there are dozens of kernel patches that I would love to play with.
When you compile linux you will have a choice to add this preempt capability to Linux or not. You can turn it on or off with the click of a mouse button and a recompile. So, if preempt makes throughput too slow for your application, turn it off.
For the great majority of users this will make the user interface very smooth and rock solid. And anyone who uses their computer to record and play back audio and video will love this patch. Never a clitch or dropped frame again. It'll be awesome.
Re:Great News! (Score:2)
Unfortunately, 99% of home computer users (the userbase that Linux has been salivating over for a while now) would have lost you at "untar" ;)
It's good news nonetheless (ok, so a bunch of people - notably BSD users - will whine that it's not actually "news" and we should specify "Linux Kernel" and not just "Kernel", but anyway), I've been using the patch myself for a while, and with the added convinience of not having to apply it, I might give the 2.5 series a try, sometime around 2.5.6 or 2.5.7...
Now for the entropy pool. (Score:5, Informative)
http://www.kernel.org/pub/linux/kernel/people/rml
describes what it is all about
Re:Now for the entropy pool. (Score:2, Informative)
Re:Now for the entropy pool. (Score:2)
god I hate when people talk about something in the server room and I cannot decrypt it. Should have used high bits.
[PATCH] To fix compile error on UP systems (Score:5, Informative)
It should be noted that this will lead to a compile error if you enable preemption but disable SMP. To make this build, you need to add this patch:
diff -urN linux-2.5.4-pre6/include/asm-i386/smplock.h linux/include/asm-i386/smplock.h
--- linux-2.5.4-pre6/include/asm-i386/smplock.h Sun Feb 10 15:35:55 2002
+++ linux/include/asm-i386/smplock.h Sun Feb 10 18:15:55 2002
@@ -15,6 +15,7 @@
#else
#ifdef CONFIG_PREEMPT
#define kernel_locked() preempt_get_count()
+#define global_irq_holder 0
#else
#define kernel_locked() 1
#endif
This is good. (Score:2, Informative)
The next thing to have is predicatability in kernel space, then we can calculate the exact max latency to expect between the important event and the systems respons to it... belive it or not. Check out with Monta Vista for this feature, I am sure they are thinking about it.
Really "Preemptible"? (Score:2)
Re:Really "Preemptible"? (Score:2)
Andrew Morton's patch is better (Score:2)
burris
QNX has far lower latency (Score:2, Offtopic)
QNX stands as a rebuke to those who say a microkernel OS has to be slow.
Re:QNX has far lower latency (Score:2)
Linker slow (Score:2, Insightful)
Redhat 7.2 has a prelinker utility on the cdrom although it is unsupported. I tried it out. Installed it, and ran the prelinker on all binaries in the default path (it appears to include most libraries and binaries). The improvement was negligible if even there.
Any Ideas on how this could be improved in the future. I have two ideas that I can think of to improve the linking performance, or at least improve the feel of the linking.
1. Memory pages that are linked, but not dirty(Havent been updated since) could be marked as part of a link cache. For instance the same program starting up could just ajust it's page table to point to the already linked page, and update the page count. The page would then be copy on write. These pages would be usable until the reference count is zero, and the system needs the page for other purposes. This would impove load speed as long as the program was previously used, and it's pages haven't been used for other purposes. This would be great for multiple use systems like a terminal server. I don't know if this is possible, or already been done, and I'm behind the times.
2. Simple start up tricks. For instance the window manager opens a frame where the program is going to start up. The frame would contain a throbber, status bar, etc. The frame would resize once the program connects to the Xserver to surround the first window of the application.
I hope these posts aren't too off topic.
Thanks.
Adam
Re:Doesn't everyone else already do this? (Score:5, Informative)
Re:Doesn't everyone else already do this? (Score:2)
As I understand it, the pre-emptible kernel patch allows user processes to pre-empt the kernel itself. Apparently the NT I/O subsystem is pre-emptible.
Re:Doesn't everyone else already do this? (Score:5, Informative)
This is a refinement on preemptive multitasking, which linux had before. Before having a preemptive kernel, the kernel could only preempt the process if it wasn't in a kernel call (okay, there are some kernel calls like writes to disk that it can preempt but most it can't). So, if an interrupt happens while my process is in the middle of a kernel call, the process that handles the interrupt will just have to wait until the call is completed.
With this patch, my process will be preempted for the handling process, allowing it to respond in a very timely fashion. Thus, this is considered to be a prerequisite for real time operating systems.
According to this [google.com] Windows NT does have a preemptive kernel, but I doubt 9x/ME do. I'm not even sure that page is right, since I couldn't find any primary sources for this and other pages [garply.com] imply it doesn't (by listing a fully preemptive kernel as a feature under one operative system, but not listing it under windows NT).
Windows CE definitly has a fully preemptive kernel.
Re:Doesn't everyone else already do this? (Score:2)
This is due to a restrictive locking system (Score:3, Informative)
Re:Doesn't everyone else already do this? (Score:3, Informative)
Cooperative multitasking - each process has to willingly give up the CPU, thus one program can bog down the whole machine. Older MacOS incarnations are like this
Preemptive multitasking - the kernel and high-priority user tasks can preempt userspace tasks, and force them to give up control of the CPU. Linux < 2.5.3 is like this (I believe Win9x and MacOSX are too)
Preemptable kernel - High priority user tasks can preempt the kernel as well as each other. Net result - lower latency I/O, possible reduced throughput due to more CPU overhead. QNX, some other commercial Unices, and WinNT/2k are here
Re:Doesn't everyone else already do this? (Score:2)
FYI, MacOSX also has a preemtible kernel (Mach 3.0)
Technical Overview (Score:5, Informative)
1) Pre-emptive
The operating system can interrupt the currently running process to allow another process to run
2) Co-operative multi-tasking
A task gives control back to the operating system in order to let more programs run.
3) User Mode
On most platforms, an execution state with limited hardware and memory access.
4) Kernel Mode
On most platforms, an execution state with direct access to all system resources including page tables and hardware.
Win3.1 runs entirely in Kernel Mode and uses co-operative multi-tasking.
Win9x runs entirely in Kernel Mode and uses pre-emptive multi-tasking.
WinNT based systems (including Win2k) uses pre-emptive multi-tasking and supports both user mode and kernel mode.
Linux uses pre-emptive multi-tasking and supports both user mode and kernel mode.
Now, a system that has pre-emptive multi-tasking can either only allow pre-emption to occur in user mode, or in both user mode and kernel mode.
Theoritically, something should not be in kernel mode for a very long period of time and what's being done in kernel mode tends to be very important.
So, Linus never really was very concerned about kernel mode pre-emptiveness because it's not terribly useful unless you have a horribly inefficent kernel or you require absolute real-time operations. Instead, he wanted to focus on making sure the kernel was as efficent as possible.
This patch allows one to enable kernel pre-emption, but be forewarned, that it will only increase the total time spent in kernel mode (doing the necessary checks) and it will not have a noticable effect unless you are running very real-time applications. That is why it's disabled by default.
It's a good thing to have for a kernel, but it's not very useful for the average user. That's why it's a configuration option. The big performance increase people are referring to is because of the new scheduler... That's a different thread though.
The fact that WinNT has a pre-emptive kernel is not necessarily a good thing. They are undoubtly taking a performance hit for it and since one can't disable it, there is no way to not have it if one doesn't need it.
I think Linus made a good decision about letting it into the kernel mainline, but I think he also made a good decision about keeping it as a configuration option and not integrating by default.
Re:Technical Overview (Score:2)
According to Robert Love's benchmarks, the pre-emptive kernel actually does perform better even for computation-intensive tasks. Having a pre-emptive kernel just gives the scheduler better control over the system. I suspect that because most syscalls are i/o-related, everything is better becasue either a process didn't do many syscalls, in which case it would get cheated by processes whose timeslices expired during syscalls, or it did i/o, and didn't get its turn back quite as soon as it could.
Re:Technical Overview (Score:2)
It has pre-emptive multi-tasking but you don't feel like using a pre-emptive OS due to the fact that, for compatibility reasons, it also allocates system resources to run DOS and Win3.1-based programs. That's to say your system is still (partly) running in cooperative mode until all 16-bit windows programs release their resource thus still suffered from the pitfalls of cooperative OS until ALL 16-bit Windows programs release their system sources.
The worst is that total release of system resources from 16-bits programs will never happen because besides apps, there are also 16-bit Windows modules running. Use the WinTop utility available as part of the Microsoft Kernel Toys you may be surprised at just which Windows components rely on 16-bit code! For example, MSGSRV32.EXE, despite the 32 in its name, is a 16-bit module. So is RUNDLL.EXE, in contrast to RUNDLL32.EXE. Another common 16-bit component is MMTASK.EXE, which supports multimedia background tasks.
This is a typical example I told my students not to answer 'Windows 95/98/ME' as stated in textbook when being asked to give examples of pre-emptive OS. The textbook will quote whatever Microsoft is advertised.
Not correct. (Score:2)
The NT kernel is partly pre-emptive because it's build up by a very small core that only does scheduling and very low level stuff. All other kernelprocesses are tasks on top of that small core. So you can implement the scheduling between these parts as a pre-emptive scheduler. BECAUSE it's pre-emptive it's so fast. The NT kernel isn't eating more performance than needed, it's more faster than comparable kernels. This has been tuned in win2k and in XP. In XP f.e. the locking mechanism which is holding back NT's kernel has been greatly enhanced. Locks are the nail on the pre-emptive-kernel-coffin.
Linux has a big kernel, it's not implemented as a small core that simply does very very low level stuff and scheduling, it does a lot more.
Re:Great Feature (Score:3)
Re:Star Office 5.2 (Score:3, Funny)
What happened to OpenBSD? (Score:2, Interesting)
People keep saying stuff like "OpenBSD hasn't had a remote exploit in the default installation for 4+ years" or something to that effect, but when I ask them about linux, they tell me to do such and such, read these ten sites, install these patches etc., if I want to use it as a server. Doesn't exactly spell out "most secure OS" to me. But then again, I'm no unix-expert, security-expert - hell, I'm not any kind of expert, but I'm pretty sure, that OpenBSD is more secure than Linux, and OpenBSD isn't exacly a proprietary Unix.
Re:Just In Time I Think... (Score:2)
The Linux Counter [li.org] reports that out of 90.670 machines where the purpose has been stated, 55.552 (61%) are marked as "workstation".
32% are WWW servers, 22% are firewalls, and 23% are file servers; one conclusion must be that a lot of people use their workstation as a server.... lots of bias in those numbers. But desktop Linux is relevant to most Linux enthusiasts.
Re:Just In Time I Think... (Score:2)
this is
You can't go round saying something so blatantly incorrect and think people will let it slide by. False statements need questioning. This is a harsh playground but luckily only feelings get hurt.
AIX - security focus lists 263 Advisories for AIX dating back to 1991. That's 2 a month, not exactly ignored!
The latest [securityfocus.com] is a remote root exploit!
You are correct that in risk analysis terms then security through obscurity has minor value but most people round here will quote to you that "security through obscurity is no security at all".
But your point was that Linux was MORE secure because of it's market penetration which would make Windows 9x the most secure OS ever!
Please... next time, think before calling some one a troll.
I thought very carefully and restrained myself from calling you a lot worse (just like code, always throw the first one away
Next time, think before calling Linux secure!
Re:Just In Time I Think... (Score:2)
The big two?
Er, if it smells like Drano, tastes like Drano, and comes in a tin marked "Drano," guess what? That's not crack you're smoking. It's Drano. Put down the pipe, bucko.
If there's a "big two" hooked up on cable modems (your original premise), it's Win9x and the classic Mac OS. Though Slashdot may give one a skewed perspective on such things, Linux is running neck and neck with the BSDs for bottom of the home market.
--saint
Re:Precompiled? (Score:2, Informative)
no it's a little different (Score:2)
Now, what about tasks that are part of the kernel? Are these run in a pre-emptive multitasking environment? Some OS-es don't and some do. Most OS-es have a different kind of scheduling of tasks within a kernel, so a kernel can predict when a kerneltasks is finished and can prevent kerneltasks from stalling the overall systemperformance. What is done here, is that by this patch, Linux got a pre-emptive scheduler for kerneltasks, so these are scheduled on a different way than before, resulting in better overall performance.
Your gzip-X problem has nothing to do with it: if 1 program hogs the cpu, another can't fully use it.