Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Linux Software

Preemptible Kernel Patch Accepted 374

An Anonymous Coward writes: "The preemptible Linux kernel patch that was originally introduced by MontaVista Software and more recently championed by Robert Love has been merged by Linus Torvalds into the main linux development-kernel tree, beginning version v2.5.4-pre6. This adds a far greater degree of real-time responsiveness to the standard Linux kernel, by reducing interrupt latencies while kernel functions are executing. The story at LinuxDevices.com includes comments by Robert Love, and there is also a recent interview with Robert Love about the preemptable kernel here and a whitepaper about the technology by MontaVista here."
This discussion has been archived. No new comments can be posted.

Preemptible Kernel Patch Accepted

Comments Filter:
  • Wow. (Score:5, Insightful)

    by 1010011010 ( 53039 ) on Sunday February 10, 2002 @06:59PM (#2983972) Homepage
    I thought that the preempt patch was quite a way from being part of the linus tree. On the other hand, early in a development kernel is probably the right place to integrate it, so that all those device drivers with problems with the preempt stuff (like NE2000, I think) can get fixed.

    • Re:Wow. (Score:3, Interesting)

      by digitalunity ( 19107 )
      It's really good that it was put in now instead of later. In fact, I really think the new VM should have waited for 2.5 as well. I just couldn't figure out why you'd change such a fundamental piece in the middle of a stable tree! But hey, I don't wear the Linus nametag; not my job.

      This will be a good first step in reducing latency and increasing response time in X and other programs.
      • Re:Wow. (Score:3, Informative)

        by WNight ( 23683 )
        This is actually a problem with the word "stable".

        The even numbered (2.2, 2.4, etc) builds are API stable.

        API stable means that a program you wrote for 2.4.0 would run without change on 2.4.99 because the libraries and system APIs are identical.

        Now, ideally they'd be stable as in not crashing, but that's never going to happen. New testing releases often have problems.

        "But, it's not testing, they released it," you say.

        Did you get a buggy kernel from Redhat, or Debian, or any other distro? Or did you go download it seperately and install it? If so, it's not released for end users.

        This doesn't mean that those kernels weren't buggy, just that they weren't guaranteed not to be.

        Stable is like free, a word with many connotations. In this case is means unchanging, not crash-free. If you want crash-free, simply wait a week and see what people have to say. (You never need a new kernel so badly you can't wait.)

        Anyways, the new VM was a big change, but if the original VM wasn't cutting it, I think Linus did the only thing he could. He wanted 2.4 to be usable (if not perfect) so he swapped out a potentially better VM for a simpler VM that would work now. Otherwise 2.4 would still be unusable for many applications and people who needed it would have to use 2.2.x or wait for 2.6 which is likely quite a ways off.

        Don't forget that changing the VM doesn't change the APIs. A program written for Rik's VM works fine with the AA VM and vice versa.
    • by rseuhs ( 322520 )
      I thought that the preempt patch was quite a way from being part of the linus tree.

      I know that I shouldn't ask this because there has already been enough changes and troubles in 2.4 - but I've got some Karma to burn:

      Wasn't this patch long enough available on 2.4 so that it should be stable enough?

      • by evil_one ( 142582 ) on Sunday February 10, 2002 @10:06PM (#2984585) Homepage
        On one hand, the preempt patch makes heavy use of SMP spinlocks, and the stability of preempt in parts of the kernel that arn't SMP capable (which are few and far between at this point) and on SMP systems is questionable.
        On the other hand, an awful lot of users have been testing and reporting back to lkml, and Robert Love has been persuing the bugs with the dedication of a first love. I'm sure that scores points with the power(s) that be on LK.
    • Not that surprising, if I correctly recall Linus's comments at ALS. Although Linus mentioned various caveats with the preemptive kernel patch (some of which are improved in more recent versions of the patch), they didn't really add up to "the whole concept is broken". More like "not for 2.4".
  • Pre-empt (Score:2, Interesting)

    by aeil ( 183600 )
    After watching traffic about this almost every day for several months, I can say that I agree with this inclusion and hopefully some of the Low - Latency patches will make it in as well.
  • Nice work (Score:5, Informative)

    by ekrout ( 139379 ) on Sunday February 10, 2002 @07:05PM (#2983994) Journal
    But many folks may have no idea what effect preemptability actually has upon a user who uses GNU/Linux. Here's the good news:

    [] Smoother video
    [] Smoother user interface
    [] A seemingly more responsive computer
    [] Overall smoothness in operation
    (reply to this if you'd like to add to my list)

    Congrats to Linus for getting this ready so soon, and to those who helped develop it.

    EricKrout.com :: A Weblog On Crack [erickrout.com]
    • Disadvantages (Score:2, Informative)

      by Metrollica ( 552191 )
      RL: Please summarize the advantages in general, not just for embedded real-time apps, of having the preemptible kernel enhancement included in the kernel. What about any disadvantages?

      Love: I'll start with a quick explanation of how the patch works. Right now, the kernel is not preemptible. This means that code running in the kernel runs until completion, which is the source of our latency. Although kernel code is well written and regulated, the net result is that we effectively have an unbounded limit on how long we spend in the kernel. Time spent in kernel mode can grow to many hundreds of milliseconds. With some tasks demanding sub-5ms latencies, this non-preemptibility is a problem.

      The preemptible kernel patch changes all this. It makes the kernel preemptible, just like userspace. If a higher priority task becomes runnable, the preempt patch will allow it to run. Wherever it is. We can preempt anywhere, subject to SMP (symmetric multi-processing) locking constraints. That is, we use spinlocks as markers for regions of preemptibility. Of course, on UP (uni-processing) they aren't actually spinlocks, just markers.

      The improvement to response is clear: a high priority task can run as soon as it needs to. This is a requisite of real-time computing, where you need your RT task to run the moment it becomes runnable. But the same effect applies to normal interactive tasks: as soon as an event occurs (such as the user clicking the mouse) that marks it runnable, it can run (subject to the non-preemptible regions, of course).

      There are some counterarguments. The first is that the preemptible kernel lowers throughput since it introduces complexity. Testing has showed, however, that it improves throughput in nearly all situations. My hypothesis is that the same quicker response to events that helps interactivity helps throughput. When I/O data becomes available and a task can be removed from a wait queue and continue doing I/O, the preemptible kernel allows it to happen immediately -- as soon as the interrupt that set need_resched returns, in fact. This means better multitasking.

      There are other issues, too. We have to take care of per-CPU variables, now. In an SMP kernel, per-CPU variables are "implicitly locked" -- they don't have explicit locks but since they are unique to each CPU, a task on another CPU can't touch them. Preemption makes it an issue since a preempted task can trample on the variables without locks.

      Overall I think the issues can be addressed and we can have a preemptible kernel as a proper solution to latency in the kernel.
      • Re:Disadvantages (Score:3, Informative)

        by Metrollica ( 552191 )
        Also look here [linuxdevices.com]
      • Re:Disadvantages (Score:3, Informative)

        by dougmc ( 70836 )
        Another disadvantage ...

        It crashes my machine occasionally. Dual p3/700 (so it's SMP -- which complicates matters.) Without the preempt patch, the box stays up for months at a time. With it, it seems to lock up hard after a few days.

        So far, at least two crashes happened while burning a CD. I wonder if that's a coincidence ..

    • Re:Nice work (Score:5, Informative)

      by jsse ( 254124 ) on Sunday February 10, 2002 @10:16PM (#2984609) Homepage Journal
      Aye, it's sure a great news for *cough* gamers. In the past when we'd like to have a smoother 'mouse' we must change the HZ value in include/asm/param.h from 100 to a higher value(progressive increase the value and recompile kernel until it breaks. ^_^)

      Other archs like alpha use values higher than 100 e.g.

      include/asm-alpha/param.h:# define HZ 1024

      include/asm-ia64/param.h:# define HZ 1024

      You may try it if you aren't going to go to 2.5.x in the near future, but hell, if you don't mind twisting and breaking the kernel by altering the HZ value, why not 2.5.x! :D

      Note: I notice that whenever I talk about changing kernel values the post will be modded redundant. I know a lot of you guys know about kernel insdide out so this might bore you - but come on! I'm sure so many other would be interested in it.
      • Re:Nice work (Score:5, Informative)

        by rhekman ( 231312 ) <hekman AT acm DOT org> on Monday February 11, 2002 @01:36AM (#2985333) Homepage
        Just to avoid confusion... a few notes about this approach.

        First, for those that didn't get it from the parent post, HZ is a system wide timing value. It has nothing directly to do with the mouse.

        What it does deal with is how many times a second the system's interrupt timer fires. The problem with increasing the interrupt timer frequency is that you waste more time servicing interrupts than doing real work. It may improve interactive "feel" because the timer interrupt will trigger higher priority tasks to be rescheduled more often, but at the price of higher system time and lower "throughput".

        Compared to the preemptible kernel patch, increasing HZ is actually harder on throughput, especially on slower systems. Much work has been done on finding and killing long held locks [zip.com.au] not covered by the preempt patch (thanks to Andrew Morton and RML), an approach which has been shown to be quite effective. Increasing timer interrupt frequency means you're creating more pointless interrupt load, which goes against the approach and advances of the other low-latency patches.

        There is an interesting discussion of the HZ value and how it effects Linux in a VM at Linux Weekly News [lwn.net] and for more arcana check out the high resolution timers [sourceforge.net] project.

        Regards

  • by BlueJay465 ( 216717 ) on Sunday February 10, 2002 @07:08PM (#2984008)
    I am interested to know if this will make the response time on X86free faster. So far from what I have noticed, comparing the way MS-Windows works where the GUI is running within the kernal, and how X runs non natively. I have seen significant lag between mouse clicks and on-screen response.

    Example. Running XMMS and pushing play on an MP3 the video display and the sound are not synched. I am running a reasonable video card and sound card (Geforce 256 and a SB-Live) and I expect the video to work on the same scale and rate as the audio, like MS-Windows.

    BTW, this has been one of the biggest complaints I have had against X86free and why I haven't completely made the transition to Linux yet. If this patch does in fact improve the response time of X86free, then I would be more likely to use it more often than I use XP.
    • I completely agree. This is one of the biggest annoyances I have with Linux: that there is a percievable delay between clicking, say, the "File" menu in a GTK-based app and the contents of the File menu showing up.

      However, I doubt that it's XFree86's fault, as the port of X-Chat (which was built with GTK) to Windows shows the same menu behavior as its Linux counterpart. On Linux, however, IceWM exhibits no menu delay whatsoever.

      Then, of course, you have to take into account if you're running a theme that uses pixmaps. If you're running bubbles-gradient, for example, you're more than likely wasting a horrendous amount of CPU cycles just to highlight a button. Even with fast themes like thinice, the delay is still there.

      It's this kind of clunkiness that makes me wonder how people can use themes like this [themes.org]
    • "where the GUI is running within the kernal" the gui is explorer and it doesn't run in the kernel.

      "and how X runs non natively" huh? you mean in user space. The graphics in windows 2k/xp run in the kernel which is what you actually mean by all this.

      "pushing play on an MP3 the video display and the sound are not synched" This has to do with the sound being buffered in xmms, the video is rendered from samples as they are placed into the sound buffer instead of when you actually hear them. This has nothing to do with anything but xmms.
  • Tradeoffs? (Score:4, Interesting)

    by chuckw ( 15728 ) on Sunday February 10, 2002 @07:12PM (#2984024) Homepage Journal
    You don't get anything for free. What is the tradeoff that occurs when you integrate this patch?
    • Just so we're sure. Is that "free as in beer" or "free as in free speech"?
    • Re:Tradeoffs? (Score:3, Informative)

      by keeg ( 541057 )
      The average throughput drops. In other words, it's not something you use on a server, but it's very useful for embedded devices, where latency is important. It's also very nice for desktops, been using it for ~2 months now. YMMV, but my desktop is a _lot_ smoother.
    • Re:Tradeoffs? (Score:5, Informative)

      by megabeck42 ( 45659 ) on Sunday February 10, 2002 @07:27PM (#2984085)
      With this patch the kernel becomes preemptible - meaning, other kernel tasks can stop the current one from executing, execute, finish, and allow the stopped tasks to finish.

      Net effect - expensive operations can be suspended for user interactiveness. Can this impact performance, Yes. Noticeably? No.

      If you're running a big-ass server, it's probably head-less, anyways - and you won't have any large, interactive processes preempting the kernel for smoothness.

      If you're running a workstation, this means that X won't bog down as much when you're running those huge simulations, compiles, etc.

      If you're on an embedded device, you can use this to try and get real-time responsiveness. (perhaps not ideal, but, in an embedded situation you have enough control that if you need a better real-time guarantee, you have other options (e.g. rtlinux).)

      If you're on a modest, consumer PC - X won't suck as much.

      All in all, this is a good idea. In theory, you lose some efficiency making several thousand context switches/second, but that's the price you pay for multi-tasking. Yeah, certain kernel operations may take longer, but, you get a better responsiveness, which - for most people, is a good thing. Most interactive individuals are seldomly pegging their processor at 100% utilization for any worthwhile period of time. (Games are an exception.)

      This is good stuff.
      • Re:Tradeoffs? (Score:3, Informative)

        by B1ood ( 89212 )
        This is an option in the kernel. if you aren't compiling a kernel for a desktop box, chances are you won't want to enable this in the first place. therefore your net loss is zero.
      • Re:Tradeoffs? (Score:5, Informative)

        by Ami Ganguli ( 921 ) on Monday February 11, 2002 @02:31AM (#2985486) Homepage
        If you're running a big-ass server, it's probably head-less, anyways - and you won't have any large, interactive processes preempting the kernel for smoothness.

        But you will have IO-bound processes coming alive faster once their data is available, often improving throughput. There have been benchmarks floating around that indicate that a lot of typical server workloads benefit from this patch too.

        It appears that this is generally a good thing. The only downside is the added complexity.

    • Re:Tradeoffs? (Score:5, Insightful)

      by sydb ( 176695 ) <[michael] [at] [wd21.co.uk]> on Sunday February 10, 2002 @07:30PM (#2984099)
      Aside from the technicalities of this particular patch, your assumption that you're going to lose something when you apply a beneficial patch because 'you don't get anything for free' is, despite appearing mature and clueful, way off mark.

      The cost doesn't have to be borne by the end user. The cost can be developer time / clues. In the Free Software world, you do get that for free.
      • by chuckw ( 15728 )


        despite appearing mature and clueful, way off mark.


        No, despite trying to sound like the wiser one, you are way off the mark. If this was just a patch that unplugged a logjam it would have been applied a very long time ago. No, it took time because there were tradeoffs. Yes, those tradeoffs may not be entirely tangible or even noticeable by the end user, however there *are* tradeoffs.

        For more proof, I'll direct you to the large number of clueful responses to my original question.

      • by RelliK ( 4466 )
        Aside from the technicalities of this particular patch, your assumption that you're going to lose something when you apply a beneficial patch because 'you don't get anything for free' is, despite appearing mature and clueful, way off mark.

        False. There *is* a tradeoff. And you probably want to take an Operating Systems course before spewing "there is no tradeoff with Free Software" nonsense. (BTW, I wanted to ask the same question).

        Anyway, here is how it works: a ready, higher-priority process can kick off a running, lower priority process before the running process's time slice expires. This does indeed improve responsiveness so that your machine "feels" (*)faster, but in reality it actually runs slower. The cost of pre-emptible kernel is that it does more process switching than a non-preemptible one (see above, it can (and does) interrupt a process before its time slice is finished). More process switching requires more CPU time, concequently, less CPU time is spent on actually doing work. So yes, the good thing is that it decreases latency (hence better responsiveness). But the bad thing is that it decreases throughput (the amount of work actually done) because of the increased process switching overhead.

        (*) The reason your machine "feels" faster is that the GUI becomes more responsive. But that is pure illusion! Your machine actually does less work. Thus, the pre-emptible kernel patch would probably be useful for workstations, but you definitely don't want to use it on a server.

        So the question becomes: what is the throughput/latency tradeoff with the current implementation of the preemptible kernel?

        • Re:Tradeoffs? (Score:3, Insightful)

          by BlowCat ( 216402 )
          In fact, you just demonstrated that the processor does more work with the preemptible kernel. Perhaps you were assuming that all 100% of CPU time is utilized and that faster GUI is not really useful, so the processor does less "useful work".

          This is not true in case of workstations, whose primary purpose is to provide smooth and fast environment for people to work, not to crunch numbers.

          Neither does your assumption hold for embedded systems - their function is often to provide fast responce to external signals, which they can do much better now. Most embedded systems don't utilize 100% of the processor power either.

          It is only in the case of servers with heavy I/O that your reasoning makes sence. But the solution is in the hardware - use bigger blocks of data, and the processor won't be interrupted too much.

    • Re:Tradeoffs? (Score:5, Insightful)

      by psamuels ( 64397 ) on Sunday February 10, 2002 @11:32PM (#2984821) Homepage
      You don't get anything for free.

      Not quite true - some Linux patches give unilateral improvement. But I do see your point.

      What is the tradeoff that occurs when you integrate this patch?

      None of the other responses to this thread (that I've noticed) addressed one tradeoff: complexity and bugs. Ever since Linux started to support SMP systems, SMP kernels have been somewhat buggier than UP kernels. This is because there are a lot of potential mistakes to be made - getting and releasing spinlocks and semaphores at the right places is not trivial stuff. Of course bugs have been fixed over the past several years, and SMP is now considered a standard Linux feature (in 2.0 it was "experimental"), but there are still no doubt lots of SMP bugs in some of the more obscure device drivers.

      The problem with the preempt patch is that it introduces all these potential bugs into a standard non-SMP kernel, since preempt uses much the same basic mechanism as SMP. Most people only have one CPU, but now these people will be exposed to the same "increased level of risk" as SMP systems.

      In a way, that could be considered a benefit - this may serve to flush out some of those last remaining SMP bugs. The SMP code paths will be exercised by a lot of people now.

  • by Maddog_Delphi97 ( 173780 ) on Sunday February 10, 2002 @07:14PM (#2984028)
    What effect would a pre-emptible kernal have on the scalability of Linux?

    As far as I can tell, a particularly responsive kernal wouldn't scale very well, since there wouldn't any guarantees as how much "time" as being spent on a thread/process by the CPU.

    Think of a large, multi-user environment based on Linux. Do you really want any user to pre-empt the processing in the kernal by CPU to the detriment of other users? A more logical answer to this is to have set guarantees as how much processing time is given to each user. It shouldn't matter if it's one user or 2,000 users, the speed of applications for each user should stay the same as much as possible.

    Maybe I'm describing Solaris, or some other operating system like this.
    • by restive ( 542491 ) on Sunday February 10, 2002 @07:20PM (#2984058)
      If you don't like it, treat it the same way as SMP...turn it off!

      The fact that this is built into the kernel means that we don't always have to go out and download patches to change this. I would assume that vendors using the Linux kernel would make decisions on how to compile the kernel to suit their environment.

      There are lots of people that are frustrated by the current need to go and get patches to change this. Incorporating it into the main kernel should be very positive, IMHO.
      • Taken from the Bitkeeper diff

        --- 1.3/arch/i386/Config.help Tue Jan 29 06:32:09 2002
        +++ 1.4/arch/i386/Config.help Sat Feb 9 11:11:32 2002
        @@ -25,6 +25,16 @@

        If you don't know what to do here, say N.

        +CONFIG_PREEMPT
        + This option reduces the latency of the kernel when reacting to
        + real-time or interactive events by allowing a low priority process to
        + be preempted even if it is in kernel mode executing a system call.
        + This allows applications to run more reliably even when the system is
        + under load.
        +
        + Say Y here if you are building a kernel for a desktop, embedded
        + or real-time system. Say N if you are unsure.
        +
        CONFIG_X86
        This is Linux's home port. Linux was originally native to the Intel
        386, and runs on all the later x86 processors including the Intel
    • It shouldn't have ANY effect on scalability. First, scalability really refers to how well the kernel handles multiple processors*, which isn't what you're talking about. Second, a process doesn't preempt another process unless it has a lower priority. As long as each of the 2000 users' apps have the same priority level, they'll all get the same response times. The only time preemptibility comes into effect is when the priorities are diffrent. In that case, a higher priority process can preempt a lower priority one, even if the lower priority process is running in the kernel, just like it should be. Big name UNIXs like Solaris are fully preemptible, and there is little question about how well they scale to thousands of users.

      *Not technically, but that's the most common usage.
    • by Fjord ( 99230 ) on Sunday February 10, 2002 @08:18PM (#2984266) Homepage Journal
      "Do you really want any user to pre-empt the processing in the kernal by CPU to the detriment of other users? A more logical answer to this is to have set guarantees as how much processing time is given to each user."

      Actually, I would think it would be the opposite. Being able to preempt within the kernel can pretect you against a DOS attack where a process repeatedly makes long running kernel calls. That would give that process more than it's fair share of time, and other processes couldn't respond to interrupts as well. Without a fully preemptable kernel, you can't guarantee how long a process can run, because it is impossible to preempt them while they are in a kernel call.
    • by iabervon ( 1971 ) on Sunday February 10, 2002 @09:10PM (#2984465) Homepage Journal
      Consider, however, the case where the reason a task is preempted in the kernel is that it has used all of its timeslice. Without the preemptable kernel patch, the task cannot be interrupted to schedule another task. In order to make guarantees about how much time will be given to each process, it needs to be possible to stop a process when its time is up essentially no matter what the process is doing.

      The issues you raise are a matter of scheduling policy, not of whether the kernel is preemptible. Furthermore, for most interactive tasks, the correct behavior is to react quickly, because those tasks haven't used up their timeslices, since they blocked waiting for input. In this case, interactive processes give up the CPU to wait for input, and then get their time back as soon as they have a use for it.

      Of course, this all also applies to tasks which "interact" with the network or the hard drives, which is any task when you have swap space. Processes which are waiting on input want to run as soon as their input is ready, and don't care about time before that. Processes which are not waiting on input want to run as much as possible, and don't care exactly when. Having the scheduler's instructions followed as closely as possible benefits both kinds.
  • by mrm677 ( 456727 )
    I know this is somewhat offtopic, however to make Linux more responsive, we need to improve X somehow. I am not saying that X sucks...I think it is a fantastic system

    Anybody who uses X and Windows regularly knows the difference in responsiveness. X Windows does what it was for designed extremely well-- a client/server display system. However, due to the marshalling and de-marshalling of X calls, even if completely local, it will always be less responsive than other methods (winblows).

    But I have an idea. Develop a system that implements the exact same interface as X but does no marshalling/demarshalling. Pixels can be written directly to the framebuffer. So you are thinking, "Yeah, but I want to use X apps without recompiling". Ok, use library interposition. This also allows you to use a "local" and "global" X library to maintain client/server capabilities. For those who aren't familar which library interpositioning, it essentially takes advantage of dynamic linking (set LD_LIBRARY_PATH on Unix). If you want to run a X program that directly writes to the framebuffer, then switch your LD_LIBRARY_PATH to a different directory before the program is executed. This could get annoying, but a Window Manager like Gnome could take care of this automatically.

    Granted that our existing X server would have to be retrofitted to allow 2 different types of X libraries to update the same display to that we can run standard client/server X apps with the new "directXfree86" (no pun intended) apps.

    However library interpositioning can be used to make X programs more responsive without sacrificing client/server capabilities and compatibility with existing applications (except those statically linked of course).
    • by Mr Z ( 6791 ) on Sunday February 10, 2002 @07:45PM (#2984159) Homepage Journal

      Writing pixels directly to a frame-buffer is slow. You lose all of the acceleration features of your video card. Keeping as much of the protocol at a high level as possible is good. The only things that benefit from direct frame-buffer access are programs that do all their own rendering. (Think video decoders.)

      Still, if you think about it, the basic gist of your idea is to get rid of the network channel from the communication protocol, and instead have the app talk directly to the X server, say, in shared memory. If so, then how does your idea compare to MITSHM [reptiles.org] and Shared-Memory Transport [precisioninsight.com]? Or the Direct Rendering Interface [sourceforge.net] for that matter? And for 2-D stuff, let's not forget the Direct Graphics Architecture extension [xfree86.org]. Nothing stops GTK, Qt and friends from using any one of these technologies if they'd improve performance and latency.

      --Joe
      • by Junta ( 36770 ) on Sunday February 10, 2002 @08:13PM (#2984244)
        Pretty good response, though I would note that even for video decoders writing to a raw framebuffer isn't desired... Writing directly to an allocated overlay in a colorspace natural to the decoding is better (that way, X provides a surface to write to that takes care of both colorspace conversion and scaling in hardware, two *Very* expensive video rendering tasks.). There are very few applications in which direct, unmediated framebuffer access is that beneficial... For example some apps support all sorts of targets from standard Xlib, to XShm, to DGA, to GL. The DGA is probably the closes to direct access, and, no surprise, it isn't that impressive....
        Of course, I think the poster didn't really mean direct framebuffer access, but rather trimming Xlib where possible to not do things that increase latency locally, which, as many have pointed out Xshm does that very thing..
      • The main slowdown for X doesn't come from Marshalling data. The marshallers are quite fast, and run at CPU speed. The latency mostly grows in the interaction between the X server and the display hardware. Since the X server is running in user mode it doesn't have access to Hardware interrupts. As a result the server ends up polling the hardware registers until the operation is complete, which wastes huge amounts of time. Still a lot faster than blitting video memory in the CPU, but not nearly as fast as Windows can manage.

        Not that I'm an expert on Video hardware or anything, but that is paraphrased from an explanation given by X-inside for their X server.

    • Oh wait, that name's already taken as it's been a part of XFree86 by default since the 4.0 release!

      Man, people piss me off sometimes... I wish people would actually read something about X before bitching about it on /.

      I don't know why people think X is so horrible. X just destroys Windows as a windowing system. The only plus Windows has it that it has better hardware support. Other than that, X blows Windows away.

      And this got mod'd up to 4... Sheeesh
      • X just destroys Windows as a windowing system.

        One of the sad, unintended consequences of Linux's popularity is that there is a young generation of geeks out there who think that X-windows is something other than a comedy of errors.

        The toolkit, the inefficiencies in communication, the lack of intelligent control at the terminal side, and the list goes on and on...

    • by nathanh ( 1214 )
      But I have an idea. Develop a system that implements the exact same interface as X but does no marshalling/demarshalling.

      Impossible. The X11 protocol is incompatible with this idea.

      Pixels can be written directly to the framebuffer. So you are thinking, "Yeah, but I want to use X apps without recompiling". Ok, use library interposition. This also allows you to use a "local" and "global" X library to maintain client/server capabilities. For those who aren't familar which library interpositioning, it essentially takes advantage of dynamic linking (set LD_LIBRARY_PATH on Unix). If you want to run a X program that directly writes to the framebuffer, then switch your LD_LIBRARY_PATH to a different directory before the program is executed. This could get annoying, but a Window Manager like Gnome could take care of this automatically.

      This has been tried. See the D11 paper by Kilgard.

      http://citeseer.nj.nec.com/125132.html

      The idea is called Direct Rendering and it is not a significant performance win for most graphics ops. The obvious exception is high bandwidth graphics such as OpenGL and streaming video. You'll notice that XFree86 already has direct rendering for OpenGL and streaming video.

      Summary: X11 is not the bottleneck on your desktop.

    • I've looked at this extensively with kernel tracing tools - the X wire protocol is not the bottleneck. The problems are:

      1) X server event loop should be driven by the monitor retrace, not by mouse/keyboard events

      2) Resizing windows should be synchronous, not asynchronous (this causes the "lag" effect when resizing a window on X)

      3) Toolkits should be double-buffering everything, using client-side layout code (rather than X window objects), and holding all drawing until input events have been handled.

      4) (controversial) - get rid of the window manager; incorporate it into the X server.
      • Just for completeness, I should add that I belive most of X's problems stem from limitations in the hardware it was originally intended to run on (i.e. achingly slow framebuffers on a slow network).

        e.g. window resize events are handled asynchronously because back when X was being designed, it was unthinkable for the client and graphics hardware to be able to respond interactively. But today's hardware can handle this with ease - and that's why window resizing feels so much better on MS Windows - because GDI was designed with assumptions that fit modern machines (i.e. fast video hardware, and no network, so synchronous resizing is not a problem).

        It's sort of a self-fulfilling prophecy - X thinks of itself as a slow graphics system implemented over a high-latency network, and so that's what it becomes, even though the hardware is capable of so much more...
    • As others have pointed out, DRI, DGA, etc all exist. Another thing to point out is the performance of Windows in VMWare. It feels responsive. Why? Simply because the VMWare video driver is smart. It knows how to turn Win32 calls, boil them down into vectors, and send them off to the X11 video driver very quickly. This is why DGA fullscreen Win98 is as fast on my machine as it is navite for video updates (but I've not run Windows natively on my workstation for over a year).

      If you want more responsiveness, fix your toolkits. This is happening in GTK+ v2. Look at the changelogs and code. IF you treat a video card like a framebuffer, you lose out bigtime. If you do everything as a vector op, you save bigtime ! This is (on of the) reason(s) why OpenGL is popular -- it's a vector API for 3D graphics.
      • If you want more responsiveness, fix your toolkits. This is happening in GTK+ v2.
        It is? Great. I've been developing a GTK+ app [obsession.se] for three and a half years, using GTK+ 1.0 and lately 1.2. Since I'm looking forward to GTK+ 2.0, I recently downloaded a snapshot of the development series (1.3.10) and built it, to try it out. Geeez, was it slow! Now, I don't have any numbers or anything, but based on my experience, the simple list test program I wrote feels 3-5x less responsive than it would be under GTK+ 1.2. Clicking a list item has a noticeable delay before it gets rendered in the selected state. Now, my machine (a K6-233/128) is obviously not a modern day monster, but still. If there are initiatives to make GTK+ 2.0 faster than its predecessors, they sure seem to start by going quite a bit in the opposite direction.
  • by sydb ( 176695 ) <[michael] [at] [wd21.co.uk]> on Sunday February 10, 2002 @07:34PM (#2984113)
    Quake 3 has never been smoother on my machine. 2.4.18-pre7 with Robert Love's Pre-emptible Kernel patch and Ingo's O(1) patch. Get it.
    • I'm not sure of the primary source for the O(1) patch, but Red Hat has a download site for it [redhat.com]. It would be helpful if when people recommend patches and such, they would provide a link. ;)
    • Quake 3 has never been smoother on my machine. 2.4.18-pre7 with Robert Love's Pre-emptible Kernel patch and Ingo's O(1) patch.

      Sounds awesome. Quick question: does Ingo's patch make all of userland O(1), or just the kernel? I'm curious.

      (1/5 wink)

  • by Khalid ( 31037 ) on Sunday February 10, 2002 @07:35PM (#2984118) Homepage
    http://linux.bkbits.net:8088/linux-2.5/ChangeSet@- 1d?nav=index.html

    It's just 3 hours old :)

    A very nice way to follow the fresher kernel !
  • I compiled this into 2.4.17 with the preempt-kernel-rml-2.4.17-1 patch. When i booted i got PPP module errors, and when i tried to install the NVIDIA (2314/2313) drivers it gave me more errors. So i went back and disabled it...

    Im looking foward to trying this patch out again when 2.4.18 comes out and i hope it works better.

    -phinn
    • and it won't (Score:3, Informative)

      by Ranger Rick ( 197 )
      I expect it won't be any better.

      NVIDIA drivers have to be rebuilt when you build a new kernel. As for PPP, you were probably just missing a driver when you configured.
    • The GLX portion of th nvidia drivers doesn't seem to care what kernel revision you're running on. The kernel module portion does however. I've been running the preempt patch for some time now with several revisions of Nvidia's drivers. Just get the SRPMS and recompile them. Or get the TGZ versions if you're running a non-RPM distribution (slackware, debian, etc).

      I don't know what problems others have or have not had, but I've never had a bit of trouble with the preempt patch.

    • by clump ( 60191 )
      I compiled this into 2.4.17 with the preempt-kernel-rml-2.4.17-1 patch. When i booted i got PPP module errors, and when i tried to install the NVIDIA (2314/2313) drivers it gave me more errors. So i went back and disabled it...

      Hrm... I am running that exact setup, and due to ISP/CLEC madness, I am also using PPP for connectivity. In fact, I am writing this dialed in with a 2.4.17-preempt kernel. No issues with all of the above plus a GeForce3 with the newest NVidia drivers.

      So far, I have to say I am very impressed with the performance. I do notice a difference because I have taken to creating Divx;-) movies which proves to be a loborious task. I can rip a DVD and preview the .avi being created with no apparent latency with the preempt patch. Without the patch, previewing the avi is not at all realistic. Hats off to RML and Linus.
  • Other arches? (Score:4, Interesting)

    by saintlupus ( 227599 ) on Sunday February 10, 2002 @07:40PM (#2984134)
    Has anyone tried this patch on non-x86 hardware?

    I've got a Powermac 7200 I'm playing with YDL on right now...

    (Note: I am not a programmer. Should this be something patently obvious to anyone with the most casual knowledge of OS programming, I still don't know. So don't flame me.)

    --saint
    • Re:Other arches? (Score:5, Informative)

      by hysterion ( 231229 ) on Sunday February 10, 2002 @10:29PM (#2984646) Homepage

      I wondered too (I also have a 7200), and found this answer in the changelog [kernel.org]:

      <rml@tech9.net>:
      [PATCH] Re: [PATCH] Preemptible Kernel for 2.5

      On Sat, 2002-02-09...<snip>

      Again, this is a minimal i386-only patch. I have other arches, documentation, etc. Patch against 2.5.4-pre5. Enjoy,

      Robert Love

  • Great News! (Score:4, Informative)

    by buckrogers ( 136562 ) on Sunday February 10, 2002 @07:42PM (#2984143) Homepage
    This has obvious applications for home and even professional audio and video.

    I have the 2.4.18-pre8 with the O(1) schedullar and it is great. It was easy to untar the 2.4.17 kernel, add the 18-pre8 patch, and then add the O(1) patch. I then copied over my old .config and .config.old to the current linux tree, ran make oldconfig, re-enabled the smp options, then compiled and installed the kernel. No pain. The next patch I'm going to try is the real 2.4.18 with both the O(1) sched and preempt patches.

    Rolling your own kernel branch is actually a lot of fun and I am learning a lot. Anyone can do it, there are dozens of kernel patches that I would love to play with.

    When you compile linux you will have a choice to add this preempt capability to Linux or not. You can turn it on or off with the click of a mouse button and a recompile. So, if preempt makes throughput too slow for your application, turn it off.

    For the great majority of users this will make the user interface very smooth and rock solid. And anyone who uses their computer to record and play back audio and video will love this patch. Never a clitch or dropped frame again. It'll be awesome. :)
    • Anyone can do it

      Unfortunately, 99% of home computer users (the userbase that Linux has been salivating over for a while now) would have lost you at "untar" ;)

      It's good news nonetheless (ok, so a bunch of people - notably BSD users - will whine that it's not actually "news" and we should specify "Linux Kernel" and not just "Kernel", but anyway), I've been using the patch myself for a while, and with the added convinience of not having to apply it, I might give the 2.5 series a try, sometime around 2.5.6 or 2.5.7...

  • by augustz ( 18082 ) on Sunday February 10, 2002 @07:52PM (#2984182)
    Robert Love has another patch that I'm hoping to see make it into the kernel. For systems in headless situations with large entropy reqs, this is pretty much make or break.

    http://www.kernel.org/pub/linux/kernel/people/rml/ netdev-random/README-netdev-random [kernel.org]

    describes what it is all about
  • by worldwideweber ( 116531 ) on Sunday February 10, 2002 @08:01PM (#2984210) Homepage Journal
    Folks:

    It should be noted that this will lead to a compile error if you enable preemption but disable SMP. To make this build, you need to add this patch:

    diff -urN linux-2.5.4-pre6/include/asm-i386/smplock.h linux/include/asm-i386/smplock.h
    --- linux-2.5.4-pre6/include/asm-i386/smplock.h Sun Feb 10 15:35:55 2002
    +++ linux/include/asm-i386/smplock.h Sun Feb 10 18:15:55 2002
    @@ -15,6 +15,7 @@
    #else
    #ifdef CONFIG_PREEMPT
    #define kernel_locked() preempt_get_count()
    +#define global_irq_holder 0
    #else
    #define kernel_locked() 1
    #endif

  • This is good. (Score:2, Informative)

    by StarBar ( 549337 )
    Preemptiveness give the kernel the possibility to change direction in the middle of a leap, and later get back to that point to finalize the leap, what ever system call that is. It will of course not do this for no reason, only if an important event has happened that has a higher priority than the current running event. A little like 'nice' but much more powerful. Can't be bad, can it?

    The next thing to have is predicatability in kernel space, then we can calculate the exact max latency to expect between the important event and the systems respons to it... belive it or not. Check out with Monta Vista for this feature, I am sure they are thinking about it.
  • Doesn't this patch just add a bunch of extra schedualling points in stategic places? That's not technically "preemptible". Or perhaps I'm thinking of one of the other "preemptible" kernel patches :~)
  • While the preemptible kernel is a more elegant solution to scheduling latency than peppering the kernel with rescheduling checks, Andrew Morton's "Low Latency" patches give better performance. I'm doing 24-bit/96-kHz audio and with the LL kernel I get vastly more stable performance than the PE kernel. Note that you aren't going to see a spit of difference with either kernel unless the process is running at realtime priority (i.e. SCHED_FIFO or SCHED_RR).

    burris
  • The right way to do this is in QNX [qnx.com], which only prevents interrupts for a few instructions at a time, typically while updating queues. QNX has a real microkernel; all the kernel does is schedule the CPU and handle interprocess communication. Everything else (drivers, paging, file systems, networking, graphics, etc.) is in protected-mode user processes, all of which are preemptable. This allows QNX to deliver sub-microsecond latency to high-priority processes.

    QNX stands as a rebuke to those who say a microkernel OS has to be slow.

  • Linker slow (Score:2, Insightful)

    by aashenfe ( 558026 )
    The linker definitly needs some work on linux. Program startup can be painfully slow especialy when using KDE (C++). This really gives the feeling of a slow system, even though when the programs are finally started, they run rather snappy.
    Redhat 7.2 has a prelinker utility on the cdrom although it is unsupported. I tried it out. Installed it, and ran the prelinker on all binaries in the default path (it appears to include most libraries and binaries). The improvement was negligible if even there.

    Any Ideas on how this could be improved in the future. I have two ideas that I can think of to improve the linking performance, or at least improve the feel of the linking.

    1. Memory pages that are linked, but not dirty(Havent been updated since) could be marked as part of a link cache. For instance the same program starting up could just ajust it's page table to point to the already linked page, and update the page count. The page would then be copy on write. These pages would be usable until the reference count is zero, and the system needs the page for other purposes. This would impove load speed as long as the program was previously used, and it's pages haven't been used for other purposes. This would be great for multiple use systems like a terminal server. I don't know if this is possible, or already been done, and I'm behind the times.

    2. Simple start up tricks. For instance the window manager opens a frame where the program is going to start up. The frame would contain a throbber, status bar, etc. The frame would resize once the program connects to the Xserver to surround the first window of the application.

    I hope these posts aren't too off topic.

    Thanks.
    Adam

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...