Alternative To the 200-Line Linux Kernel Patch 402
climenole writes "Phoronix recently published an article regarding a ~200 line Linux Kernel patch that improves responsiveness under system strain. Well, Lennart Poettering, a Red Hat developer, replied to Linus Torvalds on a mailing list with an alternative to this patch that does the same thing yet all you have to do is run 2 commands and paste 4 lines in your ~/.bashrc file."
Re:While the bashrc approach may seem attractive (Score:4, Informative)
True, though it could be done at the distro level, which appears to be the author's plans (the person who wrote this script works for Red Hat, and discussed elsewhere in the thread what Red Hat's plans are for rolling out systemd [freedesktop.org], which will handle this). Then things would be appropriately updated by the maintainers rather than relying on users to keep their .bashrc synced with infrastructure changes.
Re:Ah the beauty of the interconnected world... (Score:1, Informative)
Not really, no. The solution Lennart proposes works in tandem with systemd, which is yet another init replacement that's not being used by anyone at the moment. It is speculated that Fedora 15 will start using it, being that Lennart works for RedHat and all, but I frankly don't see it gaining traction anytime soon outside of RH.
So, a userspace solution that relies on a piece of software that no one uses (and, the cherry on top is that systemd will render a system unbootable if the kernel isn't compiled with CGROUPS, that ought to end well, really, what could possible go wrong with it), pitted against an in-kernel solution that is completely transparent to the user. Guess which is going to win.
Seems like a good plan (Score:5, Informative)
My understanding of the original kernel patch is that it just puts stuff from different ttys into different groups for scheduling purposes so that they're less able to hog each other's resources. This alternative just makes your shell sort it out itself when it starts i.e. when you're running a new terminal. So this should basically be equivalent.
See this comment from the latest article for Linus' take on putting this stuff in-kernel:
http://www.webupd8.org/2010/11/alternative-to-200-lines-kernel-patch.html#comment-98834842 [webupd8.org]
The comment here is very important to remember though:
http://linux.slashdot.org/comments.pl?sid=1870628&cid=34241622 [slashdot.org]
another comment on that article (which I can't now find - anybody know where it is?) basically said that the patch suits Linus's own use of compiling kernels whilst surfing the web. Sounds like a reasonably accurate assessment really so for now it's far from the magical boost to general interactivity some may have hoped for. In some sense there's no such thing anyhow.
Nonetheless the comment linked above also has Linus talking about increasing the scope of the automatic grouping heuristics in the future so hopefully the "just works" nature of this should become available to more people eventually.
The original kernel patch (and this alternative) aren't magically making everything respond better, they just improve certain usecases.
Re:Any linux kernel? (Score:2, Informative)
Yes, because is has to do with the way the kernel handles multitasking.
Also from the article (Score:5, Informative)
Two things:
1) There isn't a difference between the kernel patch and the command line hack. They are equivalent. The command line bit was known beforehand because that was the method used to figure out if this kernel hack would be a good idea. The kernel hack just makes the process transparent.
Linus says: Right. And that's basically how this "patch" was actually tested originally - by doing this by hand, without actually having a patch in hand. I told people: this seems to work really well.
2) Linus recommends the kernel patch:
Linus also says:Put another way: if we find a better way to do something, we should _not_ say "well, if users want it, they can do this *technical thing here*". If it really is a better way to do something, we should just do it. Requiring user setup is _not_ a feature.
Source. [lkml.org]
Re:How does this work? (Score:5, Informative)
It makes every process spawned by the user that passes through the bash shell add their process ID to a per-user task control group. See the documentation on control groups [mjmwired.net] for more information about exactly what that means, and what what some of the commands involved aim to do. I'm not sure if this is exactly the same impact as the kernel-level patch, which aimed at per-TTY control groups. That might includes some processes that don't pass through something that executes the .bashrc file along the way.
Ubuntu instructions incorrect (Score:5, Informative)
Following the instructions for Ubuntu as detailed in the post will give you an error message everytime you open gnome-terminal.
One of the comments left by Ricardo Ferreira on that page solved my problem (after rebooting again):
Edit your rc.local file with sudo gedit /etc/rc.local and delete the following line:
echo "1" > /dev/cgroup/cpu/user/notify_on_release
Save and exit gedit. Then, run gedit ~/.bashrc and add the following inside your if statement:
echo "1" > /dev/cgroup/cpu/user/$$/notify_on_release
So it should look like this:
if [ "$PS1" ] ; then /dev/cgroup/cpu/user/$$ /dev/cgroup/cpu/user/$$/tasks /dev/cgroup/cpu/user/$$/notify_on_release
mkdir -m 0700
echo $$ >
echo "1" >
fi
Re:What does this do? (Score:5, Informative)
Imagine you have an app that launches just one process, like a music player, and an app that launches 3 (for example, Firefox, which launches a new one for each plugin).
Since each process has the same priority, the second app - firefox - will effectively have 3x more CPU time than the media player, and possibly stutter the music.
The kernel has something called cgroups, which enables more than one process to be grouped, and each group will have the same CPU time. So the group (Firefox+plugins) would have the same CPU time than the media player.
This kernel patch and terminal code enables each terminal you launch to have a different group, so if you launch Firefox from one terminal and the music player from another, they'll have different groups.
Re:Also from the article (Score:5, Informative)
The difference is the kernel patch is 200 lines of C code, which compiles to several kilobytes of machine code. The shell code needs to spawn a bash process upon startup of every other process, that's several megabytes of RAM and interpreting contents of text scripts that perform the operations.
The final effect may be the same but the overhead of performing the operation is much smaller with the kernel patch.
Re:4 Lines Is Not All. Let's Not Forget... (Score:4, Informative)
/etc/bash.bashrc
Global bashrc file.
Re:What does this do? (Score:5, Informative)
Sure.
Hopefully you know what a TTY is, but in case you don't it it a virtual or real terminal. When you open up an xterm you create one. If you don't have x-windows installed, you reach one, etc.
Well Linus had an idea about using a grouping functionality that was already in the Kernel to allow all the processes (technically actually all the kernel threads) running from one TTY to be grouped together for scheduling.
The result of that is that if you are running 99 processes in one xterm that could consume all of your CPU, and you open another xterm, one one just one process that wants 100% CPU, each xterm's processes gets 50% of the CPU, rather than one getting 99% and the other getting only 1%.
But lets say you only had that first xterm. Since each of those processes are not getting nearly the processor amount they desire, normally the scheduler sees them as nearly starved, and the next process that only wants 5% of CPU does not get much preferential treatment for giving up most of it's time. However, with the grouping, the scheduler can see that those 99 processes are related, and they are not really starved, since as a group they are getting 100%. So now when this other app that wants only 5% comes along, the scheduler might give it pretty much all of that 5% rather than the mere 1% it would have been getting before, and so that app (probably a web browser or something) remains nice and responsive.
That is not 100% accurate, since I've simplified some things a little, especially with regard to the working of the scheduler, but it should give you the idea.
Eventually, more heuristics might be added, so that a GUI application that launches a bunch of threads and hogs the CPU might have all it's threads grouped, so they don't hurt responsiveness of interactive apps either.
Re:How does this work? (Score:5, Informative)
Re:Also from the article (Score:5, Informative)
Re:4 Lines Is Not All. Let's Not Forget... (Score:5, Informative)
Re:What does this do? (Score:5, Informative)
In theory you could alter the 'launch' process for running software & check a database for 'nice' priorities so that they automatically launch with a preset 'nice' rating.
Currently, the kernel is very egalitarian - everything runs at 'nice 0' unless the user wants something different. If YOU think that extinguish_fire should have more of a priority than watch_tv, then YOU should handle the issue.
However, that isn't the issue addressed by either the patch or the userspace scripts. While adjusting niceness may help in a gross sense, it's not going to handle proper timeslicing of software that's spawning a huge number of threads and lagging other applications.
As an example, we need to run extinguish_fire and evacuate_building at the same time. extinguish_fire spawns a thread for each bucket in the brigade, while evacuate_building only spawns a thread for each escape route. Now, if there are 96 buckets & 4 escape routes, extinguish fire will consume 96% of the CPU & choke out the evacuate_building threads.
You could try to guess the appropriate level of 'nice' for each program when you launch it, but it's not going to be pretty. To get even timing, you would be pushing evacuate_building to nice -19 - an act that would make it next to impossible to establish any control over the bucket brigades.
By grouping all of the threads from a program, extinguish_fire and evacuate_building get equal footing regardless of the number of threads they spawn. Both of them remain responsive to commands without taking the huge hits you get from drastic nice levels. If both processes aren't running smoothly, you can renice the group rather than take the nice hit 'threadcount' times.
Re:Whee... (Score:4, Informative)
I believe he meant they were calloused from having to deal with new users not the OS.
Re:Also from the article (Score:1, Informative)
Re:Also from the article (Score:5, Informative)
No, incorrect. This is a modification to your .bashrc, which is (already) run every time you start a bash process, within that process (i.e., not a new process). Nothing needs to be spawned on every single process.
Admittedly the bash script does spawn some processes, but a) that's the way .bashrc works, and you have dozens of those in there, and b) it's only one process, a mkdir. The echo and the conditional run within bash itself.
The way that the configuration works, whether done in the kernel or in your .bashrc, is to associate all processes spawned from a single bash shell with a single new scheduling group. This gets you better performance when you're running processes from terminals, by associating logically-similar groups of processes in the kernel instead of letting it see all the processes as a giant pile.
The intended use case, which is pretty clear from the LKML discussion, is to make performance between something intensive (like a compilation) in a terminal and something non-terminal-associated (like watching a movie) better-balanced.
Re:Poettering is pimping systemd (Score:4, Informative)
You don't need Pulseaudio if your machine has a single set of speakers and a single input device or maybe a couple of devices that never change.
As soon as you add things like bluetooth or USB headsets into the mix and want to do things like move audio streams between output devices without stopping them (play the sound from the DVD I am watching on the main speakers, unless I turn on the bluetooth headset) you either need to modify each and every application to understand all these devices or else you need some kind of sound server.
Re:4 Lines Is Not All. Let's Not Forget... (Score:3, Informative)
I was busy - working.
Putting the code in the wrong place (Score:5, Informative)
An early comment on LWN [lwn.net] captured the technical argument best, I think, which I guess illustrates both the quality of the articles and posters on LWN. The background to this is we are discussing CPU scheduling. If you don't know what CPU scheduling is, think of it as form of mind reading. I'll illustrate.
Lets say you have asked your computer to do several things, in fact so many that if it follows the usual method of simply dividing its time equally between them it is going to annoy you. The video you watching might start flickering, or the music you are listening will drop out. So obviously the computer must now give more CPU time to playing your movie and less to whatever background task you started, such as that MP3 transcode of your 20,000 song library. Except how is the computer is supposed to know this? This is how we get to mind reading.
The hack we are discussing is essentially the discovery of a way to read the minds of one particular type of computer user - the Linux Kernel developer. The Linux Kernel developer is in the habit of starting huge background jobs called kernel compiles. These kernel compiles take a looong while, so the kernel developers, being very clever people, have invented all sorts of ways of speeding them up. One of those ways is to divide the task into lots of little bits, and then fire off separate tasks to do each. This takes maximum advantage of available CPU cores, soaking up every skerrick of available CPU time. This naturally enough leaves none left over for other important tasks like watching a movie while waiting your kernel compile. In this particular case the default CPU scheduling strategy of giving each task an equal share of CPU is woefully poor, because there might be 20 kernel compile tasks and just one movie watching task, so the movie player ends up with 1/20 of the available CPU time. This isn't enough to play a movie.
The mind reading trick discovered boils down to this: Linux Kernel developers use the linux command line interface to fire off the kernel compile. And it turns out that for years now the kernel has been able group the tasks started from a command line and give that group a single portion of CPU time, as opposed to a equal portion to each task in the group. Thus you only have to split up the CPU time into 2, one portion going to the kernel compiler group and the other going to the movie player. Naturally enough the movie player works real well with a 50% allocation of CPU, and so we have a happy kernel developer.
Now we come to the merits of the two hacks. They both do the job I just described equally well. The difference between them is that one, the kernel patch, is automagic, meaning it happens automatically without anybody having to lift a finger. But it comes at the expense of bloating linux kernel a tiny bit, even for users who won't benefit from it. The other way currently has to be done applied manually using a process the vast majority of Linux users will at best find difficult, tedious and error prone.
Seems like a simple decision eh - lets take the tiny bloat hit and not inflict our long suffering desktop users with yet another Linux user-unfriendly idiosyncrasy. But here is the rub: it doesn't help them. In fact, for some it might have a negative impact (a gstreamer pipeline started from the command line strings to mind). The people who will benefit from this are the ones that use the command line heavily and regularly. People like Linus. Which is why he liked it so much I guess. But these are precisely the people who will have no absolutely no trouble doing it the manual way.
Re:But But But But Buzt Buut (Score:1, Informative)
hdparm, but it requires root.
Re:Also from the article (Score:5, Informative)
One never sets PS1 for non-interactive shells, and it's the primary way the shell tells the user's startup scripts whether they're interactive. There's a good chance the PS1 method spares a system call, too :-) It's also what the documentation says to do.
Your [ -t 0 ] approach also fails in cases where an interactive shell is being run on a non-tty. Although almost any shell since about 1990 tends to complain in such cases, at least the PS1 method will still run the right .profile code, and the -t method will not.
Re:Seems like a good plan (Score:3, Informative)
except some people (*cough*unbuntufolks*cough*) don't like to use the terminal... so the kernel patch might be better, although wouldn't all gui apps have the same [p]tty?
Exactly. Gui apps (usually) don't have a controlling terminal, so would all end up in the same scheduling group, making the patch ineffective.
However, with user-space managed cgroups, the window manager (or whatever starts up the GUI apps) could do its own thing (the .bashrc hack doesn't work as is either, because the window manager doesn't usually invoke apps via the shell)
Re:Ah the beauty of the interconnected world... (Score:4, Informative)
the steaming, broken pile known as PulseAudio
That was the case a couple of years ago, but have you tried it recently? I haven't actually had a single audio problem since switching from debian/alsa to ubuntu 10.04/PA, and I now have a ton of useful features on top :-) (per-app volume, per-app output devices, network streaming, seamless switching between headphones and HDMI, etc)
Re:which one is 'right'? (Score:3, Informative)
So, which is a better approach?