Slashdot Log In
Tuning The Kernel With A Genetic Algorithm
Posted by
michael
on Sat Jan 08, 2005 08:01 AM
from the self-modifying-code dept.
from the self-modifying-code dept.
fsck! writes "Jake Moilanen provided a series of four patches against the 2.6.9 Linux kernel that introduce a simple genetic algorithm used for automatic tuning. The patches update the anticipatory IO scheduler and the zaphod CPU scheduler to both use the new in-kernel library, theoretically allowing them to automatically tune themselves for the best possible performance for any given workload. Jake says, 'using these patches, there are small gains (1-3%) in Unixbench & SpecJBB. I am hoping a scheduler guru will able to rework them to give higher gains.'"
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Innovation and open source (Score:4, Insightful)
I mean, common. Just look at this. Amazing.
And even if it turns out to be not that good, it was still a good read
Re:Innovation and open source (Score:2)
Innovative != invention
This is very, very, innovative. It's easily the most innovative thing to hit a kernel since the 70's.
Re:Innovation and open source (Score:4, Insightful)
Parent
Re:Innovation and open source (Score:3, Informative)
Oh, and X supports many formats on the clipboard [jwz.org]
Complexity? (Score:5, Insightful)
Re:Complexity? (Score:3, Insightful)
Re:Complexity? (Score:3, Interesting)
Re:Complexity? (Score:3, Interesting)
Re:Complexity? (Score:2)
While I have to admit that none of them is anything as scary as some of the stuff that happened to 2.4 around 2.4.7-2.4.13 when the VM got changed, it is not right.
The supposedly ready and released kernel 2.6 by now from min
Re:Complexity? (Score:3, Insightful)
Re:Complexity? (Score:2, Insightful)
IMHO, if GA implementation can be made really reliable it could maybe replace other code which may be (I don't know) even more complicated.
Re:Complexity? (Score:5, Informative)
I'm a machine learning researcher, and I'll second this. Also, the code for it will be quite self-contained (if done right), and shouldn't affect any parts of the kernel except where it's called to run an iteration.
They also work quite well. That's why they are so popular.
Sure they do. For a lot of problems, though, they're not so hot compared to other optimization methods like hill climbing and empirical gradient descent - they tend to run slowly - and I would like to ask Mr. Moilanen why he didn't use one. GAs are especially good with nominal parameters (discrete, unordered). But I would expect tuning parameters to be either discrete or continuous.
GAs are theoretically capable of finding global optima, except that's kind of a red herring. The only reason you can prove that is that, theoretically, no matter how small the probability, you'll eventually get exactly the right sequence of mutations to produce a global optimum. In practice, they tend to produce local optima like most other optimization algorithms - especially as Moilanen describes it:
All of the tunings are then ordered from best performance to worst, and the worst half of the performers are replaced with new possible tunings derived from the best half of the performers.
You generally have to be a little less selective (more random) than this.
Parent
Re:Complexity? (Score:2)
good luck with that (Score:5, Insightful)
In other words, non-competitive genetic algorithms are only as smart as the fitness function you give them. If your fitness criteria aren't smart enough to cover all the bases, your solutions will have unpredictable deficiencies.
GA + Hill Climbing... (Score:4, Interesting)
As an alternative, perhaps using some form of pseduo-GA that tries to find pre-tuned parameters that most closely match your operating environment and then letting a Hill-Climbing algorithm hit it would be a better solution.
Hill climbing can also be used in a GA type manner by letting the GA determine witch parameters to climb and in what order. The climbing itself is pretty straightforward, allow vectors to interact with individual parameters. If the result is worse, reverse the vectors or switch to new parameters. Rinse, repeat.
Yes, GA can produce odd bugs and potholes. Yes, it is the fitness test that determines if that will be true. But a good GA will generally find solutions that are as good or better than hand tuning for search spaces that are very complex. Overall, this is a good idea but is probably more complex than advertised.
Parent
Re:GA + Hill Climbing... (Score:5, Insightful)
If you have the resources to exhastively search the space... you don't need a GA.
A GA is generally used when the search space is hopelessly huge and you need to chart a course from A to(wards) B but you don't know the way.
And in finding this solution, which is "grown", not engineered, it's much easier for unintended wierdnesses to creep in. A GA might solve problem X by ruining performance on problem Y, something that you, as a software engineer, would never even consider viable, and hence you forgot to force the fitness function to check problem Y along with problem X.
Parent
Re:GA + Hill Climbing... (Score:2)
I agree. However, we have a problem: Running a GA on an operational system will (MUST) test more highly sub-optimal solutions than it will near-optimal solutions. The performance penalty for searching the space will overwhelm the performance gains f
Re:GA + Hill Climbing... (Score:3, Insightful)
Mutation in Genetic Algorithms are supposed to act as hill-climbers *most of the time*. Mutations are not supposed to make drastic changes - that is the job of random individulas inserted into populations and recombination (mating). Mutation (of the bit flipping variety) is mostly there to provi
not a panacea (Score:3, Interesting)
They might converge on a point of attraction that is not the highest possible.
Sure the only way is to exhaustively search the "chromosome" space for every possibile combination, and computers are good at brute force!
Simulated Annealing (Score:3, Interesting)
Other kernel parameters? (Score:5, Interesting)
If this package could be extended to the other parameters, it would save my customers a *lot* of time and money.
If nothing else, this could be a deciding factor for some of our clients to use linux instead of windows.
Re:Other kernel parameters? (Score:3, Interesting)
Re:Other kernel parameters? (Score:3, Interesting)
What about adding hooks for applications to to send/recieve performance changes after tweaks? Services, daemons, etc, need to communicate how the GA's latest tweak adjusted performance, right?
So.... (Score:2, Funny)
Not worth it... (Score:2, Insightful)
Re:Not worth it... (Score:5, Insightful)
What is "on the borders of statistical error" depends on how many times the test was run, and how much variation there had been in his results before. I think it's pretty safe to assume that if he knows how to implement a genetic algorithm into the linux kernel, he knows how to handle statistics properly.
Parent
Re:Not worth it... (Score:2)
Re:Not worth it... (Score:2)
being mostly just a proof of concept at this stage.
Re:Not worth it... (Score:3, Insightful)
Re:Not worth it... (Score:3, Interesting)
Pragmatism and statistics are _not_ a good mix.
Note that, for example, many hosting providers host hundreds of web sites per system. Adding a couple of percent in performance then adds a couple of percent to the bottom line of the cost picture for those companies. The same is true for supercomputer clusters used by many c
This has been done before (Score:2, Funny)
"Daystrom felt that such an act was an offense against the laws of God and man, and the computer that carried his engrams also believed it."
--Kirk
One question remains: (Score:3, Funny)
Genetic packet scheduler (Score:3, Interesting)
The problem: Determining Performance (Score:4, Insightful)
What is the system using to decide if a new parameter set is better than a previous? What is the fitness test?
Some applications are disk-bound, others are CPU-bound, others are network bound. The performance dance is often non-obvious (e.g. some applications will achieve generally higher performance by allowing I/O higher priority than context switching, while others that appear to perform in a similar manner will achive higher performance by reversing that order).
The kernel does not have any mechanism to determine if a particular application is performing better or worse, it can only really get a guage of throughput and load. While this MAY be enough to get small systemwide performance gains, in order to really acheive significant application-specific performance gains, I think that applications would need to explicitly add support for adaptive tuning by logging relevant performance metrics for the kernel to tune around.
Thoughts?
Oh no! (Score:2, Funny)
Monte Carlo w Bayesian Stats (Score:2)
GAs aren't rocket science (Score:5, Insightful)
In reality, the basic GA framework is "just" another efficient search algorithm, no cooler or more complex than, say, a hash table or a binary search tree; at its simplest, a GA is a way to find an optimal configuration of components without looking at all possible (potentially explosively exponential) combinations; instead, you look at just some permutations, and as you iterate through generations, applying breeding and mutation, you arrive at a generation which is statistically close to optimal.
GAs are also in no way new or unproven technology; a nice example of mainstream use is PostgreSQL [postgresql.org]'s query planner, which uses GAs to optimize query plans.
practical applications (Score:2, Interesting)
this is a really interesting idea.
Moving the genetic algorithm processing to another machine may be warranted. If you had a good idea of what you were going to be doing (heavy database work for instance), a dedicated machine could be used to find an optimal scheduling solution and then that could be implemented on the production machine.
or maintain a list
Re:practical applications (Score:2)
Ahh.. interesting idea..
If I am running in a cluster environment, I could dedicate one or more machines in the cluster to evolve tuning parameters. That machine could publish "discoveries" to the oth
Shouldn't this be done in userspace? (Score:2)
Re:Futurology (Score:2, Interesting)
It's pretty funny considering he is talking in his grave and totally serious robo-voice...
Re:Futurology (Score:4, Funny)
It's pretty funny considering he is talking in his grave and totally serious robo-voice...
You mean there is another voice?
Parent
Re:Futurology (Score:2)
If one of Linus' kids takes over from his father then that could be considered a kind of tuning of the kernel using genetic algorithms!
EricWhy the Vioxx recall reduced spam (well, maybe temporarily) [ericgiguere.com] (see also my William Shatner All-Bran humor [ericgiguere.com]
Re:Dear Kernel Coders (Score:3, Insightful)
In other words: if you have someone who's good at writing Genetic Algorithms, but not so good at searching
Re:Dear Kernel Coders (Score:3, Informative)
Re:Dear Kernel Coders (Score:5, Informative)
Patches for 2.4 can be found in this changeset [bkbits.net].
Patches for 2.6 can be found in this changeset [bkbits.net].
Click on the little "diff -Nur style" link for a an actual usable patch.
In the course of a few hours, you have the fixes already. Yay for open source.
Btw, nice troll
Parent
Re:Dear Kernel Coders (Score:2)
Re:Oh, Oh (Score:2)
kudos to SpanKY, who found the right words for this poor soul
Re:Oh, Oh (Score:2)
I think you are allowed to just say USE="*"
If you want the binaries to be slower than RPMs...
Re:Oh, Oh (Score:2)