Tuning The Kernel With A Genetic Algorithm 251
fsck! writes "Jake Moilanen provided a series of four patches against the 2.6.9 Linux kernel that introduce a simple genetic algorithm used for automatic tuning. The patches update the anticipatory IO scheduler and the zaphod CPU scheduler to both use the new in-kernel library, theoretically allowing them to automatically tune themselves for the best possible performance for any given workload. Jake says, 'using these patches, there are small gains (1-3%) in Unixbench & SpecJBB. I am hoping a scheduler guru will able to rework them to give higher gains.'"
Re:Futurology (Score:2, Interesting)
It's pretty funny considering he is talking in his grave and totally serious robo-voice...
not a panacea (Score:3, Interesting)
They might converge on a point of attraction that is not the highest possible.
Sure the only way is to exhaustively search the "chromosome" space for every possibile combination, and computers are good at brute force!
Other kernel parameters? (Score:5, Interesting)
If this package could be extended to the other parameters, it would save my customers a *lot* of time and money.
If nothing else, this could be a deciding factor for some of our clients to use linux instead of windows.
Simulated Annealing (Score:3, Interesting)
GA + Hill Climbing... (Score:4, Interesting)
As an alternative, perhaps using some form of pseduo-GA that tries to find pre-tuned parameters that most closely match your operating environment and then letting a Hill-Climbing algorithm hit it would be a better solution.
Hill climbing can also be used in a GA type manner by letting the GA determine witch parameters to climb and in what order. The climbing itself is pretty straightforward, allow vectors to interact with individual parameters. If the result is worse, reverse the vectors or switch to new parameters. Rinse, repeat.
Yes, GA can produce odd bugs and potholes. Yes, it is the fitness test that determines if that will be true. But a good GA will generally find solutions that are as good or better than hand tuning for search spaces that are very complex. Overall, this is a good idea but is probably more complex than advertised.
Re:Other kernel parameters? (Score:3, Interesting)
Genetic packet scheduler (Score:3, Interesting)
It might be useful to find good default settings (Score:1, Interesting)
practical applications (Score:2, Interesting)
this is a really interesting idea.
Moving the genetic algorithm processing to another machine may be warranted. If you had a good idea of what you were going to be doing (heavy database work for instance), a dedicated machine could be used to find an optimal scheduling solution and then that could be implemented on the production machine.
or maintain a list of optimal solutions on the production machine and change based on context.
This implementation might not be all that useful but I hope it spurs interest and a lot more development in the area.
The author did an awesome job coming up with the idea.
Re:Complexity? (Score:3, Interesting)
Re:GA + Hill Climbing... (Score:2, Interesting)
Granted it was a user level app and stored it's data in an sql db, but roughly half that data is from running processes.
The collection of timely data will overwhelm anything else. I have no idea of the details of what has been done except from the article. I think the only way it could be usefull would be as a tuning tool for servers (most heavy duty servers have regular patterns of usage).
You run the tool on your server to collect data. You then use the data and a seperate machine to "evolve" a solution. The solution may involve many changes at various times, whatever, but these could be sheduled into the server. In most places you would want to collect data for at least a month,preferably for 13 months.
Note that the 3% was against particular benchmarks. If you based your benchmarks on data collected from your servers the "evolution" stage could be a snap. A lot of large companies won't even spend money to collect and analyse basic data because they run on a "fix on fail" basis and just throw hardware at it because memory/disk space is relatively cheap and they think it might help.
Very interesting stuff and a great thread but I can't see the GA-kernel as a significant improvement.
Does anyone use this one? (Score:2, Interesting)
But it is getting much better now, I don't know how much generations there will be needed to get things right. It feels pretty much the same as with the vanilla kernels, let's see where this leads
Anyone else with experiences? AFAIK this thingy can only be tweaked by editing the code and recompiling, there are a few hardcoded parameters
Re:Complexity? (Score:3, Interesting)
Its not really that complex, but reasoning about how well it could perform is VERY difficult comparitively speaking. Furthermore, it introduces a much fuzzier notion of fine tuning. Rather than playing with variables like "swappiness" and cpu affinity, you're messing with a fitness function, where minute changes can move you from a percentage gained over a stock kernel to a percentage (or worse) loss. Certainly, I'm impressed that he's managed to make an improvement over stock with it, which puts interested kernel developers on a good first step. Nobody wants to chase technology that doesn't even show a hint of unlocking performance potential.
Re:Other kernel parameters? (Score:3, Interesting)
What about adding hooks for applications to to send/recieve performance changes after tweaks? Services, daemons, etc, need to communicate how the GA's latest tweak adjusted performance, right?
Re:Not worth it... (Score:3, Interesting)
Pragmatism and statistics are _not_ a good mix.
Note that, for example, many hosting providers host hundreds of web sites per system. Adding a couple of percent in performance then adds a couple of percent to the bottom line of the cost picture for those companies. The same is true for supercomputer clusters used by many companies and universities with hundreds of nodes.
Even though 1-2% sounds like 'next to nothing', but that's not how you should look at it. If you ignore 2% only five times, you've really already ignored 10.5%...
There is a dutch saying that, when translated in english is like this: "If you don't honour the small things, you're not worth the big ones".
Or any startup routine. (Score:2, Interesting)
In general any software has initialization routines. My work has been in robotics, automation, and telecom switching equipment written in C and C++. We spent a lot of time worrying about efficiency, but not for startup routines unless the start lag became annoying. But for the most part people expect things to take a little time to start up. There are a lot of things to do, especially if you have no way of knowing how the thing shut down. I am digressing.
I think that a genetic algorithm to tune sounds like a great idea. I would just want a way to say check this code and not this other code.
I hope that there is much success with this GA.