Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Programming Software Linux Technology

Tuning The Kernel With A Genetic Algorithm 251

fsck! writes "Jake Moilanen provided a series of four patches against the 2.6.9 Linux kernel that introduce a simple genetic algorithm used for automatic tuning. The patches update the anticipatory IO scheduler and the zaphod CPU scheduler to both use the new in-kernel library, theoretically allowing them to automatically tune themselves for the best possible performance for any given workload. Jake says, 'using these patches, there are small gains (1-3%) in Unixbench & SpecJBB. I am hoping a scheduler guru will able to rework them to give higher gains.'"
This discussion has been archived. No new comments can be posted.

Tuning The Kernel With A Genetic Algorithm

Comments Filter:
  • by Anonymous Coward on Saturday January 08, 2005 @08:09AM (#11296207)
    A common criticism of Open Source is the accusation that it lacks innovation.

    I mean, common. Just look at this. Amazing.

    And even if it turns out to be not that good, it was still a good read :-)
    • have been doing this kind of logic for a while. Itanium is almost built entirely on this type of logic with combination of intel compilers and code technology.

      I think a hardware design that does more logic controls would be best...
    • >> [...]theoretically allowing them to automatically tune themselves for the best possible performance for any given workload.

      > A common criticism of Open Source is the accusation that it lacks innovation.

      > I mean, common. Just look at this. Amazing.


      Well, not to denigrate the efforts of the people working on the Linux kernel, but Java has had Hotspot and other Just in Time compiling tricks since 2001, and I'm sure there are others before it.

      Java Hotspot does not use the same technology of c
  • Complexity? (Score:5, Insightful)

    by BurntNickel ( 841511 ) on Saturday January 08, 2005 @08:20AM (#11296244)
    So how much additional complexity is added for a 1-3% perfomance improvement? I'm all for more speed, but keeping thinks simple can often be more improtant when it comes to maintainablity and adding additional features.
    • Re:Complexity? (Score:3, Insightful)

      by bersl2 ( 689221 )
      It's only one guy (I think; didn't RTFA), and it's nowhere near being included in the mainline kernel. Your observation may be correct in general, but what's the problem here? If an experiment pans out, the internals will be changed down the line to incorporate the new idea; if this idea were to have yielded a 5-10% increase in performance, would you have made that comment?
    • I hate to sound like Victor Meldrew, but I seriously dislike the way 2.6 is going. It should have settled by now, but instead of that you see major changes to scheduler, network drivers in every changelog. The only thing that is being left alone for now is the VM (thanks god for that).

      While I have to admit that none of them is anything as scary as some of the stuff that happened to 2.4 around 2.4.7-2.4.13 when the VM got changed, it is not right.

      The supposedly ready and released kernel 2.6 by now from min
      • Re:Complexity? (Score:3, Insightful)

        by m50d ( 797211 )
        Like I've said before, kernel 2.6 is simply not stable yet. Wait until they fork off 2.7, then with luck it will settle down.
        • It is stable. And if you check out the new development model, you can see there will not be a 2.7 anytime soon.
          • Then why does Debian still include kernel 2.4?

            2.6 is still beta.

            BSD looks alot better.
          • It's not stable. When it runs on my system, and everyone's system, without crashing more than, say, hourly, then it's stable. 2.6 is nowhere near that.
            • It's not stable.
              What you meant to say is "it's not stable for me". I'm having no problems with it.
              • No. That's why I mentioned "everyone's system". A program which is stable on some of the systems it runs on is not a stable program. Doubly so for something as critical as a kernel.
                • No. That's why I mentioned "everyone's system". A program which is stable on some of the systems it runs on is not a stable program. Doubly so for something as critical as a kernel.
                  Unless the problem is your system: i.e. hardware.
    • Re:Complexity? (Score:2, Insightful)

      by devillion ( 831115 )
      Understanding (basic of) GAs is easy and so is implementation. They also work quite well. That's why they are so popular.

      IMHO, if GA implementation can be made really reliable it could maybe replace other code which may be (I don't know) even more complicated.

      • Re:Complexity? (Score:5, Informative)

        by grammar fascist ( 239789 ) on Saturday January 08, 2005 @02:58PM (#11298652) Homepage
        Understanding (basic of) GAs is easy and so is implementation.

        I'm a machine learning researcher, and I'll second this. Also, the code for it will be quite self-contained (if done right), and shouldn't affect any parts of the kernel except where it's called to run an iteration.

        They also work quite well. That's why they are so popular.

        Sure they do. For a lot of problems, though, they're not so hot compared to other optimization methods like hill climbing and empirical gradient descent - they tend to run slowly - and I would like to ask Mr. Moilanen why he didn't use one. GAs are especially good with nominal parameters (discrete, unordered). But I would expect tuning parameters to be either discrete or continuous.

        GAs are theoretically capable of finding global optima, except that's kind of a red herring. The only reason you can prove that is that, theoretically, no matter how small the probability, you'll eventually get exactly the right sequence of mutations to produce a global optimum. In practice, they tend to produce local optima like most other optimization algorithms - especially as Moilanen describes it:

        All of the tunings are then ordered from best performance to worst, and the worst half of the performers are replaced with new possible tunings derived from the best half of the performers.

        You generally have to be a little less selective (more random) than this.
    • don't worry, the new Kernel also comes on DVD.
    • Re:Complexity? (Score:2, Insightful)

      by gnuLNX ( 410742 )
      GA's a pretty trivial to code. So very little complexity is added.
  • by Illserve ( 56215 ) on Saturday January 08, 2005 @08:21AM (#11296248)
    If a parameter space is complex enough that you can use a genetic algorithm to tune it, the solutions it finds may have all sorts of unexpected potholes, bugs, etc.

    In other words, non-competitive genetic algorithms are only as smart as the fitness function you give them. If your fitness criteria aren't smart enough to cover all the bases, your solutions will have unpredictable deficiencies.
    • by Corpus_Callosum ( 617295 ) on Saturday January 08, 2005 @08:56AM (#11296365) Homepage
      First thing: A GA is only truly effective if you let it exhaustively search the search space - which is why GAs are run against simulations rather than in operational systems. Imagine trying to tune a kernel at runtime by occassionally switching to random tuning parameters. I think this is extremely non-optimal. Of course, if most of the heavy lifting is done before-hand and the GA is simply examining pre-defined parameter sets on your machine, it could work. But it's not really much of a GA anymore.

      As an alternative, perhaps using some form of pseduo-GA that tries to find pre-tuned parameters that most closely match your operating environment and then letting a Hill-Climbing algorithm hit it would be a better solution.

      Hill climbing can also be used in a GA type manner by letting the GA determine witch parameters to climb and in what order. The climbing itself is pretty straightforward, allow vectors to interact with individual parameters. If the result is worse, reverse the vectors or switch to new parameters. Rinse, repeat.

      Yes, GA can produce odd bugs and potholes. Yes, it is the fitness test that determines if that will be true. But a good GA will generally find solutions that are as good or better than hand tuning for search spaces that are very complex. Overall, this is a good idea but is probably more complex than advertised.
      • by Illserve ( 56215 ) on Saturday January 08, 2005 @12:53PM (#11297695)
        First thing: A GA is only truly effective if you let it exhaustively search the search space

        If you have the resources to exhastively search the space... you don't need a GA.

        A GA is generally used when the search space is hopelessly huge and you need to chart a course from A to(wards) B but you don't know the way.

        And in finding this solution, which is "grown", not engineered, it's much easier for unintended wierdnesses to creep in. A GA might solve problem X by ruining performance on problem Y, something that you, as a software engineer, would never even consider viable, and hence you forgot to force the fitness function to check problem Y along with problem X.

        • If you have the resources to exhastively search the space... you don't need a GA.

          Ahh, semantics. "Allow the GA to exhaustively search vs. The GA searches exhaustively"

          Of course the GA will not do an exhaustive search, but it must have the ability to do so (e.g. explore all the really bad solutions as well as the good ones - the solution space should not be artificially constrained by eliminating random solutions from testing).

          It is pretty obvious that I was trying to say this if you actually look
          • Nothing about properly implemented GA's can be considered "exhaustive". They are, by definition, a means of effectively navigating a hopeless search space. For example, it is impossible for a GA to "explore all the really bad solutions", as there are far too many in any interesting problem space.

    • Mr. Illserve, you imply that problems best solved by genetic algorithms may well be full of potholes and bugs. Your comment is overrated FUD; you are trying to scare unwary slashdotters away from genetic algorithms !

      First of all, the possible parameters that the genetic scheduler can affect can all be changed safely. The genetic scheduler does not try to invent new scheduling algorithms; rather, it tests different parameter sets and existing algorithms to find which works best with the current workload.

      Yo
      • Admittedly, I don't understand this problem space very well. My point was just that I am wary of GA's in any situation in which reliability is a concern. But it could be that in this case the worst a GA can do is come up with a solution that is 10% slower for some unexpected type of problem.

        There's just this sense among lay people though that GA's are some kind of magical cure-all, and I think it's because this type of search is a bit of a black box. Put something in, turn the crank, and wait for a solu
  • not a panacea (Score:3, Interesting)

    by DrSkwid ( 118965 ) on Saturday January 08, 2005 @08:22AM (#11296255) Journal

    They might converge on a point of attraction that is not the highest possible.

    Sure the only way is to exhaustively search the "chromosome" space for every possibile combination, and computers are good at brute force!

    • Simulated Annealing (Score:3, Interesting)

      by Mark_MF-WN ( 678030 )
      That's where 'Simulated Annealing' comes in. It can often avoid local maxima that aren't optimal.
    • They might converge on a point of attraction that is not the highest possible.

      Mutation helps with convergence issues.

      Sure the only way is to exhaustively search the "chromosome" space for every possibile combination, and computers are good at brute force!

      Um, I don't even know how many parameters are involved to be adjusted, but it does not take that many to make exhaustive search useless, even for a computer.

    • > Sure the only way is to exhaustively search the "chromosome" space for every possibile combination, and computers are good at brute force!

      Problem is, the search space grows exponentially with the size of the chromosome. No problem if you have a short chromosome, but brute force is intractable for long chromosomes.

      For example, if you are trying to optimize a set of 100 binary parameters for some problem, brute force requires you to evaluate 2^100 combinations.

  • by Feint ( 135854 ) on Saturday January 08, 2005 @08:24AM (#11296258) Homepage
    Could this be extended to include other kernel parameters as well? Depending on your app, things like TCP timeouts and other muck can have a large impact. Tuning this stuff is currently somewhat of a black art. Then as the user community of the app becomes familiar after rollout, a lot of the usage patterns change. In a few cases, this means we end up having to re-tune the kernel.

    If this package could be extended to the other parameters, it would save my customers a *lot* of time and money.

    If nothing else, this could be a deciding factor for some of our clients to use linux instead of windows.
    • Now your talking. Adaptive tuning is definitely the future. While I disagree that a general purpose GA is useful here (you should not let a GA hit an operational system, you need to let it hit a simulation first to build up a certain amount of fitness in it's solution space), many adaptive techniques would be useful and could eliminate the need to hand tune in many environments.
    • That is a great idea. Now here is a dumb one:

      What about adding hooks for applications to to send/recieve performance changes after tweaks? Services, daemons, etc, need to communicate how the GA's latest tweak adjusted performance, right?
  • So.... (Score:2, Funny)

    by mstefanus ( 705346 )
    So will this means that if I install this kernel on my computer I will have baby Pentiums or baby Athlons soon?
    • If you nurture your computer, and occasionally sit it next to another computer then maybe, just maybe, when you wake up one morning, you will have little PDAs running around the place :)
  • Not worth it... (Score:2, Insightful)

    by mikelang ( 674146 )
    1-2% gain is in the borders of statistical error. Definitely not worth the increased complexity of the solution.
    • Re:Not worth it... (Score:5, Insightful)

      by Corfe ( 246685 ) on Saturday January 08, 2005 @09:03AM (#11296386)
      It's a unique idea - what's wrong with running it for a while with your typical load (say, for a fileserver), finding some better-than-average parameters for the kernel, then running an unpatched kernel with those parameters manually entered?

      What is "on the borders of statistical error" depends on how many times the test was run, and how much variation there had been in his results before. I think it's pretty safe to assume that if he knows how to implement a genetic algorithm into the linux kernel, he knows how to handle statistics properly.
    • Slighty disagree - I think it is worthy of evaluation... probably better place for this is on the 2.7 branch.
    • ..which would be why he's asking for scheduler gurus to work in it, no?

      being mostly just a proof of concept at this stage.
    • Re:Not worth it... (Score:3, Insightful)

      by xenocide2 ( 231786 )
      Toms hardware consistantly favors computer hardware that only pushes above the competition by 1 percent or less. People spend an extra 40 dollars for this performance, and you're not willing to consider that people might like a FREE performance boost of a percent?
    • Re:Not worth it... (Score:3, Interesting)

      by jelle ( 14827 )
      How do you know the margin of error? I've seen systems/measurements where 50% difference is a statistical error, and systems where it needs to be less than 0.2% to be a statistical error.

      Pragmatism and statistics are _not_ a good mix.

      Note that, for example, many hosting providers host hundreds of web sites per system. Adding a couple of percent in performance then adds a couple of percent to the bottom line of the cost picture for those companies. The same is true for supercomputer clusters used by many c
  • http://tinyurl.com/6pkzc
    "Daystrom felt that such an act was an offense against the laws of God and man, and the computer that carried his engrams also believed it."
    --Kirk
  • by Qbertino ( 265505 ) <moiraNO@SPAMmodparlor.com> on Saturday January 08, 2005 @08:52AM (#11296350)
    Did SCO allow him to modify their kernel?
  • by City Jim 3000 ( 726294 ) on Saturday January 08, 2005 @09:07AM (#11296402)
    Would it be possible to apply a genetic algorithm on a packet scheduler? IMO the packet schedulers available today needs too much manual tweaking.
  • by Corpus_Callosum ( 617295 ) on Saturday January 08, 2005 @09:23AM (#11296448) Homepage
    The main problem with this or any other adaptive tuning mechanism is actually acquiring performance metrics.

    What is the system using to decide if a new parameter set is better than a previous? What is the fitness test?

    Some applications are disk-bound, others are CPU-bound, others are network bound. The performance dance is often non-obvious (e.g. some applications will achieve generally higher performance by allowing I/O higher priority than context switching, while others that appear to perform in a similar manner will achive higher performance by reversing that order).

    The kernel does not have any mechanism to determine if a particular application is performing better or worse, it can only really get a guage of throughput and load. While this MAY be enough to get small systemwide performance gains, in order to really acheive significant application-specific performance gains, I think that applications would need to explicitly add support for adaptive tuning by logging relevant performance metrics for the kernel to tune around.

    Thoughts?
    • You are very right with this: The kernel does not have any mechanism to determine if a particular application is performing better or worse, it can only really get a guage of throughput and load.

      I thought long how the "fittness" part of that GA might work. Ultimately the only "fittness" the kernel can with high precision measure is "load" isn't it?

      Any better ideas?

      angel'o'sphere
      • I thought long how the "fittness" part of that GA might work. Ultimately the only "fittness" the kernel can with high precision measure is "load" isn't it?

        Any better ideas?
        I suppose the only way would be to have some "tuning" hooks that applications could use to announce their own performance metrics. The GA could optimize around load when these metrics are not available and take them into account if they are.
  • Oh no! (Score:2, Funny)

    by brainnolo ( 688900 )
    Does this means Linux will be effected by genetical diseases sooner or later?
  • Monte Carlo simulations w Bayesian Stats may explore very large otherwise intractable parameter spaces. Perhaps an alternative path?
  • by Earlybird ( 56426 ) <slashdotNO@SPAMpurefiction.net> on Saturday January 08, 2005 @09:33AM (#11296486) Homepage
    Because most people aren't intimately familiar with genetic algorithms, and because GAs are associated with machine learning/artificial intelligence, GAs are seen as somewhat mysterious and magical, and are therefore either accepted with "whoa, cool!" or rejected with "whoa, complex!" While GAs are indeed novel compared to many long-established algorithms, both mentalities are overreactions.

    In reality, the basic GA framework is "just" another efficient search algorithm, no cooler or more complex than, say, a hash table or a binary search tree; at its simplest, a GA is a way to find an optimal configuration of components without looking at all possible (potentially explosively exponential) combinations; instead, you look at just some permutations, and as you iterate through generations, applying breeding and mutation, you arrive at a generation which is statistically close to optimal.

    GAs are also in no way new or unproven technology; a nice example of mainstream use is PostgreSQL [postgresql.org]'s query planner, which uses GAs to optimize query plans.

    • GAs are not typically efficient on their own... and while their solutions are usually better than the nominal case (with no optimization), those solutions don't always qualify as "optimal".

      quote:

      When to Use GAs

      GAs treat the optimization problem as a black box, and are therefore very flexible. Because GAs use very little problem structure they will inevitably perform poorly relative to an algorithm which is is designed for a specific problem type. GAs are therefore best suited to messy problems which do n
    • Actually, some flavors of GA's are fundamentally different from other types of search, particularly those with an open ended problem space.

      But in the case of simple parameter-fitting problems (ie this one) , you're right.

  • while performance gains of 1-3% in a well defined set of tasks (in this case the benchmarks) is a marginal improvement well inside statistical error...

    this is a really interesting idea.

    Moving the genetic algorithm processing to another machine may be warranted. If you had a good idea of what you were going to be doing (heavy database work for instance), a dedicated machine could be used to find an optimal scheduling solution and then that could be implemented on the production machine.

    or maintain a list
    • Moving the genetic algorithm processing to another machine may be warranted. If you had a good idea of what you were going to be doing (heavy database work for instance), a dedicated machine could be used to find an optimal scheduling solution and then that could be implemented on the production machine.

      Ahh.. interesting idea..

      If I am running in a cluster environment, I could dedicate one or more machines in the cluster to evolve tuning parameters. That machine could publish "discoveries" to the oth
    • The genetic algorithm is supposed to create a situation where you no longer have to select a scheduler. What you are proposing is using it to find the best scheduler for the task. This is already known and the scheduler can be designed to take advantage of the environment if it is known. This is best to be thrown into a mixed environment where you do not know the types of processes.

      The implementation is horribly useful and needs improvement on the scheduler end as is said. The schedulers that it can u
    • There's a severe flaw with this methodology though.

      The two machines are different (even if their specs are theoretically the same). Minor differences in characteristics can result in highly sub-optimal results if you move the determined parameters from one machine to another.

      To obtain *useful* parameter sets, you will need to do it on the machine in question. Your second option points towards this.
  • I can see a kernel patch to export some extra information and/or extra tuning hooks via proc or sysfs, but IMHO the algorithms themselves should be outside the kernel, running in a daemon.
  • Having worked with a variety of optimization algorithms, would it make sense to consider the optimization itself as the 'innovation', where the actual algorithm used is secondary? Perhaps this can be optimized further with other algorithms or using the most appropriate one for different cases... ?
  • Just compiled this stuff on an old testbox, now running it for about 100 generations. At first it was feeling very slow, ok, it's a Pentium2 ;-) but it was much slower than running vanilla 2.6.9 or 2.6.10, for example.

    But it is getting much better now, I don't know how much generations there will be needed to get things right. It feels pretty much the same as with the vanilla kernels, let's see where this leads ...

    Anyone else with experiences? AFAIK this thingy can only be tweaked by editing the code and

This is now. Later is later.

Working...