Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Linux Software

Rik van Riel on Kernels, VMs, and Linux 233

Andrea Scrimieri writes " An Interesting interview with Rik van Riel, the kernel developer, in which he talks about the Linux's VM, particurarly about his own implementation (which was recently adopted in Alan Cox's tree). With some controversy towards Linus Torvalds. "
This discussion has been archived. No new comments can be posted.

Rik van Riel on Kernels, VMs, and Linux

Comments Filter:
  • Minor nit... (Score:3, Informative)

    by FauxPasIII ( 75900 ) on Tuesday January 15, 2002 @10:30AM (#2842100)
    >> (which was recently adopted in Alan Cox's tree).

    As I understand it, the Rik VM is what we started the 2.4 series with.
    The Andrea VM was adopted in 2.4.10 amidst much controvery, and Alan has kept
    the Rik VM as a part in the -ac kernels.
    • Re:Minor nit... (Score:2, Informative)

      by FauxPasIII ( 75900 )
      Hate to follow up to myself, but I went and reread some old stuff; look like Alan has actually move -from- the Rik VM now, and is using the same Andrea VM that's in the linus kernel.
      • Re:Minor nit... (Score:5, Informative)

        by Rik van Riel ( 4968 ) on Tuesday January 15, 2002 @10:41AM (#2842162) Homepage
        Both Alan's and Michael's kernels are including my -rmap VM [surriel.com] now.

        This is quite interesting since I haven't begun tuning -rmap for speed yet ;)

        • Re:Minor nit... (Score:3, Redundant)

          by FauxPasIII ( 75900 )
          Hrm... how impossible/practical would it be to have
          multiple VMs in the same source tree and select one at
          menuconfig time ? It would probably add up to a lot
          more testing done on all the non-linus-kernel VMs, but
          I have a hunch that the VM is probably something that's
          a pretty pervasive patch; could it be localized down to just
          an option in menuconfig ?
  • Wondering... (Score:5, Interesting)

    by prisoner-of-enigma ( 535770 ) on Tuesday January 15, 2002 @10:32AM (#2842105) Homepage
    I'm wondering why both VM's can't be included in a distro and allowing the end user to select the one he/she wishes to compile into the kernel? Are the two implementations THAT mutually exclusive?

    BTW, this kind of bashing between the high priests of Linux is not good. You can bet your bottom dollar that MS is going to use this conflict to fuel their propaganda machine, saying Linux is a fractious OS run by a bunch of young upstarts who can't agree on anything.
    • Re:Wondering... (Score:5, Insightful)

      by reaper20 ( 23396 ) on Tuesday January 15, 2002 @10:43AM (#2842174) Homepage
      I think this kind of infighting is great, as long as it doesn't get out of control.

      We get to see arguments and competing subsections of the kernel - this is SO one of the most underrated benefits of open source. Users of some other OS's don't have this benefit. I am not a programmer, so to me, I don't really understand/care the benefits of different VM systems, but I know that some other, smarter people do, and they're all trying to figure out the best way to do it, and that's good enough for me.

      I say let them go at it, let the best code win - it can only help us. And who cares how MS construes this, I'd like to see them open up their development model and see what kind of conflicts they are having.

      OT - but kerneltrap.org has a good interview with Alan Cox today....
      • That's all well and good, assuming that these people don't let their egos get in the way of their good judgement.

        And let's be honest - while they may be coding gods, and extremely intelligent, they are only human. And humans don't like being wrong.

        There is also the fact that in wanting to argue to support their own code, these *very* intelligent people may become blind to their own faults, or the faults of their work.

        Infighting isn't necessarily good. It's certainly not good when it gets personal. Active discussion and intelligent, coherent arguments are fine, infighting which includes calling each other names, rather than focusing on the problems/issues at hand, is not.

        Also, when you look at non-open OS development, you have to remember that just because we, as users, don't see arguments/discussions about the best/easiest/most effective way to do something doesn't mean that those arguments/discussions don't occur. Or that non-open OS developers aren't also intelligent people who are also trying to find the best way to tackle a problem/issue...
    • Re:Wondering... (Score:5, Insightful)

      by kaisyain ( 15013 ) on Tuesday January 15, 2002 @10:44AM (#2842177)
      this kind of bashing between the high priests of Linux is not good

      Sure it is. How else are we going to find out where our disagreements are and work through them? Or, at the very least, learn not to make the same mistakes in future projects. The problem of the Linus bottleneck has been known for a long time. This "bashing" is not new, it's just current.

      Having Linus Torvalds around helps insure that, for the average user, there is no splintering of development effort -- just use the Linus kernel. But it also severely hinders improvement because you are limited to what Linus likes or dislikes.

      And despite what may be the common conception on /. Linus is not an all knowing genius. He makes mistakes. Perhaps this is one of those mistakes. The real question is whether the benefits of the stewardship he provides compensates for the hindrances his authoritarianism creates.
      • I was referring to the extremely public and somewhat "pointed" nature of how the debate is being carried on as being detrimental to the Linux effort.

        I'm all for debate, and public discussion, but some of the comments I've seen flying around sound more like teenage namecalling than professional developers with a disagreement. Linux isn't infallible, neither is Alan or Rik.

        It sounds like these two VM's are aimed at two different problems, each addressing their own problem to the detriment of the other. I'm going to go back to my original statement of "let the user choose". Maybe I want the extra speed in a uniprocessor enviroment. Maybe I need the extra scalability in a large muliprocessor environment. I should be able to choose based upon my needs, not upon what Alan, Linus, or Rik thinks is best. After all, choice is what Linux is all about.
      • Re:Wondering... (Score:2, Interesting)

        by sboss ( 13167 )
        Arguing and debating over the VM or other technical issues is good. But when the parties begin bashing each other (not the technical issues but the people) then we have problems. I do consulting on very large projects with many different players (aka companies) involved. Whenever the discussions/debates/arguements move from issues/technical problems into people/companies that is when the whole process breaks down. I am glad that we (as a community) have the ability to debate/discuss the technical issues. I love that. It works great. But lets do not loose the prespective of what we are here to do (write good software that beats the commerical stuff) and start trashing each other.

        Just my point of view.
    • Re:Wondering... (Score:2, Insightful)

      by Fly ( 18255 )
      As I hear it, there's plenty of fighting over ideas and direction within Microsoft as well. The open software environment just allows the discussions to be public. Microsoft can try to use this as "fuel for their propaganda machine," but I doubt they'd get very far.

      Rarely does one person have a monopoly on good ideas, so it's inevitable that finding the right solution will come from competition between more than one good idea. So long as progress continues, as it has, this is healthy. Both Mr. Torvalds and Mr. van Riel recognize that they are "stubborn," and both seem to deal with the heat fairly well.

    • Re:Wondering... (Score:1, Flamebait)

      by garcia ( 6573 )
      yup but at least we don't have a single moron running the show telling everyone else what to do. This way at least there is some internal competition. I suppose in the short-term dictatorship works the best but in the long run free ideas work best (Hitler/Stalin rapid Industrialization vs. Free World)

      I don't particularly like the comments made by this kid but I don't think he is entirely wrong for doing so.
    • Apparently the two implementations are or at least were that mutually exclusive.
      I don't have the link (sorry) but there was talk on the kernel mailing list that it would be a big ball of spaghetti to include them both.
      So Linus made his choice.

      I disagree about the arguing. It would be nice if they were a little more civil about it. But hey, these guys are some of the best in the world at what they do. You know they've all got egos too. And those egos will sometimes cause people to work extra hard to prove the other wrong. That is good for the user. And the fact that a lot of great work happens at a fast pace in spite of the arguing and ego plays is a tribute to the Open Source development strategy.
    • I'm wondering why both VM's can't be included in a distro and allowing the end user to select the one he/she wishes to compile into the kernel? Are the two implementations THAT mutually exclusive?

      The issue appears to fall into the simplicity of installation. Yes, we could have both sent, but one will need to be the default or we make all users recompile the kernel. That would really sound bad from a PR perspective.

      BTW, this kind of bashing between the high priests of Linux is not good. You can bet your bottom dollar that MS is going to use this conflict to fuel their propaganda machine, saying Linux is a fractious OS run by a bunch of young upstarts who can't agree on anything.

      This issue has to do with the public nature of open source. I am certain MS has people on the mailing lists (as does every competitor), so the message would get out. With billions to spend on advertising, even an non-issue can be made one.

      I hope people are able to understand that disagreement, consensus and resolution are part of what keeps OSS healthy. We do not have a single point of failure in leadership and that is a very good thing. If we all blindly followed Linus, we would sound like zealots and the competition would focus on that.
      • Yes, we could have both sent, but one will need to be the default or we make all users recompile the kernel. That would really sound bad from a PR perspective.

        Eh? I fail to see why you couldn't have two different precompiled kernels.

        C//
    • Re:Wondering... (Score:4, Insightful)

      by Nelson ( 1275 ) on Tuesday January 15, 2002 @11:20AM (#2842383)
      I'm wondering why both VM's can't be included in a distro and allowing the end user to select the one he/she wishes to compile into the kernel? Are the two implementations THAT mutually exclusive?



      I've been wondering that myself. It's just work to do but it shouldn't be rocket science. Of course a number of people are concerned about the code size as is, you can't just add a new branch of code every time there is a conflict and pack them all together. This does seem like a somewhat good case for it.


      BTW, this kind of bashing between the high priests of Linux is not good. You can bet your bottom dollar that MS is going to use this conflict to fuel their propaganda machine, saying Linux is a fractious OS run by a bunch of young upstarts who can't agree on anything.


      No this is a good thing. Most of the linux kernel hackers have egos the size of small countries, and that's a good thing because they take pride in their work. Most of them also work as professionally and egolessly as I have ever seen. They can get in fights and then deal with it and keep working. This is far better than the closed world where people get in fights and take it personally and then try to react in some way. I've seen project where people were trying to fail the project to get revenge on someone on the team or managment for something stupid they did in the past. In the linux world people get in fights and everyone can see and they react accordingly, sometimes being told that you're being an ass is a good thing when you're being an ass, sometimes having people stop talking to you for a few days is a good thing, and sometime people appologize and that's a good thing. It's not behind closed doors though and it's hard to undermine things when it's all in the open. There is also some insanely good discussion on certain things some times. There is also something to be said for defending ideas and the strength they have when you can defend them in public.

    • BTW, this kind of bashing between the high priests of Linux is not good.



      OTOH, It's had positive results. On one side, we have a VM that's being optimized for server use. On the other, a VM that's more optimized for desktop performance.



      I'd rather have two VMs than some big, complex piece of code that tries to allow the user to tune the kernel while running...nothing like making a more complex piece of software. As far as I'm concerned, more code complexity == more chances for bugs. ;-)

    • I'm wondering why both VM's can't be included in a distro and allowing the end user to select the one he/she wishes to compile into the kernel? Are the two implementations THAT mutually exclusive?

      It's not so much "mutually exclusive," it's more like "they both rewrite the same chunks of code." Maybe I'm splitting hairs there. AFAICT, the amount of common code between the two isn't enough to make this worth it.

      The result is that the kernel hackers aren't concerned about, say, code size, as much as they're worried about readability and maintainability. The number of #ifdef's scattered throughout the VM code would be incredible, the resulting total code would look like Your Favorite Form of Pasta[tm], and fixing bugs would be difficult.

      There are other ways to do it besides #ifdef, of course, but they all detract from maintainability. And it all becomes vastly difficult to scale as soon as a third VM implementation comes along...

  • by MattRog ( 527508 ) on Tuesday January 15, 2002 @10:32AM (#2842108)
    I believe that the trend is to optimize Linux for the very powerful machines (multiprocessor and with a lot of RAM). Do you agree with me?

    No, not at all. The embedded Linux market seems to be much more active than development of Linux on high end servers. On the other hand, the high end server improvements tend to touch more code in the core kernel while a lot of the embedded systems people just keep separate patches and have no plans of merging their code with Linus.


    I'm not so sure I agree with him -- if you want to make a dent in the market shares of Solaris and NT/2000/XP you have to keep up with their innovations (Async-I/O, better SMP, etc.). As a user of Linux as our OS of choice for our database and web servers I am feeling a lot of pressure to switch to Solaris because of their better handling of higher-load environments (OLTP databases, web servers, etc.). If Solaris wasn't so damn expensive we'd probably be using SunFire 280's. So I'm pleading to keep up with the big dogs so that I can be reassured that Linux has what it takes (it's handling things fine now but as he said in the article, everyone needs more RAM, CPU, etc.).
    • by Rik van Riel ( 4968 ) on Tuesday January 15, 2002 @10:54AM (#2842231) Homepage
      Indeed, it is important to optimise the VM to work right on such large machines. I guess what I wanted to say is that the VM isn't just optimised for high-end machines, but also for machines on the low end.

      To be honest though, optimising for machines of different sizes really is a no-brainer compared to having to make the VM work with really diverse workloads ;)

      • I appreciate your reply, Rik. Would it make more sense then to perhaps have a 'database' oriented VM (or kernel), a 'web server', 'embedded device' etc system? Granted such levels of specialization aren't always that clear cut but I know of many a situation in which each machine has a well-defined primary (obviously your web box can also be your POP3) roll. I know I try to keep the web boxes separate from the DB machines (since they are two wildly different methods of spitting out data to the user) and so on and so forth.
  • I Smell a Fork (Score:1, Interesting)

    by TRoLLaXoR ( 181585 )
    This talk of "Alan's tree" and "Linus's tree" is kind of foreboding. A de facto fork has already taken place.

    What would Alan call his version of kernel? His last name already ends with an "X" so... I dunno where that would leave us.

    Yeah, better off to just keep referring to them as "Alan's tree" and "Linus's tree."
  • by PastaAnta ( 513349 ) on Tuesday January 15, 2002 @10:41AM (#2842164)
    I think it was an excellent decision of Linus to remove Rik's VM from the mainline kernel. If not for technical reasons then for political reasons.

    Rik's VM obviously needed to be fixed and/or tuned, but apparently lacked the necessary attention from Rik. If Linus had not removed the VM, it would probably have been the situation for a while. Instead we now have TWO VM's which are rather stable and Rik working full speed to make his VM the best.

    Competition is good! Which VM will be the best for the future will be determined by Survival Of The Fittest(tm)

    It can be argued though, that it was not the right time during 2.4, but Andreas VM seemed to stabilise rather quick with the high level of attention to the problem. Sometimes it takes drastic measures to get results...
    • By Rik's comments in the interview one would think that Rik's patches apparently lacked the necessary attention from Linus. I tend to believe Rik, a one man revision control system on a project of that size has got to be damn lossy...

      • by PastaAnta ( 513349 ) on Tuesday January 15, 2002 @11:40AM (#2842554)
        I think you are right that Linus could probably have taken more patches from Rik than he did.

        On the other hand you could argue, that too many patches are a sign of the VM not being stable enough. Therefore the VM should probably be matured in a seperate tree (as Rik himself suggests) instead of flooding Linus with bugfixes and tweaking. Then when the VM is stable and can be proven to perform better I am sure even Linus can be persuaded.

        And yes a one man control system IS lossy but that is not a bug but a feature - because it ensures consistensy. In every project of this scale coordination is essential and the individual developers MUST be more thorough with their work before comitting it!!!
      • Rik's a really smart guy, but he isn't (or, rather, wasn't) so good at keeping the mainline kernel moving forward. Despite his comment about Linus dropping patches (which is true). What he didn't mention is that he never resubmitted the patches. He tried once and then dropped them.

        I thinks Rik's VM will become really really good as he maintains a branch for himself. When it's 95% of the way done, he can then work on merging it into maintstream (ie, the Linus kernel). Then we'll have a really kick-ass VM.

        But Rik wasn't working well with the established method for dealing with the Linus kernel. Linus then made the choice to go with a VM from someone who *did* know how to work with the Linus Kernel.

        It's not a technical issue, it's a maintence issue.

        Read up on the kernel cousin stuff with Rik and Linus talking about this.

        Ciao!
      • one man revision control system on a project of that size has got to be damn lossy...

        And Rik's solution to this is to create a patchbot to keep submitting the patches to Linus. I hope he does not apply this concept to his OOM killer, or instead of killing processes when you run out of memory it will spawn serveral new processes. Linux is OOM, he is overloaded, so Rik decides he will put an even greater load on him.

    • After reading the interview, I agree with this. Rik says:

      You want me to answer that question in how many books ? ;) Well, lets make a short answer. Andrea's VM is an attempt to improve the performance of the Linux VM without modifying the structure of the VM. He seems to succeed at it very well, but due to the fact that he doesn't modify the structure of the VM his VM still has the same fundamental problems as the Linux VM. My VM is an attempt to attack some of the fundamental problems the Linux VM has, at the moment still without too much performance tuning.


      In other words, he's trying experimental ideas, while AA is improving on a stable system. Experimental development should not be done in the main kernel tree! I think once he has implemented his ideas and stabilized the development of his VM it might have better chances of getting back in. I think in the long run he actually has a better chance, because once he has something to show for all this, if his ideas are right, he should have a much better VM. Until then I agree with Linus' decision.
  • Interesting (Score:5, Insightful)

    by 4of12 ( 97621 ) on Tuesday January 15, 2002 @10:48AM (#2842197) Homepage Journal

    I have a lot of respect for Rik van Riel, but I think that Linus made a good decision to "cut bait" on his VM implementation for 2.4.

    It was not that Rik's ideas were bad, it was just that their complexity and implementation were going to take too long - they should have been hashed out in 2.3 instead of 2.4.10.

    I'm looking forward to having Rik prove his reverse mapping technology implementation in 2.5.

    May the best ideas ultimately win, and may the giants of the kernel not take offense at each other. It would be a real shame if something stupid like Linus' lossy source code control system put off Rik so much the Linux community at large lost his wonderful contributions.

    Here's to hoping that Linus gets more sensitive in some cases, and that Rik gets less sensitive in some cases.

  • I saw a post on the linux kernel news groups (you can serch for it) about 2-3 weeks ago where Linus says something like "that's why I don't consider you a kernel developer" he always seems to be wining about something. But hey what do I know I'm still trying to get xscreensver to do a mozilla -remote openurl (some url) for a kiosk :)
    • See this link [zork.net] and scroll down a bit for this quote from linus:

      "Which, btw, explains why I don't consider you a kernel maintainer, Rik, and I don't tend to apply any patches at all from you. It's just not worth my time to worry about people who aren't willing to sustain their patches."

      --xPhase
      • I don't want to get marked down or flamed for trolling or anything, because I am not:

        Thanks for posting this quote. The more I read in to this stuff, it often times seems as though Linux has an immature attitude, and often acts like a baby. Ignoring patches because you have a disagreement from someone is just plain immature. I can see how frustrated I would get working with Linus. He still acts like it is his little baby project, and that is just not the case anymore. Thosands of developers are working on it, and this kind of attitude by Linus would just turn people off from the project.

        These kind of snide remarks by Linus are not needed, and if I was Rik, I would tell Linus to fuck off and put my talent to use somewhere else. Linus, act a little more mature.
        • Yes you are trolling! Especially since the original poster could not get the quote right.

          Linus wrote: "...Which, btw, explains why I don't consider you a kernel maintainer, Rik, ..."
          See for yourself! [theaimsgroup.com]

          The reason was that Rik didn't care about everybody else if his bugfixes were not applied. Would YOU like a maintainer that didn't care about the rest of the world?

          "I would tell Linus to fuck off" ... who is being immature now?
  • Need Help ? (Score:1, Insightful)

    by Anonymous Coward
    Maybe we need some help from *BSD VM hackers to to solve this VM thing.

    Since a year and it still has big holes.
  • Rik seems pretty hot on this idea, but I dont see how it could help much. I mean, won't repeated email be ignored nearly as much as the initial submission? I recall in the interview with Marcelo that he did not plan to use either public CVS or a tracking system, but rather planned to keep things close to the cuff as in the past. Perhaps this is the persona of a kernel-master, or perhaps openness and publicity lead to more interruptions, I dunno.

    Anyways, an enlightening, no-holes-barred interview. Enjoyable.

    • Yes, such a thing would be appreciated. It's all very well developing linux in order to "improve itself" but one can take the "stuff the users" approach too far.

      The problem is that there isn't a decent multi-patch versioning system out there: how would you tell CVS you wanted to store versions of files pertaining to 2.5.2-mjc and 2.4.13-ac1 and 2.4.18 and then a set of files for Rik's VM? Then how on earth would you pull out the set of files that constitutes a `linus+mjc' tree, or a `linus+ac' tree, from what you've stored?
      • cvs co -r 2.5.2
        # patch mjc-1
        cvs tag -b 2.5.2-mjc
        cvs tag mjc-1
        cvs commit
        # elsewhere/when
        cvs co -r 2.5.2-mjc
        # patch mjc-2
        cvs tag mjc-2
        cvs commit

        cvs co -r 2.4.13
        #patch ac1
        cvs tag -b 2.4.13-ac
        cvs tag ac-1
        cvs commit
        # elsewhere/when
        cvs co -r 2.4.13-ac
        # patch 2.4.13-ac2
        cvs tag ac-2
        cvs commit

        # assuming that Rik's VM patches are independent
        cvs co 2.5.2
        cvs co rvr-VM
        # patch rvr-VM
        # or, maintain Rik's VM patches as their own
        # files:
        # cvs co rvr-VM
        # cvs update # forces merge
        cvs tag -b 2.5.2-rvr-VM
        cvs tag rvr-VM-1
        cvs commit
        # elsewhere
        cvs co 2.5.2-rvr-VM

        Why wouldn't something like this work? You could even wrap everything up in a nice GUI if you wanted to. :)

        -_Quinn
      • CVS isn't decent (Score:4, Flamebait)

        by kaisyain ( 15013 ) on Tuesday January 15, 2002 @12:11PM (#2842745)
        The problem is that there isn't a decent multi-patch versioning system out there

        Uh, yes there are. Perforce, aide-de-camp, bitkeeper, and others all do this just fine. I haven't used squeak much, but I think this is also how the built-in version control in their smalltalk image works as well. Every change management system that uses changesets works pretty much exactly this way.

        CVS basically sucks, which is why some people [tigris.org] are trying to replace it. It only gets used because it is popular and free, not because it is technically superior. The only thing it is better than is RCS/SCCS. Every other possible solution is no worse, and usually much better, than CVS.
        • Re:CVS isn't decent (Score:2, Informative)

          by jslag ( 21657 )
          Every other possible solution is no worse, and usually much better, than CVS.


          That's a little strong. True, I haven't used anything but CVS for the last couple years, but last time I tried common alternatives (namely MS VSS and PVCS), they were major PITAes - slow, unreliable, and not helpful when more than one developer was working on the same file. Not to say that CVS doesn't have its problems, of course, but for a number of years it has been the logical choice for anyone who doesn't want to plunk down hundreds of dollars per seat for a closed-source tool.

          • Instead of flaming the original poster for his obvious bias and lack of knowledge. I'll add another PITA revision control system... MKS. I cannot stand this piece of crap.
    • by Rik van Riel ( 4968 ) on Tuesday January 15, 2002 @11:27AM (#2842442) Homepage
      The problem is simple: maintainers of any parts of the kernel get flooded by email, maintainers of the whole kernel (Linus, Alan, Marcelo) get flooded even worse.

      You really cannot expect these people to read all their email all the time, so patches and bugfixes get lost and may need to be resent various times before they get noticed.

      Add to that the fact that many of the people writing these patches are also extremely busy and may not get around to resending the patch all the time (I know I don't).

      The solution here would be to have the patch re-sent automatically as long as it still works ok with the latest kernel version ... this can all be checked automatically.

      • I submit that a patch-holding website would be much more valuable. The patches are sent to a common site with destination E-mail specified (Linus, Alan, Rik, etc.) and are visible or private (visible allowing others to come along and look through the patch and post comments to it; like Slashdot).

        This is similar to the submission already made on the LKML that patches be auto-resent, but I don't believe the resending of them is necessary; I think sending a daily E-mail with a list of the patches and an URL to retrieve them, as well as basics about who submitted it and when would be much more valuable to all.
  • Big split? (Score:2, Insightful)

    by grub ( 11606 )

    Recall that "Linux" is owned by Linus. It's not inconceivable to envision a pissing match of the egos ending in "Cox' "rogue" kernel isn't true Linux. Rename it." one day.
    • by Anonymous Coward
      Recall that "Linux" is owned by Linus. It's not inconceivable to envision a pissing match of the egos ending in "Cox' "rogue" kernel isn't true Linux. Rename it." one day.

      Yea, but then they'd have to call it "Coxux" and they'd never get that past the censors.


      ---

      Darn right I'm not signing my name to this post.

    • Uhh GPL, heard of it. Cannot happen, Alan would never have to rename his branch. Linux is owned by no one, Linus gave it away.
  • by Anonymous Coward
    it's about time rmap VM was developed and integrated
    into the kernel. together with O(1) scheduler and low-latency patches it will be a great advance for 2.5 kernel

    But sooner or later OOM due to memory overcommit will have to be solved properly (by not overcommitting). OOM killer is just a hacky solution (even windows doesn't have suck a hack).

    CaptnMArk (forgot my password right now :(
  • Black box VM (Score:2, Insightful)

    by king_ramen ( 537239 )
    Designing the same VM for a microwave oven, gaming PC, database server, and massively parallel cluster will make nobody happy. I think that the choice in VMs will give Linux very good flexibility to run well in different environments. A fully abstracted VM that wrote to fully abstracted IO would allow for "snap ins" that would be optimized for the task at hand. I think forcing 2 VM trees to compete and coexist is good.

    Then again, how much is RAM? I just bought 512MB for less than $40. I NEVER swap (running ~800MB RAM / server) and never want to. I ran my first e-commerce web site w/ SSL and mSQL on a 486SLC-20MHz with 5MB of 110ns DRAM. Yeah, it swapped. But these days, most machines are way overkill for serving web pages, files, and queries.

  • I noticed that Rik was born in '78. That puts him just 2 years older than myself. How could he possibly know so much? I have been involved in computers since I was around 6 but I have no where near the knowledge that this fellow has. It must be all the gaming I do, not to mention he said he's been involved with Linux since '94. Anyway, what a smart guy.
    • A VM is basicly a small thing: a list of pages, every page has a set of properties and an interface on top of that to get things done with the pages (claim/free/mark dirty etc). I wrote one on an MSX2 in 1986 for having 256KB roms in 128K ram + 128K vidram (and 32K disk :)). Of course, modern OS-es need a VM that can take decisions, is scalable on different hardware, and can handle the requests fast.

      A lot of research has been done on virtual memory and the managercode for this type of memory. Also a lot of different types of VM's are implemented in different OS-es, all with pro's and con's in different situations. It's therefor not hard to dig in and get the knowledge you need.

      F.e.: the rmap stuff is a nobrainer. If you let the VM handle every request to share/allocate a mempage, that VM can keep a set of pid's per page. IIRC NT's VM (VMM32) does this. That the current VM in Linux doesn't already have this feature is beyond me.
  • by ajs ( 35943 ) <ajs.ajs@com> on Tuesday January 15, 2002 @11:09AM (#2842334) Homepage Journal
    Open Source's biggest PR dilema is this sort of argument.

    Make no mistake, every company has developers that do this. There's two differences in the Open Source world: 1) you can't just fire an Open Source developer who won't "play ball" with management's edict 2) it's usually public.

    These are actually both really good things. The fact that you can't silence someone leads to repeated analysis of a problem. OSS' biggest benefit is that it brings massive peer review to bare not just on the code, but on the process.

    The fact that it's public feeds into that, and is equally good.

    The problem is PR. The Linux kernel is starting to look like anarchy to non-developers. I suggest that the process works, so we should all take a deep breath and leave it be. However, we all need to take the front lines on PR. Spin is all-important. This is not a "spat" or a "fight", this is "parallel development" and "peer review". The joy of this kind of spin is that, unlike most spin, it's TRUE! This guy is pissed at Linus. Linus has dumped his code. Yet, the two of them keep working hard to meet their customers' demands and producing what they feel is the best possible product.

    Please, don't foster the idea that we're a bunch of anarchists producing code that's any less functional than the rest of industry, because quite the opposite is true.
    • >> The Linux kernel is starting to look like anarchy to non-developers.

      What data points do you base this observation on? You believe that there is an appearance of linux kernel development becoming increasing chaotic?

      You are very wrong. A true analysis of the progress of linux kernel development would show a measurable decrease in the level of chaos in the system.

      The past few years have seen an increase in paid full time linux kernel hackers. These hackers are generally tasked with working on a specific sub-system and/or providing source control management. The Marcelos and Coxes of the world are increasing in number and their work is really paying off in a more ordered kernel development process.

      Even the volunteers like the kernel janitors work in a more structured and orderly manner.

      Perhap someone should write a white paper detailing the main kernel hackers and the evolution of the kernel development process. THAT would make for good PR.
      • You believe that there is an appearance [...] You are very wrong. A true analysis of the progress

        Ok, so you're argument is that the perception is not one of chaos because the perception does not match the reality of the situation?

        I'm just trying to understand, since that would seem to be a self-contradicting statement.

        Yes, Linux kernel development is moving along just fine. Yes, it's very well structured. But I work in a Linux shop, and let me tell you what the perception of the folks we deal with is: "Oh, Linux. Isn't that the bunch of kids that are always arguing and forking projects?" I know that that's bunk, but most folks don't.

        Which point of view do you think Microsoft is working hard to portray?
    • Hmmm ... I remember a certain Rasterman [enlightenment.org] disappearing from RedHat ... and then E was no longer part of Gnome (although I use it because its dramatically better than the alternatives in my opinion).

      It happens.
  • by f00zbll ( 526151 ) on Tuesday January 15, 2002 @11:09AM (#2842337)
    Lets get real for a second. The linux community isn't the only OS with politics behind it. God knows there's probably more politics behind IRIX, Windows and AIX.

    I strongly feel that honesty wins in the end, because people aren't stupid. No one believes that IBM or Microsoft is one happy camp singing "we are the world."

    It's great there is a lot of attention on the VM and intense effort to make it better. I have no doubt linux and Rik are professionals and have no problems putting politics aside to get the job done. That is after all part of being a professional. Rik makes some good argument and given enough time and money he'll build the VM of his design. Will it matter 10 years from now? Most likely not. Development will continue and linux will get better. Butting heads is part of the fun, because without conflict people tend to stagnate.

    • "Will it matter 10 years from now? Most likely not."

      And I really hope that the reason it doesn't matter is because our systems will have nearly unlimited memory. You don't really need a VM if you don't have limitations on memory.

      Think I am crazy? Think back to what kind of system you were using 10 years ago, and how quickly we have gotten to where we are. What do you think will happen in another 10 years?

      Hopefully, we won't even need a VM in 10 years.
      • Think back to what kind of system you were using 10 years ago, and how quickly we have gotten to where we are.

        Ten years ago I was running a 486/66 with 16M of RAM, and a 340M hard drive that eventually filled up. Right now I'm on an Athlon 950 with half a gig of ram, and ~50 gigs of hard drive that's just about full. The programs you run are always going to drive development of the equipment you run it on. It doesn't matter how much space you have- you'll fill it.
  • I know that it has nothing to do with the topic...
    But while waiting for the page to load, I noticed that the extension was "htm" which lead me to lookup "linux.html.it" using netcraft and discovered it running IIS. Go figure?
  • Label this a flame if you want but I was absolutely disgusted by the tone and tenor of Rik's responses in that article. Regardless of the technical merits of his code or algorithms, Rik's repeated attacks on Linus will certainly not move the operating system forward.

    When the author of the article ended with "Thank you for your kindness and the opportunity to get to know you better", I almost fell out of my chair laughing. The only thing that stopped me was that Rik's behavior really isn't funny. It isn't professional and it has no place in the open source or any other community. It speaks volumes about Rik's emotional maturity or more accurately his lack thereof.
    • by Erik Hensema ( 12898 ) on Tuesday January 15, 2002 @11:57AM (#2842647) Homepage

      Somebody has to speak up against Linus. Linus is not a god. The man makes mistakes. And over the last view years it becomes increasingly a problem that "Linus doesn't scale".

      Linus however continues to develop the kernel pretty much the same way he started doing it ten years ago. And not many people think that's a problem. Rik does (AFAIK). And I tend to agree with Rik: the current system just isn't working very well. It's not very bad, but it certainly isn't optimal, IMHO.

      However, remaining silent doesn't solve the problem. Somebody has to speak up.

    • by Zog ( 12506 ) <israelshirk.gmail@com> on Tuesday January 15, 2002 @12:28PM (#2842924) Homepage
      RTI - Read The Interview.

      ...Rik's repeated attacks on Linus will certainly not move the operating system forward.

      Rik was interviewed in order to get insight into how he thinks/sees things, no? So if he doesn't like the way Linus does things, is he not at liberty to say so? (also, see quote below about still having respect for Linus in spite of their disagreements/conflicts)

      Rik's behavior really isn't funny... It speaks volumes about Rik's emotional maturity or more accurately his lack thereof.

      Rik Say:
      With Linus out of the way, I can make a good VM. I no longer have to worry about what Linus likes or doesn't like. ... Yes, though I guess I have to add that I have a lot of respect for Linus. He is a very user unfriendly source management system, but he is also very honest about it.

      I don't quite think that qualifies as immature - granted, there is a lot of conflict going on, but they still have respect for each other, even if Rik doesn't like to work with him, and there's not really anything showstopping about it. The VM situation wasn't pretty, but it's being resolved.
  • Fear Factor (Score:5, Insightful)

    by ChaoticCoyote ( 195677 ) on Tuesday January 15, 2002 @11:34AM (#2842511) Homepage

    An honest environment -- such as fostered by "free" software -- is both good and bad. On one hand, I (as a programmer) am comforted to read the kernel mailing list and other resources that let me know exactly what is happening with my tools. I don't need to wonder what's happening with "free" software -- and this is more comforting to an engineer like myself than is the closed-door, silence-is-golden, hide-the-bugs policy of a Microsoft.

    On the other hand: Show this interview to an MIS manager who need 24/7/365 reliability, and she is going to be very nervous about deploying a Linux-based solution. You can talk until you're blue in the face about reliable distros and the open road to sofwtare quality -- what the MIS/corporate person sees is chaos and feels a lack of COMFORT .

    "Out of sight, out of mind" is a philosophy many people adhere to, especially when dealing with complex issues they can not or do not want to grasp. From waste storage in Nevada to the the war in Afghanistan, most people lack the time and initiative to understand what is really happening; they go on appearances and marketing, and ignore complex and disturbing facts.

    Technology is no different. The MIS manager doesn't want to hear about VM conflicts or file system bugs or different kernels -- such issues are beyond their capability and desire to understand. Buying Microsoft is (or was, until recently) comforting, because no one ever saw the internal debates and code battles and what-not that any development team expresses. Even recent security disclosures about WinXP are unlikely to shake the faithful -- but those same people will run in fear from the blunt honesty of Linux.

    Ignorance may be bliss, but it can also get you killed. I know people whose lives depend on cars, but they have no knowledge of how to check the oil. Most MIS managers simply want to drive software; if it looks good (like a Jeep Liberty), they don't pay attention to whether it is safe (the Liberty performs poorly on crash tests).

    I doubt, however, we're going to change human nature -- and I'd rather have spirited debate and even some nasty contention if it means that people are striving to make Linux the "best" it can be.

  • OOM Killer must die (Score:5, Informative)

    by Salamander ( 33735 ) <`jeff' `at' `pl.atyp.us'> on Tuesday January 15, 2002 @11:43AM (#2842561) Homepage Journal

    Rik is an extremely bright (and likeable) guy, but his adherence to the OOM killer concept is disappointing. I've seen a lot of dumb ideas gain currency in the computing community or some part of it; OOM killer is the dumbest. If your process was allowed to exist in the first place, it should not be killed by the VM system. The worst that should happen is that it gets suspended with all of its pages taken away. If that doesn't free up any memory then neither would killing it (modulo some metadata - read on). If there are other processes waiting for the one that's suspended, then eventually they'll go to sleep, their pages will be released, and the suspended process will wake up - which won't happen if you killed it. There are only two differences between the two approaches:

    • Suspension does not take irrevocable action; the suspended process can still be restarted.
    • Suspension bears the cost of retaining the metadata for the suspended process so it can be restarted.

    The usual whine from OOM-killer advocates is that you can still get into a situation where all of that retained metadata clogs up the system and essential system functions can't allocate pages. However, that's preventable too. All you need to do is preallocate a special pool of memory that's only available for use by those essential system processes - either individually or collectively. The size of that pool and the exact details of how it gets allocated (e.g. which processes are considered essential) could be treated as site-specific tuning parameters. The same idea can then be further generalized to allow definition of multiple private pools, creating a semi-hard barrier between different sets of tasks running on the system (if you want one; the default pool is still there otherwise). This actually fits in very nicely with other things like processor affinity and NUMA-friendly VM, which I know because I once worked on a kernel that had all of these features.

    In short, there's no need for the OOM killer. Plenty of systems, many of which handle extreme VM load much better than Linux, have been implemented without such a crock. Rik contends that a lot of people make suggestions without actually understanding the problem, and he's right, but I also submit that sometimes he also rejects suggestions from people who do know what they're talking about. This row has been hoed before, and Rik's smart enough that he should know to avoid the NIH syndrome that afflicts so many of the other Linux kernel heavyweights.

    • by _xeno_ ( 155264 ) on Tuesday January 15, 2002 @12:14PM (#2842769) Homepage Journal
      This may just be my misunderstanding, but the way I understood it was that the OOM killer takes effect when there is no more memory available at all. You say "The worst that should happen is that it gets suspended with all of its pages taken away." but I have to wonder what the process is going to do when it starts up and has no pages - it'll crash instantly, so why not kill it, right?

      Or did you mean that "the process's pages will be swapped?" Even if you did mean that, my understanding is that the OOM killer only takes effect when there is no memory space left - including swap. In this scenario, there isn't much to be done should the system need more memory to continue - you either kernel panic or you find some process to kill and kill it. In an extreme circumstance like OOM on a kernel alloc, I see nothing wrong with deciding to kill a process. I really don't see how "suspending" on a process solves memory issues since it still needs its pages somewhere...

      My understanding was that the idea behind the OOM killer was to prevent the kernel from panicing and instead leave a working system which needs to have its memory problems worked out. I could be wrong since I haven't really looked into the OOM killer and when it's invoked.

      • I have to wonder what the process is going to do when it starts up and has no pages - it'll crash instantly, so why not kill it, right?

        If you can't start the process, fork() should fail. That way the caller gets an error code and has some hope of doing something genuinely useful.

        Even if you did mean that, my understanding is that the OOM killer only takes effect when there is no memory space left - including swap.

        What I'm saying is that you should never wait that long to detect that condition. Sure, you can disable overcommit entirely, but that's pretty inflexible. Overcommit is a good thing, if it's handled properly. Some people would rather not, but if you want you should be able to walk out on that narrow ledge, right up to the point where you can't go any further, and then stop. Gracefully. OOM-killer is like taking you right up to the edge of the cliff, then throwing you off and saying it's your own damn fault.

        the idea behind the OOM killer was to prevent the kernel from panicing...

        Fine, but there are other ways as well.

        ...and instead leave a working system

        If that's the goal, it fails, because it makes no such guarantee that your system will be working in any intuitive or useful sense of the word.

        • I have to wonder what the process is going to do when it starts up and has no pages - it'll crash instantly, so why not kill it, right?
          If you can't start the process, fork() should fail...

          I meant "resume," not start - sorry about the confusion... (as in, resume the suspended process that you said should be suspended to recover memory). In other words, you suggested that a process should be suspended in an OOM situation. When suspended, however, the pages can't just disappear, because the application still needs them.

          I think you're talking about an over-committed application where the process is suspended because it tried to write to a page that hasn't been copied yet. (In other words, process A forks process B which does nothing for a while and then starts doing some massive changes to memory structures that it was reading off of process A's memory space which causes a page fault and causes a new page to need to be allocated, which fails due to OOM.)

          That sounds like a good idea as a method to potentially help avoid OOM, but I can still invent a scenario where it doesn't work. (For example, say the X-server gets suspended, and therefor the various clients get suspended waiting for the local socket to send an event, leaving only login running on other ttys - not enough memory to exec a new shell, and therefore a useless system.)

          Even with preventative OOM measures, it's still possible to run into an OOM situation, and when the situation arises, there needs to be a way to handle it. OOM killer, assuming it's sophisticated enough, is one way of ensuring that the box doesn't just grind to a halt and panic.

          If that's the goal [preventing the kernel from panicing], it fails, because it makes no such guarantee that your system will be working in any intuitive or useful sense of the word.

          Beats the Black Screen Of Panic, Linux's version of the Blue Screen Of Death... (unless INIT dies (which panics anyway...) or X dies (which locks display/keyboard) or login dies or ...)

          Bottom line, OOM is a pretty drastic state which I have yet to ever reach (although I've come close with Unreal Memory Leak Tournament). If you hit OOM, something needs to be done, and OOM killer is better than just panicing and causing everything to be lost.

    • by Anonymous Coward
      While I'm not entirely convinced that overcommit and OOM-killer are the right approach, I don't understand how your proposal solves anything.

      If your process was allowed to exist in the first place, it should not be killed by the VM system.

      So we know before a process gets to execute exactly what its memory usage profile is? A VM is called upon to predict the future all the time, but this is ridiculous. I can understand that we'd rather not start a process if we're going to kill it off later, but just how are we supposed to make that determination? I've got 3/4 of my memory free. Of course the system will let me start another process --we're nowhere near OOM. But the new process proceeds to allocate and allocate and allocate... And now we really are OOM. What now? What have we solved?

      The worst that should happen is that it gets suspended with all of its pages taken away.

      "taken away"? Where do they go? You obviously can't drop them on the floor if you're planning to unsuspend the process at some point. You can drop the executable pages, but if we're in an OOM condition we've already done that. And we're out of swap, so we're not putting the pages there, either.

      If there are other processes waiting for the one that's suspended, then eventually they'll go to sleep, their pages will be released, and the suspended process will wake up.

      Now *that* is a chain of events which makes absolutely no sense to me. I repeat: *We* *are* *OOM*. Where do their pages go and how, exactly, does this allow the VM to make some progress without deadlocking the system?

      Color me skeptical.

      • by Salamander ( 33735 ) <`jeff' `at' `pl.atyp.us'> on Tuesday January 15, 2002 @01:20PM (#2843410) Homepage Journal
        So we know before a process gets to execute exactly what its memory usage profile is?

        Please don't construct strawmen. Oh wait, that's not just a strawman, it's also circularity. You're assuming that the OOM killer exists, then using that to "prove" that an alternative approach is impossible to implement. Well yeah, an alternative system that both does and does not incorporate the OOM-killer concept is impossible. Congratulations. Well done.

        What I'm really saying is that the VM system should ensure that it has other means to deal with memory exhaustion. Disallowing overcommit altogether is one approach, and that has proven quite acceptable for many systems, but there are plenty of other approaches as well. I've briefly sketched out only one; look up the others yourself (the information is available in plenty of places including some OS textbooks).

        "taken away"? Where do they go?

        The phrase "suspended with all of their pages taken away" (which is what I said) includes the case where the pages have already been taken away. English 101.

        As for where they go, the obvious answer is not the general swap area, because that's already full. However, that doesn't preclude the existence of a secondary (actually tertiary) swap area that exists only for this purpose. It could also be a percentage of the general swap area, which starts to look very much like the memory-pressure code in the very highly regarded FreeBSD VM, or Solaris, etc. The point is that there's a middle ground between "no overcommit at all" and "if you allow overcommit we might shoot you in the head just because we feel like it".

        Color me skeptical.

        Skepticism is one thing; strawmen and circularity are another. I'm skeptical about the need for an OOM killer.

        • As for where they go, the obvious answer is not the general swap area, because that's already full. However, that doesn't preclude the existence of a secondary (actually tertiary) swap area that exists only for this purpose.

          If you have enough swap space (whether you call it "tertiary" or not) for all writable memory, you're not overcommitting. By definition.

          PS. It's funny that you accuse Rik of NIH, because his VM is strongly influenced by FreeBSD's, and receives praise from that camp. Indeed Rik is usually the one making NIH accusations.

          • by Salamander ( 33735 ) <`jeff' `at' `pl.atyp.us'> on Tuesday January 15, 2002 @02:26PM (#2843947) Homepage Journal
            If you have enough swap space (whether you call it "tertiary" or not) for all writable memory, you're not overcommitting. By definition.

            You might (or might not) be overcommitting. Up to you. However, even if you are, you're not waiting until the last second and then going postal instead of taking concrete steps sooner to avoid total memory exhaustion. For example you could say that, once you start dipping into the overcommit pool, fork() will start failing but existing processes can continue. You could say that only certain processes that are being allowed to run to completion will be able to allocate new swap space; anyone else will just get suspended the first time they try. Once you have set a high watermark somewhere short of total exhaustion, you can do any number of things, even if you're overcommitting. Some of those measures are pretty drastic, but still better than the OOM killer.

            To a certain extent, perhaps, these "softer" approaches just slow down what might be an inevitable march to OOM. In theory, you could still reach the total-exhaustion deadlock that OOM-killer is supposed to deal with, although it really doesn't because it doesn't in any way guarantee that your system will really be any more useable than if the deadlock had occurred. In practice, though, you'd be hard pressed to find a system that (a) allows overcommit, which is only necessary with VM systems that are broken (wrt how much swap they allow) to start with, (b)takes such drastic measures before going OOM, (c) does in fact hit OOM anyway, and (d) would benefit from an OOM killer if it had one. Without such an existence proof, claims that an OOM killer is necessary are pretty bogus.

            As I've said, these aren't new ideas just off the top of my head. These are approaches that are proven to work. Ask yourself: how is it that so many systems get by just fine without an OOM killer? There are answers out there.

            It's funny that you accuse Rik of NIH

            Actually I didn't. I accused other Linux kernel hackers of NIH, and tried to warn Rik about becoming more like them. I know Rik's smarter than that, but sometimes even smart people submit to "common nonsense".

            • For example you could say that, once you start dipping into the overcommit pool, fork() will start failing but existing processes can continue.

              Why make this a special pool? I mean, isn't this effectively like saying "once 80% (or whatever) of swap is full, don't allow any more forks"? I'm not convinced that's any improvement over the OOM killer. (Or at least not over turning off overcommit, if that's the way you want to go.)
    • by puetzk ( 98046 )
      there is indeed a /proc entry (/proc/sys/vm/overcommit_memory) to disable VM overcommit. In which case, it's impossible to reach the scenario where the OOM is needed (some process gets a null from malloc instead).

      However, as it stands, linux by default is willing to overcommit (via copy-on-write). This is a good, and beneficial thing - when one forks, the pages don't need to be allocated and copied until they are changeed (as most never are). This saves memory, saves time, and vastly improves scalability of many tasks. Ditto for many, many other situations. But, it means when everything goes to hell in a handbasket, you have promised memory to processes that you simply do not have, and you've already told them they can have it. So you have to produce something, and that means someone gets tossed.

      as far as reservving special memory, the mlock call does just that. It tells the VM that these pages can't be messed with, they need to be ready immediately.

      You can't just suspend, because you already did that - OOM doesn't occur until you are also out of swap. OOM is a last-ditch, we have *nowhere* left to put this. If you ever see OOM, you need more swap. Simple as that.

      (Now, one thing that would be very nice would be dynamically resizing swapfiles, so that if you had disk space left not currently being used for swap, the swaparea could grow. But even then, there is such a thing as out of disk as well. The only way to completely avoid OOM is to avoid overcommit/copy-on-write and allocate any pages that could potentially be used by a call every time (even when, as in fork/exec they very rarely are). That way you could make the calls fail in this worst of worst-cases and the applications could respond.
      • When process A tries to write to the memory it was given but that memory isn't actually available, suspend it. Suspend it _before_ actually allocating the memory that it thinks it has.

        Put into motion the required steps to make more memory available (suspend things as necessary) and allow process A to continue if and _only_ if that memory is actually allocatable.

        If desired, have a timeout period after which the process is killed. I don't see how that's worse than just killing the process.
        • I think you're forgetting one thing. We've already swapped out everything we can, physical RAM is full, the swap space is full, and something still wants more memory. So, we suspend something. OK, what do we do with it? We can't swap it out, no swap space to put it in. If we can't swap it out, how do we free up any of it's pages? And if we can't free up any pages, how do we satisfy the process that wants more memory?

          I don't like the OOM killer, but when you're in that tight a bind there's not a lot of better options.

      • > there is indeed a /proc entry >(/proc/sys/vm/overcommit_memory)

        That setting doesn't work properly. Linux will just overcommit slightly less.
    • by Elladan ( 17598 )
      Your scheme won't work. Think about it.

      The OOM killer is triggered when the system is completely out of all memory, including swap, and a process (any process, not just some ram hog) tries to allocate more. That allocation request cannot complete, so the kernel needs to do something else. Note that it can't fail the request, because it already passed it due to overcommit.

      The OOM killer approach is to find a process that looks ripe and get rid of it. Thus, stability is restored.

      Your approach is to freeze the process that wanted a little bit more ram. What do you hope to gain by this? Well, presumably, you think that some other process is going to release some memory and allow the first one to complete. As should be obvious, this may not happen. In fact, it probably won't. What you'll end up with is a dead system with a lot of frozen processes. If you're careful, root might still be able to get in on the console to kill some stuff or link in more swap, but that's about it.

      For all practical purposes, the system is hosed.

      Your scheme has the advantage for a computation server that the administrator might be able to link in some more swap to complete a computation, but for normal uses, it's just a hang. The OOM killer approach is to attempt to blow away the memory hog and keep the system operational, without administrator intervention.

      The other approach, of course, is to get rid of overcommit entirely. People wouldn't like this too much, since they'd need a lot more swap space.

      • "That allocation request cannot complete, so the kernel needs to do something else. Note that it can't fail the request, because it already passed it due to overcommit."

        Gee, here I though malloc was supposed to return NULL when it couldn't grab any pages.

        RETURN VALUE
        For calloc() and malloc(), the value returned is a pointer
        to the allocated memory, which is suitably aligned for any
        kind of variable, or NULL if the request fails.


        Shucks be darned, I was right. So why should the VM kill a legit app?

        This is a VM problem. You won't solve it by attacking user space. If you run programs that try and suck all your memory, perhaps you should get well behaveb, properly designed programs.
    • This topic is kind of like abortion for me, I just can't decide which way I feel, however your statement:

      If that doesn't free up any memory then neither would killing it (modulo some metadata - read on).

      Is simply not true, any process may grow over time without executing a single fork, and force the system out of total memory, in that case killing the process will free up memory. I'm not saying that an OOM killer is the right way to go but you definitely need to address this part of your argument if you want to convince anyone that it's valid.

    • by Bitmanhome ( 254112 ) <bitman AT pobox DOT com> on Tuesday January 15, 2002 @10:28PM (#2846585)
      What the hell, I'll join the fray. You're spreading a huge amount of lies and FUD, and doing it VERY LOUDLY. Unfortunately, volume doesn't make up for sense.

      First off, you need to study your own subject line: OOM Killer. OOM means out of memory. It does not mean low memory; it does not mean "maybe the admin can link in some more swap;" and it does not mean "move pages into a protected buffer." It means out of memory: there is no memory left. Anywhere.

      You say:
      If your process was allowed to exist in the first place, it should not be killed by the VM system.
      then you say:
      So we know before a process gets to execute exactly what its memory usage profile is?
      Make up your mind dude -- which side are you on? Hint: Your second statement is correct, sarcastic as it was. We can't know the true memory behavior of a process ahead of time, so we can't block processes that are going to become too large.

      Next you say:
      The worst that should happen is that it gets suspended with all of its pages taken away.
      Those pages contain data, they cannot be taken away. The must be moved somewhere .. got any ideas?

      ... a secondary (actually tertiary) swap area that exists only for this purpose.
      Ah, your fingers are moving, but you don't understand the words. This is still swap space; the label is meaningless. And if we have swap available, then we're not out of memory, are we? This is a low memory condition, and is irrelevant to this discussion.

      Okay, so how about:
      ... a special pool of memory that's only available for use by those essential system processes - either individually or collectively.
      Once again, you use the term without understanding it. This is called memory (as you said,) and if there's some left, then we're not out of it, are we? Once again, this is a low memory condition, and is irrelevant to this discussion.

      This next statment needs to be pulled apart, as you try to make two points with it:
      Plenty of systems ... handle extreme VM load much better than Linux ...
      Agreed, but we're not talking about "extreme VM load", are we? We're talking about out of memory conditions. No memory left. Anywhere.
      Plenty of systems have been implemented without such a [patch].
      Sure, but these systems have been hand-tuned to avoid running out of memory in the first place. Is this a good thing? Let's see:

      In short, there's no need for the OOM killer.
      Oddly, you first said "must die," but now say "no need," but I'm willing to ignore that. Ideally, a system should never run out of memory. But how can you know your machine is safe? After all, you can never know the memory requirements of a program ahead of time. So you can either throw excessive amounts of resources at your machine, or you can add support for low-memory and out-of-memory conditions. Your ideas for low-memory problems may be good, but they're not relevant here.

      So the only question is this: Is an OOM killer worth the effort? OOM killer performs a partial system shutdown, allowing reletively quick recovery. This might be valuable for servers, but for single-user computers, it's usually easier to just reboot, especially since the machine will be desperately thrashing by that point.

      Is an OOM killer worth writing? We're not paying Rik, so that's up to him. If you want low-memory conditions handled better, pay up, or write it yourself.

      Rik contends that a lot of people make suggestions without actually understanding the problem...
      Look in the mirror, dude -- that's you!

      -B
  • by iabervon ( 1971 ) on Tuesday January 15, 2002 @11:51AM (#2842611) Homepage Journal
    It started out with Rik's VM in the kernel, since it was a promising new development. However, once it was in Linus's kernel, the fact that Rik's development style was not compatible with Linus's source control style because an issue, because the VM wasn't getting updated in Linus's tree.

    So Linus switches to the other VM, which is based more on the original. This means that Rik can do his development without dealing with Linus and the Linus tree can have an up-to-date VM. When Rik's is to the point where he's really happy with it and he doesn't think he'll have to make a lot of patches (and it does all the things he wants), it will probably go back in.

    Since then, Rik and Linus have figured out (hopefully) how their interaction failed to work, and what Rik has to say along with his patches to make Linus know they're worth looking at. It turns out that it is possible to automate this process, such that a script will send the patches when appropriate, with the right assurances of freshness (having actually tested them, of course).

    Linus wants to be able to ignore any patch that isn't for the part he's thinking about at the time (e.g., non-block-i/o patches around the beginning of 2.5). When it becomes interesting again, however, the original patch may not be right any more. Having not looked at the patch at the time when it was sent, Linus can't determine whether it is still good, since the author may have found bugs, and he doesn't know exactly what the patch was supposed to do. He wants the author to make any updates needed and resend it. It may be, of course, that the patch doesn't need to be changed, and the author doesn't have a new and better patch, but Linus can't tell unless the author sends it again with a note that it's still good.

    So Rik's patchbot will test whether the patch still applies and still works, and has not been replaced by a new version, and then will send it again until Linus actually looks at it. This seems to me like a good plan, since it doesn't require Linus to test everyone's old patches and have a complicated mail system. And Linus won't accidentally apply the wrong version of a patch or be unable to find a patch.
  • ...how a lot of kernel developers seem to talk mainly about how well their patches help a server withstand slashdotting? ;)

    I mean... look at that bloke who posted a scheduler patch in Kernel Traffic, and now Rik... both mention a certain website. I dunno if this is a good or bad thing
  • Ok, now I've got people wondering if I've finally gone nuts, I'll give my 2 cents on the interview.


    First off, an honest account of a person's feelings is not a personal attack. In fact, it has nothing to do with the other person at all. Nobody can "make" another person feel a particular way. A feeling is simply what a person has. The question is what the person does with that feeling. Rik van Riel seems to be doing what any dedicated, driven, psycho-geek would do - he's making his VM the best he possibly can.


    Second, there are MANY possible resolutions to the purported conflict. The idea of having modular VMs is extremely sound, and likely to be implemented at some point. In the same way that networking QoS code supports multiple methods, in parallel, with rule-matching to determine the "probably best" solution, I could easily see a "meta-VM" engine which used a similar system to drive multiple VMs which could "steal" memory off each other, as needed, to run the most programs closest to optimally.

  • Am I the only one who's spent more time reading the Linux Kernel Mailing List than slashdot recently *because* of the feuding and flaming that's going on? All the patches, bug reports, insults, ideas, and philosophical asides are like a soap opera (with diffs). Okay, I admit I'm addicted to reading through diffs that I have no idea what they're doing, but it makes me *feel* smarter.

    About the only thing I didn't like was Linus' rambling evolution thread. Personally, I'm on Andrea's side in the VM wars, but I think its because he had a clever flame or two a while back. Plus, I've had to build kernels for two friends with 2.4.13 & 15 who were having problems with memory with older 2.4.x's (probly redhat's problems) but since Rik's siding with redhat, that's another strike against him. I don't run a data warehouse, and I hate xinetd, and am still bitter over the RPM incompatibilities between 3 & 4.
    • I forgot to mention my main point, and that's that for 90% of us (running servers or desktops), 99% of the time, we'll never notice the difference between either VM or even the old 2.2 (which I still use)
  • by pdqlamb ( 10952 ) on Wednesday January 16, 2002 @04:58PM (#2850892)
    There's a lot of back-and-forth discussion, not only on the VM, but on the feature (un)freeze of 2.4/2.5, and on how Linus is a lousy patch control system. But maybe that's not the most important thing here.

    Way back when, the purpose of a development kernel was to feed things in to a stable kernel tree. Now part of the problem has to be that Linus started 2.4 way before 2.3.X was ready for it, but it looks like history is repeating itself. 2.4 isn't all that stable, even now, but Linux is happily accepting lots of new goodies to play with in 2.5.

    Something is not working right here. Is Linus less demanding of quality now, since he's willing for somebody else to come in and fix up the allegedly stable kernel tree? Or is he accepting too many things to allow a development tree to stabilize?

    I suspect it's a combination of too much stuff and too big a kernel. Instead of the heady days of 2-3 kernels per week in the development tree, and the stable tree gets another kernel every week or two, now we have a development kernel every week or two and a stable patch every month or three. And the kernel size is 10x bigger than in the 1.0 days.

    Look how long it took the USB stuff to filter through the development into the stable tree.

    It seems obvious the Linus Linux development process is not scaling. I'm not sure what the answer will turn out to be, but it may be some combination of the following:

    (1) More "boutique" kernels like Alan Cox's ac series, feeding into the "stable development" kernels that Linus has been generating.

    (2) More formal check-in methods, a la CVS commit. This may take some developer training in how to use CVS -- does anyone want to offer Linus a course and set up a server for him? I bet he'd take a complementary Geek Cruise!

    (3) Some kind of more rigorous control in the stable kernel tree. I suppose you could say Redhat and SuSE are doing this informally now; if they start coordinating their efforts, and get IBM involved, the kernel will be incredibly stable. And even more incredibly slow to update.

    (4) More beta testers to crack the newer kernels. This is going to get harder, as more of us need to get work done on our Linux boxes. It used to be a hassle when Linux crashed; now it's not acceptable any more!

    (5) Better ways for these users to track down problems and report bugs. This last week I heard myself say, "Try rebooting your Linux box and see if the problem goes away." I just don't have the time, energy, knowledge, and skills to deal with lusers' "I've got a problem" whines any more.

    (6) Is the quality of kernel patches too low? Do we need to develop some regression tests for the kernel, which a patch would have to pass before it would be accepted? (And how do you do a regression test program of this magnitude without Microsoft's beta testers, AKA customers?)

    Anybody want to contribute more ideas to the list? We can spam Linus with them until he agrees!

Brain off-line, please wait.

Working...