Please create an account to participate in the Slashdot moderation system


Forgot your password?
Operating Systems Linux

Anatomy of Linux Kernel Shared Memory 93

An anonymous reader sends in an IBM DeveloperWorks backgrounder on Kernel Shared Memory in the 2.6.32 Linux kernel. KSM allows the hypervisor to increase the number of concurrent virtual machines by consolidating identical memory pages. The article covers the ideas behind KSM (such as storage de-duplication), its implementation, and how you manage it.
This discussion has been archived. No new comments can be posted.

Anatomy of Linux Kernel Shared Memory

Comments Filter:
  • by WrongSizeGlass ( 838941 ) on Saturday April 17, 2010 @05:09PM (#31883664)
    ... why OSS is the way things should be. You'll never see this type of documentation, and this type of detail, available to anyone and everyone from closed source software. I love my Mac, and supporting Windows pays my bills, but OSS is unlike any other animal out there.
  • Re:First Post (Score:1, Interesting)

    by BitZtream ( 692029 ) on Saturday April 17, 2010 @05:27PM (#31883768)

    The concept isn't new to Windows, VMWare or FreeBSD I know for a fact (though none of them work exactly the same as this).

    I would have presumed this wasn't new to Linux either, just different from the existing implementation (I know its blasphemy here, but I'm not a Linux person).

    Its certainly been done in the mainframes for god knows how long.

    I doubt this is as groundbreaking as its being promoted.

    If your OS isn't sharing duplicate memory blocks already, you're using a shitty OS. (Linux already shares dup read only blocks for many things, like most modern OSes).

    These are just extensions to help deal with virtual machines rather than actually fix the problems that make the need to run a virtual machine. Not unique to Linux in any way, all the big boys are riding the VM fad until people realize your OS is supposed to do that for you already. Prolly take a few years before all the young'ns realize that VMs aren't new and theres a reason people having been using them all the time for the last 3 decades (at least, my age is showing, but the 70s era computing was just before I started :)

  • by Anonymous Coward on Saturday April 17, 2010 @06:08PM (#31883950)

    There are signs of life at KernelTrap ( []).

    There have been a number of postings by Jeremy since the beginning of April.

  • Re:First Post (Score:4, Interesting)

    by Anpheus ( 908711 ) on Saturday April 17, 2010 @06:24PM (#31884040)

    For now, at least. VMWare doesn't support combining pages >= 2MB because the overhead (hit rate on finding duplicates versus the cost of searching for duplicates) and I suspect other hypervisors will do the same. Additionally, Intel and AMD are both moving to support 1GB page tables. What are the odds that you'll start up two VMs and their first 1GB of memory will remain identical for very long?

    The only way I see page sharing working in the future is if the hypervisor inspects the nested pages down to the VM level, which will typically be the 4KB pages we know and love. Either that, or paravirtualization support needs to exist for loading common code and objects into a shared pool.

    Even so, there's a lot of overhead from inspecting (hashing and then comparing) pages which will only grow as memory sizes grow. If we increase page sizes, the hit rate decreases and the overhead of copy-on-write increases. It's not a good situation.

    Sources: Performance Best Practices for vSphere 4 [] which references Large Page Performance [] which states:

    In ESX Server 3.5 and ESX Server 3i v3.5, large pages cannot be shared as copyonwrite pages. This means, the ESX Server page sharing technique might share less memory when large pages are used instead of small pages. In order to recover from nonsharable large pages, ESX Server uses a “sharebeforeswap” technique. When free machine memory is low and before swapping happens, the ESX Server kernel attempts to share identical small pages even if they are parts of large pages. As a result, the candidate large pages on the host machine are broken into small pages. In rare cases, you might experience performance issues with large pages. If this happens, you can disable large page support for the entire ESX Server host or for the individual virtual machine.

    That is, page sharing involves breaking up large pages, negating their performance benefit and is only used as a last ditch when you've overcommited memory and you're nearly to the point of having to hit the disk. And VMWare overcommit is great until you hit the disk, then it's a nightmare.

  • Re:First Post (Score:3, Interesting)

    by bzipitidoo ( 647217 ) <> on Saturday April 17, 2010 @10:09PM (#31884846) Journal

    Personally, I find memory compression [] more interesting than just deduplication, which could be considered a subset of compression. The idea has been around for years. There used to be a commercial product for Windows 95 [] that claimed to compress the contents of RAM, but which had many serious problems, such as that the software didn't work, didn't even attempt compression, and was basically "fraudware". The main problem with the idea was it proved extremely difficult for a compression algorithm to beat the speed of just doing everything without compression. Gave the idea of memory compression a bad reputation.

    Now we have LZO, an algorithm that has relatively poor compression, but which is very, very fast-- fast enough to beat the speed of a straight memcopy. 15 years ago, there wasn't any compression algorithm fast enough for this application. Also, I'm thinking memory is slower relative to processors than 15 years ago, as that would provide incentive to increase the size and sophistication of CPU caches, which has happened. (100 MHz, 32bit Pentium with 33 MHz RAM then, vs 3 GHz, 64 bit multi core CPUs with 800 MHz DDR2 RAM today) Now CPU caches are plenty large enough to handle many 4K pages.

    Still, deduplication could have many other cool uses. Friend of mine once hacked up some disk deduplication software for the Apple II. All it did was crosslink identical sectors it found on a disk, and then mark the duplicates as free. There was no provision for counting the number of links or anything like that, in case you wanted to change the contents of the disk. Just had to be aware this had been done to the disk. Ultimately proved more trouble than it was worth, but it was a nice thought for teenagers desperate for more disk space.

Civilization, as we know it, will end sometime this evening. See SYSNOTE tomorrow for more information.