Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Open Source Operating Systems Upgrades IT Linux

Live Patching Now Available For Linux 117

New submitter cyranix writes "You may never have to reboot your Linux machine ever again, even for kernel patching," and excerpts from the long (and nicely human-readable) description of newly merged kernel code that does what Ksplice has for quite a while (namely, offer live updating for Linux systems, no downtime required), but without Oracle's control. It provides a basic infrastructure for function "live patching" (i.e. code redirection), including API for kernel modules containing the actual patches, and API/ABI for userspace to be able to operate on the patches (look up what patches are applied, enable/disable them, etc). It's relatively simple and minimalistic, as it's making use of existing kernel infrastructure (namely ftrace) as much as possible. It's also self-contained, in a sense that it doesn't hook itself in any other kernel subsystem (it doesn't even touch any other code). It's now implemented for x86 only as a reference architecture, but support for powerpc, s390 and arm is already in the works (adding arch-specific support basically boils down to teaching ftrace about regs-saving).
This discussion has been archived. No new comments can be posted.

Live Patching Now Available For Linux

Comments Filter:
  • by rgbe ( 310525 ) on Thursday February 12, 2015 @03:28PM (#49040609)

    This, among other things was discussed in the Kernel Report, at the recent Linux Conf in Auckland, New Zealand:
      https://www.youtube.com/watch?... [youtube.com]

  • by ArcadeMan ( 2766669 ) on Thursday February 12, 2015 @03:30PM (#49040631)

    Which means you can keep it up forever!

    (PHRASING!)

    • Re: (Score:3, Informative)

      by fulldecent ( 598482 )

      Holy shitsnacks. There are more Archer seasons? I need to step up my piracy.

      We are talking about Archer, right?

    • Re: (Score:2, Insightful)

      by aliquis ( 678370 )

      Meanwhile Windows still seem to ask you to reboot about once a week (every second week?) to install the updates ..

      • Actually, once a month - second Tuesday of the month is patch day.

        I read the commit notes, but am still a bit fuzzy at how the system can patch code that's currently in use by running processes... Does anyone understand the mechanism used to do this? The mention of stop_machine() suggests that processes affected by the patch are being shut down and restarted. If that's the case, then although the machine isn't necessarily rebooted, processes may go down and come back up during this procedure, right? Or

        • Re:No more downtime (Score:5, Informative)

          by gavron ( 1300111 ) on Thursday February 12, 2015 @08:24PM (#49043493)

          Ok, so here's the simple answer. Note: I'm generalizing a lot to make this simple.

          All functions have a known entry point which you can think of a name that you can call like
          print("hello world"); -- calls "print" so it knows where "print" is.

          Somewhere in the memory was loaded the function print(). There's also a symbol which allows everyone who wants to call print() to know where it is.

          The livepatch loads a new function into memory. Let's call it print2(). It then goes over and makes the symbol that used to let everyone know where print() is point to print2(). Anyone that comes after this patch will still think they are calling print() but in fact will be calling print().

          The stop_machine() is part of how ksplice (the proprietary-vendor method does it). That is not part of kernel live patch (klp).

          What klp does is ensure that a process is in a "good point" to be messed with, and then changes its pointer to e.g. print().

          That allows no changes to affect the process until that pointer to print() is changed at which point any subsequent call to print() will run print() instead.

          Ehud
          P.S. I have some code from the early 1990s where we used to do this on VMS/OpenVMS. We literally patched the running kernel (much as is done here) and allowed a system to run for years with newer kernel code.

          • Ah, I see, so it's a real-time library function pointer fixup then... no need to shutdown and restart processes at all (which would make it much less useful, I guess). I guess by nature there wouldn't typically be any changes that should have extraneous side effects from C-style library functions, so this should work pretty well in practice.

            Thanks for the high-level explanation!

            • That should have been obvious from the summary, it said it was based on ftrace.
              ftrace allows to intercept function calls for the purpose of tracing or benchmarking.

              • I'm not very familiar with Linix programming yet, which is why I tried to ask nicely for an explanation. Sorry for not already being omniscient.

        • by aliquis ( 678370 )

          I feel like it's more than that. Mind you in this case it likely had wanted me to reboot for some time because I had some upgrades and when it was back up it kinda asked me to reboot again but I guess that may have been the second Tuesday in the month. As in maybe I rebooted on say Sunday and then again on Wednesday or something such.

          As for updates only coming once a month I don't know.

      • Ubuntu pesters me nearly daily to reboot....

        • by aliquis ( 678370 )

          Ubuntu pesters me nearly daily to reboot....

          I don't know whatever you run some rolling release / unstable version of it or not.

          But my experience would be that kernel upgrades is what requires reboots and even if running openSUSE with updates and pacman repositories enable of course you'll get new kernels but it's not like you have to reboot because of that or that anyone is forcing you.

          I don't know what Ubuntu says because I don't use it.

          In Windows when you get updates it gives a generic "fixes some problems in product" and nothing more really and th

          • I *just* updated and it's pestering me to reboot again. Mostly libkrb stuff.

            Just because you ignore it doesn't mean it's not asking you to reboot.

            • by aliquis ( 678370 )

              I'm not sure openSUSE tell me to reboot / I don't know how it expresses it.

              Though I guess it would be interesting to get to know whatever any of the updates was security related or whatever it's just upgraded software in general.

              Also Windows didn't just told you you should reboot or asked you whatever you wanted it.

              It enforced the reboot regardless of what you picked.

              I understand I need to reboot for the kernel stuff to happen but in Linux I would worry less about being without those upgrades than I would i

      • Uptime is irrelevant for a desktop.
        • by aliquis ( 678370 )

          Uptime is irrelevant for a desktop.

          I'm not talking "uptime" as a number. I'm talking having to reboot to get upgrades installed.

          Regardless clearly it's not irrelevant for me.

          I don't really give much of a shit how you feel it is. That doesn't apply to me. I've got my reasons.

          • Well, let me see... Having to restart a desktop computer is not a problem, after all you supposedly are not running a server. And if despite being a desktop you keeps it on 24/7 then you are wasting electricity, because you are not using your computer 24 hours a day non stop right? Then he be occasionally off is not a problem. And when Windows asks to restart it does not require you to do so immediately, you can finish what you was doing and then restart. What's the big problem with doing this?
            • by aliquis ( 678370 )

              It is a problem for me.

              Try to get that out of your tiny (or gapping, what do I know) asshole (maybe it's just too much to try to get into your.. uhm.. ass to understand for you.)

              Ohnoz, I'm wasting electricity. Thanks for telling me!

              I've done that lots of years. Now I have functional sleep at least so sometimes but not always I use that which is much better than having it on. Hibernate doesn't work so I don't use that.

              It is a problem. It may not be for you but it is for me. Get over it.

              Windows did require me

              • Well, it seems clear to me that you really should use a Linux server, if it's not what you already do. So why not stick with your server rather than offend others like a crybaby who was upset, crybaby?
    • Or, you could just have rolling updates and the clients wouldn't know that the servers were rebooted.
    • by Creepy ( 93888 )

      Lack of patching never stopped me from striving for high uptimes, but now I can have high uptimes, safely.

  • by Anonymous Coward on Thursday February 12, 2015 @03:32PM (#49040643)

    Yup. Exactly.

    But then I guess the quest for epic uptime is bogus, right? Who the heck would want their system running 24/7 all the time?

    *waits for Systemd flamewar to break out*

    • by Anonymous Coward

      Yes, why would I want my laptop running 24/7?

    • by Anonymous Coward

      I certainly never want to test production systems to see if they'll boot after a power failure. This is great only for machines that actually have an uptime requirement (gathering live data from an experiment, waiting for an external trigger like alarm systems, etc.) If you have a high availability requirement, then you really ought to use failover instead of relying on the uptime of a single machine, and make sure your machines will recover from soft and hard reboots.

    • That's odd! I just upgraded systemd and no reboot was needed!

    • by caseih ( 160668 )

      Well when you post flame bait what do you expect, especially when it's FUD pure and simple. If you've used systemd, or done even a little bit of research you would know that 99% of systemd's suite of services does not run in pid 1. I think at this point pid 1 is rather stable and unchanging. It's the ancillary (and I might add mostly optional) components that are getting the updates lately it seems. I've run put in several updates for systemd since RHEL7 and Centos 7 came out, and *none* of them have re

    • by koinu ( 472851 )

      *waits for Systemd flamewar to break out*

      Ok, I'll try...

      Of course, as long as the kernel is patching itself, systemd will take over the kernel's role and keep your system running. Until next year, then the kernel won't get patches anymore, because systemd will fork it's own kernel service systemd-kerneld which will make Linux standalone kernel irrelevant.

  • by DigitAl56K ( 805623 ) on Thursday February 12, 2015 @03:34PM (#49040671)

    ... to a more extreme version:

    "I don't always test my code, but when I do it's via live patching the kernel on production"

  • by Anonymous Coward on Thursday February 12, 2015 @03:37PM (#49040709)

    Is this the anti-systemd?

  • by marciot ( 598356 ) on Thursday February 12, 2015 @03:52PM (#49040897)

    Maybe I’m old school, but this sort of bothers me. One of the nice things about rebooting is that it clears out old crud and gives you a reassurance that the system can bring itself up by its bootstraps. I can imagine live patching giving rise to a scenario where you have a machine that hasn’t been rebooted for years and when a power glitch finally brings it down, you find that what is on disk is different than what was in RAM and your kernel is corrupt or not bootable.

    I think live patching would make sense if we had non-volatile system RAM (i.e. universal memory), but until then, it seems like rebooting is a pretty good sanity check that things are alright.

    • In theory, at least, you patch or update the software image on disk and this allows the working copy in RAM to use those patches without being restarted. Thus, if and when you need to reboot, what you load is functionally identical to what you were running before. Of course, that's only in theory. In practice, there's always the possibility that what you get at reboot won't be quite the same as what you had before because of some sort of read/write glitch that slipped past the error checking and mucks th
    • So your use case allows for reboots - not every one does. And far more use cases allow for scheduled reboots, but not necessarily immediate reboots as soon as a security vulnerability is published.

      Fedora-derived (e.g. EL7) and Ubuntu disros can use Redhat's kpatch [github.com] support (but no patches provided in EL7 -updates yet) whilst SuSE has kGraft which as of November has had a real update stream available. People don't run these things because it's easier than rebooting.

      Besides, fix your runtime problems, set up

    • by mmell ( 832646 )
      You're obviously not "old school" enough - we truly old ones remember working on mainframes and minicomputers - not x86 commodity-grade hardware. "Old ones" such as myself remember platforms which could withstand a disk, memory . . . even a processor failure without any service interruptions. I personally have worked on minis and mainframes with over ten year uptimes despite multiple hardware failures. The x86 stack can't even come close to that kind of reliability. This is a first step (possibly the la
      • Yeah, the x86 stack doesn't need that kind of reliability, because of the inexpensiveness of the hardware. If you need that kind of uptime, you buy 3 and put them behind a load balancing scheme. You end up with more capacity, the same reliability, and still less expense. Especially in the world of virtualized server instances.

      • x86 did gain reliability features years ago, with the Nehalem-EX series and successors.

        http://www.anandtech.com/show/... [anandtech.com]

        Not sure if that's close enough for you. A year ago there were some additional RAS features (lower quality article : ) http://semiaccurate.com/2014/0... [semiaccurate.com]
        Perhaps it doesn't go as far as the most paranoid mainframes but I wonder if such systems can be called a minicomputer.

    • by emil ( 695 ) on Thursday February 12, 2015 @05:19PM (#49041819)

      Ksplice and it's derivatives won't help you if you need to purge bad glibc code from memory, as we did for the recent "ghost" vulnerability [theregister.co.uk].

      Still, it could potentially be nasty if exploited so we strongly recommend immediate patching and rebooting. Without a reboot, services using the old library will not be restarted,” Moore concluded.

    • Maybe I’m old school, but this sort of bothers me. One of the nice things about rebooting is that it clears out old crud and gives you a reassurance that the system can bring itself up by its bootstraps. I can imagine live patching giving rise to a scenario where you have a machine that hasn’t been rebooted for years and when a power glitch finally brings it down, you find that what is on disk is different than what was in RAM and your kernel is corrupt or not bootable.

      I think live patching would make sense if we had non-volatile system RAM (i.e. universal memory), but until then, it seems like rebooting is a pretty good sanity check that things are alright.

      I agree, especially in this day and age where it is so easy to build scale out solutions that make restarting a server a non issue. I think worst of all I can imagine the nightmare of finding that one of the 30 or 40 patches you applied since last reboot caused a critical issue or incompatibility that is only discovered through a reload of drivers and now you have to try and track down which bitch did the damage.

  • i'm kinda partial to seeing the errors so that i can fix them...*shrugs*...
  • by Anonymous Coward

    AIX had this in 1990s.

    • I think mainframes had this long before the '90s.

      From the 1970s, Tandem Computers (now part of HP) specialized in high-availability computing. I'm pretty sure they've had the ability to patch their equivalent of a kernel for ages.

      1980s-era electronic/digital telephone switches (the kind at telco switching offices, NOT your run-of-the-mill PBX) had uptimes measured in DECADES. I don't know if these switches had "live 'kernel' update" capability or not but they did have an "half and half" mode with "live fa

      • According to a friend, who was a switchman for AT&T/Bell Atlantic/Verizon for 36 years, the machines in a single office were a dual machine architecture, running off of the primary, and in case of a problem, would switch over to the secondary machine which was already up & running & doing stuff.

        For patching, you moved to the secondary machine, installed the patch on the primary, rebooted the primary & prayed it would work. If not, then you spent lots of time on the phone to geek central to f

    • AIX had this in 1990s.

      and users too!

      LISP machines had this in the 70's, so there's that.

  • This is awesome. Now that live patching is part of the kernel I expect it to be implemented in systemd very soon and all my GNU/systemd servers will never need a three finger salute again!

  • by davidwr ( 791652 ) on Thursday February 12, 2015 @04:37PM (#49041411) Homepage Journal

    The OSes that ran on 8086-era computers and on very early Macs, as well as most consumer 8-bit OSes could in principle be patched or even completely overwritten without a reboot.

    I vaguely remember an early Mac implementation of Lisp which basically "took over" the machine and gave you a command-line environment (look Ma! No menus!). You "ran" it by running a standard Mac application which basically took over the machine.

    I seem to remember some DOS (if you can call that an OS) programs that worked basically the same way: They loaded themselves into memory, kicked the OS out, then when they quit, they asked you to insert a DOS disk and re-loaded DOS from disk without doing a hardware/BIOS-level reboot (or they knew how to read the hard disk boot tracks and loaded it from there).

    With the advent of chips that provided real privilege levels and OSes that actually took advantage of them, such "takeovers" without the cooperation of the already-loaded OS became impossible by design (but still possible using exploits of course).

    • That's how BootX (one of the Mac OS 8/9 Linux boot loaders for PowerPC Macs) worked. Mac OS would start loading, then a dialog would come up and you could select Mac OS or Linux. You could also run the application from Mac OS anytime after the OS was fully booted. In either case, when you selected Linux, it pushed Mac OS out of memory and Linux would start up.

    • by Myen ( 734499 )

      That sounds more like kexec, where the running kernel is replaced (which also means existing processes are all killed). This newfangled thing is for live patching, where everything (including userland) stays up.

      The DOS part you are talking about works because it isn't doing multitasking; effectively, each app is the kernel as it runs. For later examples of this, any 386 or higher version of Windows (3.11 WFW, 95, ...) did basically the same thing.

  • I know TFS mentions ksplice, but I thought this was the purpose of kexec as well? So really, it's not anything new.
  • .... what could possibly go wrong?

  • Although KSplice is nice (and functionality like this has been in Solaris/AIX/... for a really long time), last time I looked at it, it didn't support live-patching everything. You couldn't just bump an entire kernel version (as is possible on Solaris), only patch modules with a very specific patch as long as there are no processes using any part of it.

  • by fisted ( 2295862 ) on Thursday February 12, 2015 @08:30PM (#49043551)
    An ocean of new opportunities for rooted machines...

For God's sake, stop researching for a while and begin to think!

Working...