Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Red Hat Software Upgrades Linux

No More Need To Reboot Fedora w/ Ksplice 262

An anonymous reader writes "Ksplice, the technology that allows Linux kernel updates without a reboot, is now free for users of the Fedora distribution. Using Ksplice is like 'replacing your car's engine while speeding down the highway,' and it can potentially save your Linux systems from a lot of downtime. Since Fedora users often live on the bleeding edge of Linux development, Ksplice makes it even easier to do so, and without reboots!"
This discussion has been archived. No new comments can be posted.

No More Need To Reboot Fedora w/ Ksplice

Comments Filter:
  • Awesome! (Score:5, Funny)

    by mark72005 ( 1233572 ) on Tuesday August 31, 2010 @03:14PM (#33429330)
    But do the windows "snap" to one side of the screen? See? Simple! ($100 please)
  • Hmm... (Score:3, Funny)

    by jgagnon ( 1663075 ) on Tuesday August 31, 2010 @03:15PM (#33429336)

    Changing your car's oil while driving down the highway could be tricky, too.

    • by Anonymous Coward on Tuesday August 31, 2010 @05:26PM (#33430736)

      The uptime obsession is crazy. Rebooting once in a while is useful, if only to see that you can still get everything running again from a complete stop. Kernel updates in particular can cause all kinds of problems at boot time. If you don't check the boot sequence, you'll almost certainly have forgotten what you changed that killed your cold boot ability when you need it for some other reason (moving servers, power failure, hardware upgrade, ...).

    • Re: (Score:3, Funny)

      by adisakp ( 705706 )

      I Reboot As Much As I Get Laid

      Man, that would be great for Windows users - not to mention they would *REALLY* look forward to Patch Tuesday.

  • Scary analogy (Score:5, Insightful)

    by Xest ( 935314 ) on Tuesday August 31, 2010 @03:22PM (#33429456)

    "Using Ksplice is like 'replacing your car's engine while speeding down the highway,'"

    So in other words it's something you'd never want to risk doing because it'd almost certainly cause a crash?

    I think they should've thought about a different analogy for this one...

    • Re:Scary analogy (Score:4, Insightful)

      by natehoy ( 1608657 ) on Tuesday August 31, 2010 @03:52PM (#33429786) Journal

      True. The in-place kernel upgrade is somewhat safer than their analogy might imply, but it does lead to an interesting point. Why would you want to do this?

      Personally, I'm OK with having to reboot my Linux machine when I change kernels, mostly because it's the only time Linux DOES ask me to reboot. To be fair, Microsoft and especially third party Windows software vendors have gotten a lot better about this in the last few years, so infrequent need to reboot is now a pretty solid feature on both Windows and Linux.

      In any case, when I get a new kernel, I can install the new kernel and continue running along on the old one as long as I wish to, then reboot to apply the new kernel at a convenient time. Rebooting Linux Mint takes less than a minute from powerdown to login, and I know I haven't run into any risky process locks or anything during the upgrade process. Plus, I like the fact that the "older" kernel is always available to me on the boot menu in case something goes horribly wrong with the new one.

      But I'm not all that uptight about "uptime". It's a home computer. If I have to reboot it once a month or so to apply the latest kernel, I'll reboot it. For my purposes, I don't see any added value for the extra risk (however slight) an "in-place upgrade" would introduce.

      If I were running a "must be up 24/7" machine, I could see this as a benefit, but chances are at that point I've load-balanced a couple of machines and the cluster can stand a "rolling reboot" of the machines far better than it could stand a botched upgrade.

      I still love the idea, and applaud the folks who managed it, but I don't think I see a real reason for it other than "wow, that's pretty nifty". It doesn't seem possible without introducing at least a little bit of risk, and it doesn't seem that the people who would really need it would be all that tolerant of the risk.

      • Re:Scary analogy (Score:5, Informative)

        by jimmyharris ( 605111 ) on Tuesday August 31, 2010 @04:25PM (#33430162) Homepage
        If your server only takes a few minutes to reboot, then I can see why you wouldn't be so concerned about having to reboot for kernel upgrades. We have Oracle and Sybase database servers that take over 90 minutes to start up all their services (these are 16 and 32 core machines) and not having to reboot them for kernel updates would be a huge win for us.
        • Re:Scary analogy (Score:5, Insightful)

          by Leebert ( 1694 ) * on Tuesday August 31, 2010 @04:43PM (#33430348)

          And how would you know for sure that it would actually boot correctly the next time you actually *need* to?

          There is nothing worse than having an actual unexpected reboot (UPS hiccup, whatever), and finding that the system that has been up for 3 years isn't booting, and not having ANY idea which patch put in place in the intervening time actually broke it.

          Not that, ahem, I speak from experience, or anything...

          Occasional rebooting is good, if for no other reason than making it happen in a controlled situation so you aren't surprised in an uncontrolled situation. If you really need the 100% uptime, then by all means, design a proper high availability system.

          • Re: (Score:3, Insightful)

            And how would you know for sure that it would actually boot correctly the next time you actually *need* to?

            Scheduled reboots.

            Now you are going to say - scheduled reboots is when you do your kernel upgrades.

            The problem with that approach is overloading due to bitrot. Kernel ugprades are not the only reason a system will fail to boot. By upgrading and rebooting you are combining the two goals of patching the kernel and verifying that the system is still bootable, which means potentially more effort troubleshooting if something goes wrong. In the past you didn't have the choice to separate out those two tasks.

        • Re: (Score:3, Insightful)

          Given that you mentioned Oracle and 32 core machines, I'm sure you're in a corporate environment that places severe restrictions on your ability to change existing solutions. With that said, your servers are taking more than 90 minutes to boot into a usable state. A huge win for you would be servers that can be rebooted, not servers that never need to be rebooted.

          Somehow, I imagined the following conversation.

          "Jimmy, are you ok?! There's blood everywhere!"
          "OK?! I just had a huge win! I stain-proofed the carpet, so I don't need to worry about the blood anymore."
          "Jimmy, are you hurt? Do you need a doctor?"
          "What do doctors have to do with cleaning this mess off the carpet?"

          (Hopefully my light-hearted tone is apparent. I have nothing but sympathy for fellow cogs st

      • The main reason is in cases where there's a bad vulnerability in the kernel which is sufficiently bad to warrant risking the live update rather than waiting for the next scheduled maintenance period. Personally I'm not sure how often that's going to happen as there's always the possibility of something going wrong when you do it, leaving you in the position of rebooting the machine anyways. I tend to think that most people that are that concerned with a few minutes of downtime are already using clustered se
      • Re:Scary analogy (Score:4, Insightful)

        by MoralHazard ( 447833 ) on Tuesday August 31, 2010 @06:25PM (#33431188)

        Get a little imagination, will ya? Here,I'll boil your objection down into two simple premises, for you:

        1) In-place kernel upgrades are inherently RISKY to stability, compared with normal reboot upgrades.
        2) Reboot upgrades are a LOW COST operation.

        You seem to assume that the risk of #1 (upgrade in-place) will always outweigh the cost of #2 (rebooting to upgrade). At the moment, you MAY be correct in that assumption, but we have no basis for any conclusions, yet.

        But Ksplice's current business plan is to get ahold of a massive, low-cost testing infrastructure by getting installed by default on as many popular Linux distros (Ubuntu, Fedora, etc.) as possible. Properly executed, a massive testing and development effort should improve KSplice's quality (read: stability) over time.

        At some point, if KSplice does it right, in-place kernel upgrades will become stable enough to no longer entail measurably more risk than traditional reboot upgrades. If/When that happens, you'd be a fool to continue reboot upgrading, right? If there's no practical added risk, why should you have to even put up with the inconvenience of a single minute's delay, or the hassle of closing and re-opening all your SSH sessions?

        Hell, it's reasonable to imagine that in-place upgrades could even become MORE stable than reboot upgrades (eventually). If that happens, you'd have to be more than a fool to continue rebooting--you'd have to be some kind of technical cargo-cultist, unwilling to offend the Machine Gods by departing from the correct rituals. (There will probably be at least a few of these people--I know some of them, I think.)

        For another perspective, consider these:

          - guns vs. bows
          - automobile vs. horse-and-buggy
          - pen/paper vs. typewriter
          - typewriter vs computer
          - multiprocessing vs. single CPU

        Reasonable people expect that the earliest incarnation of a new technology will be buggy, unstable, dirty, explosive, unreliable, or otherwise potentially hazardous. But given time to iron out the bugs, there's eventually a tipping point where the original technology no longer fulfills its basic purpose as well as the new-fangled competitor.

    • by Spad ( 470073 ) <slashdot@ s p a d . co.uk> on Tuesday August 31, 2010 @03:52PM (#33429792) Homepage

      They should have stuck with their original slogan: "Using Ksplice is like updating your kernel without rebooting"

      • sure it's accurate, but where's the love? how about "using Ksplice is like updating the kernel without rebooting and you have a nice car" or "using Ksplice is like updating the kernel in your car without rebooting your car"
    • by Kjella ( 173770 )

      Well let's be honest here, the risk/gain isn't exactly working out for stable enterprise uses. They want people that can show off all the crazy things you can do with a computer and are willing to risk that their machine could go down. If they get it working for enough people over time, then it'll spread as people like the convenience of reboot-less upgrades. But right now, I'd say their analogy is just right for the market, it's the nerd version of the teenage drivers who play chicken.

    • by Spad ( 470073 )

      Maybe the bomb will go off if you drop below 50mph...

    • Why would it cause a crash? Not having an engine only means you can't accelerate; you can still brake, turn, etc. It's really a lot like restarting your engine while driving, something that's quite simple and safe: clutch in, turn off engine, restart engine, clutch out.

      • Re: (Score:3, Interesting)

        You lose power steering and anti-lock brakes, which can still make things considerably more dangerous and control of the car more difficult. But that's not really the main issue. The main issue is it's a silly analogy.
  • how about is linux with memory leaks? is the base os good? what about X? most of the apps? what about apps get stuck in background that need a reboot to unload?

    • Re: (Score:3, Insightful)

      by h4rr4r ( 612664 )

      WTF are you talking about? Kill -9 gets rid of apps if you really need too, rebooting is for windows users.

      • Re: (Score:3, Insightful)

        Unless they are in uninterruptible sleep.

      • And you can easily put in an icon that forces a process that owns a Window to quit (it looks like a broken window on Ubuntu/GNOME). The only times I have to reboot is when X.org takes out the desktop *and* the keyboard, or when ACPI fails. Those things things still happens too often.

  • interesting (Score:4, Interesting)

    by idontgno ( 624372 ) on Tuesday August 31, 2010 @03:24PM (#33429476) Journal

    this may be based on Free software (residing in the machine needing its kernel patched), but it appears that patch preparation is based on a subscription service provided by the Ksplice Uptrack people. That's the part which is (selectively) free-as-in-beer. This isn't organic to the kernel or the normal methods of kernel updating.

    That means there's libre-free software and a service provided by a non-distro company which is, for selected distros, gratis-free. For now.

    The technical description sounds like the ancient OS patching techniques the old mainframes I used to work on used.

    And frankly, I'd still feel a little more comfortable with a reboot, since I'd worry a bit about state consistency of kernel and client processes. But, I guess smarter people than me says it OK, so what do I know?

    • Re:interesting (Score:5, Interesting)

      by bsDaemon ( 87307 ) on Tuesday August 31, 2010 @03:40PM (#33429676)

      When you have around 1500 production servers to patch, such as with the memmap 0 bug last year, doing them one-by-one, or even in small batches, remotely over IP KVM takes a long-ass time. This is nice for those types of situations.

      • Re: (Score:3, Funny)

        by vlm ( 69642 )

        When you have around 1500 production servers to patch, such as with the memmap 0 bug last year, doing them one-by-one, or even in small batches, remotely over IP KVM takes a long-ass time.

        One single line using pssh, dsh, dish, or no lines at all when using a very fancied up puppet configuration?

        Do you like toggle in boot code over the IP KVM like a PDP-8 or what?

        The ability to do something the hard way, does not prove the lack of existence of an easier way.

        • Re: (Score:3, Interesting)

          by bsDaemon ( 87307 )

          OK, you see how all of those things still lead to doing a reboot? Now, imagine automating the process AND using ksplice. And I agree that automating the process would have been super awesome, but unfortunately that's just the sort of design process and forethought which was shunned at the place I worked at that time. So I left.

          • by vlm ( 69642 )

            unfortunately that's just the sort of design process and forethought which was shunned at the place I worked at that time.

            Ouch man, ouch. In the networking world we call that situation trying to solve a simple layer-8 problem using a very complicated layer-1 solution (or various other combos of numbers). I'm guessing rebooting was unacceptable because you had no backups / load balancers / load levelers / checkpointers / heartbeat monitor / hot standby disaster recovery / replication systems. Most places, reboots sound like a great time to test that gear, assuming you have it...

      • Re:interesting (Score:5, Interesting)

        by TooMuchToDo ( 882796 ) on Tuesday August 31, 2010 @04:42PM (#33430326)

        Seriously? I patched 5500 linux servers in 24 hours *by myself*, all the while they were churning through collider data from the LHC. This would be, in my opinion, what I would call a production environment. Shortcuts are nice, but sometimes you don't need them if your environment is engineered properly.

        • Re:interesting (Score:5, Insightful)

          by scheme ( 19778 ) on Tuesday August 31, 2010 @05:00PM (#33430518)

          Seriously? I patched 5500 linux servers in 24 hours *by myself*, all the while they were churning through collider data from the LHC. This would be, in my opinion, what I would call a production environment. Shortcuts are nice, but sometimes you don't need them if your environment is engineered properly.

          That's slightly different. I assume you're at a CMS or ATLAS T2 center and frankly most of those systems were worker nodes that could be taken down for a minute or too for a reboot as jobs were drained off of them and they went idle. A quick reboot and they'll show up in condor or pbs a minute or two later and start processing jobs. The gatekeepers and gateways for the SE would be more complicated but if you got them up within a minute or two, most if not all of the running jobs wouldn't notice.

    • Re:interesting (Score:5, Informative)

      by phantomfive ( 622387 ) on Tuesday August 31, 2010 @03:46PM (#33429732) Journal

      And frankly, I'd still feel a little more comfortable with a reboot, since I'd worry a bit about state consistency of kernel and client processes.

      This in theory can be a problem, but each kernel update has to be prepared individually, so someone (once again, this is the theory) has looked at the kernel modifications and made sure it won't cause problems. This isn't an automatic thing that can work with any kernel (don't try to use it to go from a 2.4 kernel to a 2.6 kernel), and if there are major changes, say a new scheduler or something, then someone needs to write code that will move the data from the old scheduler to the new scheduler.

      Mainly its used for security updates which are probably a line of code changed, or a function changed, and there is no difficulty with inconsistencies (unless maybe someone is in the middle of trying to exploit the buffer overflow, but they avoid that problem by making sure no threads are in the functions that are being patched). This is my understanding of how it works.

    • ...it appears that patch preparation is based on a subscription service provided by the Ksplice Uptrack people.

      Yeah, I'll use it when my distribution vendor of choice is the one doing the preparation.

    • I think that (basic) tools for generating ksplice patches are available as Open Source, it's just that you probably don't want to actually have to generate them yourself (and you ought to have someone qualified look over them). It probably makes most sense for a distro to generate them but as that's not happening yet these guys have their niche.

      I ran ksplice on an Ubuntu box for a while without problems.

  • by JustAnObserver ( 1194117 ) on Tuesday August 31, 2010 @03:28PM (#33429530)
    ... and it has been free for Ubuntu, as indicated on their web page (http://www.ksplice.com/pricing)
  • Old hat (Score:5, Informative)

    by Kaz Kylheku ( 1484 ) on Tuesday August 31, 2010 @03:29PM (#33429548) Homepage

    Lisp systems did this 30+ years ago: reload new compiled functions, and keep going. New calls go to the new function, old function becomes garbage when no more threads are executing it.

    • Re: (Score:2, Insightful)

      by Myopic ( 18616 )

      Did you just equate hot-replacing a kernel with adding a function to a runtime environment? Or did I not quite understand? If I understand, then that would be more like, say, upgrading a program without having to reboot, which is unremarkable.

      "Next time you open that app, it launches the new version!"

      • Re: (Score:3, Insightful)

        by imbaczek ( 690596 )
        back then, the runtime environment on a lisp machine was pretty much the kernel.
      • Did you just equate hot-replacing a kernel with adding a function to a runtime environment?

        No, he equated hot-replacing a kernel with hot-replacing a function in a piece of software while the software was still running.

        • Have you ever heard of a LISP Machine? Who says that LISP code is not in the kernel?

          • Re: (Score:3, Funny)

            by Abcd1234 ( 188840 )

            I never said anything of the kind.

            But hey, you got to show off that you know what a lisp machine is, so bully for you.

      • Re: (Score:3, Insightful)

        by Kaz Kylheku ( 1484 )

        No, I equated hot-replacing sets of functions in a run-time to hot-replacing a kernel (which is a set of functions).

        "Next time you open that app" isn't hot-replacement if you are first required to exit the current instance, such that a new process is started.

      • Re: (Score:3, Insightful)

        by TheRaven64 ( 641858 )
        Actually, the Lisp version was more impressive. The entire OS on the Lisp machines was written in Lisp and was introspectable. You could, at run time, inspect the code for the running system, modify it, and have the code compiled and the new version replace the old one without any downtime. Ksplice, in contrast, requires a separate program to do the compilation and requires a user to manually do some merging of nontrivial changes.
        • Re: (Score:3, Informative)

          by Abcd1234 ( 188840 )

          Incidentally, Smalltalk images work similar. For a fun time, open up a Squeak image and start digging around. Now *that* is open source software.

          • Re: (Score:3, Interesting)

            by TheRaven64 ( 641858 )

            Almost, but not quite. In Squeak, the VM itself is statically compiled (via C), so you can't modify that at run time. With SqueakNOS, you can modify device drivers, but there are still some bits that you can't modify.

            This is actually pretty trivial for late-bound languages in general. With LanguageKit, we can replace methods written in Objective-C with methods written in Smalltalk or JavaScript at run time too.

            • Re: (Score:3, Interesting)

              by Abcd1234 ( 188840 )

              Bizarre that you got modded flamebait, but...

              Almost, but not quite. In Squeak, the VM itself is statically compiled (via C), so you can't modify that at run time.

              Eh, that's largely picking nits. Did the Lisp machine let you change the hardware running those lisp expressions? No? Then why would you expect to be able to modify the virtual machine running compiled Smalltalk bytecodes?

              With LanguageKit, we can replace methods written in Objective-C with methods written in Smalltalk or JavaScript at run time t

              • Re: (Score:3, Informative)

                by TheRaven64 ( 641858 )

                Did the Lisp machine let you change the hardware running those lisp expressions? No?

                Not exactly, but with a Lisp Machine everything other than the hardware was modifiable. The entire run-time environment was written in Lisp. The Squeak VM includes things like the frame buffer, for example, which are statically compiled.

                SqueakNOS is more impressive than Squeak, because everything from the interrupt handler layer and up is written in Smalltalk and can be modified. This is pretty close to being equivalent to a Lisp Machine. The only bits you can't modify at run time are the bits that a

  • How long before it is used to exploit machines, what could possibly go wrong.

    • Re: (Score:2, Insightful)

      Well, if you manage to get your "updates" accepted by the machine's update process, you pwn the machine after the update anyway, even with conventional rebooting updates.

  • I've been waiting for years!

    watch uname -r

    (from the man page)

  • Servers (Score:4, Informative)

    by DreamArcher ( 1690064 ) on Tuesday August 31, 2010 @03:50PM (#33429772)
    Other than just screwing around in your garage it's still $50 a year per server if you actually need.
  • ...but does it have support for smooth full-screen Flash video yet?

    (It's http://xkcd.com/619/ [xkcd.com] for those of you who still have question marks over your heads)
  • Okay, so even suppose this is perfectly reliable. Let's say I'm running a high-availability server and can't stand any downtime. Now when my kernel needs an update, I don't have to reboot, great!

    So what about when, say, libc needs an update? As long as programs are still using it, they'll be using the outdated version. Am I supposed to restart all programs using libc? That will cause downtime just like a reboot (although maybe a bit less).

    Or what about when I need a hardware upgrade? Or there's a

Technology is dominated by those who manage what they do not understand.

Working...