Forgot your password?
typodupeerror
Data Storage Linux

DRBD To Be Included In Linux Kernel 2.6.33 166

Posted by timothy
from the now-you-can-sleep-nights dept.
An anonymous reader writes "The long-time Linux kernel module for block replication over TCP, DRBD, has been accepted as part of the main Linux kernel. Amid much fanfare and some slight controversy, Linus has pulled the DRBD source into the 2.6.33 tree, expected to release February, 2010. DRBD has existed as open source and been available in major distros for 10 years, but lived outside the main kernel tree in the hands of LINBIT, based in Vienna. Being accepted into the main kernel tree means better cooperation and wider user accessibility to HA data replication."
This discussion has been archived. No new comments can be posted.

DRBD To Be Included In Linux Kernel 2.6.33

Comments Filter:
  • How does this differ from the Network Block Device (NBD)? http://en.wikipedia.org/wiki/Network_block_device [wikipedia.org]
    • by Anonymous Coward on Thursday December 10, 2009 @09:47PM (#30397560)

      The "DR" stands for Distributed and Replicated. DRBD is way higher-level in function, but integrated lower-level than the simple userspace daemon that the server side of NBD uses.

      Read the docs, the differences should be blindingly obvious.

    • Lot of different ways to get similar results. You might say I'm cloudy on which of these is really equivalent, is a good idea or the best way to do it, or has good performance.

      There is Gluster [gluster.org] which sits on top of any existing disk file system, via FUSE, I think. No kernel module needed, only runs a daemon. I tried version 2, and it worked fine, however I didn't demand much of it. They've just come out with version 3.0 that doesn't need libfuse anymore.

      Or there's Lustre, which does need a kernel mod

      • by wagnerrp (1305589)

        Are some of the new file systems under development, such as btrfs, going to have distributed, networked operation as a basic feature? I recall hearing that ZFS has some ability along those lines.

        Not exactly. ZFS has send/receive functions that let you copy a filesystem snapshot (full or incremental based off a previous snapshot) to another location. These functions just use stdin and stdout, relying on rsh/ssh/nc/whatever for network communication. It's designed more for remote backup purposes, rather than high availability.

        • by thogard (43403)

          The big problem with send and receive is that if you have any bit errors in the data stream receive will back out everything. That means its useless for long term backup where you might only need to get one file off an tape since its an all or nothing. I think ZFS's biggest failure at this point is a lack of a way to do backups without modifying the meta data on the files.

          • If you are using zfs send/receive for backups, you should be using incremental replication. You take a shapshot on your live system, then use zfs send to replicate that snapshot on another system. For a long-term backup, you then dump the copied snapshot to a tape (reading a read-only snapshot doesn't modify anything). You don't want to use incremental backups for long-term backups because they multiply the chance of corruption.
        • If you use ZFS on FreeBSD then it fits into the GEOM stack. You can use ggate to provide physical devices that are exported over the network then add these to a ZFS pool. You can also do it the other way up and export ZVOLs via ggate and run some other filesystem on top of them, although that's probably less useful.
      • by dr.newton (648217)

        I have searched high and low for something truly equivalent to DRBD, and cannot find it.

        Not only does DRBD provide replicated storage that can be shared among multiple nodes with synchronous writes, but it also has HA features, like supporting failure and restoration of a node without a loss in service.

        No combination cluster filesystems and NBD-style storage-over-the-network software does this. They need shared storage to provide redundant, HA access to data.

        I have thought about trying to jimmy up something

  • by Anonymous Coward on Thursday December 10, 2009 @09:43PM (#30397534)

    About 15 years ago, I worked for a place that used Tru64. It offered very similar technology to this. Frankly, we found typical hardware solutions to work better. Software is better at some things, but for work like this, you want it done as much in hardware as is possible.

    • by MichaelSmith (789609) on Thursday December 10, 2009 @09:59PM (#30397656) Homepage Journal

      But your hardware device is just another computer running software for which this feature might be useful.

    • by Lemming Mark (849014) on Thursday December 10, 2009 @10:36PM (#30397904) Homepage

      Doing it in software for purely virtual hardware is useful. I know it's been used to sync disks across the network on Xen hosts, the idea being that if the local and remote copies of the disk are kept in close sync, you can migrate a virtual machine with very low latency. Should be able to do similar tricks with other Linuxy VMMs. Having software available to do this stuff makes it easy to configure this sort of thing quickly, especially if you're budget-constrained, hardware-wise.

      • by dgym (584252)
        You can achieve live migration with iSCSI and AoE too, and if you use a SAN you will probably continue to use one of these network block device protocols.

        What DRBD does it make it relatively simple to set up a redundant SAN, using commodity hardware, from which you can export iSCSI devices etc.

        Of course if you are going to use local storage for your VPSs it is just as easy to set DRBD up on those hosts and forgo any network block device layer on top of it. Dual primary mode makes live migration in thi
        • by Znork (31774)

          You can achieve live migration with iSCSI and AoE too

          Indeed, but you don't want to do live migration over high-latency links with iSCSI. DRBD may be a better way to go if you want live migration between data centers in different countries.

    • by fuzzyfuzzyfungus (1223518) on Thursday December 10, 2009 @11:13PM (#30398082) Journal
      I suspect that, like so many things, while there is room for the best way, there is a great deal of room for the "reasonably good and a whole lot cheaper" way.

      A whole lot of progress in modern IT, especially on the server side, is less about exceeding the architectural sophistication of 70s-80s UNIX systems and mainframes, and more about making some of those capabilities available on sucktastic x86s.
    • by dgym (584252) on Friday December 11, 2009 @02:01AM (#30398722)
      I'm not about to dismiss your experience, but things have changed over the last 15 years so it might not be as relevant as it once was.

      In that time processors have become much faster, memory has become much cheaper, commodity servers have also become much cheaper and a lot of software has become free. While that has happened hard disks have become only a little faster. As a result many people consider custom hardware for driving those disks to be unnecessary - generic hardware is more than fast enough and is significantly cheaper.

      There might still be some compelling reasons to go with expensive redundant SAN equipment, but for many situations a couple of generic servers full of disks and running Linux and DRBD will do an admirable job. The bottleneck will most likely be the disks or the network, both of which can be addressed by spending some of the vast amount of money saved by not going with typical enterprise solutions.
      • by afabbro (33948)

        While that has happened hard disks have become only a little faster.

        "A little faster" is a bit of an exaggeration...see a 1994 hard drive [stason.org]. 13MB/sec vs. 2009's 6000 MB/sec on SAS. In 1994, people were running what, 50Mhz PCs? They haven't improved by the same amount, nor has the speed or quantity of RAM in the typical machine.

    • by Macka (9388)

      What are you talking about? Tru64 has nothing that functions like DRBD and never has. You need to re-read what DRBD actually does because you're getting confused. Also, 15 years ago Tru64 was only 1 year old, only it wasn't Tru64 back then it was DEC OSF/1 and it was really quite crude and buggy compared to the Tru64 in circulation today. So you would not have had a very spectacular experience with it.

  • Very Useful Software (Score:5, Interesting)

    by bflong (107195) on Thursday December 10, 2009 @09:58PM (#30397638)

    We use DRBD for some very mission critical servers that require total redundancy. Combined with Heartbeat I can fail over from one server to another without any single point of failure. We've been using it for more then 5 years, and never had any major issues with it. It will be great to have it in the mainline kernel.

    • by Tuqui (96668)

      We use it only for mirroring the Databases, for mirroring files we use Mogilefs and other methods. The problem with DRBD is that once the primary is down, to check both machines and decide if is OK to resync the disks takes a lot of time, And only DB needs the low latency mirroring in our case.

    • Re: (Score:3, Interesting)

      by DerPflanz (525793)

      We have used drbd 0.7 for some mission critical server, but it gave more headaches than a warm (or even cold) standby. The main problem is keeping you nodes synchronised for the disks that are NOT in the drbd (e.g. /, /etc, /usr, etc). We put our software on drbd disk and the database on another. However, when adding services, it is easy to 'forget' to add the startup script in /etc/ha.d and the first failover results in not all services being started. Which leads to a support call.

      I understand that we shou

      • That's why configuration management systems like Bcfg2, Puppet, Chef, Cfengine, etc. exist. They can guarantee that all the relevant configuration is identical across your systems.

        As for services managed by the HA demon, with the modern configuration of OpenAIS/Pacemaker (even in Heartbeat 2.0) there's a CIB (Common Information Base) that shares the configuration between all the cluster nodes. It makes it pretty much impossible to not have the identical HA services configured cluster-wide.

    • We're also very happy with it. I have a RAIS array of two servers each with RAID1. So our Postgres database is configured on a quadruple-backup setup thus:

      Postgres /var/lib/postgresql /dev/drbd1
      primary server --- secondary server
      raid 1 raid 1
      2 x X25-E SSD 2 x X25-E SSD

      The servers are connected back to back by a direct gigabit ethernet link, and we use DRBD in protocol B (memory synchronous).
      Thus all transactions are guaranteed to hit the disk, we get fast performance, and excellent re

  • Just what we need, yet another networking module built into the kernel. Creating a fresh config with the 2.6 series kernels has become even more of a hassle since there are so many modules that are activated by default. To stop the insanity I have to go through and eliminate 90% of what's there so that 'make modules' doesn't take longer than the kernel proper. Most of them are targeted for special applications and don't need to be in a default build.

    • Re: (Score:1, Flamebait)

      by Abcd1234 (188840)

      Maybe stop building kernels by hand and you'll be a lot happier, then, eh? Seriously, there's virtually no reason to build a custom kernel unless you have some pretty unusual requirements. So quit wasting your time. And if you insist on building kernels by hand for no particularly good reason, quit bitching. It's not like you don't have a choice.

      • Re: (Score:3, Interesting)

        People who build (and test) their own custom kernels are important. Sometimes, a bug won't show up except with some weird combination of kernel options, because some code path dependencies are missed with the fully configured kernels that the distros build for you.
        • by Abcd1234 (188840)

          People who build (and test) their own custom kernels are important. Sometimes, a bug won't show up except with some weird combination of kernel options, because some code path dependencies are missed with the fully configured kernels that the distros build for you.

          Well, that's very noble. Nevertheless, those who make the choice to build their own kernels, as valuable as they may be, are still making a choice, and that choice means putting up with the tedium of configuring and building the kernel out. Don'

          • by richlv (778496) on Friday December 11, 2009 @03:28AM (#30399000)

            i'm sorry to say, but that's not a good attitude. and i'm being polite here.

            developers need testers. some arrogant assholes might claim they don't, but then they're known as ones. now, to attract testers you not only are polite to them, you also do not discourage them by breaking or ignoring things that hamper them (but might not concern casual users), you actually should build tools and other support functionality for testing.
            essentially, having less testers will impact quality of the software for everybody else, so casual users also should desire for the project to have more testers.

            i'm glad that at least some kernel hackers recognise this, and 2.6.32 actually has support for new configuration method, which looks at already loaded modules and some other stuff to create trimmed down kernel config - http://kernelnewbies.org/LinuxChanges#head-11f54cdac41ad6150ef817fd68597554d9d05a5f [kernelnewbies.org]

            • by ookaze (227977)

              i'm glad that at least some kernel hackers recognise this, and 2.6.32 actually has support for new configuration method, which looks at already loaded modules and some other stuff to create trimmed down kernel config - http://kernelnewbies.org/LinuxChanges#head-11f54cdac41ad6150ef817fd68597554d9d05a5f [kernelnewbies.org]

              But "make oldconfig" is there since years.
              It's not tedious at all to configure your new kernel when you have your old config file. Only the new options or the modified ones will show up.
              So the tools are already there for those that build their own kernel.

              • by richlv (778496)

                It's not tedious at all to configure your new kernel when you have your old config file.

                exactly. when you have. for the same machine.
                when you don't have any, or when you are compiling for a new machine, it _is_ tedious task - i know, i'm a slackware user who recompiles kernels.

                and even with make oldconfig (which helps a lot) sometimes amount of new config entries is huge, especially if you skip a couple of kernel releases.

                i don't know how usable localmodconfig is, but i really appreciate kernel devs who try not to alienate, and even attract, other people.

            • by schon (31600)

              developers need testers

              So your solution to this is to stop adding new drivers so the testers have nothing to test?

              Yeah, that makes a *lot* of sense!

              • by richlv (778496)

                i'm sorry, WHAT ?
                did you respond to incorrect post, or did you misread some parts of it ?

      • by JWSmythe (446288)

            Actually, custom kernels work better for most applications. It reduces the bloat of unwanted code that's been compiled in, and gives you exactly what you want.

            Anyone who bitches about it just hasn't had enough practice.

        • How much do you actually remove? I've not compiled Linux for almost a decade now, but I used to compile a custom FreeBSD kernel after install. Now all of the things that I want are compiled in or in modules by default and everything else is in modules. The stuff I'm not using just doesn't get loaded. The only overhead you get from modules that are not loaded is a small amount of disk space and a slightly longer kernel compile time (which doesn't matter if you're not compiling your own kernel). Accordin
          • by JWSmythe (446288)

                I actually take a good bit out, only leaving what I'll be using. It not only helps in the running environment, but it helps a lot at boot time too.

          • by Ash Vince (602485)

            How much do you actually remove?

            Remove? Why remove anything, instead just start afresh.

            I generally only change my kernel config when I buy a new PC or add new hardware. If I am building a new PC I start with a vanilla kernel source and then go through enabling just the functionality I need, and screw all those modules I just build it in unless it has to be a module. This may result in kernel that does not fit on a floppy disk but why would I care? it doesnt fit on a punched card either.

            I know this takes time, but it is a good way of learn

        • by drinkypoo (153816)

          Actually, custom kernels work better for most applications. It reduces the bloat of unwanted code that's been compiled in, and gives you exactly what you want.

          If you're trying to save a megabyte of RAM on a modern computer, you're a tool. Building your own kernel was totally mandatory back in the 386 days, but it's totally unnecessary for most users. They derive a lot more benefit from knowing that DKMS will function. With that said, I do have a laptop that I've pondered building a kernel for, because it's got a so-far-unsupported processor (Athlon 64 L110) and if I want cool n' quiet I need a custom kernel. But as a user the best bet is to buy something already

        • by Abcd1234 (188840)

          Actually, custom kernels work better for most applications. It reduces the bloat of unwanted code that's been compiled in, and gives you exactly what you want.

          Apparently *someone* doesn't understand what kernel modules are. Hint: The code *isn't* "compiled in".

          • by JWSmythe (446288)

            No, but you have to have some compiled in. Most of the stock distros I've went looking through have things I don't want or need compiled in. The modules tend to be a little slower, and at very least I have to wait for all of them to load before things work. Why should you have a delay while they load, when they can be put in the kernel right off, and just work. On a one-off machine, you don't really care, but when you have a network of hundreds or thousands of machines, you don't want to

            • by Abcd1234 (188840)

              No, but you have to have some compiled in. Most of the stock distros I've went looking through have things I don't want or need compiled in.

              Uhuh.

              Such as?

              The modules tend to be a little slower

              Citation please.

              and at very least I have to wait for all of them to load before things work.

              Arrange to have the ones you need loaded at startup so you only pay the (miniscule) cost at boot time.

              Why should you have a delay while they load, when they can be put in the kernel right off, and just work.

              Wait wait... now I

    • Re: (Score:2, Troll)

      by JohnFluxx (413620)

      Oh noes! It takes like 10 minutes to compile the default kernel for all those users that compile their own kernel! Clearly linux is going down the tubes! What insanity!

      What's with all the idiotic posts?

      • Re: (Score:3, Funny)

        by Anonymous Coward

        Just wait 'til next week when the Gentoo folks finish compiling and finally see this story.

    • by Lemming Mark (849014) on Thursday December 10, 2009 @10:40PM (#30397942) Homepage

      You want "make localmodconfig", which I think was also added recently, possibly to 2.6.32 actually. This builds a kernel using a local .config file, except that it only compiles modules that show up in lsmod. So if you boot off your vendor kernel with a squillion modules, let it load the modules you actually *use* then do make localmodconfig, you can make a kernel that only contains those modules. I don't know what it does if module names etc change, maybe you'd need manual fixup then - should still be less work than you currently are doing though.

      There's some explanation here, though it might be for an out-of-date version of the patch:
      http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-09/msg04230.html [derkeiler.com]

      As the other reply said, make oldconfig is also useful to important settings from a previously configured kernel, can save a lot of time.

    • by eyepeepackets (33477) on Friday December 11, 2009 @12:35AM (#30398394)

      They are called modules for a reason: You can add or remove at will, including whether or not you bother to build them at all. To say modules are "built into the kernel" is incorrect; module code is included with the kernel source code, but the modules themselves are only built and used if you choose.

      As concerns the "insanity" of configuring a kernel, here again you have a choice: Use Ubuntu. But if you want a fast, lean, mean machine you really do want to craft your kernel to fit your specific needs.

    • by dr.newton (648217)

      I shall now make several statements that may prove informative to you.

      DRBD is not "another networking module".

      Adding this feature to mainline, and thus maybe getting some RHEL support for it, will benefit a large number of companies doing things for themselves with Free software.

      There is nothing else like this in the kernel.

      If one wants new features, one must accept the occasional addition of some code.

  • Linux FS rocks (Score:5, Interesting)

    by digitalhermit (113459) on Friday December 11, 2009 @12:02AM (#30398250) Homepage

    I admin AIX systems for my day job... One thing that's really nice about AIX is that the filesystem and underlying block device is highly integrated. This means that to resize a volume you can run a single command that does it on the fly. For AIX admins who are new to Linux it seems a step backwards and they liken it to HP-UX or some earlier volume management...

    Ahh, but the beauty of having separate filesystem and block device is that it's so damn flexible. I can build an LVM volume group on iSCSI LUNs exported from a another system. In that VG I can create a set of LUNs that I can use for the basis of my DRBD volume. In that DRBD volume I can carve out other disks. Or I can multipath them. Or create a software RAID.

    Anyhoo, DRBD is a really cool technology. It gives the ability to create HA pairs on the cheap. You can put anything from a shared apache docroot there to the disks for Oracle RAC. With fast networking available for cheap, almost any shop can have the toys that were once only affordable to big companies...

    • Re:Linux FS rocks (Score:4, Interesting)

      by mindstrm (20013) on Friday December 11, 2009 @12:05AM (#30398268)

      Or you could have ZFS where you don't even need to resize.. it just happens.

      And you still have block device representations if you want them, along with all the other benefits of zfs.

      • Re: (Score:2, Flamebait)

        by drinkypoo (153816)

        Every time someone talks about how much they like some filesystems on Linux, someone pops up to tell us about how great ZFS is. Well, the license is shit, it was chosen specifically for GPL incompatibility, and sun can fuck off into the air. Stop trolling.

        • Re: (Score:3, Insightful)

          by Anonymous Coward

          That's not a licensing problem with ZFS. That's a licensing problem with Linux. ZFS integrated perfectly well with FreeBSD, license-wise.

          • by KiloByte (825081)

            Uhm, wrong. Sun's employees have specifically said [wikipedia.org] they introduced the incompatibility on purpose. So it's not a problem with Linux, but a problem with Sun.

            As GP said, stop trolling.

            • So it's not a problem with Linux, but a problem with Sun.

              And it's not even a license problem it's a patent problem. Remember how ZFS is like 6000 LoC? It would have been re-implemented by now if not for patents - heck, it's far more useful than many other odd filesystems linux supports.

              • by KiloByte (825081)

                It can be done better, too. ZFS' cow can be done only for entire filesystems, so you need to mount the snapshot. I have an idea for cloning directory hierarchies, this can be damn more powerful.

                Of course, I don't have a crack team of coders and I'm not suicidal to start something as big as a filesystem as a personal toy project -- but it's something to think of.

                • It can be done better, too. ZFS' cow can be done only for entire filesystems, so you need to mount the snapshot. I have an idea for cloning directory hierarchies, this can be damn more powerful.

                  How would that work? ZFS filesystems are ~free, though, you could mount a new one at an arbitrary directory location if you really needed it.

        • ZFS works great on Linux. Some of us don't care about the license, only if the software works. If you're a lawyer then I'm sure you love to get in a tizzy about the license, but for technical people with real work to do it's about whether the code works and is stable. It does and it is.

        • Every time someone talks about how much they like some filesystems on Linux, someone pops up to tell us about how great ZFS is. Well, the license is shit, it was chosen specifically for GPL incompatibility, and sun can fuck off into the air

          I think the point is somebody is going on and on about their awesome new buggy whip and somebody pops up and says, "dude, it's the 21st century, buy a fucking car."

          But anyway, you identify the real problem well, and there's some hope that Oracle will liberate the talent a

      • by greg1104 (461138)

        Or you could have ZFS where you don't even need to resize.. it just happens.

        Right, except if you want to do something crazy like reduce pool capacity [opensolaris.org], which is impossible.

    • Re: (Score:1, Insightful)

      by Anonymous Coward

      As soon as you're paying for Oracle RAC, you're so far gone from the Realm of Cheap that saving some bucks with DRBD isn't a concern any more.

    • Uuum, what stops you from doing the same thing on Linux? Every partition / logic volume can be partitioned again, and so on.

      Maybe I don’t get the difference.

      • by Wodin (33658)

        He was saying that in AIX it's all integrated and therefore easy and AIX admins tend to think of the way it's done in Linux as a step backwards, BUT with the Linux way of doing things it's much more flexible exactly because "every partition / [logical] volume can be partitioned again, and so on."

  • by pjr.cc (760528) on Friday December 11, 2009 @12:31AM (#30398376)

    I dont like drbd (though i've used it for a while)... its a massive convoluted and complex mess and fairly inflexible.

    Personally, im hoping dm-replicator gets near completion sometime soon though details of it are rather scarce (i do have a kernel built with the dm-replicator patches, but trying to do anything with it seems near impossible)...

    I do a fair amount of work inside the storage world and drbd is just such a mess in so many ways.

    I sounds very critical and so forth to drbd and thats not the way i mean to come across. What I really am trying to say is that its bloated for the small amount of functionality it does and with a couple of minor tweeks could do much MUCH more. Its a kewl piece of software, but like many FOSS projects has a hideous, weighty config prone to confusion (something you just dont need with DR).

    Still, that is the way it is!

    • by Macka (9388)

      Don't hold your breath for dm-replicator, it's still a way off. And even when it does hit you'll only get active-passive replication. Active-active isn't even on the road map yet and DRBD has that today. In addition there is no support today for dm-replicator in any of the popular linux cluster stacks where DRBD is very well supported and has been for a many years.

    • Re: (Score:3, Interesting)

      by sydb (176695)

      I implemented a DRBD/heartbeat mail cluster for a client about six years ago. At the same time I implemented a half-baked user replication solution using Unison when we should have been using LDAP. I picked up DRBD and heartbeat easily under pressure and found the config logical and consistent once I understood the underlying concepts. Certainly not bloated. Unison on the other hand caused major headaches. So quite clearly, like LSD, DRBD affects different users in different ways and perhaps you should stic

  • Yes but what does it all mean?

  • 2002 (Score:1, Interesting)

    by Anonymous Coward

    FreeBSD users have been doing it for 7 years with the default kernel. I guess that's one reason why it's more popular with companies that depend on HA, such as Bank of America. I love having ZFS as well, the combination is sooooo bad ass :-)

    For those that run BRDB and want to try it, can read this [74.125.77.132].

  • by pesc (147035) on Friday December 11, 2009 @04:40AM (#30399304)

    Ah, Linux gets disk level clustering?

    It is interesting to compare with what VMS offered 25 years ago [wikipedia.org]:
    - VMS could have multiple nodes (can DRBD? It is not obvious from the web site.)
    - All VMS nodes have read and write access to the file systems
    - The distributed lock manager [wikipedia.org] helps with file locking in this case.
    - VMS has the concept of quorum [hp.com] to avoid the "split brain" syndrom mentioned on the web page.

    • Re: (Score:3, Funny)

      by Macka (9388)

      Yes yes yes - but 99.9% of slashdot users have probably never seen VMS, never mind a VMS LAVC cluster, so they have no idea that even today their latest toys are still playing catch up. Hell, half of 'em probably weren't even born then.

      Now if only I had an excuse to shout "get off my lawn" ;-)

      • VMS? That's the toy OS from DEC for people who can't afford MULTICS but want something that sucks a bit less than UNIX, right?
    • by jabuzz (182671)

      That is a cluster file system, which is something DRDB is not.

      Oh, and yes there are a number of clustered file systems both free and non-free for Linux

We want to create puppets that pull their own strings. - Ann Marion

Working...