Forgot your password?
typodupeerror
The Internet Software Linux IT

Wikimedia Simplifies By Moving To Ubuntu 215

Posted by kdawson
from the all-eggs-one-basket dept.
David Gerard writes "Wikimedia, the organization that runs Wikipedia and associated sites, has moved its server infrastructure entirely to Ubuntu 8.04 from a hodge-podge of Ubuntu, Red Hat, and various Fedora versions. 400 servers were involved and the project has been going on for 2 years. (There's also a small amount of OpenSolaris on the backend. All open source!)"
This discussion has been archived. No new comments can be posted.

Wikimedia Simplifies By Moving To Ubuntu

Comments Filter:
  • Re:And? (Score:3, Informative)

    by Anonymous Coward on Friday October 10, 2008 @11:53AM (#25328511)

    8.04 is a LTS release. Which is obviously the reasoning behind the version choice.

  • by Rik Sweeney (471717) on Friday October 10, 2008 @11:53AM (#25328527) Homepage

    I did not know that ubuntu was a player in the server market.

    http://www.ubuntu.com/products/whatisubuntu/serveredition [ubuntu.com]

  • Re:And? (Score:5, Informative)

    by Ngarrang (1023425) on Friday October 10, 2008 @12:00PM (#25328613) Journal

    How is this news?

    Well they either should have stuck with 7.10 or waited for 8.10.

    That's news...

    8.04 is a long-term release. In the world of servers, that counts for something. Also, there were changes from 7.10 to 8.04 that were probably things Wikimedia wanted to take advantage of.

  • by Anonymous Coward on Friday October 10, 2008 @12:12PM (#25328741)

    It's right up there at the top of the page giving the Ubuntu version as 8.04, which is called Hardy Heron, which BTW is an LTS release (Long Term Support).

    8.04 is rock-solid stable and has all the stuff in it to be a lean, mean, yet well-equipped server platform right from the base install.

    I've been running Hardy Heron since May 2008 without a problem, after switcing from being a long time SuSE/OpenSuSE user.

  • by Shade of Pyrrhus (992978) on Friday October 10, 2008 @12:16PM (#25328791)
    Ubuntu server edition is stripped down and customizable, as well. I assume they didn't use the desktop edition.

    This may be an outdated experience, but...I ran a single server with Gentoo for a while - until updating became such a tremendous pain. Manually merging configuration changes and such is simply not a good way to spend time, and neither is reading release notes to see whether I can simply use the old config and ignore new changes. Ubuntu is nice because installing and updating apps is easy, there is a wide variety of apps available for it, and it's quick and easy to install. Gentoo distro installation was a very lengthy, manual process - has this changed?

    I'd agree with others that say that CentOS may have been a better choice, but in my eyes the choice between the two comes down to preference of package management systems rather than any difference in security or performance.
  • Re:And? (Score:5, Informative)

    by cyphercell (843398) on Friday October 10, 2008 @12:24PM (#25328891) Homepage Journal
    the cuticle doesn't properly detach itself from the nail as it grows. The nail's growth slowly tears your skin apart.
  • The first technical person was Brion, who'd done the job as a volunteer for quite a while before that.

    I started editing Wikipedia in early 2004. I believe they'd just made the radical jump from one box to three boxes.

    Now stuff is structured in a horizontally-expandable fashion. "Add some more Squids." "Add some more Apache servers." So a single platform is an obvious win, and picking one platform to standardise on is actually more important than which of various near-indistinguishable free Unix-like operating systems that could all do the job they pick.

  • Re:go on.... (Score:3, Informative)

    by David Gerard (12369) <slashdot@NospAm.davidgerard.co.uk> on Friday October 10, 2008 @12:46PM (#25329165) Homepage

    I put the story in the queue as an insight into how a top-10 free content site run by a severely under-resourced charity does its stuff. And it's all over the press this morning, fwiw.

  • by brion (1316) on Friday October 10, 2008 @12:49PM (#25329233) Homepage
    That is an entirely accurate summary of the situation. :) We still have a tiny technical staff, and re-organization of things that got thrown together in a hurry long ago is an ongoing task.
  • by joe_cot (1011355) on Friday October 10, 2008 @12:52PM (#25329285) Homepage
    You'd think that, but consider this:

    If you install Redhat, it costs money, because they support it.

    If you install CentOS, it's free, but if you need support, there is none. You can get support from third parties, but not Red Hat. To get support from RedHat, they'd need to move from CentOS to RHEL.

    If you install Ubuntu, it's free. If you need commercial support, you can pay Canonical. They could, for example, pay Canonical for a year, and, if they can handle it on their own, not renew their support contract. They also can choose later to go back to them. That's a lot more freedom than Red Hat can give, and unlike CentOS, they have someone to fall back on if they need help.
  • by brion (1316) on Friday October 10, 2008 @12:56PM (#25329349) Homepage

    Canonical has recently provided us a donated support contract, but that didn't influence our (much earlier) decision to stick with Ubuntu.

    Primarily:

    • We liked it better
    • It's nice that people can run the same version locally (who runs CentOS on their desktop? Playing CentOS vs RHEL just feels like a big fat kludge and tells you there's something broken about the distro.)
    • Unlike Debian stable, and like Fedora, it's updated fairly frequently so we get a decent rate of package updates for infrastructure...
    • ...unlike Fedora, it's not so bleeding edge that things die all the time (SELinux breaking everything, yay!)
    • ...and Canonical actually puts out security updates for a decent amount of time.
  • Re:Not so happy (Score:3, Informative)

    by brion (1316) on Friday October 10, 2008 @12:59PM (#25329391) Homepage
    Strangely enough, none of the things that bother you are an issue for us. Either they were fixed over two years ago, or they don't affect us.
  • Re:How many admins? (Score:5, Informative)

    by brion (1316) on Friday October 10, 2008 @01:10PM (#25329519) Homepage

    Mass installation of a customized distro can do better than mass installation of a general distro (eg, the kernel and software can be optimized for your use case).

    And indeed, we use a slightly customized Ubuntu, in that we have our own patched versions of some packages (PHP, Squid, MySQL, some custom PHP extensions, etc) tweaked for performance or features we need, plus custom meta-packages to install the configurations we require on different server sub-types.

    This is pretty easy to do on any distro with a decent package manager. I still like apt better than yum, though!

  • by brion (1316) on Friday October 10, 2008 @01:25PM (#25329749) Homepage

    These are on our new image/media-upload fileservers. We're trying out the wonders of ZFS (snapshotting for consistent backups and "rm -rf oops" protection, potentially filesystem-level replication, etc).

    Since they're an isolated service type it's not a *huge* burden to have them be a little funky (eg, we don't randomly have an OpenSolaris box in the middle of the Apache/PHP cluster), though if we could do ZFS on Linux without jumping through scary hoops we'd happily to that instead!

    We'll try it out for a while, and if we're happy with it we'll keep using it, if not we'll migrate to something else eventually (the machines should as happily run Ubuntu as they do OpenSolaris)

  • by Drew M. (5831) on Friday October 10, 2008 @01:37PM (#25329899) Homepage

    Just another person who's dealt with Ubuntu in a large enterprise setting. I don't mean for these comments to be flamebait, but it may come off that way. I'd just like to see more attention put toward them.

    1. Incomplete automated installer. You can do nearly anything from Redhat's kickstart, but working with d-i doing partitioning, especially more advanced lvm and software raid setup is nearly impossible without some custom scripting hacks outside of d-i. Also, don't even ask what happens when you have a usb disk (or even just a card reader) plugged into the machine at automated install time, guess what gets recognized as /dev/sda... Speaking of which, since Ubuntu has their own installer, they don't support, fix, or use d-i, which means a lot of the time you will run into other random d-i installation bugs.

    2. Ldap/krb5 stability. It's quite obvious that Ubuntu doesn't put a priority on testing or stability patching any of this, and in large scale deployments it just falls over on the server and client side.

    3. Nobody in the enterprise uses cds or dvds to install, everything is automated from PXE, which means creating a local mirror to install from. Guess how difficult it is to mirror the "pool" directory without also getting the packages from every other version of Ubuntu. Yes you could use a script that parses the Packages file and only downloads the packages you need, but that just leaves more room for errors. Why can't I just have a single directory I can rsync?

    4. When doing large scale automated apt-get update; apt-get upgrade tasks, ask what happens to apt-get/dpkg when a postinstall script fails, or there were file conflicts. Yes, the machine never fetches updates again. dpkg --configure -a and dpkg --purge --force-reinstreq and apt-get -f install are your manual cleanup friends. Also don't ask what happens when a user wants to install a local package with dpkg -i. Yes it prints an error, but unknowingly to the user the package actually gets half installed and breaks the automated update jobs. Why isn't there a --force flag to prevent this from happening?

    5. When patching packages, there's at least 8 different ways a diff could be included in the sources. Here's a incomplete list of a few different schemes I've found over the years:
    - Just drop the patches into patches/
    - Just drop the patches into a non-standard patches/ directory
    - Drop the patches into patches/ and add it to 00list
    - Drop the patches into patches/ and manually patch the source yourself
    - Edit the rules file and add in the patches manually

    They really needs to adopt a single patching format, rather than quilt, dpatch, dbs, cdbs, and a bunch of other minor ones.

    The sad part about this, is nearly all of these issues also exist in upstream Debian. I'd love to see these get fixed. I'd like more choices that I can run in the enterprise.

  • Re:And? (Score:3, Informative)

    by AmberBlackCat (829689) on Friday October 10, 2008 @01:55PM (#25330097)
    A nail clipper works better to remove the "flaps". And applying cocoa butter or shea butter to the area afterwards, as well as the area between the nail and the finger.
  • Re:Simple is good (Score:5, Informative)

    by moosesocks (264553) on Friday October 10, 2008 @02:22PM (#25330533) Homepage

    I need to overwhelmingly emphasize that OS X Server is *barely* suitable for a production environment.

    I'm a big fan of Apple, and do appreciate the nice GUIs that they provided with OS X Server. However, it's not particularly stable, tends to break at odd intervals, and ignores many common Unix conventions, making it a huge pain to perform certain tasks, or do things not supported by the GUI.

    It's a nice start, but I'd be very cautious about adopting it across your entire server infrastructure. Using it to host certain Apple-y apps might be fine, though I'd rely upon Linux/BSD for serious server tasks, especially if you already have the staff/experience to do so.

  • Re:How many admins? (Score:4, Informative)

    by Colin Smith (2679) on Friday October 10, 2008 @02:49PM (#25330859)

    Try this for an idea... The whole concept of "installation" is wrong.

    Build your own distributions. One per purpose.

    Use something like RockLinux [rocklinux.org]

    to build a ramdisk image which contains all of the software and configuration required for a particular application. By "all" I mean "only". You end up with a single file which you put on a tftp server, you boot your servers over dhcp, they pick up the OS image and boot to the image on a ramdisk.

    e.g. You might have one squid image, one PHP app server image, one Mysql rdbms server image etc. When the image boots it does whatever is required to run the app successully. e.g. putting a filesystem on the hard disk.

    The benefits:

    • Zero server configuration. (or close to it) this means no need for YUM, no RPM, no APT. No dependencies.
    • Massive scalability because of above.
    • Only tested images reach production. You know it is going to work because the production image is the same single file, you know exactly how it is going to perform because you tested exactly the same file already.
    • Everything is version controlled and completely repeatable as part of the build process.

    2 admins can run 500-1000 systems in a site easily because there is really only one machine; the network. Logarithmic increase in effort with the number of systems.

  • Re:And? (Score:3, Informative)

    by Randle_Revar (229304) * <kelly.clowers@gmail.com> on Friday October 10, 2008 @02:53PM (#25330947) Homepage Journal

    >new Xorg ditching its config file

    You can run xorg without a config file now, but you don't have to (I believe that was also true in xorg 7.3). And every version recently has been making more of the old config file redundant or unneeded. Instead it relies more on autodetection and sane defaults, which is a good thing. But you can still use the config file to override, if needed.

  • by petermgreen (876956) <plugwash@p10l[ ].net ['ink' in gap]> on Friday October 10, 2008 @07:06PM (#25333745) Homepage

    uess how difficult it is to mirror the "pool" directory without also getting the packages from every other version of Ubuntu.
    Not too hard you just have to use the right tool, https://help.ubuntu.com/community/Debmirror [ubuntu.com]

    Why can't I just have a single directory I can rsync?
    IIRC the main reason debian introduce the pool structure is to allow packages to be shared between versions (particularlly testing and unstable) and therefore reduce the archive size.

I am the wandering glitch -- catch me if you can.

Working...