Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Linux Software

Kernel.org Needs Some Help, Perl Foundation Got Some 181

Dante wrote in to say "I just read this on the Linux Kernel mailing list, it's from Peter Anvin, one of the ftp.kernel.org maintainers...
H. Peter Anvin writes: "The recent troubles we've had at kernel.org pretty much highlight the issues with having an offsite system with no easy physical access. This begs the question if we could establish another primary kernel.org site; this would not only reduce the load on any one site but deal with any one failure in a much more graceful way.

Anyone have any ideas of some organization who would be willing to host a second kernel.org server? Such an organization should expect around 25 Mbit/s sustained traffic, and up to 40-100 Mbit/s peak traffic (this one can be adjusted to fit the available resources.) If so, please contact me."

In related news, mbadolato wrote in to tell us that "there's a press release over at dyndns.org announcing that they've donated $20,000 to the Perl Foundation!

'Thanks primarily to Perl and other Open Source technologies, we are able to provide DNS services to over 180,000 members of the Internet community,' said Tim Wilde, founder and chief executive officer of DynDNS.org. 'This is our way of giving back to some of the people whose tireless devotion to writing quality software has enabled us to provide our services to the Internet community over the past three years.'

The donation page for the Perl Foundation can be found here

This discussion has been archived. No new comments can be posted.

Kernel.org Needs Some Help, Perl Foundation Got Some

Comments Filter:
  • by sebol ( 112743 ) on Sunday January 20, 2002 @07:07AM (#2871727) Homepage
    To persuade RH, AOL willing to sponsor second kernel.org site.
  • What is the composition of these bandwidth requirements? I mean, if it's primarily file download, requirements could maybe be met by a decentral system. Just curious...
    • Sounds like a bad idea to me. Centralised control is a good thing on occasion - at the moment we have kernel.org and a list of mirrors. I think this is a sweet spot between being too centralised (bandwidth problems) and too distributed (control problems)

      So if we have a decentralised system (I assume you are inferring P2P style? If not, then surely the mirrors system is adequate?) how are we to stop people abusing the system? I could host a trojan version - where do people go to verify the MD5? How many people actually check MD5s? My point is that we need a trusted pool of servers; the truly paranoid can still check their MD5s against those at kernel.org, the rest of us can be assured that this is one of the trusted mirrors, and not a server owned by j03 133+ h4X0R.

      Hey - my acct works again!
  • by Anonymous Coward on Sunday January 20, 2002 @07:15AM (#2871735)
    Perl Foundation Got Some

    Well it's about time! I couldn't bear to think about those 45 year old GNU hippie geek virgins working at the Perl Foundation anymore.

    -Metrollica

    Read my UPDATED journals!
  • Dyn Dns. (Score:4, Informative)

    by ImaLamer ( 260199 ) <john@lamar.gmail@com> on Sunday January 20, 2002 @07:26AM (#2871740) Homepage Journal
    Wow! Because of this donation to the Perl guys and gals, my check is in the mail.

    I use DynDNS, and have been thinking about sending them *something*. I don't have much, but to see them donate a little something in return is nice. Any donation is cheaper than getting a 'real' domain name. Plus *.ath.cx is kinda cool, I wonder if goatse.ath.cx is available?

    I just hope all these donations don't go to stuff like strippers. I could be spending my money on that.
  • by dreamquick ( 229454 ) on Sunday January 20, 2002 @07:30AM (#2871742) Homepage
    But I would imagine that everytime a new kernal is released that world+dog go to view the site. I serious doubt that everyone who goes to the site downloads, most just read - lots of people reading still requires a fair old chunk of bandwidth.

    As we saw earlier in the week the /. effect seemed to bring that site to its knees, but as regular news sites see linux as more and more relevant to their audience they too will link people in, adding to the problem.

    Realistically they are victims of their own success - people want information about the new kernal and doubtlessly they want to download stuff too.

    As these once limited interest sites become more mainstream, then it's clear that they need to maintain quality of service, and that means no /. effects which stop people going to your site and could potentially discourage them from going there in the future.

    Just my 2c
    • by Zico ( 14255 )

      I would imagine that everytime a new kernal is released that world+dog go to view the site.


      Why would you think that? Let's say _all_ Linux users did that, you're still only talking about 0.24% of computer out there. Of course, realistically, not all Linux users are going to visit the site for every new kernel release. Just imagine if Linux ever gets 1% of the computer world, that's going to be over 4 times the load that they're complaining about now. Say it with me, kids, "Scalability nightmares."

  • donation ability (Score:3, Interesting)

    by Anonymous Coward on Sunday January 20, 2002 @07:37AM (#2871746)
    Is kernel.org a 501(c)3 org so that whomever decides to donate bandwidth can have 'some' help and not take all the expense of moving that much traffic?
  • by AirLace ( 86148 ) on Sunday January 20, 2002 @08:25AM (#2871777)
    The real problem is not lack of bandwidth. There's plenty of it to go around. What saddens me is that the ISC is throwing away [iu.edu] most of $80,000 annually because people can't be bothered to patch their kernel, and instead rely on downloading the full 20MB tarball every time a new kernel is released.


    The solution to the problem is really quite simple. As Larry McVoy, who maintains the powerful but non-free BitKeeper [bitkeeper.com] RCS system and knows a thing or two about patches, has hinted towards [iu.edu] kernel.org may be better off not providing a tarball for each release, instead providing some kind of utility that downloads the latest available full kernel, but only if necessary, plus patches. I'd be all for it. In the meantime, there are a number of incremental patching systems for the Linux kernel that automatically download patches, verify their signatures and patch the kernel which may be worth looking into to save time, bandwidth and resources:

    • dlkern [chaosreigns.com]
    • buildkernel [stearns.org]
    • lkpatch, which has fallen into disrepair


    Of course, it goes without saying that everyone should still use their local mirror, particularly as kernel.org will only be accessible to mirrors for the forseeable future.

    • by GrafZahl ( 180304 ) <reb AT b4mad DOT net> on Sunday January 20, 2002 @08:47AM (#2871794) Homepage
      IMHO your last sentence is the key to the solution. Other projects use a system whereby they don't allow direct ftp / http download access. This should be purely for mirror sites.

      It would also help, if /. would not put direct links to kernel.org but would instead provide or link to a list of mirrors.

      BTW, the following message I just got back from www dot kernel dot org:

      The Linux Kernel Archives is currently offline due to a hardware failure. However, mirror sites are receiving updates; please use a mirror site instead.

      May be this is the beginning of the end to direct access to them?!

      Regards,
      REB

      • BTW, the following message I just got back from www dot kernel dot org:

        The Linux Kernel Archives is currently offline due to a hardware failure. However, mirror sites are receiving updates; please use a mirror site instead.

        May be this is the beginning of the end to direct access to them?!

        More likely, it's the problem referred to in the email.
      • Re:The real problem (Score:3, Interesting)

        by axioun ( 119341 )
        I believe using mirror sites exclusively is the key, too. This could be done either with the users' knowledge or behind the scenes. The first method would probably be the quickest to implement at this time. It would involve something much like what Tucows has. Have a main site that links to mirrors based on region. This selector system might even temporarily not show links to sites that are currently experiencing high loads so people don't just pick the server on the top of the list and /. it.

        This could probably be simplified by creating a small program that automates the server search process, and possibly download and apply patches on the fly. Later we might have a mirror system that is distributed. This latter stage could have a system like Freenet which is P2P or a load-balancing, centralized system.

        Another method that just came to mind is to use a system similar to Audiogalaxy or Napster. While you've just downloaded a new kernel, your download software remains running. The software could have a default time to terminate. While the software is running, it acts as a small server, and the "Kernelgalaxy" software (what a fitting name) controls distribution in a distributed manner.
    • Re:The real problem (Score:3, Informative)

      by cperciva ( 102828 )
      kernel.org may be better off not providing a tarball for each release, instead providing some kind of utility that downloads the latest available full kernel, but only if necessary, plus patches

      I agree, that's a great idea. But it needs a good name... how about calling it CVSup [polstra.com]?
    • by kinnunen ( 197981 ) on Sunday January 20, 2002 @09:05AM (#2871813)
      There should be a 'make update' that atomatically retrieves (from the nearest mirror) the patch, uncompresses it and performs the patching. I can't imagine it would be too hard to code and the ease of use should convince even the "I have a 100M pipe so I don't bother with patches"-people to use it.
      • by Dom2 ( 838 )
        In FreeBSD, CVSup is used to keep source trees in sync. It's a very efficient way of keeping several hundred megs of source code up to date.

        I realise that CVSup is oriented towards CVS trees, which the Linux kernel isn't, but even an rsync server would be better than continuously downloading the patch.

        The reason I mention this is because of the support infrastructure available in FreeBSD:

        1. Install cvsup (once)
        2. edit /etc/make.conf (once)
        3. cd /usr/src && make update

        CVSup is available at http://www.polstra.com/projects/freeware/CVSup/
    • by Molina the Bofh ( 99621 ) on Sunday January 20, 2002 @10:00AM (#2871847) Homepage
      I know system admins (if they can be called such) that don't know how to patch. Granted, it's not an intuitive process.

      Also. some are not updating from the last kernel, wich requires more than 1 patch.

      I slso believe that such tool, that downloaded as many patches as needed, should be explained and incentivated in the kerne's site motd. If they don't show it on the front page, and say it's an advantage to the user, then few people are going to get it.
      • What does "incentivated" mean? Is it anything like advocated?
        Or perhaps publicised?
      • I agree. I'm not exactly a system admin. (Just a computer-illiterate home user who manages to link directories to themselves...)

        I downloaded the patch from 2.4.16 to 2.4.17, but couldn't figure out what to do with it... So I downloaded the whole kernel. Which means I wasted even more bandwidth than I would have if I just downloaded the whole thing.

    • I dont know if this is because I live in Europen, but I dont think i ever downloaded a singel kernel (or patch) directly from kernel.org. For me, that site is always quite slow, 30K/sek top or so (The cable across the atlantic is a real bottleneck). I always use a good mirror, usually hosted by a university in Sweden, Finland or Germany, where I get higher download speeds (usually around 5-600KB/sek)
      I've been using Linux since 1998, and my favorite mirrors are often faster than I am in finding out there's a new Kernel available.
      I do have to agree though, that I rarely download patches. I have a reason for this though, I like to keep the old kernel tree for a while, if the new kernel is broken in some way that facilitates changing back to the old one. I value that, so I value full tarballs.
      • Re:The real problem (Score:2, Informative)

        by pljones ( 129152 )
        You don't ever need to download the tar ball. You've got tar and patch installed. Brew your own:

        mv $OLDKERNEL linux
        cd linux
        mv .config my-config
        make realclean
        cd ..
        tar cjf linux $OLDKERNEL.tar.bz2
        mv linux $OLDKERNEL
        tar xjf $OLDKERNEL.tar.bz
        cd linux
        patch -p1 cd ..
        mv linux $NEWKERNEL
        ln -s $NEWKERNEL linux
        cd linux
        cp my-config .oldconfig
        make oldconfig
        Then "going back" is as simple as changing the symlink back.

      • If you keep the patch around you can apply it in reverse (-R or --reverse in the patch program) and revert the tree to the original one.
    • by Quixote ( 154172 ) on Sunday January 20, 2002 @10:31AM (#2871892) Homepage Journal
      ISC is throwing away most of $80,000 annually because people can't be bothered to patch their kernel, and instead rely on downloading the full 20MB tarball every time a new kernel is released.

      Another thing: when I download the kernel (as an end-user), why should I have to download Sparc, MIPS, IA64, PPC, etc. sources when all I need is x86? Maybe the kernel sources can be broken apart into individual architectures for the end users (obviously not for the kernel hackers).
      Just did a quick check on my 2.4.17-xfs. The "arch" directory, compressed, takes 5.1MB. But the i386 subdirectory takes just 400KB (all figures with tar | gzip -9). I see a potential savings of 4.5MB right here.
      • Re:The real problem (Score:2, Interesting)

        by wampus ( 1932 )
        Give that (wo)man a cigar... and some karma!

        I would MUCH rather grab linux-base-2.4.18.tar.bz2 and linux-i386-2.4.18.tar.bz2. Possibly break it up even further... I don't need or want SCSI drivers Video4Linux, ISDN support, etc, etc. Why not a simple utility similar to make xconfig that not only configures your kernel, but downloads the neccesary subsystems to your tree. This could also be used to patch for new releases. Just hit the "Upgrade" button and your hard drive grinds for a few seconds and spits out a shiny new kernel.
        This could even automate the process of checking signatures, similar to Ximian's Red Carpet. I know I never bother to check when I download from kernel.org or mirrors, but if the utility that downloads automagically does it, those that are worried about j03 h@x0r putting up a rogue mirror will be that much more likely to use the mirrors.
      • Another thing: when I download the kernel (as an end-user), why should I have to download Sparc, MIPS, IA64, PPC, etc. sources when all I need is x86? Maybe the kernel sources can be broken apart into individual architectures for the end users (obviously not for the kernel hackers).

        I thought as you do, and about a year and a half ago - I think it was with 2.2.16, not sure - I tried removing the nonessential arch/* garbage, because at the time I was restricted to about 2.5 gigs total space on my system.

        The kernel failed to compile in some pretty horrific ways.

        Now, if that stuff is for the architecture specified ONLY, it should've done just fine. Apparently, though, somehow stuff for my i386 kernel needs to reference stuff from IA64 or Alpha or some such. I think that diverse kernel types based on Architecture - from a download perspective, at least - is a good idea, but it appears that there needs to be some code cleanup before it can happen.
    • Of course, it goes without saying that everyone should still use their local mirror, particularly as kernel.org will only be accessible to mirrors for the forseeable future

      I've thought a bit about this. It seems that every separate open-source project has to set up their own mirrors, there is no automatic system for finding "nearby" connections or for load-balancing, and volunteering for being a mirror can cause you to incur quite a bit of bandwidth costs.


      I would be willing to pay a modest fee for downloads. I don't know if many of the other open-source fans (notorious cheapskates) would be willing, but if they were:

      • A site like SourceForge could take the lead.
      • Anybody who wants could have their software hosted there.
      • Anybody who wants to would create an account to download.
      • Charges would be proportional to bandwidth used, possibly off-peak times lower.
      • SourceForge (or whoever) could subcontract connections to Akamai, Connxion, or others.
      • Possibly a portion of the money collected could go to the creators of the software, the big problem being how to decide how the money gets divided for projects with many contributors or a lot of code re-use.
      • Anybody could still distribute independently. It would not (could not) be an exclusive arrangement.
    • by nehril ( 115874 ) on Sunday January 20, 2002 @11:45AM (#2872050)
      or they could provide patches only for the first few days after a release (forcing the rabid hordes to learn how to patch if they want the goods NOW) and then, at a random later time, post the full tarball. this might cause some percentage of ppl to get into the habit of patching, which should make a significant dent in their bandwidth needs.
    • Or, the whole bloody thing can be put into read-only CVS, which would only update what's necessary and not force people to apply 12 patches one after another - why bother coming up with sequential patch-applying schemes, when the work's been done already?
  • I'm somehow sorry for the Kernel developers, they work really hard to provide us with a nice piece of software, and then they are stopped due to hardware failures ... first the old server, and now the new compaq machine!

    The porblem is I can't help them, my 256k (yeah that's what they call broadband here) DSL Line is at full load, sucking the newest Debian Packages :(
    Yep, you're right: Shit Happens :-/
  • by mollusk ( 195851 ) on Sunday January 20, 2002 @08:47AM (#2871793) Homepage
    If only there were some organization out there with a vested interest in linux. One that owes its existence to linux. Preferably one with a history of involvement in the linux community. Maybe even some corporation that runs it own websites dealing with open source issues. And while we're wishing, maybe even some entity with experience dealing with massive traffic requirements similar to the dreaded 'slashdot effect.'

    Nah, nothing comes to mind. Shame.
    • Though I am sure the parent post is not meant seriously it highlights one of the current issues evolving around resource provision for Free Software projects.

      IMHO we have to move away from the idea a central resource allocation for projects is good. The currecnt debate about SourceForge and VA is the best example.

      It is just dangerous to rely on one or two main sites run by corporates. Why not try to find many corporations that can share the load and also minimise the risk for the project of being affected by companies woes?

      One main server that is the central source for many, many mirrors and without direct access for the end-user might be the way ahead.

      Regards,
      REB

    • Actually there's many companies bigger than OSDN which benefit from Linux and to who the bandwidth cost would be negligable or free (they already pay for fat net pipes).

      IBM would be top of my list, but there's also SGI, Compaq, maybe RedHat (soon to be AOL-Time-Warner?!), SuSE...
  • solutions.... (Score:1, Insightful)

    by Anonymous Coward
    i agree with putting limiting direct access to kernel.org. save it for the mirrors, and for key developers.

    as for the rest of us, how about having words with major shareware sites.

    Or possibly some sort of pay-per-mb-download scheme from official mirrors? that would certainly improve the popularity of patching.

    just ideas. flame as you see fit.
  • MS (Score:2, Funny)

    Maybe Microsoft could host kernel.org.
    Then again, maybe not...
  • by A Commentor ( 459578 ) on Sunday January 20, 2002 @09:39AM (#2871831) Homepage
    25 Mb/s = 3.125 MB/s = 187.5 MB/min = 11.25 GB/hr = 270 GB/day = 8.1 TB/month
  • erm, Google? (Score:2, Insightful)

    by doq ( 126365 )
    Didn't google say recently how they save so much money with Open Source, etc etc etc?

    They probably have that kind of bw... :/
  • by John Zero ( 3370 )
    This is just another case, where the Freenet Project [freenetproject.org] could help, in the future.

    Besides being an anonymous (but authentic) information storage, it is also higly distributed.
    In this case, that would mean there would be no "bottleneck", instead, the kernel tar.gz would be distributed, in small blocks.

    Too bad it's yet under development, but it's getting better and better.
    • Freenet has lots of unecessary complexity from trying to provide both publisher and downloader untraceability. Why take huge performance hits for hosting legal data like the linux kernel? Also, Freenet looses because it requires people to leave their Freenet node running after they are done dowloading, else they aren't really helping to relieve the main problem here. It turns out that the vast majority of people don't want to do this; they just want to get what they want and get the hell out.

      Burris
    • Check out the Everything Over Freenet project especially the freenetified apt-get [sourceforge.net] (apt is the Debian package manager).
  • my 2 Euro-cents (Score:2, Insightful)

    by Anonymous Coward
    How 'bout if people who use P2P like Edonkey, who downloaded the kernel-source, just put it in their shared directory? That would distribute the load a little bit
    • No thanks! (Score:2, Informative)

      by Bake ( 2609 )
      I'd rather download my latest kernel from a known and reputable source.
      P2P is not the way to distribute a critical thing like the kernel source. It only takes one individual with an malicious intent to spread a virus in the kernel itself! Linux has been virus free for over 10 years and I would personally like to keep it that way.
      • perhaps have kernel.org host a md5sum of the authentic tarball. then, once you get it, md5sum, and compare. also, compare tarball sizes, to the byte.

        at least, thats what i would do...
      • p2p systems like the one you mention use cryptographic hashes to make sure you are getting the file you really wanted. It would be just as secure as downloading from kernel.org.

        Burris
  • by cpuffer_hammer ( 31542 ) on Sunday January 20, 2002 @10:22AM (#2871876) Homepage
    Could some form of broadcast or streaming help?

    What about Net-News it is an existing system that could distribute the patch to many of the people within a day.

    The new kernel could be released,
    mirrors and approved developers could have access to kernel.org for the first 3 days. Then only be patch downloads from kernel.org for the next 4 days.

    BUT through net-news and most people would have it in a day.
    • I can't speak for everyone, but my RoadRunner news server is absolutely terrible when it comes to large and multi-part files. And they don't seem to have much interest in making things better. If this became the "regular" way to get kernels, it would really suck.
      • Which Road Runner server are you using? I am on the Texas server (Austin, I'm pretty sure) and have had pretty good service from it. Retension is pretty good, and I consistanly get about 15Meg a minute.

        If new kernels where posted to a specific group even before the mirrors where updated, I'd sure get it from there. Of course, I'd have to hit kernel.org to get the md5sum before I did anything with it. Just include instructions along with the post on how to verify authenticity and how important it is. Have an automated process post the latest kernel every 2 or 3 days to make sure it's always there.

        The traffic generated would surely be less than that used by even a small warez or porn newsgroup.

        Or another idea... make a deal with the major Internet providors to create their own mirrors of major projects. It would save them bandwidth out of their network, and would also be a good PR move.
    • Could some form of broadcast or streaming help?

      You can't broadcast to the entire internet.

      Multicast/streaming is of little use for downloading files, unless you include code to correct errors, and resubmitting erred parts. Besides, multicast doesn't really work well across the entire internet.

      Akamai could help. They have lots of servers located near the edges of the internet. The ISPs hosting these servers would probably welcome less costly traffic to upstream providers, and instead get the files to the customers from a server on their own network.
      • I think Sprint has a fully meshed multicast network. Any Sprint customer can request to join the multicast "grid".

        As for the broadcasting of source code, you'd have to use some form of forward error correction (FEC) similiar to what is used in sat. broadcasting.
  • by macemoneta ( 154740 ) on Sunday January 20, 2002 @10:33AM (#2871896) Homepage
    Asking for a big chunk of bandwidth and centralized management is the problem. It's expensive. Instead:

    - Use the existing file sharing networks

    - Netnews (I can get the file faster from my ISP's news server than anywhere else), and software like pan makes getting all the pieces trivial.

    - Are there any open file sharing projects that we could use? Something that limited to a single download per user wouldn't be onerous. There are lots of cable/DSL linux users.
    • The problem is people could tamper with it, would pretty wrong when others put backdoor patches into the kernel and distribute it tru there common p2p program or maybe even Usenet...we need to have a secure place on the web to distribute these files.

      Now i am not saying this couldnt be solved, by md5 or what ever check is needed, but you know that i know that common people like me dont use those checks, until something goes wrong....

      Quazion.
    • this is all and good with the filesharing idea (I like the news server idea myself) But if it was peer to peer (I'm tired if that's not what your talking about buy me another beer and put me to bed), what would happen the first time Uber l33t black hat Joey Joe Joe Junior Shabadoo inserted some sort of backdoor into the kernel?

      The user's system would be comprimised and they wouldn't know it.
      But in a case like that, that's where the MD5 checksum would come in handy, which would still have to come from a reliable source (kernel.org) but how many admins out there who were just tossed into the job would actually compare the 2?

      It's a great idea but for those MS system admins who are now running a linux box, it's a black hat's oasis of comprised systems. He could just keep a log of who downloaded his modified kernel and start scanning thier IP blocks.
      • The number of people who say something along the lines of what you've just said is astonishing. People currently DONT use MD5 because they feel they DONT have to. If you've got a trusted source, then no worries. People WOULD use MD5 if they got patches from newsgroups or P2P. There is currently no need for rampant MD5 use, so of course few use it. But just as by now most hapless n00bs know to never click on a questionable email attachment, people will learn quickly to use checksums.
        • Exactly. Classical "users" don't upgrade their kernels from source. Those that do, should know that the MD5SUM is available. You can even make the MD5 verification part of the build process, it that is a concern:

          # make

          *** MD5UM file not found. Copy MD5 file "xyz"
          from http://www.trustedsource.com to the kernel directory before performing a make.

          This might not be a bad idea anyway, to keep people from becoming complacent.
    • There's no such thing as Netnews, perhaps you're talking about Usenet. File distribution over Usenet is HIGHLY inefficient because the stuff has to be base64 encoded which blows it up quite a bit.
      • It may be inefficient in space; but my cable can transfer at 600KBs (that K-Bytes, not bits) per second, because I am going to a "local" server in my cable company. It doesn't matter if a 22MB kernel becomes 60MB, I can still get it in under 2 minutes. And there are a lot of Usenet servers.
    • - Use the existing file sharing networks

      A legitimate use for P2P? Unpossible! ;)

  • Before starting a download, eople should get the answers to two questions:

    1) Who *REALLY* needs to update, and why?

    2) How to patch an older kernel.

  • Perhaps I'm uneducated about what all is out there currently. But it seems to me that with a common base of GNU, Open Source Software, etc.. The building of the Public Common Wealth of computer operating systems and the benefit this is providing to everyone around the world, that there should be some sort of Sponsorship type of program or organization that would help to streamline the searching for and finding, the matching up of corporate sponsors to software projects.

    Would it be so bad that in return the Sponsor gets a mention in the source code and perhaps even in any "about this program" information box or command line option?

    A old paper of mine that might generate some ideas [mindspring.com]
  • the message was from hpa: H. Peter Anvin, chrisd obviously needs to look a little harder at his bootscreens....
  • How about some of those porn sites that use Perl extensively donating some of their profits?

    Of course, maybe they do - if I was getting bucks from porn people I might not be issuing press releases about it :)

  • by sstamps ( 39313 ) on Sunday January 20, 2002 @11:04AM (#2871961) Homepage
    1) Only allow access by mirrors and those ACTUALLY working on the kernel (ie, the kernel maintainers).
    2) Get more mirrors. We're talking like several thousand here. As an ISP, I know I would not mind hosting a mirror, but I cannot afford $25,000/month in bandwidth. Splitting up the load using a large number of mirrors would make it MUCH cheaper to mirror the kernel files.
    3) Use a highly-efficient load-sharing/balancing mechanism to direct people to mirror sites. Make it so the user can browse/select the files from the main kernel.org site, but the downloads are redirected from there to the mirrors.
    4) Use a better patch process to reduce the size of the average download: 1) The x.x.0 release is the only full download, 2) use a patch system that downloads all the necessary updates, applies them to the x.x.0 version (or whatever the version the user already has) to get the latest version, and 3) MD5 checksums EVERY file to verify that it was patched correctly.
  • It's been up on the kernel.org site for a while now, and it only sends binary diffs, so it's quite light. The overhead is like 1%, plus the diffs. In fact, Tridge wrote this exact case (rsyncing the linux kernel) up in one of his early papers on the need for rsync. rsync.samba.org [samba.org] for more information.
    • Rsync is going to want to work on uncompressed tarballs or plain old unpacked source trees. (diffing .gz or .bz2 files doesn't work well, your first change usually causes the entire remainder of the file to be different) That is fine for bandwidth because it will compress the data before sending, but you do need to watch out for CPU use. My very rough estimate is that pumping out 50Mbit/sec of traffic with rsync is going to take something like a pair of top notch ia386 cpus.

      I think it would still be a win. CPUs are cheap compared to bandwidth, but it does change the hosting dynamic a bit. You can't just use a nasty old desktop PC or virtual server to soak up the excess bandwidth. You need something with a little meat to it. (Not to disparge virtual servers, but they usually have paying clients that care if their CPU gets saturated.)

      Now that you mention it, it is such a good idea that I will set one up today. I can't publish the access to it. I only have 2mbps uncomitted and that won't go far on a slashdotted kernel loader. :-)

      I suppose I will settle for rsyncing the tar file around. It is seductive the rsync the unpacked source tree, but if I turn on --delete then it will whack my .o files and header links and I'll always be doing a full build, plus if I need to do a quick 'forgot a module' build my kernel version will have changed. If I do not turn on --delete it would mostly work, but I could accumulate obsolete files and there is a danger of date stamp problems.
  • why not use cvsup?
    and also keep drivers separte from the kernel
    i think its dumb to include drivers for EVERYTHING! the tarballs are getting huge
    imagein being able to use a 5 month old kernel because it works, and just compileing the latest modules for your scsi adapter or soundcard
    does'nt that sound like a better way of doing things
    but then again im the kind of person who uses the kernel that comes with a distro, because it works and there no reason to waste your time compileing a new one
    guess im just not a ubergeek ;-/
  • Just do what a large number of larger sites like Yahoo do, and ask the mirror list (currently over 100+ sites) to act as full mirrors, and round-robin the dns.

    Further, make kernel.org alpha.kernel.org, and have alpha be the site everyone mirrors from, and restrict access to it to only core kernel developers.

    Overnight, you'd have taken care of the problem.
    • I always use www.bz2.us.kernel.org when downloading kernels...
      It seems to give me a random mirror site every time
    • insightful?? (Score:1, Informative)

      by Anonymous Coward
      This is the way it is currently done, and the way it has been done since the beginning of kernel.org more or less. Go to ftp.us.kernel.org (replace 'us' with your country code for a closer mirror -- seriously, it works for any country code) and you should get a different server each time.
  • Mirroring Scheme? (Score:2, Interesting)

    by suwain_2 ( 260792 )
    The first thing I have to say is largely irrelevant, but quite a good idea, IMHO. Let's move the 'official' list of kernel.org mirrors *off* the kernel.org machine. When you can't load their main page at all, it makes no sense to expect me to use a mirror, since I don't have that list.

    That said... An idea struck me. Suppose kernel.org develops a system where incoming requests are sent to a server; the routing is based on the preferences of the admin of that server. For example, let's say I work at a small webhosting company, and have a couple of T3s. (I don't really.) All our servers run Linux, and I want to give back to the community, and show everyone how cool I am. But I'm gonna go out of business if I allow 90 Mbps of bandwidth to be going to kernel.myfakelittlehostingcompany.com, because my customers wouldn't have any bandwidth.

    So I decide "Well... I can spare 10 Mbps at the most." I could tell the kernel.org admins this, and when you went to kernel.org, you would be redirected to a site, based on what the mirror sites wanted.

    I'm willing to be that companies like OSDN, RedHat, Mandrake, Rackspace, etc. might be willing to let a kernel.org mirror have a small bit of their bandwidth, if they had a way of knowing that it would be controlled.

    • Let's move the 'official' list of kernel.org mirrors *off* the kernel.org machine. When you can't load their main page at all, it makes no sense to expect me to use a mirror, since I don't have that list.

      Two words: google cache.
    • I'm willing to be [sic] that companies like OSDN, RedHat, Mandrake, Rackspace, etc.

      Not Mandrake at least. They, wisely, don't host a thing. It's all mirrors and it works well, especially since most people are downloading 650MB ISO images. Something kernel.org should think about. The only problem with that is they need fast syncing of the mirrors, because a lot of -pre patches are only tested for a few days until the next one comes out....
  • Dreamworks and other Hollywood Moguls are big benefactors of Linux capabilities on animation, farm processing, etc.

    Why not ask then to contribute?

    Also the big biofirms who're working on the biotechnology are using massive pc farms with linux. Why not them for support?

    The support of non-profit project can be done by this, and others big corporate guys who are profiting a lot from the technology we are developping.
  • Why not a peer to peer setup of some sort using a semi custom app that uses checksums to prevent any nasties from getting added to the src. A central mirror would not be needed, and it would be nearly impossible to have anything totally down at once.
  • by Bowie J. Poag ( 16898 ) on Sunday January 20, 2002 @01:33PM (#2872417) Homepage


    If iBiblio [ibiblio.org] is willing to host Propaganda, i'm sure they're more than willing to host a kernel.org mirror. In my experience, they've been wonderfully good hosts and run a very professional operation. Better still, they aren't hiding alterior motives by hosting free software projects, unlike the two-letter chameleon we've all grown to hate over the past year or two.

    As for SourceForge, I wouldnt bother..The company that runs it turned its back on the community that made it's existance even possible. That alone should dissuade anyone. More tangible perhaps would be that the company has only one product (which they cant sell), and only enough cash on hand to last another year at most.

    Cheers,
  • P2P (Score:4, Insightful)

    by burris ( 122191 ) on Sunday January 20, 2002 @02:31PM (#2872677)
    The Linux Kernel and other open source projects should use some of the up and coming peer-to-peer distribution technology to host files. Tools like BitTorrent [bitconjurer.org] use the bandwidth of the current downloaders to relieve pressure on the main publishers. DL'ers get pieces of the file in random order and automatically exchange pieces with each other. From the users perspective, they just clicked a link. This technology desperately needs to be used by the Kernel archives, Debian, RedCarpet, etc...


    Burris

  • by drwho ( 4190 )
    OK, I am sure I am probably not the first person to suggest this, but P2P makes the most sense for large files. but of course someone might try to trojan the file. What we need is a cross-platform, open-source p2p with ENFORCED RC5 checking.

    What I find really distressing is the number of times I have searched for things such as the linux kernel or openbsd iso's on the late morpheus/kazaa, or gnutella, and not found any. after downloading, I put up the kernel never to have it touched.

    Many people don't use gnutella because of the high bandwidth consumption of the porn/warez/mp3 searches going on. What if we were to start a new gnutella network strictly for open source software. maybe network ID "opensource"? I have to admit I don't consider myself an expert on gnutella, maybe someone who is can remark on the merits of this idea.

    If I get three other people who respond to this post and will tell me they are willing to be a part of the network, I will put up a node here.

    Now, what we need is kernel maintainers (and other project maintainers) to post MD5s in a place that won't be swamped by people using the traditional methods (c2s?) of leeching^h^h^h^h^hdownloading.
  • Twenty thousand dollars is, what, two months of the burdened cost of a single mid-range software engineer? Why is this worth an exclamation point? Many organizations pay several full-time programmers to work on open source projects -- any one of these organizations exceeds this tiny donation in a week.

    Tim
    • Re:chump change (Score:2, Informative)

      by CyberBry ( 196935 )
      You're missing the point, though. To a company like IBM, $20,000 is nothing. They make millions off of open source, so having a few developers on staff contributing to open source project ultimately leads to more profit for them (and benefits the open source community, too).

      However, last I checked, dyndns.org wasn't a multimillion dollar company. It's run entirely by volunteers (myself being one of them), and almost every penny of our income comes from user donations. We don't have a single full time anyone on staff to run the service, let alone to develop open source applications.

      What this donation is supposed to signify is not so much a dollar amount, but what it stands for. It's a challenge to any company making money from open source. If a non-profit service run by volunteers can donate such a sizable amount to open source, imagine what for-profit companies are capable of doing? Come up, try and one up us :)
  • I plan on setting up a 400GB raid system this spring (would that be enough? I could add an additional 400GB but it wouldn't be raid), unfortunatly I'm only on cable. If I had the bandwidth they're asking for, I'd host that site no problem. It'd be fun. So basically, if anyone would like to fund a nice pipe into my basement, I'll do it :P
    Although I'm sure there's no shortage of admins for kernel.org.
  • by Sentry21 ( 8183 ) on Sunday January 20, 2002 @05:14PM (#2873366) Journal
    I would suggest Canadians start using the Canadian Communications Research Centre [www.crc.ca]'s servers. They do have the bandwidth, especially to University students (CA*Net III and other academic/research networks) who are probably a large amount of users of the Linux kernel.

    Incidentally, just some of the files available via rsync [ftp.crc.ca] from ftp.crc.ca [ftp.crc.ca] (which, sadly, has an anon-ftp limit of 25 users):

    Perl CPAN mirror
    GNOME desktop and utilities
    Linux HowTo's
    KDE desktop and utilities
    XFree86
    ALSA Linux sound drivers
    Debian Linux
    Debian Linux ISO images
    FreeBSD
    Alexy Kuznetsov's IP Routing Tools for linux
    Blackdown's port of JAVA for Linux
    CRC's Linux Kernel Archive (I wonder if this is different from the standard kernel? they don't say "CRC's" on everything)
    CRC's RedHat mirror
    CRC's RedHat Contrib (interesting)
    Slackware Linux
    SUSE Linux
    TurboLinux
    CRC's VQEG Digital Video Experiments
    CRC's XAnim mirror

    So if you are Canadian and use any of these software packages (or the others on the page I linked), PLEASE use this site, it's extremely fast on broadband and even more so to university students. I used it for my Debian packages until they dropped the limit on FTP users. Maybe if I ask real nice they'll give me a login....

    The site itself is interesting too. Neat stuff.

    --Dan
  • by macdaddy ( 38372 )
    ...but I don't have near that much bandwidth to donate. IMHO, if a given site needs that much bandwidth (especially an open source one) then there should be a dozen or more mirrors set up off of round-robin DNS w/ fault tolerance. I don't know of anyone that could justify to the powers that be that they need to pay more $$ for +25Mbits of bandwidth to sustain something for which they get no return on (no visible return to the suits at least). If there was a need for only 2Mbits, then I'm sure many placed already have that much to spare. IMHO there should be a dozen or two mirrors. I can't see it working any other way unless there just happens to be a rich geek that wants to put up the green for a big fat pipe. I would if I could but I'm not Bill G.

    That's an idea! Linus should ask Bill G to front the green for the Linux kernel site. I know Billy-boy would do it. He's all for helping the community... ;-)

  • "The recent troubles we've had at kernel.org pretty much highlight the issues with having an offsite system with no easy physical access. This begs the question if we could establish another primary kernel.org site; this would not only reduce the load on any one site but deal with any one failure in a much more graceful way."

    I'm confused; I don't see how it begs the question. I dont see an attempt to prove anything with itself.

To do nothing is to be nothing.

Working...