Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Linux Software

Ask Donald Becker 273

This is a "needs no introduction" introduction, because Donald Becker is one of the people who has been most influential in making GNU/Linux a usable operating system, and is also one of the "fathers" of Beowulf and commodity supercomputing clusters in general. Usual Slashdot interview rules apply, plus a special one for this interview only: "What if we made a Beowulf cluster of these?" is not an appropriate question.
This discussion has been archived. No new comments can be posted.

Ask Donald Becker

Comments Filter:
  • One question... (Score:5, Interesting)

    by Noryungi ( 70322 ) on Monday October 14, 2002 @12:04PM (#4446015) Homepage Journal

    (And this is a serious one!)

    Why did you choose Linux, instead of *BSD, to create a Beowulf?

    This is a serious question, not a flame: why choose Linux over, say FreeBSD? Is it just because your employer already used Linux? Because you had used Linux before and had more experience working with it? Because you had tested both, and found Linux better than BSD? Or because Linux had tools the *BSD did not have?

    Just a question...
    • Re:One question... (Score:2, Interesting)

      by Anonymous Coward
      I don't know why. The ports system on FreeBSD seems to make a lot of sense.

      What I do know is that Red Hat is often preffered for making clusters because of rpms. Sure you can say "rpms suck blah blah blah" but they are very easy to install. If you know that an rpm will work, it is a sinch to upgrade a package on all computers in the cluster at once. This way you can be sure that they are runing the same software

      I guess the FreeBSD ports system would be just as easy to use... Am I right?
      • your kinda right, er.... correct. FreeBSD has two systems of software distribution, ports, and packages. Ports is always a good thing if your into automation of downloading/checking for dependancies/building/installing software in its most fundamental way. Packages, on the other hand, have already been through that process, as in somebody (or you) has already compiled the source code into usable binary exacutable. The packages are more comparable to an RPM as they have the ability to check for dependancies for dynamically linked binaries, or whatever, and know how to install themselves. FreeBSD packages are every bit the equal of a RPM. The key differnce of your RPMs, and the FreeBSD ports/package collection is that FreeBSD is centralized. As in we have a big team of peopel that maintain the ports, and add new ports all the time. When software becomes a port, it also becomes a package since every freebsd port has the "make package" target in their Makefiles. When a new release of FreeBSD is about to be made public, the entire ports tree is compiled down into packages and put onto a set of cdroms. Well, not all the ports are, just the ones with licenses that permite such redistribution, currently over 8000 titles! PAckages are a nice way for administrators to build once, and install many places on their networks/clusters.
    • Re:One question... (Score:4, Informative)

      by kiolbasa ( 122675 ) on Monday October 14, 2002 @12:11PM (#4446065) Homepage

      If I recall, the definition of a Beowulf cluster does not specify Linux specifically, only a free operating system.

      Look it up [canonical.org]

    • Re:One question... (Score:2, Informative)

      by jahjeremy ( 323931 )

      Note: Only logged because AC is giving me formkey errors.

      This isn't a very well-informed question. Beowulf does not specify a particular platform.

      From the Beowulf FAQ [canonical.org]:
      [Beowulf is] a kind of high-performance massively parallel computer built primarily out of commodity hardware components, running a free-software operating system like Linux or FreeBSD, interconnected by a private high-speed network.

      Please mod accordingly. Let's not waste Becker's time or one of the ten questions on ill-informed pablum refuted in the first question of an FAQ.

  • What one thing would you like to see added to the Linux Kernel? Why hasn't anyone done that allready? And how would that "One Thing" be better than somebody else's suggestion?

  • Your dream machines (Score:3, Interesting)

    by trevry ( 225903 ) on Monday October 14, 2002 @12:07PM (#4446035) Homepage
    Seein' as we all want to make beowulf clusters out of toasters and keyrings and coffee makers......
    What are the five dream machines that you want to have on, under, near, beside or in you over the next 10 years? And what do you forsee actually happening?
    And can we make beowulf clusters out of them?
  • by Bob Abooey ( 224634 ) <bababooey@techie.com> on Monday October 14, 2002 @12:07PM (#4446037) Homepage Journal
    This reminds me of when I was working at Apple in the secret (heh... my NDA ran out and they did away with the division so it's no longer a secret...) two button mouse division. Basically we used open source tools, like Linux/Emacs and Linux/gcc because they were fast and very functional, but we could never get any of the team leaders to permit them company wide due to the fact that they didn't come shrink wrapped and thus were not officially supported. Now I know that you can get great support from Usenet but that's not good enough for the pinheads who are in upper management at Apple.

    So, my question would be, what's the best way for an engineer at a large company to address this issue with the people they report to.
    • by Anonymous Coward
      Nerf guns. Lots of Nerf guns.

      Seriously. If your bosses were putting that much effort and cash into the adoption of a common device that the rest of the world already uses, rational arguments just won't work.

      Nerf guns.
    • How long ago were these efforts? The current upper management of Apple has built the foundation of their company on FreeBSD with Darwin, so it seems that you crack about the pinheads in Apple upper management is a past tense statement.
  • hardware insights? (Score:5, Interesting)

    by rambham ( 60312 ) on Monday October 14, 2002 @12:09PM (#4446047)
    With your experience creating so many ethernet
    drivers do you have any opinions or suggestions
    for hardware makers? Aside from good documentation
    what makes a given hardware device easy to work
    with and what makes a device hard to work with?
  • by eadint ( 156250 ) on Monday October 14, 2002 @12:09PM (#4446055) Homepage Journal
    Where do you see the Beowulf project going in the future. Plus I hope that this isn't a redundant question but will you be adding MPI into your clusters to create a kind of PVM / MPI hybrid. how about really good documentation. and finally. Have you considered porting your software over to the OS X platform. if so how can the apple community help.
    • Beowulf, or rather bproc, has supported MPI for quite some time already. And if you don't want any single system image thing, MPI has been available for Linux for bloody ages. And I don't think it requires any kernel parts, so probably it would be quite easy to port to OSX.
  • If you could make a Beowulf cluster out of anything, what would it be and why?
    • by registro ( 608191 ) on Monday October 14, 2002 @01:17PM (#4446573)
      Well, if the goal is TeraFlofs league clusters, What about using other commodity chips, like nvidia/3dlabs/ATI GPUs? Some groups are already working on ways to use GPUs as mathematic coprocessors [uni-duisburg.de], using OpenGL to represent numerical vector operations by OpenGl based graphics operations on images.
      This is not just academic. GPUs are real Vector processors, some of them capable of +200 GFlpos, using up to 128 bits Floating point precision.

      Thats about 100 times faster than Intel based CPUs.

      Extending math libs, and adapting MPI to use the cluster GPUs as vector oriented Math co-procesor, could potentially lead to 10 computers TeraFlops level beowulf Clusters.
  • Enterprise Computing (Score:5, Interesting)

    by llamalicious ( 448215 ) on Monday October 14, 2002 @12:12PM (#4446074) Journal
    What is - in your opinion - the single most important, necessary evolution of GNU/Linux systems to help them become a commodity in the enterprise arena?
  • by theBraindonor ( 577245 ) on Monday October 14, 2002 @12:13PM (#4446077) Homepage
    What do you see as the future of distributed computing? Will it be massive P2P distributed networks for the masses? Or will it be large commercial distributed networks?

    What tools exist that will be used to create this future? What tools still need to be invented?
  • by Hairy_Potter ( 219096 ) on Monday October 14, 2002 @12:15PM (#4446095) Homepage
    You've written code that's used by millions of people, just about anyone who's ever networked a Linux box has used your driver. Yet, you're not rich. Would you like to see Linux people chip in a few bucks out of gratitude?
    • by Anonymous Coward
      Nah. What most folks don't realize is that I own their box(es) thanks to some backdoor code in the drivers. I've got the worlds most powerful beowulf cluster at my fingertips, so money means very little to me. Believe me, if I wanted money I could take whatever I wanted.

      -Don
    • I have seen him answering this question a few times.
      He said he was quite happy about it. He contributed just a bit to the Linux kernel, but he got the rest of it for free. He accepted that as rather good payment.
      So unless he has changed that view, I'm not really interested in an answer to your question. And I also wonder if he'd care for a Paypalled website, allthough, you never know :-)
  • The Future.. (Score:5, Interesting)

    by Anonymous Coward on Monday October 14, 2002 @12:15PM (#4446100)
    What do you see the future holding for:
    (a) Beowulf technology
    (b) Different uses for Beowulf
  • OSS methodologies (Score:3, Interesting)

    by Jack Wagner ( 444727 ) on Monday October 14, 2002 @12:17PM (#4446116) Homepage Journal
    How do you address the issues that Gnu/Linux suffer from by sticking with legacy programming methodologies and legacy (sad but true) programming languagues? Namely, lack of modern programming methodologies like eXtreme Programming and C++ or Java on the language issue.

    Warmest regards,
    --Jack
  • how does it feel (Score:2, Interesting)

    by Anonymous Coward
    how does it feel to be giving all your work away for free when you could be making money working for a big company like microsoft?
    • Bear in mind that Donald Becker works for Nasa.. I think given the choice of the two, Nasa would be more exciting to work for.

      Well.. as long as they got Steve Ballmer to work there.. Scientists, scientists, scientists!! :)
  • by Bonker ( 243350 ) on Monday October 14, 2002 @12:23PM (#4446157)
    If you could add features to the x86 processor or architecture to make clustering work better, what features would you add?
  • Why (Score:4, Interesting)

    by idontneedanickname ( 570477 ) on Monday October 14, 2002 @12:24PM (#4446161)
    Why did you name it after the epic "Beowulf [amazon.com]"?
    • Re:Why (Score:2, Interesting)

      by Anonymous Coward
      Actually, Thomas Sterling named it. I heard this straight from him. They asked for a name for his project over the phone, and he looked up at his bookshelf and Beowulf was the first thing he saw.

  • Two questions (Score:5, Interesting)

    by Theodore Logan ( 139352 ) on Monday October 14, 2002 @12:24PM (#4446162)
    First one I really think should be in your faq [canonical.org], but that I haven't been able to find there: why did you choose the name of an millenia old epos about a Scandinavian warrior for something that does not even seem distantly related?

    Secondly, do you read Slashdot, and if so, what do you think about all the troll jokes about Beowulfs? Was at least funny in the beginning to hear about people "imagining" clusters of just about anything?

    Ok, so it was more than two questions. Sue me.
  • OS X (Score:4, Interesting)

    by paradesign ( 561561 ) on Monday October 14, 2002 @12:24PM (#4446166) Homepage
    What are your thoughts on Mac OS X?

    It seems to have all of the polish and usability Linux/BSD people dream about, whie still maintainging a fully open source BSD core (Darwin). Have you ever been tempted away from Linux like so many ohers?

  • by turgid ( 580780 ) on Monday October 14, 2002 @12:26PM (#4446179) Journal
    Why do you think that message passing clusters are more popular than single system image clusters, and do you see the balance changing eventually? In other words, is there no compelling reason to choose single system image for most problems? Also, when do you think that the 32-bit addressing limitations of x86 hardware will become a problem for doing Big Science on clusters?
    • These two ideas aren't mutually exclusive. The Cray T3E is a single system image machine, but applications running on it are almost exclusively message passing in nature. My opinion on why there aren't proliferations of SSI clusters is because they are a lot harder to build. If you go with a set of seperate machines, which means you don't have a single *memory* image, getting the various kernels involved to all talk to each other is non-trivial. If you go with a single memory image, then you're not really doing a cluster, you are building a real supercomputer. Examples of single memory image machines of large size include the Sun Enterprise 1x000 line, the SGI Origin 2000/3000 series, the Cray T3E, and the not-quite-in-full-production-yet Cray X1.


      As for the 32 bit address limit, it's already a problem. For large scientific code, 4GB per processor is already not enough. Now, people live with it, but that doesn't mean they like it. Intel's 36-bit addressing hack doesn't help, either, since you still have a single-virtual-address space limitation of 32 bits. This is probably the biggest motivation to go to a 64 bit architecture. Note that this problem also applies to large databases.

    • by joib ( 70841 ) on Monday October 14, 2002 @01:51PM (#4446824)
      Programming MPI (i.e. message-passing) is slow, difficult and error-prone. But I'd say making the hardware and especially the operating system for a single system image computer with thousands of processors is even more difficult. Or hey, why stop at thousands of processors? IBM is designing their Blue Gene computer, with 1 million processors. How do you make a single kernel scale on a system like that?

      The traditional approach is to use fine grained locking in the kernel, but this tends to lead to unmaintainable code and low performance on lower end systems. For an example of this see Solaris, or most other big iron unix kernels.

      Another approach is the OS cluster idea championed by Larry McVoy (the Bitkeeper guy). The idea is that you run many kernels on the same computer, one kernel takes care of something like 4-8 cpu:s. And then they cooperate somehow so they can give the impression of SSI.

      A third approach seems to be the K42 exokernel project by IBM. They claim very good scalability without complicated lock hierarchies. The basic design idea seems to be to avoid global data whenever possible. Perhaps someone more knowledgeable might shed more light on this...

      But anyway, until someone comes up with a kernel that scales to zillions of cpu:s, message passing is about the only way to go. Libraries the give you the illusion of using threads but are actually using message passing underneath might ease the pain somewhat, but for some reason they have not become popular. Perhaps there is too much overhead. And some people claim that giving the programmer the illusion that all memory access is equal speed leads to slow code. The same argument also applies to NUMA systems.

      And on the system administration side of things, projects like mosix and bproc already today give you the impression of a single system image. Of course your application still has to use message passing, but administration and maintenance of a cluster is greatly simplified.
  • Grid Computing? (Score:2, Interesting)

    by SilverThorn ( 133151 )
    Is Grid Computing (http://www.butterfly.net) [butterfly.net]) really the foundation of enterprise-based Beowulf technology? If so, what other modernized aspects can this technology be applied to?
  • Do you ever regret leaving Steely Dan [steelydan.com]?
  • by johnnyb ( 4816 ) <jonathan@bartlettpublishing.com> on Monday October 14, 2002 @12:30PM (#4446212) Homepage
    In addition to being extremely smart, Donald Becker is a world-class guy. When I was new to Linux, I had trouble with one of his drivers. I emailed him, and within a day he emailed me back. It was a pretty stupid issue - I needed to download the latest drive :) However, he was very nice about it, didn't send me an RTFM - in fact he included instructions for building and installing it.

    Anyway, Donald - thanks for helping me out when I was a stupid newbie, you are truly a world-class fellow.
    • Yeah, he does a lot of support on top of the stuff he writes.

      And his drivers have much better diagnostic capabilities than anything else out there.
    • Don Becker does indeed ssom to have done some great work, especially on network drivers and clustering.
      I can't seem to find any information on his work regarding `making Linux usable' that's mentioned in the byline of this story. Am I missing something, or is that part of the intro a little bit confused (maybe Slashdot has a different definition of usability from the rest of the computing world)?
  • by unixmaster ( 573907 ) on Monday October 14, 2002 @12:31PM (#4446221) Journal
    What do you think about the affect of next Linux kernel v 2.6/3.0 on clustering when the new O(1) scheduler and VM and many new features taken into consideration?
  • by Effugas ( 2378 ) on Monday October 14, 2002 @12:33PM (#4446235) Homepage
    Dr. Becker,

    As I'm sure you've noticed, the price of memory has been driven into the ground -- indeed, it's so inexpensive, the economics seem to have rendered the usage of virtual memory nearly obsolete. Need another 256MB? Spend the $20 and buy it. It's just that simple.

    Now, memory makers can't let their goods be absolutely commodified forever, and I'm unconvinced that further speed increases, either in latency or bandwidth, will remain permanently relevant. So I'm curious about your opinion of embedding highly localized simple logical operators amongst the core memory circuitry itself. I've heard a slight amount about work in this direction, and it seems fascinating -- instead of requesting the raw contents of a block of memory, request the contents run through a highly local but massively parallelizable operation -- bit/byte/word interleaved XOR/ADD/MUL, for example. Obviously semiconductors can do more than store and forward; do you believe we a) will and b) should see memory implement trivial operations directly? What about non-turing complete instruction sets?

    Yours Truly,

    Dan Kaminsky
    DoxPara Research
    http://www.doxpara.com

    P.S. Please forgive me if this entire post reads like "What about a beowulf cluster of DIMMs?"
    P.P.S. Be honest: Do you ever find it ironic that the Internet Gold Standard for Ethernet cards ended up being called Tulip?
  • ually read Beowulf? [amazon.com] If you have, could you please write a 5,000 word book report for the that the children that have difficulty finding information on the book because of the saturation of Beowulf trolls, jokes and legitimate information online.

    Thanks for thinking of the children!

  • Did you ever have anything to do with crynwr?
    And why don't you people like vowels? :)

    (Thanks for the ne2000 driver!)

  • by jahjeremy ( 323931 ) on Monday October 14, 2002 @12:36PM (#4446260)
    Please describe the general process you follow for writing and testing ethernet drivers on linux.

    A couple more specific questions...

    1) What approach do you take in creating drivers for cards which have inaccurate or insufficient documentation?

    2) What tools do you use for debugging and and/or "discovering" the workings of old/obscure/poorly documented hardware?

    3) What skillset, i.e. languages, knowledge & tools, do you consider necessary to perform the kind of coding you routinely do (outside of hacker wizardry and C mastery)?

    I am also wondering how you got started writing ethernet drivers and clustering software for linux. What lead you down this specific path rather than other aspects of kernel/OS development?

    JM
  • Java (Score:5, Interesting)

    by Anonymous Coward on Monday October 14, 2002 @12:38PM (#4446280)
    What do you think about Java and its role in distributed computing? Do you have much experience with Java, and what are your opinions of it?
  • by iamsure ( 66666 ) on Monday October 14, 2002 @12:41PM (#4446294) Homepage
    As the man responsible for writing multiple network card drivers, you are in a unique position to answer this..

    What (FastEthernet/100mb) Network gear manufacturer do you prefer and recommend to others?

    Whether its servers, or home use, its an important question, as some are as buggy as all get out, and others are to die for.

    And if its a different answer, which manufacturer do YOU use?
    • Yes! If any question is to be answered, please let it be this one. After my Tulip card, my ethernet HW has all been poo. Does anyone make decent gear these days?

      • I like Becker's drivers, but I ran into a problem with his Tulip ones -- on a *massively* overloaded Ethernet, if you get 16 retransmits failing and so the transmit fails, the driver does a full reset of the card. This makes the card not send data for about two seconds, which means on an extremely overloaded Ethernet, the card isn't that useful.

        Right now, I'm using a 3c905b card (though it isn't a Becker project) with great success.

        I think Linus likes eepro cards, IIRC from lkml.
    • by steveha ( 103154 ) on Monday October 14, 2002 @02:52PM (#4447256) Homepage
      I blush to admit this, but I have already asked him this question. Last January I was having trouble with a network card, and I sent email to Mr. Becker asking his advice.

      Here a my quick summary of what he told me:

      Some network cards are really pathetic and/or broken. As long as you don't buy one of those, it doesn't really matter very much which one you buy.

      The 3Com 3c905 cards are a little bit better than other cards.

      I found this web page:

      http://www.fefe.de/linuxeth/ [www.fefe.de]

      Based on that web page and Mr. Becker's comments, I bought myself some 3Com 3c905c network cards, and I have been very happy with them.

      P.S. I used to buy my net cards by brand name. Bad idea! You must look beyond the brand name and see what chipset the net card uses. I bought a Linksys LNE100TX card and liked it, so I kept buying that card. But Linksys started making different versions of the card, using completely different chipsets, so the last time I bought that card it turned out to be really broken under Linux. Older LNE100TX cards work well with the "Tulip" driver under Linux, but newer ones are really broken.

      steveha
      • Its not just linksys...

        dlink also does it...

  • OpenMosix (Score:5, Interesting)

    by GigsVT ( 208848 ) on Monday October 14, 2002 @12:52PM (#4446367) Journal
    As someone who has made small contributions to the OpenMosix project, while I'm amazed at what clustering can do, I'm dissapointed at the same time at what it cannot.

    Distributed shared memory is a big hurdle facing the OpenMosix project over the next couple years. Right now any program that allocates shared memory cannot migrate. What do you think of projects like OpenMosix? Do you think we will reach a point where parallel programming is a thing of the past, discarded in favor of tools like OpenMosix that require no special programming considerations except implementing clean threading?
    • I can imagine an interesting architecture for SHM coherency involving L2 Broadcast as the backhaul and random hash broadcasts for most-recent-update-received synchronization. As long as updates are reasonably rare, this can work astonishingly well, though I must admit writes will inevitably block for significantly longer periods of time than they otherwise would locally. Fits in well with some other packet mangling I'm doing...toss me a mail, will ya?

      Of course, the obvious approach of only migrating processes and not the shared memory it allocated (instead using SHM-over-TCP-maybe-with-SEQ#'s-directly-mapping-to -the-2GB-space)
      also should work.

      --Dan
      www.doxpara.com
  • by gosand ( 234100 ) on Monday October 14, 2002 @12:54PM (#4446381)
    Donald, as the founder and CTO of Scyld, as well as a member of the board of directors, do you still get to hack, or is your time all taken up with business? Do you ever get the itch to get back to hacking code? If so, what are you working on?
  • by 4of12 ( 97621 ) on Monday October 14, 2002 @12:56PM (#4446393) Homepage Journal

    Would you care to comment on your experience in NASA working on an Open Source project? (I understand you've left NASA for Scyld, maybe that partially answers my questions, but I still want to know...)

    It seems as if your work on Beowulf clusters had a nice spin-off in terms of providing not only low cost supercomputing for academic, government and industrial users, but also in terms of Ethernet support for all sorts of Linux users.

    1. Are further spin-offs in the works, be it for advanced network interfaces or anything else?
    2. Are the program managers in government aware of the beneficial impact they have on a wider scale by funding work like yours?
    3. Do they even care?
  • by cyco/mico ( 178541 ) on Monday October 14, 2002 @01:02PM (#4446451)
    What made you believe in the future of Linux? What justifyed the efforts you put into its development? Was it rather the spirit of the community in the early days or maybe rather the realization: "This is _my_ tool, better suited than *BSD, and I can bring it to the point I need it to have."?
  • What if we made a Beowulf cluster of Donald Becker?
  • Big companies are jockying for good ground on the subject of Grid Computing. What role does Linux Beowulf have in the future of Grid Computing, and do you think that the community can come up with better Grid solutions than those being pushed by the Big Boys?
  • limits of clusters (Score:5, Interesting)

    by flaming-opus ( 8186 ) on Monday October 14, 2002 @01:05PM (#4446481)
    Beowulf and similar clusters have hugely lowered the cost of super-computing for a great number of scientific problems. Due to the great interdependance of data and the relative high latency of cluster interconnects, some problems are not easily worked on using clusters. What are the evolving areas of clustered computing? Where are the advances: Are new algorithms being developed for these difficult problems, or are clusters becoming more capable?

    - Also -

    What tools are seriously lacking in linux clusters? Are open source (or low cost) cluster filesystems necessary to expand the use of beowulf clusters? - Are better libraries needed? Where is research needed?
  • What comes next? (Score:5, Interesting)

    by Matt_Bennett ( 79107 ) on Monday October 14, 2002 @01:07PM (#4446494) Homepage Journal
    Ethernet seems to be reaching the end of its usable capacity- a gigabit ethernet card running at full bore (wire speed) can max out many machines both on bus bandwidth and CPU utilization. Infiniband appears to be the best alternative, but acceptance is so slow, it may never make it. There is a linux effort with Infiniband [sourceforge.net], but due to the slow acceptance and development of Infiniband, it seems we may never see the combination of good working hardware and a complete software implementation of the standard [infinibandta.org].

    If Ethernet consumes too many resources, and Infiniband is stillborn, what's the next communications medium for networking and clustering?
    • At moderate cost, the most popular are myrinet and dolphin, at around $1000-$1500 per node IIRC. For the high end stuff, there's quadrics, at $3000+ per node. Of course, quadrics is also mind-bogglingly fast, something like 350MB/s in each direction, and 5 us latency.
      • Really, my point is that right now, we don't have a standards based high speed interconnect other than Ethernet, and that is quickly running out of steam. I'm vaguely familiar with Myrinet, but it hasn't seemed to have caught on. (I just read that Myrinet is standards [myri.com] based.)

        Even then, those standards you mention are all about clustering, not general purpose networking. Ethernet has really caught on in the 30 or so years it has been around because it has been very adaptable- maybe that adaptability is exactly what is preventing it from progressing further.
    • by delta407 ( 518868 )
      a gigabit ethernet card running at full bore (wire speed) can max out many machines both on bus bandwidth and CPU utilization ... Ethernet consumes too many resources
      Wait -- if gigabit maxes out the bus bandwidth and CPU of a machine, how is that consuming too many resources? If my system bus can only transmit data at, say, 100 MB/s, and that goes directly to the gigabit card, why is that a bad thing?

      Isn't that a limitation of the computer, not a limitation of gigabit Ethernet?
      • The point is- the computer has no time to do anything but transfer data and the overhead associated with it, so if you happen to be doing anything other than transferring data, something has to give, so your data transfer speed is lower. Gigabit will max out a 33MHz, 32 bit PCI bus, but can only take up (about) 1/8th of a 133Mhz 64 bit PCI-X bus, but even in that circumstance, you'll come close to maxing out a 1 GHz Xeon, because TCP/IP has so much computational overhead.
  • by Eric Seppanen ( 79060 ) on Monday October 14, 2002 @01:08PM (#4446502)
    You wrote and maintain a lot of Linux network drivers. Unfortunately, these drivers stopped being included in Linus' kernel because he dislikes the backwards-compatibility code in them (throwing out the baby with the bathwater, if you ask me, but this is Slashdot and I dare not criticize the Great Leader too much). Sadly, the end-users are the ones that really suffer.

    Is this still the case and is there any hope of this deadlock ending? I know some folks have stepped up to maintain what's left of your code in the kernel; are they doing an adequate job?

  • by Loki_1929 ( 550940 ) on Monday October 14, 2002 @01:14PM (#4446548) Journal
    Donald Becker,

    With all that you've accomplished to date, how much do you think a Beowulf cluster of Donald Beckers could accomplish?

  • by guygee ( 453727 ) on Monday October 14, 2002 @01:16PM (#4446569)
    Given the decreasing ratio of power efficiency per transistor for newer generations of commodity CPUs, what suggestions do you have to reduce the total cost of ownership (including the necessary electrical power and cooling infrastructure upgrades)over the lifetime of large computing clusters?
  • by cfulmer ( 3166 ) on Monday October 14, 2002 @01:25PM (#4446639) Journal
    In general, the architecture provided by Beowulf works well on specific classes of problems -- those that can be divided among a large number of processors for simultaneous processing. Figuring out how to do the division of a large problem, however, is decidedly non-trivial. What tools do the commercial supercomputer outfits have to solve the problem that could be adapted to a Beowulf environment?
  • Free time at NASA??? (Score:5, Interesting)

    by ICA ( 237194 ) on Monday October 14, 2002 @01:35PM (#4446714)
    Here goes:

    What drives a guy working at NASA to develop a plethora of Ethernet drivers and architect a distributed computing system?

    Was this based on a need for better tools at work? Spare time?
  • Why is it? (Score:3, Insightful)

    by Jeremiah Cornelius ( 137 ) on Monday October 14, 2002 @01:39PM (#4446744) Homepage Journal
    Why is is that the Slashdot crew --and the Open Source world in general-- seem largely oblivious to your acheivements as a musician, composer and arranger?

    It would be nice to have an anecdote or two about your years with Steely Dan - or even the solo projects from the '80's.

  • Export restrictions (Score:4, Interesting)

    by Call Me Black Cloud ( 616282 ) on Monday October 14, 2002 @01:47PM (#4446803)
    Currently high-performance computers (supercomputers) are subject to export restrictions. Don't want the bad guys simulating their nuclear explosions in software or decrypting our secrets of course. This is an example of technology that can do a lot of good or a lot of bad depending on who's using it.

    Though it's certainly impossible at this point, do you think similar restrictions should apply to projects like Beowulf? At what point does the potential for bad things outweigh the potential for good things?
  • (open) mosix (Score:3, Interesting)

    by mulcher ( 241014 ) on Monday October 14, 2002 @01:59PM (#4446900)
    What do you think of openmosix? Is it the "true"
    answer for real beowulf computing?
  • by painehope ( 580569 ) on Monday October 14, 2002 @02:03PM (#4446938)
    Donald,
    As a member of the beowulf@beowulf.org, I have noticed that your posts generally seem to be of a technical, "yes/no, this is how you do it", etc. nature ( which is quite good actually ), and I've never really seen much stating your opinion on the way things are. I've got a few questions :
    1) how do you feel about high-speed interfaces, and the parallel code ( i.e. various flavors of MPI ) to take advantage of them? I noticed that every time benchmarks come up for Myrinet or SCI interfaces, we get a minor flamewar between said parties, and noone ever really mentions Infiniband ( and Gigabit ethernet to ea. node is still prohibitively expensive in terms of price/performance at the switch level ). This also brings up issues of free vs. propietary interfaces and software. What do you think are the futures of these technologies, and which model do you prefer : open source or Whatever Gets The Job Done(TM)?
    2) why did you pick Linux, as opposed to, say, one of the BSDs? At the time when you started doing Beowulfs, GNU/Linux wasn't the beloved child of the community that it is now, so what prompted the choice?
    3) also, what do you see the next wave of clustering to be? We saw mainframes ( Shared Memory Processors ), then high-powered clusters ( ala SP2 + SP3, SMP on ea. node, but no contiguous RAM across all nodes natively ), then the introduction of COTS ( Commodity-Off-The-Shelf ) Beowulfs, then next-generation Beowulfs ( higher-end dual ( sometimes quad or even now some Xeon NUMA boxen ) processor, large amounts of RAM, high-speed SCSI disks, 64 bit PCI or PCI-X, etc. ), which argues that the community goes w/ the next bright idea ( which is dependent on hardware ), and companies go w/ whatever gives them the most bang for their buck. Where do you think we're going now ( as far as the major trend, since there is no 1 answer to the various problems that MPPs are used to address )? Low power consumption, low-heat large farms? I'm all ears...
    Anyways, whether these questions get answered or not, thanks for the hard work you've done and all you've given to the community.
  • Dear Mr. Becker,

    What's your favorite flavor of high speed communications card for implementation within a beowulf cluster?

    Respectfully,
  • And now, of course, it's time for the famous questionnaire invented by Bernard Pivot for Bouillon de Culture...

    What is your favorite word?

    What is your least favorite word?

    What turns you on?

    What turns you off?

    What is your favorite curse word?

    What sound or noise do you love?

    What sound or noise do you hate?

    What profession other than your own would you like to attempt?

    What profession would least like to attempt?

    If Heaven exists, what would you like to hear God say when you arrive at the Pearly Gates?
  • by Koos ( 6812 ) <koos@kzdoos.xs4all.nl> on Monday October 14, 2002 @02:23PM (#4447084) Homepage
    That is the one thing I'd like to know. You had (at least from the looks of it) a good career at NASA where the work on the clustering and the high-performance network drivers was sort of an added bonus to help you do your research work.

    You changed to scyld [scyld.com] where the main objective is to earn money from the application of high-performance computing. You still make all those drivers available [scyld.com] and update them (many thanks for that) but the company also has to make money, you need to pay your meals and your home.

    What made you change, and how do you feel about that change now it's been a few years.

  • Is there a movement in the Beowulf community to develop an effective thread system that can operate over multiple machines?
  • by CresentCityRon ( 2570 ) on Monday October 14, 2002 @02:56PM (#4447313)
    Where would a greater return be found for the development effort today? Better cluster software or better end user application tools for cluster software?
  • by CresentCityRon ( 2570 ) on Monday October 14, 2002 @03:35PM (#4447732)
    I read from Dijstra and Knuth that they both noted how many programmers also played musical instruments - more than the standard population.

    This will will not further the clustering field but do you play any musical instruments?
  • by minaguib ( 591953 ) on Monday October 14, 2002 @03:48PM (#4447871) Homepage Journal
    Hello Donald,

    I'm a perl hacker (with a bit of C knowlege) and have made a good career out of it so far.

    However, lately I've found myself getting interested in the linux kernel and specifically, device drivers.

    My question is.. Where to begin ? I've seen your name in several drivers in the linux kernel (specifically to my case, the Intel EtherExpress Pro 10/100 card) and have spoken to you on usenet on occasion.

    What should a complete beginner like me learn to get into this area ? Specifically, kernel modules in general, hardware drivers in general, researching how to deal with a specific piece of hardware...

    Thanks for any tips :)

  • Rumor has it that when you were initially working on the Beowulf project (pre-infancy, while at NASA maybe?) and released some initial code on the web, some government entities were none-to-happy at the prospect of having foreign countries use that code to construct powerful clusters from commodity PCs.. in essence, to side-step export controls. You may also have been abducted and/or charged with "heinous" crimes while they were investigating Beowulf (black-bmw shady government style abduction).

    Can you lend any insight as to what these rumors may be based on? Do you have any advice for budding programmers as to how the government might react if we just release world-altering software into the open, like you did?

  • With all the interest in clusters Vector based systems seem to have fallen behind at least in the US. Do you think that cheap cluster systems are hurting classical vector based super computers?
  • by Anonymous Coward on Monday October 14, 2002 @04:38PM (#4448388)
    In the past, various network topologies were attempted on supercomputers such as the CM5 or T3D, ranging from fully interconnected to 2d toroids to hypercubes. Most Beowulf-class systems are of the fully interconnected variety, which doesn't necessarily scale well beyond a relatively small number of processors. Do you have any thoughts on alternatives, or is this just an issue that will affect too few sites to be worth addressing at this time? Do you anticipate it becoming an issue as we move from discrete clusters to Grids?
  • In certain Linux Circles, the mention of Donald Becker's drivers is met with raised eyebrows - and polite but marked silence.

    Is this just nonsense, or do you actually favour a different spot in the stability/performance/ease-of-implementation triangle than most other driver developers ? If so, why ?
  • First, thank you for the drivers. Everyone else seems to be saying that too, but I guess it can't be said enough.


    Secondly, what applications are there out there that you think that beowulf-style clusters are especially suitable for that you don't see people applying them to? Personally I have a mini-cluster for POV-Ray, and I know there's lots of people using clusters for more interesting projects like weather analysis, geographical mapping, and nuclear simulation, but what do you think *isn't* taking advantage of this technology that should be? Is there anything that you feel should be advancing that isn't?


    Thirdly (and this is totally personal, having grown up in Greenbelt and a frequent visitor to GSFC), are you dismayed that PG county never did much to take advantage of having such a resource as Goddard's Space Flight Center? Aside from naming apartment complexes things like "Goddard Space Village", of course. Or maybe things like Government pay scales are to blame?

  • Do you consider yourself primarily an "Open Source" programmer, a "GNU" programmer, or a "Linux-only" programmer?

    If you consider yourself an "Open Source" programmer, how do you justify your stance on withholding your driver code from proprietary OSes, since truely "Open" source, lets people use the source code for whatever they want?

  • Ethernet Drivers (Score:2, Interesting)

    by marcilr ( 247981 )
    The linux kernel has numerous ethernet drivers that you've written. I was wondering how you are able to write and maintain drivers that are compatible with literally dozens of different ethernet cards. How to you manage change control and regression testing?

  • by cmholm ( 69081 ) <cmholm@mauihol m . o rg> on Monday October 14, 2002 @10:14PM (#4450658) Homepage Journal
    Your work in making the "piles of PCs" approach to high performance computing a reality with Beowulf has been responsible for vastly expanding the construction and use of massively parallel systems. Now, viturally any high school - never mind college - can afford to construct a system on which students can learn and apply advanced numerical methods.

    In retrospect, however, it would seem that the obvious cost benefits of Beowulf very nearly killed the development and use of large SMP and vector processing systems in the US. My understanding of the situation is this:
    * Before Beowulf, academics had a very hard time getting time on hideously expensive HPC systems.
    * When Beowulf started to prove itself, particularly with embarrassingly parallel problems using MPI, those academics who happened to sit on DARPA review panels pushed hard to choke off funding for other HPC architectures, promising that they could make distributed memory parallel systems all singing, all dancing, and cheap(er).
    * They couldn't really deliver, but in the meantime, Federal dollars for large shared memory and vector processing systems vanished, and the product lines and/or vendors with it.... at least in the US.
    * Eight years later, only Fujitsu and NEC make truly advanced vector systems [top500.org], and Cray is only now crawling back out of the muck to deliver a new product. Evidently someone near the Beltway needs a better vector machine, and Congress ain't paying for anything made across the pond.

    Cutting to the chase, did you advance a "political" stand among your peers within the public-funded HPC community, or were you just trying to get some work done with the budget available at NASA?

  • According to this page [scyld.com], the copyright for the rtl8139too driver, a substantial portion of which is your code, was claimed by Jeff Garzik illegally and fraudulently. Is this true? If so, have you done anything in an attempt to get the situation resolved? Do you think that other free software authors should be paranoid about protecting their copyrights?

Our OS who art in CPU, UNIX be thy name. Thy programs run, thy syscalls done, In kernel as it is in user!

Working...