Ask Donald Becker 273
This is a "needs no introduction" introduction, because Donald Becker is one of the people who has been most influential in making GNU/Linux a usable operating system, and is also one of the "fathers" of Beowulf and commodity supercomputing clusters in general. Usual Slashdot interview rules apply, plus a special one for this interview only: "What if we made a Beowulf cluster of these?" is not an appropriate question.
One question... (Score:5, Interesting)
(And this is a serious one!)
Why did you choose Linux, instead of *BSD, to create a Beowulf?
This is a serious question, not a flame: why choose Linux over, say FreeBSD? Is it just because your employer already used Linux? Because you had used Linux before and had more experience working with it? Because you had tested both, and found Linux better than BSD? Or because Linux had tools the *BSD did not have?
Just a question...
Re:One question... (Score:2, Interesting)
What I do know is that Red Hat is often preffered for making clusters because of rpms. Sure you can say "rpms suck blah blah blah" but they are very easy to install. If you know that an rpm will work, it is a sinch to upgrade a package on all computers in the cluster at once. This way you can be sure that they are runing the same software
I guess the FreeBSD ports system would be just as easy to use... Am I right?
Re:One question... (Score:2)
Re:One question... (Score:4, Informative)
If I recall, the definition of a Beowulf cluster does not specify Linux specifically, only a free operating system.
Look it up [canonical.org]
Re:One question... (Score:3, Informative)
Re:One question... (Score:3, Informative)
Re:One question... (Score:2, Informative)
Note: Only logged because AC is giving me formkey errors.
This isn't a very well-informed question. Beowulf does not specify a particular platform.
From the Beowulf FAQ [canonical.org]:
[Beowulf is] a kind of high-performance massively parallel computer built primarily out of commodity hardware components, running a free-software operating system like Linux or FreeBSD, interconnected by a private high-speed network.
Please mod accordingly. Let's not waste Becker's time or one of the ten questions on ill-informed pablum refuted in the first question of an FAQ.
Re:One question... (Score:2)
What one thing would you like to see added... (Score:5, Interesting)
Your dream machines (Score:3, Interesting)
What are the five dream machines that you want to have on, under, near, beside or in you over the next 10 years? And what do you forsee actually happening?
And can we make beowulf clusters out of them?
How to bring this up with your boss?? (Score:4, Interesting)
So, my question would be, what's the best way for an engineer at a large company to address this issue with the people they report to.
Re:How to bring this up with your boss?? (Score:2, Funny)
Seriously. If your bosses were putting that much effort and cash into the adoption of a common device that the rest of the world already uses, rational arguments just won't work.
Nerf guns.
Shouldn't you say "were"? (Score:2)
Re:How to bring this up with your boss?? (Score:3, Funny)
Thanks for pointing that out. To clear things up, some are shitheads, waterheads, airheads and buttheads. Let's remember that there's a diversity of stupidity and a uniformity of idiocy in management.
hardware insights? (Score:5, Interesting)
drivers do you have any opinions or suggestions
for hardware makers? Aside from good documentation
what makes a given hardware device easy to work
with and what makes a device hard to work with?
Thanks for all the Ethernet drivers, Don! (Score:5, Insightful)
Re:Thanks for all the Ethernet drivers, Don! (Score:2)
You ARE the man. May your business prosper.
No questions.
MPI OS X and the future (Score:4, Interesting)
Re:MPI OS X and the future (Score:2)
If you could make.. (Score:2, Funny)
10 computer, Teraflops GPU based Beowulf systems. (Score:5, Interesting)
This is not just academic. GPUs are real Vector processors, some of them capable of +200 GFlpos, using up to 128 bits Floating point precision.
Thats about 100 times faster than Intel based CPUs.
Extending math libs, and adapting MPI to use the cluster GPUs as vector oriented Math co-procesor, could potentially lead to 10 computers TeraFlops level beowulf Clusters.
Enterprise Computing (Score:5, Interesting)
Re:Enterprise Computing (Score:2)
Re:Enterprise Computing (Score:2)
What's the future of distributed computing? (Score:5, Interesting)
What tools exist that will be used to create this future? What tools still need to be invented?
Dear Don, does it suck not to be rich? (Score:5, Interesting)
Re:Dear Don, does it suck not to be rich? (Score:2, Funny)
-Don
Please, mod parent down (was Re:Dear Don, does it) (Score:2)
He said he was quite happy about it. He contributed just a bit to the Linux kernel, but he got the rest of it for free. He accepted that as rather good payment.
So unless he has changed that view, I'm not really interested in an answer to your question. And I also wonder if he'd care for a Paypalled website, allthough, you never know
The Future.. (Score:5, Interesting)
(a) Beowulf technology
(b) Different uses for Beowulf
OSS methodologies (Score:3, Interesting)
Warmest regards,
--Jack
how does it feel (Score:2, Interesting)
Re:how does it feel (Score:2)
Well.. as long as they got Steve Ballmer to work there.. Scientists, scientists, scientists!!
Processor/Architecture (Score:5, Interesting)
Why (Score:4, Interesting)
Re:Why (Score:2, Interesting)
Two questions (Score:5, Interesting)
Secondly, do you read Slashdot, and if so, what do you think about all the troll jokes about Beowulfs? Was at least funny in the beginning to hear about people "imagining" clusters of just about anything?
Ok, so it was more than two questions. Sue me.
OS X (Score:4, Interesting)
It seems to have all of the polish and usability Linux/BSD people dream about, whie still maintainging a fully open source BSD core (Darwin). Have you ever been tempted away from Linux like so many ohers?
Message Passing vs. Single System Image (Score:5, Interesting)
Re:Message Passing vs. Single System Image (Score:3, Informative)
As for the 32 bit address limit, it's already a problem. For large scientific code, 4GB per processor is already not enough. Now, people live with it, but that doesn't mean they like it. Intel's 36-bit addressing hack doesn't help, either, since you still have a single-virtual-address space limitation of 32 bits. This is probably the biggest motivation to go to a 64 bit architecture. Note that this problem also applies to large databases.
Re:Message Passing vs. Single System Image (Score:4, Insightful)
The traditional approach is to use fine grained locking in the kernel, but this tends to lead to unmaintainable code and low performance on lower end systems. For an example of this see Solaris, or most other big iron unix kernels.
Another approach is the OS cluster idea championed by Larry McVoy (the Bitkeeper guy). The idea is that you run many kernels on the same computer, one kernel takes care of something like 4-8 cpu:s. And then they cooperate somehow so they can give the impression of SSI.
A third approach seems to be the K42 exokernel project by IBM. They claim very good scalability without complicated lock hierarchies. The basic design idea seems to be to avoid global data whenever possible. Perhaps someone more knowledgeable might shed more light on this...
But anyway, until someone comes up with a kernel that scales to zillions of cpu:s, message passing is about the only way to go. Libraries the give you the illusion of using threads but are actually using message passing underneath might ease the pain somewhat, but for some reason they have not become popular. Perhaps there is too much overhead. And some people claim that giving the programmer the illusion that all memory access is equal speed leads to slow code. The same argument also applies to NUMA systems.
And on the system administration side of things, projects like mosix and bproc already today give you the impression of a single system image. Of course your application still has to use message passing, but administration and maintenance of a cluster is greatly simplified.
Grid Computing? (Score:2, Interesting)
Only one question (Score:2, Funny)
Donald Becker is a world-class guy (Score:4, Interesting)
Anyway, Donald - thanks for helping me out when I was a stupid newbie, you are truly a world-class fellow.
Re:Donald Becker is a world-class guy (Score:2)
And his drivers have much better diagnostic capabilities than anything else out there.
Making Linux usable? (Score:2)
I can't seem to find any information on his work regarding `making Linux usable' that's mentioned in the byline of this story. Am I missing something, or is that part of the intro a little bit confused (maybe Slashdot has a different definition of usability from the rest of the computing world)?
Linux kernel 2.6/3.0 ? (Score:5, Interesting)
Memory-Oriented Logic (Score:5, Interesting)
As I'm sure you've noticed, the price of memory has been driven into the ground -- indeed, it's so inexpensive, the economics seem to have rendered the usage of virtual memory nearly obsolete. Need another 256MB? Spend the $20 and buy it. It's just that simple.
Now, memory makers can't let their goods be absolutely commodified forever, and I'm unconvinced that further speed increases, either in latency or bandwidth, will remain permanently relevant. So I'm curious about your opinion of embedding highly localized simple logical operators amongst the core memory circuitry itself. I've heard a slight amount about work in this direction, and it seems fascinating -- instead of requesting the raw contents of a block of memory, request the contents run through a highly local but massively parallelizable operation -- bit/byte/word interleaved XOR/ADD/MUL, for example. Obviously semiconductors can do more than store and forward; do you believe we a) will and b) should see memory implement trivial operations directly? What about non-turing complete instruction sets?
Yours Truly,
Dan Kaminsky
DoxPara Research
http://www.doxpara.com
P.S. Please forgive me if this entire post reads like "What about a beowulf cluster of DIMMs?"
P.P.S. Be honest: Do you ever find it ironic that the Internet Gold Standard for Ethernet cards ended up being called Tulip?
Just curious... (Score:2)
Thanks for thinking of the children!
Post screwup... (Score:2)
Have you actually read Beowulf?
Considering you're on the board at scyld.. (Score:5, Funny)
And why don't you people like vowels?
(Thanks for the ne2000 driver!)
Re:Considering you're on the board at scyld.. (Score:2, Informative)
probably way too many but what the hey... (Score:5, Interesting)
A couple more specific questions...
1) What approach do you take in creating drivers for cards which have inaccurate or insufficient documentation?
2) What tools do you use for debugging and and/or "discovering" the workings of old/obscure/poorly documented hardware?
3) What skillset, i.e. languages, knowledge & tools, do you consider necessary to perform the kind of coding you routinely do (outside of hacker wizardry and C mastery)?
I am also wondering how you got started writing ethernet drivers and clustering software for linux. What lead you down this specific path rather than other aspects of kernel/OS development?
JM
Java (Score:5, Interesting)
Which Network gear manufacturer? (Score:5, Interesting)
What (FastEthernet/100mb) Network gear manufacturer do you prefer and recommend to others?
Whether its servers, or home use, its an important question, as some are as buggy as all get out, and others are to die for.
And if its a different answer, which manufacturer do YOU use?
Re:Which Network gear manufacturer? (Score:3, Interesting)
Re:Which Network gear manufacturer? (Score:3, Informative)
Right now, I'm using a 3c905b card (though it isn't a Becker project) with great success.
I think Linus likes eepro cards, IIRC from lkml.
Re:Which Network gear manufacturer? (Score:5, Interesting)
Here a my quick summary of what he told me:
Some network cards are really pathetic and/or broken. As long as you don't buy one of those, it doesn't really matter very much which one you buy.
The 3Com 3c905 cards are a little bit better than other cards.
I found this web page:
http://www.fefe.de/linuxeth/ [www.fefe.de]
Based on that web page and Mr. Becker's comments, I bought myself some 3Com 3c905c network cards, and I have been very happy with them.
P.S. I used to buy my net cards by brand name. Bad idea! You must look beyond the brand name and see what chipset the net card uses. I bought a Linksys LNE100TX card and liked it, so I kept buying that card. But Linksys started making different versions of the card, using completely different chipsets, so the last time I bought that card it turned out to be really broken under Linux. Older LNE100TX cards work well with the "Tulip" driver under Linux, but newer ones are really broken.
steveha
Re:Which Network gear manufacturer? (Score:2)
dlink also does it...
Re:Linksys / Tulip-compatibility (Score:2)
All I know is that I had problems with a LNE100TX under Linux. I was getting throughput of 2 Mbps, not 100 Mbps. I was also getting lots of errors.
One of the web pages I checked identified the chipset in that Linksys card as a "really broken chipset". I have used that card under Windows, so perhaps the drivers are partly to blame... but I want speedy and reliable networking, under Linux, so I don't want Linksys cards.
steveha
The answer is clear, of course... (Score:2)
OpenMosix (Score:5, Interesting)
Distributed shared memory is a big hurdle facing the OpenMosix project over the next couple years. Right now any program that allocates shared memory cannot migrate. What do you think of projects like OpenMosix? Do you think we will reach a point where parallel programming is a thing of the past, discarded in favor of tools like OpenMosix that require no special programming considerations except implementing clean threading?
Broadcast w/ Verification for SHM updates (Score:2)
Of course, the obvious approach of only migrating processes and not the shared memory it allocated (instead using SHM-over-TCP-maybe-with-SEQ#'s-directly-mapping-t
also should work.
--Dan
www.doxpara.com
What have you done for me lately? :-) (Score:5, Interesting)
NASA, Government, Linux, Open Source (Score:5, Interesting)
Would you care to comment on your experience in NASA working on an Open Source project? (I understand you've left NASA for Scyld, maybe that partially answers my questions, but I still want to know...)
It seems as if your work on Beowulf clusters had a nice spin-off in terms of providing not only low cost supercomputing for academic, government and industrial users, but also in terms of Ethernet support for all sorts of Linux users.
What made you believe ...? (Score:3, Interesting)
Let's see if I can burn what little karma I have.. (Score:2, Funny)
Grid Computing and Linux Beowulf (Score:2, Interesting)
limits of clusters (Score:5, Interesting)
- Also -
What tools are seriously lacking in linux clusters? Are open source (or low cost) cluster filesystems necessary to expand the use of beowulf clusters? - Are better libraries needed? Where is research needed?
What comes next? (Score:5, Interesting)
If Ethernet consumes too many resources, and Infiniband is stillborn, what's the next communications medium for networking and clustering?
Re:What comes next? (Score:2)
Re:What comes next? (Score:2)
Even then, those standards you mention are all about clustering, not general purpose networking. Ethernet has really caught on in the 30 or so years it has been around because it has been very adaptable- maybe that adaptability is exactly what is preventing it from progressing further.
Re:What comes next? (Score:3, Insightful)
Isn't that a limitation of the computer, not a limitation of gigabit Ethernet?
Re:What comes next? (Score:2)
Network driver fiasco (Score:5, Interesting)
Is this still the case and is there any hope of this deadlock ending? I know some folks have stepped up to maintain what's left of your code in the kernel; are they doing an adequate job?
Time to burn some karma... (Score:5, Funny)
With all that you've accomplished to date, how much do you think a Beowulf cluster of Donald Beckers could accomplish?
Total Cost of Ownership (Score:5, Interesting)
(Massively) parallel supercomputing (Score:3, Interesting)
Free time at NASA??? (Score:5, Interesting)
What drives a guy working at NASA to develop a plethora of Ethernet drivers and architect a distributed computing system?
Was this based on a need for better tools at work? Spare time?
Why is it? (Score:3, Insightful)
It would be nice to have an anecdote or two about your years with Steely Dan - or even the solo projects from the '80's.
Export restrictions (Score:4, Interesting)
Though it's certainly impossible at this point, do you think similar restrictions should apply to projects like Beowulf? At what point does the potential for bad things outweigh the potential for good things?
(open) mosix (Score:3, Interesting)
answer for real beowulf computing?
changes in the beowulf environment + community (Score:5, Interesting)
As a member of the beowulf@beowulf.org, I have noticed that your posts generally seem to be of a technical, "yes/no, this is how you do it", etc. nature ( which is quite good actually ), and I've never really seen much stating your opinion on the way things are. I've got a few questions :
1) how do you feel about high-speed interfaces, and the parallel code ( i.e. various flavors of MPI ) to take advantage of them? I noticed that every time benchmarks come up for Myrinet or SCI interfaces, we get a minor flamewar between said parties, and noone ever really mentions Infiniband ( and Gigabit ethernet to ea. node is still prohibitively expensive in terms of price/performance at the switch level ). This also brings up issues of free vs. propietary interfaces and software. What do you think are the futures of these technologies, and which model do you prefer : open source or Whatever Gets The Job Done(TM)?
2) why did you pick Linux, as opposed to, say, one of the BSDs? At the time when you started doing Beowulfs, GNU/Linux wasn't the beloved child of the community that it is now, so what prompted the choice?
3) also, what do you see the next wave of clustering to be? We saw mainframes ( Shared Memory Processors ), then high-powered clusters ( ala SP2 + SP3, SMP on ea. node, but no contiguous RAM across all nodes natively ), then the introduction of COTS ( Commodity-Off-The-Shelf ) Beowulfs, then next-generation Beowulfs ( higher-end dual ( sometimes quad or even now some Xeon NUMA boxen ) processor, large amounts of RAM, high-speed SCSI disks, 64 bit PCI or PCI-X, etc. ), which argues that the community goes w/ the next bright idea ( which is dependent on hardware ), and companies go w/ whatever gives them the most bang for their buck. Where do you think we're going now ( as far as the major trend, since there is no 1 answer to the various problems that MPPs are used to address )? Low power consumption, low-heat large farms? I'm all ears...
Anyways, whether these questions get answered or not, thanks for the hard work you've done and all you've given to the community.
What's your favorite flavor? (Score:2, Interesting)
What's your favorite flavor of high speed communications card for implementation within a beowulf cluster?
Respectfully,
Inside the Coder's Studio (Score:2)
What is your favorite word?
What is your least favorite word?
What turns you on?
What turns you off?
What is your favorite curse word?
What sound or noise do you love?
What sound or noise do you hate?
What profession other than your own would you like to attempt?
What profession would least like to attempt?
If Heaven exists, what would you like to hear God say when you arrive at the Pearly Gates?
What prompted you to leave NASA ? (Score:5, Interesting)
You changed to scyld [scyld.com] where the main objective is to earn money from the application of high-performance computing. You still make all those drivers available [scyld.com] and update them (many thanks for that) but the company also has to make money, you need to pay your meals and your home.
What made you change, and how do you feel about that change now it's been a few years.
Shared Memory Constraints in Beowulf Culster (Score:2, Interesting)
Kernel or Applications? (Score:3, Interesting)
Do you play any musical instruments? (Score:4, Interesting)
This will will not further the clustering field but do you play any musical instruments?
Device drivers - where to begin ? (Score:5, Interesting)
I'm a perl hacker (with a bit of C knowlege) and have made a good career out of it so far.
However, lately I've found myself getting interested in the linux kernel and specifically, device drivers.
My question is.. Where to begin ? I've seen your name in several drivers in the linux kernel (specifically to my case, the Intel EtherExpress Pro 10/100 card) and have spoken to you on usenet on occasion.
What should a complete beginner like me learn to get into this area ? Specifically, kernel modules in general, hardware drivers in general, researching how to deal with a specific piece of hardware...
Thanks for any tips
Did the men in black suits come for you? (Score:2)
Rumor has it that when you were initially working on the Beowulf project (pre-infancy, while at NASA maybe?) and released some initial code on the web, some government entities were none-to-happy at the prospect of having foreign countries use that code to construct powerful clusters from commodity PCs.. in essence, to side-step export controls. You may also have been abducted and/or charged with "heinous" crimes while they were investigating Beowulf (black-bmw shady government style abduction).
Can you lend any insight as to what these rumors may be based on? Do you have any advice for budding programmers as to how the government might react if we just release world-altering software into the open, like you did?
Has clusters hurt Super Computer Development? (Score:2, Interesting)
Other architectures for a Beowulf (Score:3, Interesting)
Tradedoffs (Score:2)
Is this just nonsense, or do you actually favour a different spot in the stability/performance/ease-of-implementation triangle than most other driver developers ? If so, why ?
Clustering Needs and PG county (Score:2, Interesting)
Secondly, what applications are there out there that you think that beowulf-style clusters are especially suitable for that you don't see people applying them to? Personally I have a mini-cluster for POV-Ray, and I know there's lots of people using clusters for more interesting projects like weather analysis, geographical mapping, and nuclear simulation, but what do you think *isn't* taking advantage of this technology that should be? Is there anything that you feel should be advancing that isn't?
Thirdly (and this is totally personal, having grown up in Greenbelt and a frequent visitor to GSFC), are you dismayed that PG county never did much to take advantage of having such a resource as Goddard's Space Flight Center? Aside from naming apartment complexes things like "Goddard Space Village", of course. Or maybe things like Government pay scales are to blame?
Open Source vs GNU vs Linux (Score:2)
If you consider yourself an "Open Source" programmer, how do you justify your stance on withholding your driver code from proprietary OSes, since truely "Open" source, lets people use the source code for whatever they want?
Ethernet Drivers (Score:2, Interesting)
Did You Intend To "Kill Cray"? (Score:3, Interesting)
In retrospect, however, it would seem that the obvious cost benefits of Beowulf very nearly killed the development and use of large SMP and vector processing systems in the US. My understanding of the situation is this:
* Before Beowulf, academics had a very hard time getting time on hideously expensive HPC systems.
* When Beowulf started to prove itself, particularly with embarrassingly parallel problems using MPI, those academics who happened to sit on DARPA review panels pushed hard to choke off funding for other HPC architectures, promising that they could make distributed memory parallel systems all singing, all dancing, and cheap(er).
* They couldn't really deliver, but in the meantime, Federal dollars for large shared memory and vector processing systems vanished, and the product lines and/or vendors with it.... at least in the US.
* Eight years later, only Fujitsu and NEC make truly advanced vector systems [top500.org], and Cray is only now crawling back out of the muck to deliver a new product. Evidently someone near the Beltway needs a better vector machine, and Congress ain't paying for anything made across the pond.
Cutting to the chase, did you advance a "political" stand among your peers within the public-funded HPC community, or were you just trying to get some work done with the budget available at NASA?
Did someone infringe your copyright? (Score:2)
Re:Role of GNU in GNU/Linux (Score:3, Insightful)
Even Linus doesn't feel strongly one way or the other. The only person who seems to be working up a lather is RMS. It's sad.
Re:Role of GNU in GNU/Linux (Score:2, Insightful)
Bell/AT&T/GNU/Linux forever!