HA-OSCAR 1.0 Beta release - unleashing HA Beowulf 90

Posted by Hemos on Tuesday March 23, 2004 @11:15AM from the let-it-loose dept.

ImmO writes " The eXtreme Computing Research (XCR) group at Louisiana Tech University is pleased to announce the first public release of HA-OSCAR 1.0 beta. High Availability Open Source Cluster Application Resource (HA-OSCAR) is an open source project that aims toward non-stop services in the HPC environment through a combined power of High Availability and Performance Computing solutions. Our goal is to enhance a Beowulf cluster system for mission-critical applications and downtime-sensitive HPC infrastructures. To achieve high availability, component redundancy is adopted in HA-OSCAR cluster to eliminate single point of failures, especially at the head node. HA-OSCAR also incorporates a self-healing mechanism; failure detection & recovery, automatic failover and fail-back. The 1.0 beta release supports new high-availability capabilities for Linux Beowulf clusters based on OSCAR 3.0 It provides an installation wizard GUI and a web-based administration tool that allows a user to create and configure a multi-head Beowulf cluster. A default set of monitoring services are included to ensure that critical services, hardware components and important resources are always available at the control node. "

HA-OSCAR 1.0 Beta release - unleashing HA Beowulf

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 90 Comments Log In/Create an Account

Comments Filter:

- - - - Re:Imagine... (Score:2, Funny)
        
        by Anonymous Coward writes:
        
        Given that this is slashdot, they probably deleted it because they realized it had nothing to do with Anime, Microsoft bashing, or Linux worship.
There's an article on HA-OSCAR... (Score:5, Informative)

by tcopeland ( 32225 ) * writes: <tom@@@thomasleecopeland...com> on Tuesday March 23, 2004 @11:16AM (#8644871) Homepage

...written by Tong Liu (the lead developer) in last month's LinuxWorld [linuxworld.com].

You have to be a subscriber to view the HTML, but it seems that you can download the PDF version for free...

- Here's the LinuxWorld article in full (Score:1, Informative)
  
  by Anonymous Coward writes:
  
  Just click [linuxworld.com] and enjoy. It's a good read.
Linuxworld (Score:5, Informative)

by ViceClown ( 39698 ) * writes: on Tuesday March 23, 2004 @11:16AM (#8644873) Homepage Journal

Worth noting also, Linuxworld magazine has an article [linuxworld.com] this month on HA-OSCAR which is pretty good!

- Re:Just what I need (Score:1)
  
  by meringuoid ( 568297 ) writes:
  
  You know, this is the sort of troll I just don't get. There's no way it's for real - throwing out Opterons?! - but it's not inflammatory, or even particularly interesting. Is there a '-1, Pointless' moderation?
  - Re:Just what I need (Score:1, Offtopic)
    
    by AftanGustur ( 7715 ) writes:
    
    You know, this is the sort of troll I just don't get.
    That's how trolls function.. Put out outrageus statements that about a third of the readers doesn't understand, a third thinks is "informative" and the last thinks is insulting..
    And then maybe a few %'s see as troll.
    This guy is just trolling on the fact that very few /. users will ever see this code run, let alone install a Beowulf cluster.
    It's like if /. had an article on new state regulations on buying rocket fuel, and I would say how glad I
- Re:Is this Slashdot? (Score:1, Offtopic)
  
  by Wateshay ( 122749 ) writes:
  
  At least they were honest about it, instead of trying to pretend they're some random geek who stumbled onto their site.
- Re:Just imagine... (Score:4, Funny)
  
  by Nick of NSTime ( 597712 ) writes: on Tuesday March 23, 2004 @11:30AM (#8645045)
  
  What? Imagine what? Don't keep me in suspense!
  
CPU RAID (Score:3, Interesting)

by manganese4 ( 726568 ) writes: on Tuesday March 23, 2004 @11:24AM (#8644977)

So on a multi-CPU sever if you started the same process synchronously on multiple CPU, how close in time would they finish assuming there is sufficient memory and disk drive controller to prevent severe competition?

- - Re:CPU RAID (Score:1)
    
    by manganese4 ( 726568 ) writes:
    
    But is there enough noise/error in non-competition resource access to cause them to unsync?
- Re:CPU RAID (Score:3, Informative)
  
  by straponego ( 521991 ) writes:
  
  The only simple, honest answer to this is: it depends. If your jobs stay completely inside the CPU cache, and nothing else is happening in the system, and the scheduler is smart enough not to swap the tasks between CPUs without good reason, you should see very nearly 100% scalability. The larger the cache, the more likely this is, so at this point smaller jobs favor Xeon CPUs over Athlon/Opterons. Most jobs do need to access memory and disk, though. In these cases, the Opteron architecture does well,
- - - - Re:Pseudo code version (Score:1)
        
        by Mateito ( 746185 ) writes:
        
        Sorry, to get modded up on slashdot you should have written it in obfiscated perl.
First thing this cluster could compute: (Score:2, Funny)

by Da Fokka ( 94074 ) writes:

The ratio 'imagine a...'-jokes to 'now there will be a lot of 'imagine a...''-jokes
- Re:First thing this cluster could compute: (Score:1)
  
  by Hentai ( 165906 ) writes:
  
  Damn. Thanks to my twitchy mousewheel, I just accidentally moderated this -1 Flamebait instead of +1 Funny. Replying now to cancel the moderation, and my sincere apologies.
In Other News (Score:5, Funny)

by PonyHome ( 625218 ) writes: on Tuesday March 23, 2004 @11:30AM (#8645037)

Darl McBride files suit against Louisiana Tech, saying "This is one more example of how SCO innovation has been misappropriated."

Buzzwords Aplenty! (Score:4, Funny)

by DanoTime ( 677061 ) writes: on Tuesday March 23, 2004 @11:31AM (#8645050)

Boy, I could make my manager's head spin just by reading the summary of that article!

- Re:Buzzwords Aplenty! (Score:3, Funny)
  
  by sg_oneill ( 159032 ) writes:
  
  Haha. Yeah I was kinda thinking there was a bit of a buzzword overload there..
  
  That said , I think they missed the bit about it using "XML compliant Strategic Webservice Failover Product placements + Redundant steak knives!!"
  
  Aint it scary tho, when you read articles like that, and despite having years of IT deep-fried knowledge, you'd probably have to pass it to marketing to decode it.
More about beowulf? (Score:5, Informative)

by Krik Johnson ( 764568 ) writes: on Tuesday March 23, 2004 @11:39AM (#8645127) Homepage

If you have seen all the jokes, but you still don't know what a beowulf cluster is, then this site [beowulf.org] is for you. It has all you need to know about it.

- Re:More about beowulf? (Score:1)
  
  by deadline ( 14171 ) writes:
  
  You may also find ClusterWorld [clusterworld.com] web-site and magazine useful.
Buzzword count (Score:4, Funny)

by Electrawn ( 321224 ) writes: <<electrawn> <at> <yahoo.com>> on Tuesday March 23, 2004 @11:42AM (#8645161) Homepage

High amount of corporate buzzwords detected: self-healing, mission-critical, GUI, beowulf...

Oh, this project actually does those things? Quaint!

Just running the vaporware bullshit o-meter here...

- Re:Buzzword count (Score:1)
  
  by AviLazar ( 741826 ) writes:
  
  GUI is a corporate buzzword? How many corporate slaves, who aren't semi-techies, know what GUI means - other then "I've been slimed" :) -A
  - GUI is a buzzword (Score:1)
    
    by Electrawn ( 321224 ) writes:
    
    GUI is a buzzword the pointy haired types can easily understand and make sentences with.
    
    E:"Our new router blocks 99% of our spam! Saves us millions!"
    
    B:"But does it have a GUI?"
    
    E:"No, it..."
    
    B:"All our TCO spec products must have GUIs!"
    
    *blink*
    
    http://www.buzzwhack.com/buzzcomp/indgk.htm [buzzwhack.com]
    - Re:GUI is a buzzword (Score:1)
      
      by AviLazar ( 741826 ) writes:
      
      I guess. But this begs the question, do they know what GUI means :) -A
hold on hold on (Score:2, Funny)

by tetrahedrassface ( 675645 ) writes:

I can hear the terrorist governments of the world licking their chops for this one! Im just joking. Or am I?
Misread title.... (Score:1, Funny)

by BenJeremy ( 181303 ) writes:

I thought this was about Beowolf clusters in NASCAR. :o
More info on OSCAR and related projects @ (Score:3, Informative)

by brechin ( 309008 ) writes: on Tuesday March 23, 2004 @11:51AM (#8645242)

I've been writing some articles about OSCAR and some of the projects that are related that are being developed at NCSA and other places. You can find the latest version of this newsletter at the Linux Developer Newsletter [uiuc.edu] site.

OSCAR 3.0 Link correction (Score:4, Informative)

by brechin ( 309008 ) writes: on Tuesday March 23, 2004 @11:56AM (#8645306)

The link in the story to OSCAR 3.0 should be to http://oscar.sourceforge.net [sourceforge.net] The other site is just the parent organization's info page.

/. effect (Score:1)

by tehcyder ( 746570 ) writes:

So how come none of the linked sites have been slashdotted?
Is it because they have un-killable servers, or rather that is this not a hot enough topic here?
Kind ruins the cliche (Score:1)

by Bob Loblaw ( 545027 ) writes:

"Just imagine one of these!" doesn't have the same ring ...
- Re:OSCAR? (Score:1)
  
  by theguywhosaid ( 751709 ) writes:
  i wouldnt imagine so, because
  
  not the same product space, though Firebird DB wasnt either
  
  AOL's protocol isnt an important brand issue to them, the AIM service is.
  
  i just dont
- Re:OSCAR vs. Grid (Score:2, Informative)
  
  by ahadsell ( 248479 ) writes:
  
  OSCAR vs. Grid: Substantially different. Kinda like the difference between a LAN and the Internet.
  OSCAR vs. other cluster software: HA-OSCAR is a logical development of other open-source cluster software out there. For instance, see SLURM [llnl.gov], a package for scheduling jobs on a Linux cluster.
sources of failure (Score:1)

by mkstowegnv ( 674358 ) writes:

It isn't surprising that beowulf clusters would want to incorporate mechanisms to deal with node failure, but I am curious if those who have worked on actual clusters could expand on the most common causes of failure. I was surprised to read in a previous slashdot post (sorry no URL) that even clusters of mini-ITX boards without hard drives (the most failure-prone component I would have thought) have frequent failures.
- Re:sources of failure (Score:3, Funny)
  
  by Bombcar ( 16057 ) writes:
  
  I've heard (no sources, google it) that Grendel is hard for beowulf clusters to deal with.....
  
  Maybe? I dunno. :)
- Re:sources of failure (Score:2, Informative)
  
  by jahill_isu ( 579660 ) writes:
  
  but I am curious if those who have worked on actual clusters could expand on the most common causes of failure...
  As a research assistant that helps maintain a cluster, the most frequent problems in out Commercial Off The Shelf (COTS) clusters are power supplies. We have at least one die each week. Hard drives are a close second.
- Re:sources of failure (Score:1)
  
  by streepje ( 87249 ) writes:
  
  The sources are essentially no different than for your desktop but if you do the math you'll see that failure is much more common when you have a bunch of them.
  
  What's the probability that your desktop will crash if you run it fully loaded for a week? Pick a number, say 1%. So it has 99% chance of completing the job.
  
  Now suppose you have a job that runs in parallel on 100 such nodes flat out for a week. The probability that the job finishes successfully is (0.99)^100 or about 36%
  
  So the job is about two
Dr Box (Score:2, Interesting)

by ChaserPnk ( 183094 ) writes:

I actually go to Louisiana Tech. Chokchai Leangsuksun (Dr. Box), the director of the HA-Oscar program also teaches my Operating Systems class. He came into class today looking tired...he said he'd been working very hard on it.

I think it's about time LaTech got some recognition.
- Re:Dr Box (Score:1)
  
  by elchican ( 764688 ) writes:
  
  Yes it is time that LaTech got a little recognition. Dr. Box also deserves a great deal of credit. He's a very talented and gifted man. Box actually brought a small cluster of his IBM tablets into class the other day and we actually saw a few of HA-OSCAR's capabilities. Hopefully HA-OSCAR will pan out like expected.
HA Beta?!? (Score:1)

by Charles Dart ( 731692 ) writes:

High availability and beta don't seem to go together to me. I don't think an OS should be classified such until it is STABLE

<mumble> I doubt anyone will read this, drowning as it is in stupid Beowulf jokes</mumble>

This story is burning up enough mod points to give us all karma nirvana.

Please, stop wasting points modding off-topic.
- Re:HA Beta?!? (Score:1)
  
  by Rich Klein ( 699591 ) writes:
  
  Good point. I was wondering why a beta was released at 1.0, which implies, to me, a production release. If it were up to me, I'd release a beta at 0.9 or something.
  
  If it's stable then they should probably drop the beta suffix.
Imagine my surprise (Score:1)

by shancock ( 89482 ) * writes:

when there was no mention of the satellites or amateur radio here.
Time to play! (Score:2)

by pair-a-noyd ( 594371 ) writes:

I've got about ~55 Compaq's that are bored to death and looking for something to do..

Now, if the circuit breakers will only hold up long eno
Hopefully fail safe ? (Score:2, Funny)

by LupeSpywalper ( 713932 ) writes:

I hear a certain terrorist group's Open Source Application Management Administrator (OSAMA) is already working hard to find some loop holes in the code.
More Cluster Information (Score:1)

by deadline ( 14171 ) writes:

Shameless Plug:
There is now a magazine [clusterworld.com] and a news website [clusterworld.com] dedicated to HPC/Beowulf cluster computing. You may recognize the webpage format.
We are still running our free three month trial issue offer as well.
Does it auto fork processes (Score:2)

by nurb432 ( 527695 ) writes:

Mosix does this, but what about this? Or do you have to recompile and optimize for clustering?

In a 'regular' environment auto propagate would be more useful.
how does this compare to openssi? (Score:3, Informative)

by jelle ( 14827 ) writes: on Tuesday March 23, 2004 @03:53PM (#8648239) Homepage

How does this compare to OpenSSI [openssi.org]? OPenSSI is nice because of the single system image approach, that makes administration very simple. AFAIK, an OpenSSI cluster also supports PVM and MPI in addition to exec and run-time load balancing (a'la mosix [openmosix.org]).

OpenSSI has a lot of "HA-" support, including support for various clustered filesystems, failover of network interfaces across nodes, and failover of the first node (hopefully soon without needing shared SCSI storage but using something like drbd [drbd.org]).

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Re:Imagine... (Score:2, Funny)

There's an article on HA-OSCAR... (Score:5, Informative)

Here's the LinuxWorld article in full (Score:1, Informative)

Linuxworld (Score:5, Informative)

Re:Just what I need (Score:1)

Re:Just what I need (Score:1, Offtopic)

Re:Is this Slashdot? (Score:1, Offtopic)

Re:Just imagine... (Score:4, Funny)

CPU RAID (Score:3, Interesting)

Re:CPU RAID (Score:1)

Re:CPU RAID (Score:3, Informative)

Re:Pseudo code version (Score:1)

First thing this cluster could compute: (Score:2, Funny)

Re:First thing this cluster could compute: (Score:1)

In Other News (Score:5, Funny)

Buzzwords Aplenty! (Score:4, Funny)

Re:Buzzwords Aplenty! (Score:3, Funny)

More about beowulf? (Score:5, Informative)

Re:More about beowulf? (Score:1)

Buzzword count (Score:4, Funny)

Re:Buzzword count (Score:1)

GUI is a buzzword (Score:1)

Re:GUI is a buzzword (Score:1)

hold on hold on (Score:2, Funny)

Misread title.... (Score:1, Funny)

More info on OSCAR and related projects @ (Score:3, Informative)

OSCAR 3.0 Link correction (Score:4, Informative)

/. effect (Score:1)

Kind ruins the cliche (Score:1)

Re:OSCAR? (Score:1)

Re:OSCAR vs. Grid (Score:2, Informative)

sources of failure (Score:1)

Re:sources of failure (Score:3, Funny)

Re:sources of failure (Score:2, Informative)

Re:sources of failure (Score:1)

Dr Box (Score:2, Interesting)

Re:Dr Box (Score:1)

HA Beta?!? (Score:1)

Re:HA Beta?!? (Score:1)

Imagine my surprise (Score:1)

Time to play! (Score:2)

Hopefully fail safe ? (Score:2, Funny)

More Cluster Information (Score:1)

Does it auto fork processes (Score:2)

how does this compare to openssi? (Score:3, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals