Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Debian Software Linux

Debian Cluster Replaces Supercomputer For Weather Forecasting 160

wazza brings us a story about the Philippine government's weather service (PAGASA), which has recently used an eight-PC Debian cluster to replace an SGI supercomputer. The system processes data from local sources and the Global Telecommunication System, and it has reduced monthly operational costs by a factor of 20. Quoting: "'We tried several Linux flavours, including Red Hat, Mandrake, Fedora etc,' said Alan Pineda, head of ICT and flood forecasting at PAGASA. 'It doesn't make a dent in our budget; it's very negligible.' Pineda said PAGASA also wanted to implement a system which is very scalable. All of the equipment used for PICWIN's data gathering comes off-the-shelf, including laptops and mobile phones to transmit weather data such as temperature, humidity, rainfall, cloud formation and atmospheric pressure from field stations via SMS into PAGASA's central database."
This discussion has been archived. No new comments can be posted.

Debian Cluster Replaces Supercomputer For Weather Forecasting

Comments Filter:
  • by pembo13 ( 770295 ) on Friday March 14, 2008 @01:32AM (#22748554) Homepage
    How different can Debian really be compared to RedHat in terms of stability? They both use the Linux kernel, and GNU tools, and follow the LSB, no?
    • Many distro's add kernel patches and add different drivers to the initrd.
      Also the core os ( most minimal installation ) has many different tools and libs.

      Also at time of release they can pick from many different versions of a single package.
      That in combination with what version of GCC and compile flags can and does make a huge differance.

      And at least with Debian you really do know how the systems was build, with RedHat I still wonder...

      Marcel
      • by prefect42 ( 141309 ) on Friday March 14, 2008 @03:39AM (#22748962)
        You don't have to wonder with RedHat. Just look at the SRPMs and see what patches they've applied.
      • by 0racle ( 667029 )
        Given that Linus's approach to 2.6 was develop and experiment in the main tree and let the distros stabilize it, Red Hat patching the kernel is hardly a bad thing. Besides, there are more then a few Debian packages that are heavily patched to work 'The Debian Way'. If I was wanting a distro that was pretty much vanilla everything, I'd go for Slackware, not Debian.
    • by Zantetsuken ( 935350 ) on Friday March 14, 2008 @01:46AM (#22748592) Homepage
      Because each major distro, while they use the same base kernel, GNU command line tools, and same GNOME/KDE environment, can have radically different kernel extensions and drivers implemented by one distro doing development but not another. If you're using whatever GUI tools a distro provides, they can each configure the same backend very differently, which depending on how the tool writes the config file can also effect stability, security, and other functions. Also Fedora/RHEL and tends to use tools created or modified by Red Hat specifically while those aren't easily available for Debian or SuSe, which have their own tools in the same manner.
    • by Xero_One ( 803051 ) on Friday March 14, 2008 @02:30AM (#22748724)
      Debian will run multiple services reliably under heavy load. From my limited experience, it's one of those distros where you "Set It And Forget It" and that's that.

      Once you got it configured the way you want it, there's little intervention involved to maintaining it. It'll just keep chugging along. The keyword there is "correctly". Follow the readmes, howtos, and best practices, and you're golden.

      It's also one of the oldest distributions which always kept to the spirit of GNU/Linux in general: community development and enrichment. Debian developers pride themselves on that spirit. To make the best software for humans. (At least that's what I gather from hanging out with Debian folk) These people are not only passionate in the software that they write, they do it without wanting anything in return, being humble in the way they do it, and wanting no reward for doing it. To them, their reward is in other people using their software and loving it! In my opinion they're not recognized enough.

      But what do I know? I just use the software.
      • According to http://www.top500.org/stats/list/30/os [top500.org] when the linux is identified, most are suse.
    • Also the stable version of Debian is very stable, as in it doesn't change. Security fixes are almost always backported so you don't wind up with new features or changed behaviours, etc. I don't follow Red Hat so I don't know much they differ in that regard, but when you have a server that's configured how you want it and working fine it's really nice to know that if you install a security update it's not going to change any of the functionality.

      In addition, packages go from unstable through testing and si

    • by emag ( 4640 ) <slashdot@gursk[ ]rg ['i.o' in gap]> on Friday March 14, 2008 @01:52PM (#22753618) Homepage
      Well, one of the things I'm running into, having walked into a RHEL-heavy shop, is that every single RH box has what I've come to derisively refer to as /usr/local/hell. Every. Single. One. Basically, because of the extreme pain of upgrading (or others' laziness, though my limited experience in the distant past says it's mostly the former), we have RHEL3, RHEL4, and RHEL5 servers, all at whatever the then-current patchlevel was (AS, ES, WS, BS, XYZS, ASS...Taroon, Macaroon, Buffoon...you get the idea), that have almost everything important duplicated in /usr/local, built from tarballs. Can't remove the system-supplied stuff, since what's left that expects it will balk, but can't use it either, since users or security concerns dictate significantly newer releases of everything from Apache to perl to php to mysql to...

      This is, of course, a nightmare. Worse, the kernels on all of these have the notorious RH patchsets, so as far as anyone knows, each and every one has a mish-mash of backported "features" from newer kernels, but few of the bugs fixed that those newer ones have. In fact, several are still at 2.4.x kernels that, even years later, suffer from a kswapd problem that makes the machines unusable after a while. And we're getting in newer hardware that the 2.6 kernels that ARE supplied don't support. Everyone here has given up trying to build a plain vanilla kernel from the latest sources, because there are so many RH-applied patches to their kernels that may or may not even be applicable or available for the vanilla Linus kernels, that no one can say with any degree of certainty that the system will even boot. With Debian, I gave up on vanilla kernels because I was just tired of sitting through recompiles for the "advantage" of just removing a few things that were modules that I knew I would never use, customized to each of a half-dozen machines.

      With Debian, which I've run for years without a "reinstall", updates are significantly simpler to perform, and if you want to throw backports.org into your sources.list (which may or may not be a fair thing to do), even *stable* has 2.6.22, or 2.6.18 withouth bpo in the mix. Remote updates? No problem, Debian was *designed* to be updatable like that from the start. The dpkg/apt* tools are still worlds ahead of the RH (and SUSE) equivalents. Dependencies are easier to deal with, as are conflicts, and security.d.o seems to be a lot more on the ball about patches than anyone else.

      In fact, I'm often telling our security guys about updates that fix vulnerabilities that their higher-up counterparts haven't started riding them about yet, so they can get a head start on going through the major hassle of begging/cajoling/threatening the RH admins to grab the latest sources, rebuild, and re-install so we don't get slammed on the next security scan for having vulnerable servers. Not that it's ever "the latest", but always "the least amount of upgrade needed to avoid being flagged", which means that next month we go through this all again. With Debian, "aptitude update ; aptitude upgrade" (or just "aptitude"/whatever and then manually select packages, though in stable, it's only going to be fixes anyway most of the time), and the handful of systems I've gotten in under the radar are up-to-date security-wise with significantly less effort.

      Even the "you get what you pay for in terms of support" canard has been proven to be false. We had a couple brand-new freshly-patched RHEL5 systems that just would not stay up. First thing RH Support has us do is run some custom tool of theirs before they'll even attempt to help us. A tool that, I should add, is rather difficult to run on a machine that goes down within minutes of coming up. Finally we re-installed and didn't apply the RH-sanctioned updates. Machine...stays up. Same thing with some RH-specific clustering software. Another RH-approved release resulted in...no clustering at all. For whatever reason, re-installing the prior RPM wasn't possible, but the
  • by dhavleak ( 912889 ) on Friday March 14, 2008 @01:55AM (#22748628)

    What was the age and the specs of the SGI being replaced?

    Going by Moore's law, a factor of 20 performance improvement takes about 6 to 8 years. If the SGI was at least that old, this isn't news -- it's just the state of the art these days. In other words, small clusters capable of weather forcasting are relatively run-of-the-mill.

    Of course, props to linux for being the enabler in this case.

    • 6 to 8 years, you say? Well, then, they'll be ready to upgrade about the time the next version of Debian is released.

      • Debian "unstable" Sid is upgraded every day, or at least several times per week.

        Debian "testing" is upgraded several times a month.

        Debian "stable" is upgraded every one or two years.

        Take your pick.

        I chose "unstable" which is stable enough to be on my home machine. I have never had any serious issues, so far, after one year of usage.

        For a production server I would use "stable" but for a research machine the "unstable" looks like a good choice. I guess the people who built it would know what to do.

        The only on
        • by IkeTo ( 27776 ) on Friday March 14, 2008 @06:27AM (#22749474)
          This is inaccurate, as a long time Debian user I really cannot resist in correcting them.

          > Debian "unstable" Sid is upgraded every day, or at least several times per week.

          True.

          > Debian "testing" is upgraded several times a month.

          Wrong. Debian testing is updated automatically from packages in Debian unstable. The difference is simply that a package has to sit in Debian unstable for a few days, and no significant bugs can be introduced by the new package, before it is updated. Since the process is automatic, Debian testing is updated just slightly less continuously as unstable (it depends on the robot to check the package dependencies and bug reports rather than the maintainer to upload a new version).

          The only time when the update rate is seen as low as less than that is when testing is in deep freeze, i.e., a new stable is about to be created.

          > Debian "stable" is upgraded every one or two years.

          It usually takes slightly longer than two years.

          > The only one I have avoided is "Debian experimental"... :)

          You cannot have a pure "Debian experimental" system. Debian experimental are subsystems that could have profound effect on the rest of the system, and so is provided for trial in isolation. E.g., back in the Gnome 1 to Gnome 2 transition days, or XFree 3 to XFree 4 days, these subsystems are tested in experimental before moving to unstable. These packages are supposed to be used on top of or to replace some unstable packages. Since they affects one particular subsystem, experienced testers can try one particular one based on their needs.
          • Re: (Score:3, Interesting)

            > Debian "testing" is upgraded several times a month.

            Wrong. Debian testing is updated automatically from packages in Debian unstable. The difference is simply that a package has to sit in Debian unstable for a few days, and no significant bugs can be introduced by the new package, before it is updated. Since the process is automatic, Debian testing is updated just slightly less continuously as unstable (it depends on the robot to check the package dependencies and bug reports rather than the maintaine

            • Re: (Score:3, Informative)

              by IkeTo ( 27776 )
              > I don't think the process of migrating packages from unstable to testing is quite as
              > automatic as you describe. At least, the most important packages (like linux, gcc, glibc,
              > dpkg, python, xorg, gnome, kde, ...) don't migrate automatically. These transitions are
              > made only when the maintainers think they're ready to be included in the next stable
              > debian release and when they're sure that they don't break anything.

              The process is automatic. There is even a script to tell you why a particu
    • by Gutboy ( 587531 )
      a factor of 20 performance improvement takes about 6 to 8 years

      It was a cost reduction of a factor of 20, not a performance improvement.
      • It was a cost reduction of a factor of 20, not a performance improvement.

        Err.. if the old SGI had 400 CPUs and the new cluster needs 20 CPUs to match its performance, then new cluster performs the same at 1/20th the cost.

        I know this is an over-simplification, the SGI probably used vector processors and relied a lot less on parallel-processing, cost-per-processor and number of processors will be completely skewed from my example if that's the case, yada yada yada.. but you get the point I hope..

    • by TwoCans ( 255524 )
      > What was the age and the specs of the SGI being replaced?

      Well, RTFA:

      "Previously, for almost a decade PAGASA used an SGI Irix supercomputer...."

      So they were probably using an O2k. Somebody wake me when there is some real news.....

    • Re: (Score:3, Informative)

      by fm6 ( 162816 )
      It's not news that an old system was replaced by a new system. It is interesting that an old supercomputer wasn't replaced by a new supercomputer; a cluster of cheap commodity systems does the job just as well when you don't need real-time performance. This sort of creative use of PCs is what drove SGI into bankruptcy and irrelevance [sgi.com].

      This Philipine newspaper story [manilatimes.net] fills in some important details missing from the Australian PC News article: the age of the SGI system (10 years) and the reason it was costing s
      • by emag ( 4640 )
        Few even within SGI seem to be working on IRIX either. At least, that's the impression I get from our unfortunate IRIX admin here. The amount of cursing and swearing that ensues when SGI has a "recommended" change/fix, especially for our CXFS [wikipedia.org] servers is...impressive. The fact that even the "5 minute" changes still require us to notify users we're expecting at least a half day of downtime (based on experience), is significantly less impressive...
        • by fm6 ( 162816 )
          "Few"? Try none. SGI EOLed its last IRIX product in 2006. There might still be somebody working on patches for their support customers, but that's a small part of what is now a very small company.
    • by Lennie ( 16154 )
      It didn't say performance improved by a factor of 20, the article (and summary) say the operation cost went down by a factor of 20.
  • by toby34a ( 944439 ) on Friday March 14, 2008 @02:03AM (#22748646)
    Most weather prediction centers have adapted their weather forecast models to use Linux clusters. By running an operational forecast model on a cluster, it's easy for forecasters to scale the models so that they can be run (albeit slowly) on desktop machines, and are easily worked on by real meteorologists (versus IT professionals). At my university, we use a large cluster of machines on a RedHat enterprise system, and then able to scale the models and run them on multiple processors using MPICH compilers and batch jobs. Really, using a Debian cluster is no different then using a RedHat cluster. My colleague has access to the NOAA machine, which has more processors then you can shake a stick at... he talks about some code that takes 3 days to run on his personal workstation that takes 2 minutes on 40 processors. With the relatively low cost of a linux cluster, weather forecasting models can be run quickly and efficiently on numerous processors at a local level. With the ease of use of a Linux machine versus some of the supercomputers, it puts the power in the meteorologists to make those changes to the model so that it can improve forecasts.
    • The article mentions the Global Telecommunications System (GTS) It would be cool to know
      how they get their GTS data, probably use a satellite downlink. There is a GPL GTS switch that's developed for Debian:

      http://metpx.sourceforge.net/ [sourceforge.net]

       
      • GTS was until recently largely an X.25 [wikipedia.org] PSTN [wikipedia.org]. I learned X.25 helping maintain message-switching software at a military weather forecasting center; we were a subsidiary node of GTS in that capacity.

        I know that many nodes in the GTS have gone to FTP or TCP socket streaming over the Internet (or VLANs running under the Internet). Old-sk00l by 'net standards, but Très moderne in the WMO timescale.

        • Exactly! MetPX is a tcp/ip only switch. It implement WMO manual 386 tcp/ip sockets, as well as file exchanges over ftp & sftp. It was written to accomplish a transition away from proprietary mainframe stuff exchanging X.25 with the GTS. It also does AFTN (Aviation Fixed Telecommunications Network) over tcp/ip, in contrast to traditional X.25. It is used for the Canadian gateway between GTS and AFTN in Canada, as well as the GTS node itself.

          Many think of the GTS as an X.25 network, but X.25 is going
    • by jd ( 1658 )
      If you look at scientific open-source software for netcdf and other parallel data management systems, a very substantial portion is for climate and weather modelling. High-energy physics is another major area for open source. Computational fluid dynamics is also popular, but most of the code requires a posted and faxed agreement that the source not be opened to hostile countries. I would expect Linux and the *BSDs to be in widespread use in all three areas, simply because anyone who needs that level of cont
  • Obligatory (Score:3, Funny)

    by Nefarious Wheel ( 628136 ) on Friday March 14, 2008 @02:23AM (#22748694) Journal
    Imagine a Beowulf cluster of ...

    oh, wait...

  • Where is... (Score:5, Funny)

    by darekana ( 205478 ) on Friday March 14, 2008 @02:27AM (#22748706) Homepage
    I tried:
    apt-get -f -y install gweather
    But it failed with something about "ldconfig: /lib/libearthquake-2.3.so.0 is not a symbolic link"

    Is libearthquake in unstable?
  • "an eight-PC Debian cluster"

    "[we] wanted to implement a system which is very scalable"

    8 PCS ? 64 cores at most. And they call that scalable ? Come on, today's top500 top machines scale on 10'000 cores. They're 15 years late.

    • Re: (Score:3, Informative)

      Perhaps they mean it is scalable in the sense that one could simply add more machines to the cluster, rather than adding more cores to the machines already in place.
    • As was already mentioned, 'scalable' doesn't mean 'already at max scale' it means that they have the ability to increase the size of the cluster easily. And as far as Top500 goes, you don't need 10k cores to get on the list. Lower on the list (like 450-500) you'll find machines with under a thousand dual-core Xeon 5100 Woodcrests.
  • From TFA
    What was even surprising to us is that Intel FORTRAN is also free of charge ...
    I bet Intel are surprised too. Their compilers are not that free of charge. The people at the Philippine government's official weather service are hardly "not getting compensated in any form" http://www.intel.com/cd/software/products/asmo-na/eng/219771.htm [intel.com]
  • The article lack, as usual, information about what those machines actually do when they compute together.

    What I want to know is: Do they have a big 64 bit addressable RAM image spread over all nodes, communicating with pthreads, like I prefer? Or perhaps they have several 32bit RAM images communicating with some special message protocol. Or perhaps they just have lots of quite independent but equal programs running, as an ensemble. Or perhaps some kind of pipeline where the different parts of the calculatio
  • what else is new?

    FYI, I am a little biased, but Debian is the distro that constantly gives me the least trouble.
    • Agreed. Debian (and all it's variants) are my favorite distros by far. Everything just works as expected, not to mention their package management system (apt) is the superior binary package management system by far (gentoo's portage system is really nice as well, but I'm not so much a fan of installing from source).

The unfacts, did we have them, are too imprecisely few to warrant our certitude.

Working...