Slashdot Log In
IBM Saves $250M Running Linux On Mainframes
Posted by
kdawson
on Tue Jul 31, 2007 08:54 PM
from the mmmm-dogfood dept.
from the mmmm-dogfood dept.
coondoggie writes "Today IBM will announce it is consolidating nearly 4,000 small computer servers in six locations onto about 30 refrigerator-sized mainframes running Linux, saving $250 million in the process. The 4,000 replaced servers will be recycled by IBM Global Asset Recovery Services. The six data centers currently take up over 8 million square feet, or the size of nearly 140 football fields."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Proof of Linux's Environmentalist Friendlyness (Score:5, Funny)
Of course they're saving money (Score:5, Funny)
(it's a joke)
Must be SCO jacked up the rates... (Score:4, Funny)
A pleasure to work with, as well.. (Score:5, Informative)
2000 sq feet per small computer? (Score:5, Interesting)
Re:2000 sq feet per small computer? (Score:5, Funny)
(I'm so sorry)
Parent
My employer recently 'consolidated' too. (Score:5, Interesting)
Now we have ten VM servers running all the migrated services, PLUS a room with about fifty aging Dell PowerEdge servers, each running independently, requiring massive support, cooling, and electricity.
I never thought 'consolidation' would require so much more space, electricity, air conditioning, and upgrades to core switches and UPS units.
Re:My employer recently 'consolidated' too. (Score:5, Funny)
(Strange thing is, I make a good living replacing aging mainframes by linux clusters. mainframes are fine when you're doing transaction processing. But for cpu-bound stuff, you're better off with a room full of opterons).
Parent
Re:My employer recently 'consolidated' too. (Score:4, Insightful)
I wonder if IBM factored in the number oddball projects that require Windows systems in their server count? Windows won't run in a zSeries VM, and there is plenty of software out there that is still Windows only.
Parent
$250M?? (Score:4, Insightful)
Re:$250M?? (Score:5, Funny)
Parent
Re:$250M?? (Score:5, Informative)
Combine IT salary for 3-5 years, power over 3-5 years, etc. etc. and that number makes sense.
Parent
what does this have to do with Linux? (Score:5, Insightful)
The story here is about consolidation, virtualization, etc.
Linux is a small part of the technology involved here. z/OS is the real story here.
Well Duh. (Score:5, Insightful)
The reason why companies are in this pickle is because they thought more was better. They though "All we need to do is buy 4000 x86 servers and we can do tons of work." They didn't realize how HARD it is to get 4000 servers to operate in a cluster so you can take advantage of those individual systems as one body. So, they ended up with islands of computing power instead of a cluster. Naturally the mainframe consolidates these islands back to computing continents and you end up running the mainframes at near capacity all the time. Modern mainframes make this easy with dynamic CPU/RAM allocation, as well as dynamic storage. So you segment out the mainframe in to four or eight chunks. Chunk 1 is hot, chunk 3 and 5 are idle. Simply re-assign some of the CPUs from chunk 3 and 5 to 1 until the load goes down. You can take advantage of this in a big way if you segment your work load to match global demand. So chunk 1 might be data for the western USA, and chunk 7 might be EMEA. You can bounce resources between those segments much more easily. You can even script it. HP has an offering that does this automagically, I'm sure IBM has something similar.
Now, my personal opinion is why Linux? Some of the more advanced features like dynamic RAM, CPU, and IO allocation don't appear to be that solid to me. Perhaps IBM added these features to Linux or made them more robust? Maybe they run Linux inside an AIX virtualization container?
A very confusing endeavor for us (Score:5, Interesting)
We did some calculations and determined that for the price of a zLinux engine we could buy an entire rack of high-end HP servers that would outperform the single engine by a factor of 200:1. Again, maybe it was just the workload we were doing, but even IBM couldn't figure it out and our server work profile isn't exactly uncommon. Granted you can cram a lot of guests onto a host system provided that none of the guests want to use more than 10% of their CPU at any given time, but that defeats the purpose. I could probably run a VMWare host with 100 guests and call it a success, provided they all sat idle.
It was kind of funny because the IBM engineers would shake their heads and admit that for our workload it just wasn't going to work out. Then the next week the sales guy would call and ask if we were ready to buy that third mainframe since he just read the engineer's report and our visit was obviously a smashing success.
I'm not knocking the whole Linux on the mainframe concept, I'm just sharing our experience and how the whole thing seemed to be like someone in IBM Marketing declared "we need to sell Linux on the mainframe" and the Dilberts were forced to sell a product that worked about as well as a chocolate fireguard. It was a very awkward experience and even the IBM engineers seemed like they were stuck in an uncomfortable position of supporting sales for a product that even moderately demanding customers wouldn't be able to run with.
Personally I consider Linux on the mainframe to be on par with running Linux on an iPhone. Sure you probably can, but does it actually do anything uniquely useful for the business? I have a hard time selling technology to the CIO on the grounds that because it's Linux it's a good business decision regardless of the context.
Amplification re: CPU Sparing (Score:5, Interesting)
Actually, on a System z9 EC (Enterprise Class), a single CPU chip failure is not a "Call Home" repair event. Only the second CPU chip failure would result in an automatic call, while your business keeps running of course. (There are a minimum of two spares in each machine.) The average time to first failure for a particular machine is somewhere in the many decades range.
OK, just for fun (because it never actually happens in the real world), what happens with a triple failure? If you happen to have a "fully configured" mainframe -- all processors turned on -- then.... your business still keeps running. Yes, the system might lose some processing capacity, but it keeps running. The higher priority stuff (from a business view) takes precedence automatically, and life goes on. This is all on a single machine still.
If you've got an S18, S28, S38, or S54 model, then, at your business's convenience, the faulty hardware can be replaced. (You might do this at night, for example.) The repair technician tells the mainframe to "evacuate" memory on a portion of the machine while the OS and applications keep chugging along, possibly with reduced capacity, often not. (Depends on what configuration you choose.) When the evacuation is complete, the technician can pull a processor/memory group (called a "book"), insert the new one, bring the new one online, and... everything still keeps running. Again, this is all on a single machine -- no clusters required for any of this.
Parent
It's just an endless cycle (Score:5, Interesting)
But I bet that a small farm of modern medium sized servers running Linux on VMWare would be even less expensive. Or Solaris/Niagara. Why would you want to run an open source operating system, whose major benefits are openness and affordability on the what is literally the most expensive and most proprietary computing platform in the world!
These server consolidation projects are just giant boondoggles spawned because the server sprawl finally got insane. It's an endless cycle:
A. Giant server consolidation project that takes 4000 servers down to 30 servers.
B. Department B complains that Department A's application keeps hanging and consuming all of the CPU. They demand their own hardware "for availability reasons".
C. Vendor C demands dedicated hardware for licensing/capacity planning/supportability reasons. Rather than constantly bicker with the vendor over supportability they get dedicated hardware.
D. Department D complains that the IT department is charging outrageous prices for time sharing on the mainframe. After all a dedicated server only costs $XXX.
E. Suddenly there are 4000 servers again.
F. IT department spends some insane amount of money on infrastructure to manage the 4000 servers.
G. IT department budget gets insanely large trying to manage that much stuff.
H. Some CIO gets the idea that all of this money managing servers is ridiculous and we should do a server consolidation project.
I. IT department spends an even larger amount of money on the latest super high availability gear and consulting services so that the can run 4000 commodity servers inside a few big servers. All because it will "cost less to maintain".
J. Go back to A.
Big Iron. Right concept, wrong platform. (Score:5, Insightful)
For example, for email we run Lotus Notes on a couple of BIG pSeries (AIX) servers. We could have run it on a farm (technical term) of windows boxes.
For webservers, which you could run on AIX, or linux on zSeries. We have multiple (read: many) x86 servers running linux+apache. Why? They connect to a backend app server (pSeries) which connects to a backend zSeries DB2 (I'd prefer Oracle however, to run Oracle on zSeries requires it to be run in a linux VM).
We definitely subscribe to the school of using VMs whether they are zSeries, pSeries, or VMWare on x86. Even if the x86 server is running ONE application, we still put vmware underneath, as it allows for us to move the image to a newer hardware platform when it's time to upgrade. Even some of the larger x86 servers run vmware but in each partition there is a single instance of apache. Makes for managing storage that much easier (fewer zones, cabling etc).
Would I consider moving our apache on linux on x86 to apache on linux on zSeries? Not really. It's a waste of CPU cycles (MIPS). I'd rather use zSeries MIPS for something a bit more critical like keeping my database up and running than serving out webpages (static or dynamic).
IT isn't not about religion, it's about finding the best tool considering your requirements. I have no problems telling IBM that product XYZ is trash. While my servers are IBM, you won't see IBM disk, or IBM tape, and atleast once a quarter some salesman from IBMs storage group is at my door. He buys me lunch and every quarter he is sent packing. You won't see ibm bladecenters as the thought of hundreds of additional servers to manage isn't appealing (but I'll gladly take 100s of VMs across larger x86/pseries boxes).
I know many of you were expecting to hear me say 5000 linux servers, but there are options for my requirements that did not lead to big "google style" linux farms.
BTW: I have no problems kicking out IBM on x86 if HP/Dell/Sun have a better product, and knowing this and letting IBM know this gives me a great advantage over them, as they very well know I'm capable of bringing in something more suitable. (I *used* to have IBM storage).
Re:No (Score:4, Informative)
Parent
Re:System z Mainframes (Score:5, Informative)
Part of that is because IBM will customize the machines to your heart's content. The sky and your budget are the only limits. They leave a good many of the loadout details (xGB/TB of RAM, DASD storage size, # of CPUs per card, # of CPU cards, even number of mainframes - they can be chained in parallel). You should look at the Z series hardware specs [ibm.com] for the general details and look up what details you don't know.
If you're looking for benchmarks or comparisons to x86/x86-64 or other commodity architectures good luck - they are nearly impossible to find. This is due to the implementations being on entirely different scales. The best comparison you an find is the MIPS per CPU. You can find some slightly stale numbers here [isham-research.co.uk] (BTW: an LPAR [wikipedia.org] is something that's been around on mainframes for several decades - one LPAR can run up to several hundred x86 VMs concurrently).
Parent
IBM's been doing this for-ever, dude. (Score:5, Interesting)
It was pension and payroll software and it was legally blessed.
It was such a frigging song and dance trying to get anything done that it was cheaper and faster for the company to emulate their butts off rather than trying to go through the management and the unions and the employees.
But I did learn about optimizing instruction fetches by scattering the compiled code around the circumference of a magnetic drum so that the drum would have rotated around beneath the read head in time for the next instruction.
Try and tell that to the young people of today, and they wont believe you, eh Obadiah?
Parent
Re:IBM's been doing this for-ever, dude. (Score:5, Funny)
Parent
Re:IBM's been doing this for-ever, dude. (Score:5, Informative)
Parent
Re:single points of failure (Score:5, Informative)
These are machines that don't break, period. We're talking the types of machines that run the major banking systems of the world and the like. They simply do not go down. In this situation, if one of the 133 apps buggers up, it's only that VM that's shot. You just nuke it and restart it, the rest of the machine just keeps ticking along.
Parent
OS/2 (Score:4, Insightful)
I think you're hesitant to accept IBM because of the whole 70's/80's "Big Blue" stuff, but after Microsoft swept the rug out from under their feet the company's strength was permanently compromised. The consumer market rejected them (hence the sale of the PC division to Lenovo) and until they committed to Linux software was a major vulnerability for them. The openess of Linux enabled them to get back in the game - their customers didn't have to worry about the future of the platform while their immense contributions to Linux enabled the OS to really threaten Microsoft. So yeah - as a Slashdotter, IBM are the good guys. They support Linux and they don't aggressively protect their many, many patents (they use their patents to protect themselves rather than trying to sue everyone they can for $$$). Personally, I think IBM is the most important tech company in the world.
Parent