

Linux On Another New Architecture: PowerPC 64-bit 131
An unnamed correspondent writes: "This one rather silently whizzed by on the kernel mailing list. IBM reports that they have ported Linux to PowerPC hardware running in 64-bit mode. This no doubt applies only to the larger processors but it's pretty cool all the same." I don't see this processor yet listed on the NetBSD page, even on the mind-bending list of not-yet-integrated ports; is this a first? :)
Re:IBM (Score:1)
Systems that use the POWER3 chip (Score:1)
Re:OK, dumb 32/64 bit question (Score:1)
64 bit is going to be helpful if you're interested in high precision floating point math. A 64 bit processor can do operations on 64 bit FP numbers as a single operation, rather than 4 operations as you'd need with a 32 bit processor, so there's a big speed up in heavy duty number crunching. That's why you always see people doing really chunky numerical modeling- like predicting the weather- using 64 bit computers instead of cheaper 32 bit ones. Not what everyone needs, mind you, but the people who do need it are willing to pay.
how long until... (Score:1)
Re:The real question (Score:1)
--
Re:Big Time Linux: Itanium, S/390, PPC64 (Score:1)
And where is a semi-usable UltraSPARC distribution?
I actually have been running RH 6.2 for some time on this machine - runs like a champ. Often have to recompile software, but no real problems to speak of...
[chris@gomez chris]$ cat
cpu : TI UltraSparc IIi
fpu : UltraSparc IIi integrated FPU
promlib : Version 3 Revision 15
prom : 3.15.2
type : sun4u
ncpus probed : 1
ncpus active : 1
BogoMips : 665.19
MMU Type : Spitfire
Re:Dumb answer (Score:2)
--
Re:OK, dumb 32/64 bit question (Score:1)
PowerPC - does anyone care? (Score:2)
Given the current trend of consolidation, I see room Intel, AMD, and a high-end player yet to be named - either Alpha or PPC. I'm discounting the Mac userbase in advance as I believe Mac users care the least about the technical details of their platform, and hence constitute an OS market more than a microprocessor market.
Re:IBM Gaining Marketability in Mainframe Industry (Score:2)
--
apple, 64 bit ppc, etc. (Score:2)
> out six weeks before the internally developed 630.
The 620 made it int at least one Apple server, iirc. And when it trounced the wintel boxes in a benchmark, the predictable response back was that it wasn't fair to compare a 64 bit machine to 32 bit machines.
hawk
Why does this matter? (Score:2)
Re:IBM (Score:2)
> The guys in the server room who clean the crap off the floor get hot
> and bothered by operating systems. The guys these people clean up
> after only care about getting the job done right.
For the most part. But if you told them to run NT on big iron, they would probably get hot under the collar and very bothered
hawk
:)
Re:Actually, SMT is much better. (Score:2)
Thanks, all!
--JRZ
Re:Big Time Linux: Itanium, S/390, PPC64 (Score:1)
Peter
--
www.alphalinux.org
Benchmarks used. (Score:2)
For that particular project, I was using the go and cc1 integer benchmarks from the SPEC suite (not sure which year). No special reason; these were just the ones I had on-hand, and for the project it didn't really matter (as I was interested in relative and not absolute results).
I cite the figures from that project as a reference point to give some idea of the ballpark values that can be expected. A 50% increase in ILP for "average" code I might believe. A 400% increase I wouldn't.
Yes, certain scientific applications can be written to be easily parallelized, but this is only one niche. For most code, I am deeply skeptical of filling 8 issue units per clock. SMT offers the potential for across-the-board speedup (as long as you're running more than one CPU-bound thread on the machine at once).
Multiple cores on a chip. (Score:2)
The multiple cores idea has been around for a while, and certainly works; SMT is just more resource-efficient.
My impression is that the Power4 is going to have two cores, but I haven't been following it closely, so I could easily be wrong about that.
In a SMT system, functional units are indeed shared dynamically between the threads. As far as most of the chip's concerned, there's only one instruction stream, composed of interleaved instructions from the two threads (well, not interleaved in lockstep, but close enough). All you'd need to do would be to add an extra bit onto the register specifier tags (so that the two threads access non-overlapping sections of the register file) and give each thread its own page table identifier (selected by a few bits tacked on to the address). You could even get away with having a single TLB cache.
In summary, you can keep most of the design the same as for a single-thread machine, and make relatively minor changes in a few places to implement SMT. This takes far less silicon than dual cores, and lets you use the functional units more efficiently and use a wide issue unit efficiently (by boosting parallelism in the instruction stream).
Now what? (Score:1)
This message was encrypted with rot-26 cryptography.
Dumb answer (Score:1)
Re:OK, dumb 32/64 bit question (Score:1)
Re:Dumb answer (Score:1)
Re:The real question (Score:1)
Re:OK, dumb 32/64 bit question (Score:2)
pipeline efficiency, memory bus bandwidth, smp cache coherency efficiency.
If you don't need >4GB of address-space then you're probably better off with a high-clock 32-bit chip and a good memory bus
read the link first, dumbass! (Score:1)
Now (Score:1)
sizeof(int)==4
sizeof(long)==8
sizeof(void*)==4 (IIRC)
Re:IBM (Score:1)
I think that this precisely underlines why Free/Open Source software is such a great idea. When you share, everyone wins. Getting more people onto the platform increases the development effort much less than the support base, so the average effort per user is less. As long as IBM is truly sharing by adding some effort into the system rather than leeching off everybody else, bringing them onboard helps everybody. Admittedly, adding new platforms as IBM is doing is more effort than more users on already supported platforms, but there's also potentially more benefit. Adding users with different needs adds new features (which is why it's more effort), but many of them will provide trickledown benefits to other users who wouldn't necessarily have been willing to develop them by themselves. And IBM is playing fair by putting in the development effort of adding those new platforms and features themselves rather than demanding that others do the work.
Get a life (Score:1)
You have to understand that IBM has traditionally been a hardware company but services really drive revenue and profit now. Sun hasn't stepped up to Linux the way that IBM, Compaq, and HP has so IBM is hoping to pick up some of the sales and services as Linux momentum picks up.
Re:Linux is dying (Score:1)
I've used GNU/Linux since early '98, and I haven't posted to USEnet since 1996, sooo.... what good are your figures?
Seriously, most of us have better things to do with our time than chat in newsgroups or IRC (I finally gave up MUDding in 1997, and I still miss it...), like work for a living!
Sheesh. This doesn't even qualify as FUD; your 'logic' is ^severaly^ flawed!
Articles on Testing Web Apps, Kernel (Score:2)
I've posted this here before, but don't want the IBM folks who might be reading to miss it:
I could also use some help from someone with expertise in designing database schemas.
Thank you for your attention.
Mike [goingware.com]
Re:Multiple cores on a chip. (Score:2)
I'm not sure I buy this argument. It is far easier to duplicate an existing design to make another core than to modify the design to support SMT. SMT requires big fetch/decode/rename hardware, a big register file and probably big caches, too. All of this stuff is in the critical path and will be difficult to run at the high clock speeds expected from modern cores.
Actually, unless you want to take a moderate speed hit from recycling external bus protocols internally, you'll have quite a bit of design work on your hands building the internal communications bus for a multi-core system. Whether this is comparable to the amount of work needed for a SMT system is an open question, but it's definitely not negligible.
The fetch/decode hardware is a straight duplication of the existing hardware - it doesn't take up any more space than for two duplicated cores.
The architectural register file is in two independent banks; again, no more space than you'd have normally. Your physical register file is probably in the form of distributed reservation stations on the functional units; again, no more space than for duplicate cores.
You do need more bandwidth on your result busses if you want to use more functional units at once, but this holds true for an aggressively-superscalar single-thread processor too. This is a manageable problem. If necessary, you can trade off bandwidth and latency when building the thing, because your execution stream is much less sensitive to latency than it would be with a single-thread machine.
Renaming hardware is manageable. You'd need the bandwidth anyways on a wide-issue single-thread processor with the same issue rate.
Caches will have less locality, which can be partly addressed by the operating system (keeping related threads on the same die), but will still be a problem. I don't think this will kill performance. You can pull tricks like having multiple cache banks to fake multiporting, or you can pipeline the cache and run it at a higher clock speed (as you don't mind _some_ extra latency), or do a number of other things. We're reaching the point of diminishing returns with cache size anyways, so you'll probably still have enough cache to effectively handle both threads.
Re. clock speed, again, you don't mind a _moderate_ amount of extra latency, because you have enough parallelism to reschedule around it. You also can get away with a smaller instruction window, because you won't have to work as hard to find independent instructions. This saves latency in the scheduler.
In summary, while you raise legitimate concerns, I don't think that they'll be significant problems.
Re:Multiple cores on a chip. (Score:1)
I don't know exactly what you mean by "recycling external bus protocols." Certainly you'd want to design the core interface to be efficient in a single-die environment, but it seems to me that the fundamental protocols (cache coherence, etc.) are the same. But I'm not an MP expert, so you probably have more insight on this than I do.
Not true. The SMT has to worry about fetch policy. This is not a trivial problem to solve. Starvation is a real concern here. Two independent cores don't need to worry about the fetch stream of one interfering with that of the other.
Decode is not a trivial problem, either. IA32 has bug problems with this. My (and others') guess is that's why a trace cache was put on the P4. It's decode cache!
Then there are problems with a large, muti-ported L1 instruction cache. Or interleaved fetch, which gets back to the first problem.
Except for the additional routing logic and wiring.
First off, pet peeve of mine. This is not directed to you personally, but to the computer architecture community in general. It's function (or execution, etc.) unit, not functional unit. I would hope all our execution units are functional. :)
As for the physical file, distributing it implies a non-uniform register acces time a la the 21264. It's not an impossible problem to handle but there is a penalty with such a large file. Register caching can help with this and it may not be a large concern in the end. More study is needed in this area.
Certainly. This is why engineering is fun. :) The point of my post is that SMT is not a guaranteed win. It might be beneficial in some situations, but not all. I'm not sure it's justified for a POWER-class machine.
Ah, but there is no single-thread machine that has the bandwidth of the proposed SMT schemes. Building a 4-way machine is a challenge. 8-way should be much more challenging.
That's not true on server-class machines. Capacity is still a problem. This is why we see superlinear speedup on MP systems. The extra cache on each chip makes the machine as a whole run faster than you would expect, given the number of processors. SMT takes this distributed cache and puts it in one big array. This will slow it down in one way or another.
Eh? ILP has nothing to do with cycle time, save the impact on cycle time that complex O-O-O hardware can have. One cannot "get around" a slow clock through scheduling. Pipelining can be used, but that has its own costs.
Hmm...maybe. But you argue above that any extra latency in the system can be masked by the O-O-O engine. To me, this implies a larger window. Eventually one thread or another is going to get backed up waiting on memory. When that happens you have to have room available to fetch from the other threads. Has anyone done any studies of instruction queue utilization in SMT? I'd like to know how often the queue is full and how big it has to be to sustain execution. I seem to recall some of the work out of U. Wash. doing this, but I can't find the reference at the moment.
All this is not to say that STM is worthless. Far from it. In fact, the fast thread context switching allows some super-cool techniques not previously possible. I'm trying to temper the enthusiasm for SMT a bit. Think of me as a Devil's Advocate. :)
--
Re:OK, dumb 32/64 bit question (Score:1)
So much so, that Linus has shunned use of 'long long' in the kernel as much as possible. This is rather a pity. I hope that someone improves this in the future.
Tim
Re:IBM (Score:2)
The system from the benchmark report has 24 RS64-IV 64-bit processors running at 600 Mhz with 96GB (yes, GB) of system DRAM. Each processor has 128kB L1 data cache, 128kB L1 instruction cache, and a 16MB L2 cache. The chips also support course-grain multithreading (simpler, but similar to SMT).
(600 Mhz sounds slow until you realize that it uses a simple, very efficient 5-stage pipeline. Intel and others achieve high clock rates through deep piplines and rely on branch prediction and other techniques to keep the pipe full. Branch mispredictions and cache misses can kill the actual performance of these chips on real server code.)
This system with 24 processors outperforms HP's 48 processor "SuperDome" and Sun's 64 processor EU10k (though the UE10k is an old system by now, it is the fastest server Sun is shipping.)
The above system is not using the Power3 chip from the posted story. You can bet IBM will port Linux to this beast next. We won't see a 24 processor systems with Linux right away, but an s80-like system would make a sweet 4-processor Linux server.
One last note: these systems are not vapor-ware. A 12-processor system with an earlier version of the same processor has been shipping since the summer of '98.
Re:Multiple cores on a chip. (Score:1)
From what I've seen, SMT is based on the idea that "We have all these execution units to supply peak demand, but they go unused a lot." You add more of everything else, and share the execution units. It's unlikely that both threads will be making peak demands at the same time, so you can probably keep them satisfied. It's a good way to get the average throughput up. This is like ethernet, where the bandwidth is shared, but not everybody uses it all at once, so it works.
I also want to reply to some of your specific points:
Decode is pretty much trivial on anything but an IA32. They have this problem because they have to run an old instruction set that wasn't designed to be easy for hardware like we have now to deal with. New designs try to be good compiler targets, and to make the CPU's job easy. Most instruction sets are like the decoded x86 instructions that are generated internally on modern IA32 processors.
Of course you'd use pipelining! Pipelining is has a similar goal to SMT: Keep more of the hardware busy more of the time, to increase total work/time. Since run time is the quantity of interest when measuring how fast a computer is, ILP and scheduling have everything to do with cycle time. You can trade off one to get the other, and have a CPU that gets the job done in the same amount of time.
Time = instructions * CPI * clock period
So raising the clock speed (smaller clock period) has the same effect as decreasing the average number of cycles per instruction (more ILP).
As for window size, remember the law of diminishing returns. Two small windows on two threads will find more instructions to run than one large window on one thread.
Cool. Just remember that we're not claiming an SMT cpu can do the work of two whole CPUs. As I said, it's an idea in the same category as pipelining. One pipelined CPU with 5 stages probably isn't as good as 5 non-pipelined CPUs (as long as there are five tasks to keep all the CPUs busy. If there's only one job to do, and the compiler didn't parallelize it, the single pipelined CPU will be faster). However, a pipelined CPU only takes a bit more silicon, and gives a big (but not quite x5) speedup.
You probably figured out some of that before I said it, but I hope helped :)
#define X(x,y) x##y
Re:Multiple cores on a chip. (Score:1)
Yes, but I'm assuming equal (or as near equal as is practically possible) execution resources.
See, now that's where I disagree. In terms of raw transitors you are right. But you are forgetting the cost of design and slippage in time-to-market. It is less costly to design and fabricate an MP design from previously designed cores than it is to take such a core and modify it for SMT.
Realize that I am not saying this should never be done. Just that more evaluation is needed. Some of that evaluation will involve silicon, probably in the consumer market.
If you look at raw transistor count, I might agree with you. I depends greatly on the architecture and the expense of the duplicated vs. shared resources of the two designs (decode logic, etc.). SMT has a cycle time impact and you have to balance that against the extra transistors required for a CMP.
This is an argument I've never understood. The execution units are such a tiny, tiny part of the die that I don't see much benefit in sharing them. Sharing the decode/O-O-O logic seems more beneficial, but even an SMT requires more of that (in terms of bandwidth).
Sure about that? The POWER architecture is pretty complex. Even on a MIPS-like machine, the rename and dependency logic complexity rises rapidly with fetch/issue width. As does the wakeup logic with larger instruction windows.
My point is that raising ILP with SMT can decrease the clock speed (increase the cycle time). So you get more ILP but everything runs more slowly. A CMP with an equivalent number of contexts should get the same ILP without the extra cycle time penalty (ignoring messaging and coherence overhead). SMT does not make any one thread run faster. It increases throughput, which is exactly what a CMP does.
Pipelining is not a panacea, either. There is a limit to how deep you want to make your pipe. This is one of the reasons good branch prediction is so important -- it allows a longer pipe.
Remember also that cycle time is pretty much the only thing companies have to market, everything else (i.e. number of threads) being equal. Is this unfortunate? Clearly. But it is reality and engineers need to deal with it when making design decisions. Which reminds me to plug the excellent book The Soul of a New Machine [barnesandnoble.com] by Tracy Kidder, a fascinating account of Data General's race to kill the VAX. There's a bit of discussion devoted to market times and perfect designs.
A CMP has two small(er) windows for two threads. An SMT has one big window for two threads. Two smaller windows should run faster. Whether they find more ILP is an open question. SMT does have the advantage that it can trade off window space, etc. between threads. I think this is more difficult to do than most people realize due to the challenges with fetch policy.
In any event, I am extremely curious to see what happens with the SMT chips coming out. Let's sit back and see if they make it! :)
--
Re:The real question (Score:1)
Re:Power3 or PowerPC? (Score:2)
POWER == Performance Optimised With Enhanced RISC
PowerPC == POWER for Personal Computers
The PowerPC was developed as a cut-down (32-bit instead of 64-bit and lacking a few rarely-used and complex instructions), largely binary-compatible version of the POWER.
PowerPC isn't really any particular processor, but a specification, which was first implemented as the PowerPC 601 back in 1994 (remember how it totally wiped out the Pentium-75?). Subsequently, embedded versions have been made, along with more powerful desktop versions of the PowerPC - the 603, 604, 750 (G3), 7400 (G4) and now the 7450 (G4+).
Meanwhile, the POWER has been developed as well, remaining a high-end 64-bit monster for the enterprise-level RS/6000 machines. The PowerPC 601 was based more on the POWER1 than anything else, the chip shown in the log is a POWER3, and the current hot topic is the POWER4 with all these nice new features (one or two of which have reportedly already made it into the 7450...).
The bottom line is that the POWER and the PowerPC are different but surprisingly similar beasts. They are nearly binary-compatible, which is why the kernel reports it as a PowerPC-class processor.
IBM (Score:1)
Re:IBM (Score:1)
Re:IBM (Score:1)
The real question (Score:1)
Nope. (Score:1)
Well, not exactly. Perhaps some of us remember MkLinux? Apple ported that to the older Nubus based Macs, something nobody else seems to have done with any other OS. So IBM porting Linux to a chip they designed? Whoo hoo. Yay Linux. They could've just as easily ported NetBSD or any other operating system to it, if they had the inclination (which, obviously, they don't given their latest Linux kick). Now, if they'd ported Mac OS X to the PPC64, that'd be something to write home about. =-)
Re:Big Time Linux: Itanium, S/390, PPC64 (Score:2)
Linux was the first OS ever to boot on Itanium. (*bsd not there).
Where are the Itanium computers? This port isn't of much use to nearly everyone.
Linux was first on PPC64. (*bsd not there).
Where are the PPC64 computers?
Linux was first free OS on S/390. (*bsd not there.)How many people own an S/390?
Linux was first on UltraSPARC.
And where is a semi-usable UltraSPARC distribution?
Heck, all of these ports require much hand-rolling. And you also mentioned hardware which the vast majority of people here have never even touched or seen- have you?
Proof of concept ports, and ports that aren't deployed anywhere in the real world: these aren't of much use, regardless of if the port is of a Linux or a BSD.
Re:IBM (Score:4)
--
Re:Big Time Linux: Itanium, S/390, PPC64 (Score:1)
Also most people cannot just buy new hardware, that is why NetBSD is ported to the VAX, but why not LINUX?
Re:OK, dumb 32/64 bit question (Score:2)
It performs pretty well for a kludge but does require your application to use the MS AWE (address windowing extensions) memory allocation api's which have some restrictions, such as only providing page fixed memory and only allowing you to dealloc in the same unit you alloc'd (so writing dynamic memory handling is not easy))
The increased address space is cool if your o/s has a good (fast, influenceable) vm manager - you can strip out buffer mgt code from your app (reduces complexity)
Also great for server apps that do lots of read io as you can buffer even at large concurrent user workloads, so can see Real/Oracle/Akamai type apps benefiting
Re:IBM (Score:1)
I have a number of rs/6000s at work, running AIX currently. It's nice to know that as soon as I don't need AIX (read: as soon as I don't need V4 CATIA or CATIA is ported to GNU(!)) I can run GNU/Linux on those bad boys!
For those of you who haven't used a late model rs/6000, I highly recommend them. Those chips are darned fast!
Re:Big Time Linux: Itanium, S/390, PPC64 (Score:4)
Itanium represents the first commodity 64bit enterprise computing platform. A major advance if you ask me (regardless of performance), and linux will be there first, along with SCO, and win2k bringing up the rear.
Where are the PPC64 computers?
Ever hear of Power3 and Power4, and AIX? 'nuff said.
How many people own an S/390?
I think the count of people that use S/390 is far less inportant than the importance of those people. S/390 has no peer in its class as a mainfraim. Sun's starfire comes close.
And where is a semi-usable UltraSPARC distribution?
Debian has a semi-usable distribution for Ultra Sparc. I beleive they have Xfree working, among other things, along with the trivial ports that just require a linux kernel
Proof of concept ports... these aren't of much use...
Needless to say, I disagree.
No SMP yet though (Score:1)
Of course, I assume SMP will be arriving sometime shortly.
Re:IBM (Score:1)
One of the well-known problems with TPC-C is that it uses a hierarchical system where all the data is part of a particular warehouse and only a small percent of transactions need to access any cross-warehouse data. The little intra-warehouse communication required is evenly distributed, so load imbalance is not a problem.
For this reason, the TPC-C benchmark can run more efficiently on a cluster of servers than many real world OLTP setups. From what I've been told, real clustered DBMS setups suffer from load imbalance and are even more difficult to setup and tune than 'single instance' systems. The majority of OLTP systems in the field don't use clusters for performance.
That said, one of the best uses of small clusters is to provide fault tolerance and high availability of data, but that is a different setup than the 'clustered' TPC-C results you mentioned.
Re:Big Time Linux: Itanium, S/390, PPC64 (Score:1)
-lx
Re:Big Time Linux: Itanium, S/390, PPC64 (Score:1)
I'm a long time BSD user, who's also tried the top 10 Linux distros at various points. The only ones I even came close to liking were Slackware and SuSE. SuSE is very commercialized, and it's hard to get ISOs for, so I don't mess with it much.
But I do know that the various times I've played with Linux over the years, it's proven to be quite a bit less stable(first time I installed redhat, it locked up after the first reboot, redhat, caldera, debian and corel all failed to install and boot on a rather flaky p166, whereas FreeBSD did flawlessly), far behind in terms of package management, i.e., rpm(I know about apt, and I think it's cool, but that's one Linux distro, and one I don't dig on much), and fragmented(even though this is what people say about the BSDs, the different distros are very dissimilar, and are quite large in number).
These are the things that have made me stick with BSD over the years. I get my OS from a central location, worked on people in an actual team environment with democracy and accountability, released under a license which is truly free, is easy to get ahold of and install(many linuces didn't have install over ftp for years, some still don't), they have great package management, great performance, great support, and great stability. I'm not saying there's no place for Linux, but given the reasons that I've just mentioned, why would I want to use it?
Anyhow, that's all just my opinon, and you expressed yours. What I *do* take issue with is the "never caught up" bit. In what way is any given BSD distrobution not equal or superior to Linux?
-lx
Re:Dumb answer (Score:1)
Re: That's why you need a damn good compiler (Score:1)
This sounds a bit like the apache patches SGI did for the Accelerated Apache project. The patches were submitted but aren't (apparently) going into apache.
My concern is that as more and more code is given to us, will we refuse it becuase we didn't write it? And would that be a bad thing?
Re:Dumb answer (Score:1)
Re:Power3 or PowerPC? (Score:1)
I know that PowerPC is subset architecture of the Power. ( not Power2, Power3, Power4 )
I'm not sure if current PowerPC processors have binary compatibility with current Power processors. Although the begining of PowerPC is similar to the Power, I don't think PowerPC architecture is identical to Power architecture.
You can't say that Z80 is x86. ( Z80 is said to have binary compatibility with x86. )
Does AIX for RS/6000 with PowerPC processors run on RS/6000 with Power2, Power3 without modification? I'm not sure of it.
Is there any person who tried it?
Re:OK, dumb 32/64 bit question (Score:1)
Re:Linux is dying (Score:1)
Re:IBM Gaining Marketability in Mainframe Industry (Score:1)
Re:PowerPC - does anyone care? (Score:1)
Wonderful (Score:1)
Re:Big Time Linux: Itanium, S/390, PPC64 (Score:1)
1) Q: Where are the Itanium computers?
A: Still vapour.
2) Q: Where are the PPC64 computers?
A: Any late-model RS/6000. I have quite a few at work; don't you? *grin*
3) Q: How many people own an S/390?
A: A great many large corporations. Duh. What could anyone do with one in their living room! For that matter, how many people owned VAXen when they were new? Again, corporate hardware.
4) Q: And where is a semi-usable UltraSPARC distribution?
A: Try SuSE; you'll be glad you did.
Re:Big Time Linux: Itanium, S/390, PPC64 (Score:1)
dont forget mandrake. they also have a version for sparc: ftp://fr.rpmfind.net/linux/Mandrake-iso/sparc/ [rpmfind.net]
although it is beta. i must confess that i have never used it, but i would think it's at least semi-useable.
use LaTeX? want an online reference manager that
Re:OK, dumb 32/64 bit question (Score:5)
Re:IBM (Score:2)
IBM haven't ported Lunix (A UNIX implementation for the Commodore 64/128) to the 64-bit PPC platform. They ported Linux. Get It Right.
Sorry... someone had to do it though
Power3/4 and OS/400 (Score:1)
Re:IBM (Score:1)
i was referencing some Jeff K stuff.
i suppose the moderators are gone by now, here's the link ;)
USAR FREINDLEY [efront.com]
and this strip in particular [efront.com]
Re:Dumb answer (Score:1)
I'm sorry to correct you, but the 68000 series of microprocessors does not have a combined address / data bus. The 68008 had a separate address bus of 20 or 22 pins, so it could directly address 1 or 4 MB of memory (depending on package)
Re:its pretty bad when the editors troll... (Score:1)
First,
Second,
Third, for anyone coming from outside, the flame wars help identify the major players, and occasionally even some useful informations.
There seems to be some kind of natural progression from Windoze to Linux to BSD, with a number of people running a mix. My idea of "World Domination" is half the destops running OpenBSD, but them I've got a warped sense of humor
Execution units rapidly reach diminishing returns. (Score:5)
Eight execution units! I recall that the x86 line have half of that. And 6.4Gb/s memory is not to be laughed at either!
Memory bandwidth is a good thing. Low latency cache hits are great thing, if you can get them (no idea if PPC does this or not).
However, adding more execution units won't buy you much beyond a fairly small number. The reason: you just don't have that much extractable parallelism in the serial instruction stream.
I had the good fortune to be playing with this recently via simulation. If you give the processor a *huge* instruction window (256 instructions) and the ability to execute *any* number of instructions of *any* type in parallel (except for memory accesses - see below), you still get an average Instructions Per Clock of about 2.1-2.2. 95% of the time, you're getting four instructions or fewer issued (and most of the time, far fewer than that).
When SMT is put in silicon, wider issue will become practical (due to increased parallelism in the instruction stream), but as it is, you're better off spending the silicon on other improvements.
Re. memory accesses; the reason why it's extremely difficult to do memory accesses out-of-order with each other is that you have to check to see if any given two memory accesses refer to the same location (indicating a dependence). You often don't know what the target address is until late in the pipeline, and you'll still need to do a TLB translation to get the physical address, and compare two large bit vectors (the addresses).
Remember, to be useful for scheduling, you have to be able to do all of this very quickly and very early in the pipeline.
All of this makes out-of-order memory accesses very difficult to implement theoretically, and a nightmare to implement in real silicon. It's still sometimes done in a limited manner, but this doesn't affect the IPC very much.
IBM using Macs? (Score:1)
Re:List of CPU architectures supported by Linux? (Score:1)
Wow, your right. All this time I have been interfacing my kernel by hand. You OS-bigots were right all along! What I need is an "operating system". I feel so bad that I have wasted all this time using a kernel when I could have been using a "cool" operating system. Sign me up!
Re:IBM (Score:1)
Re:Big Time Linux: Itanium, S/390, PPC64 (Score:2)
Itanium: Linux has an Itanium emulator written specifically for it, by Intel, I believe. That makes it kind of easier. Besides that, BSD does boot on the Itanium, even though they were severely impeded by lack of tools.
PPC64: It was ported by a corporation, fuckwit. A corporation with more resources than a non-profit organization could ever put towards porting to a platform, porting Linux to run on their own hardware, whereas NetBSD is an independent effort. They can't just run out and get a PPC64 box for themselves.
S/390: Same story.
UltraSPARC: Both run on UltraSparc, but I don't know dates of when they first booted, or the extent of Linux/Sparc support. This might actually be...a *relevant point*!
And then you call this stuff "mainstream, state of the art hardware". For all but the UltraSPARC, it's impossible for a normal person to even lay hands on one of those machines. Even in the case of corporations, how many do you know that are running Linux on IBM boxes instead of AIX? Why the hell would anyone want to, seeing as how AIX generally outperforms it anyhow?
In any event, how about high-end hardware that people can actually buy? NetBSD was the first to be running on the Alpha, for instance, a high performance platform that actually matters. First on SGI boxes. How about i386, the architecture everyone uses? In the early 90s, NetBSD was far more complete and usable than Linux, and to this day has very complete hardware support for the platform. One could also point out that Linux has been lagging behind on new technologies, like IPv6. Might want to take that into account when you're tallying up the final "Score".
Re:Linux VAX at Sourceforge (Score:1)
Re:IBM (Score:1)
Re:Power3 or PowerPC? (Score:1)
Re:List of CPU architectures supported by Linux? (Score:2)
Of course, this is not all of them, S/390 is even missing.
And uLinux runs on architectures like the DragonBall, and other things too. I don't know of a complete list anywhere.
Re:List of CPU architectures supported by Linux? (Score:2)
Re:Dumb answer (Score:2)
Ouch!
Re:Dumb answer (Score:2)
Re:Execution units rapidly reach diminishing retur (Score:2)
Re:Execution units rapidly reach diminishing retur (Score:2)
Yes. You average 2.1-2.2, and 95% of the time you're only getting 4 or fewer. However, when you look at the other stuff the Power3 architecture includes, it's pretty obvious what the overall intent is:
Hardware Loop Unrolling.
IBM has got some customers that use some serious CPU. We're talking national labs and the like. For them, the ability to run 8 of those neat 'multiply-and-add' instructions per clock cycle is quite an important feature.
The chip *starts* at 375mz, and can do 16 floating point ops/clock (an amazing amount of code uses that mult-and-add over an array - and the IBM compilers are smart enough to detect and convert divide to multiply-by-inverse and add/subtract issues).
And of course, IBM is hoping that even though the big SP/2 iron is limited to national labs and Fortune-500 companies (see The Top500 List [top500.org] for details), that they'll be able to sell a lot of the smaller 43P deskside boxes (1-4 Power3 CPUS) and the 8-16 CPU rackmount servers, to all the smaller companies that need number-crunching.
Re:OK, dumb 32/64 bit question (Score:2)
...and a 64-bit integer can be manipulated on a 32-bit machine - and even fairly conveniently, if the compiler cooperates, and GCC does cooperate here (think long long int).
List of CPU architectures supported by Linux? (Score:3)
My question is, is there such a page updated with such info? I don't believe that Linux Torvalds maintains all different architecture branches..
Thanks!
r. ghaffari
(25/M/Baltimore, MD)
Power3 (and Power4) have som really cool features. (Score:5)
Taken from:
http://www.rs6000.ibm.com/resource/pressreleases/
"Unlike a typical PC microprocessor, the chip features eight execution units fed by a 6.4 gigabyte-per-second memory subsystem, allowing the POWER3 to outperform competitors' processors running at two to three times the clock speed"
Eight execution units! I recall that the x86 line have half of that. And 6.4Gb/s memory is not to be laughed at either!
Re:IBM (Score:2)
So, you can see this as yes IBM is scratching an itch, but at the same time making lunix more available in the high-end enterprise environment.
really useful ports (Score:2)
These are platforms I'd like to see some porting being done for:
Other people's brains (remote access, perhaps X-10 integration)
Garage door opener (so I can apply sound themes to it, and replace the "so 1991" screeching)
Alarm clock (see garage door opener)
Pets (obediance school in any C, Perl, or any language you want).
Re:IBM (Score:2)
The PowerPC running in 64-bit mode will help them get Linux up-and-running on the eServer iSeries (that platform has been 64-bit for longer than just about any other major server). It allows them to funfill their goal of getting Linux running across all the eServers.
Re: That's why you need a damn good compiler (Score:2)
In a word, yes. If the "Not Invented Here" syndrome becomes rampant, then large corporations will have less incentive to build improvements upon the system, and therefore, will start again with either a fork or a potentially closed-source proprietary system. Either way, support for the original open-source system will wither away, reducing the potential for corporate uptake.
However, this is just a generalisation; I cannot comment specifically upon Apache itself.
--
Re:Dumb answer (Score:2)
The 64-bit PowerPC architecture antedated the RISC AS/400's, as far as I know - as I remember, I saw a PowerPC architecture manual describing 64-bit mode before the RISC AS/400's came out (it was some time in 1994, I think, when I saw it).
The PowerPC 620 was supposed to be the first 64-bit PowerPC; I don't know whether any machines shipped with it. IBM now have 64-bit PowerPC's in both the AS/400 and RS/6000 machines (I think some of the RS/6000's use the same chip as some of the AS/400's, with the tag bits and other AS/400 extensions disabled in the RS/6000's).
Re:its pretty bad when the editors troll... (Score:2)
http://slashdot.org/article.pl?sid=01/02/21/14502
"it looks like NetBSD could give Linux a run for its money in the handheld arena."
http://slashdot.org/bsd/01/02/05/1859221.shtml [slashdot.org]
" 'Linux 2.4.0 is available for no money. So is FreeBSD. Linux uses advanced hardware, so does FreeBSD. FreeBSD is more stable and faster than Linux, in my opinion. "
Basically the precident is that it is acceptable to be inflammatory as long as your aren't Linux. A majority of the articles comparing BSD and Linux do so on a well-known point, stress under high loads. Notice that Slashdot does not post articles comparing native application support, user-base, or multi-processor support. Posting of such articles or comments will likely be considered inflammitory.
In posting this in now way am I trying to start a flamewar. However, I do feel Slashdot holds a double standard in how it treats BSD remarks, especially on the front page. Being immature and biased is useless, regardless of OS choice. Thoughts?
Political considerations (Score:2)
Far better for them to put the work into ensuring a stable port to their new chip. Now all they need to do is to wait for Intel to put out a sickly version of the Itanium (like they did with the first release of the P4).
--
Re:List of CPU architectures supported by Linux? (Score:2)
Re:IBM (Score:4)
This is probably IBM anticipating. After all, just because there's no demand now doesn't mean that there won't be demand when the system is available. Getting the system ready ahead of demand is smart; it means that when people running PPC want more horsepower, IBM will be able to provide them with a nice smooth path to 64 bit PPC. This looks like it's just a regular part of IBM's Linux strategy. They want to make it available everywhere, so companies can upgrade to more and more powerful systems without having to relearn everything.
Re:IBM (Score:2)
Something that I can see this having a use for, is for boxes that are going to soon (very probably) reach an 'end of life' in the IBM OS camp. True there is Project Montery, but some of these 64 bit machines could definately end up lying in the dust. Especially with a whole new OS and designers that may decide that the earlier systems are too 'hard' to support easily.
I'd much rather see IBM make sure that something decent still runs on older boxes, than having no shipping OS at all that will support such platforms.
And I'm not saying IBM will drop support for these platforms anytime soon, but it's much easier to get the thing started now, than later, when it could be too little, too late.
HP's PA-RISC's are a prime example. Linux runs on these, as various numbers of the machines fell into the hands of Linux developers that have experience in kernel code/porting. Unfortunately the machines never really made it, but at least the people out there have something that runs and is probably better supported than what you would otherwise get. It was the thing, just too late in the game. At least IBM aren't making the mistake of leaving people lingering with unusable hardware.
Also remember that IBM has a considerably large Second Hand group that reconditions trade-in systems and then sells them to more disadvanged groups or countries - and if they don't have an OS to ship on those machines, what do they do with them?
Of course, what contributions they get from some of the truely bright sparks in the community with the Linux port, may actually improve the way their own code monkeys write/implement their next OS kernel, which they can only view as a win-win situation.
One size doesn't fit all. (Score:2)
Don't fall into the arrogant assumption of thinking one architecture is enough for everything. You wouldn't want just VGA graphics now would you?
The 370 architecture is alive and quite well, thank you, and processing payroll, accounting and other mundane crap that you can't live without.
But you wouldn't want to have to write a game for a 3279 terminal now would you. No more than you'd want to bank with somebody who'd balance your accounts on a PS2.
The PPC architecture is alive and well and the G4 is very useful for some types of processing and totally useless for other things but what it does, it does damn fast.
The x86 is as much of a dead-end as the z80. It will be utterly swamped by the requirements of voice processing and image recognition that a wired economy needs. Forget passwords. Just say your name and smile for the cam. (And that's only the first app. The one at the gate, so to speak. )
Re: That's why you need a damn good compiler (Score:2)
That's exactly why IA-64 is going to kick the crap out of other architectures 3 years down the line (once the compilers actually get good). And it's also why RedHat is screwing over their users by sticking with gcc over SGI's (GPL'ed) IA-64 compiler. As the first author said, conventional compilers only 2.X IPC at best. So two-thirds of the Itanium execution units are wasted when you use a compiler like gcc. SGI's, on the other hand, was redesigned from the ground up (starting with the gcc parser for compatability) to use all of the neat, theoretical tricks that you need to get ILP in this situation. TurboLinux has already gone with it and demonstrated good results (that's one reason why NCSA will be using Turbo for the second stage of their huge, new cluster). But gcc is Cygnus' baby, and they will fight to keep using it, no matter how badly it hurts performance in the end.
--JRZ
Actually, SMT is much better. (Score:2)
You seem to have an incomplete picture of what SMT is.
SMT - Symmetrical Multi-Threading - is simply the ability to have multiple threads running on a chip at the same time, with separate fetch units and register files but with the instruction window and the functional units still shared.
The threads don't even have to be from the same program, or in the same address space (though it'll reduce TLB and cache load if they are).
No extra effort is needed on the part of the programmer, and you get N times as much instruction level parallelism with N threads as you would for one thread. In one instruction stream, you'll always have dependencies that can't be avoided - true dependencies. Parallel threads don't have any shared dependencies for register operations, and are much less likely to have dependencies for memory operations (under most conditions).
A compiler, on the other hand, has to be made extremely complex to extract much more parallelism than is currently extracted, and still won't be able to capture a lot of it. I know this far too well, having seen the guts of compilers on a few occasions. You'll also get no benefit for legacy code or for code that was compiled with a mediocre compiler (as almost all code is, to Intel's continuing dismay).
SMT is especially nice because there's almost no extra hardware overhead for implementing SMT. It's a winning strategy from all angles.
Re:OK, dumb 32/64 bit question (Score:2)
+++