

Kernel Benchmarks 136
kitplane01 writes: "Three students and a professor from Northern Michigan University spent the semester benchmarking a bunch of Linux kernels, from 2.0.1 to 2.4.0. Many functions improved in speed, but some did not. Total lines of code have tripled, and are on an exponential growth curve. Read their results here."
Re:silly graphs (Score:1)
It's a good thing Moore didn't have a pompous ass like you as an instructor, or he might have been too traumatized to make the observation that processing capacity doubles every 18 months.
`strings' strikes again (Score:1)
$ strings total_growth.gif | head -2
GIF89a
Software: Microsoft Office
Nice graphwork (Score:3)
This shows why computer guys are not scientists. My first year phys chem prof would tear his own arm off and beat you to death with it if you gave him a graph that looked that ugly.
The Excel defaults may be ugly, but you can change them.
GCC optimizations and benchmarking (Score:5)
Thus "0" indicates byte alignment, "1" word (16 bit) alignment, "2" doubleword (32 bit), "3" quadword (64 bit), and "4" paragraph (128 bit). The other optimization of interest is the "-O" setting. Here arguments can take the value of 0, 1, 2, or higher. Personally, I found that -O2 was not necessarily the best setting, although it seems very common to find it set to that in Makefiles. I found using -O1 and tuning the alignment optimizations by hand provided better results.
My findings by benchmarking all the combinations of settings were that for a Cyrix 5x86, optimal alignment values were lower numerically lower than might be expected. For example, close to optimal settings as I recall were:
It wouldn't be a bad starting point for any Intel processor. On modern processors, it is more important to achieve high cache hits, which is thwarted by certain wrong optimizations such as aggressive loop unrolling and excessive alignment. One particular setting to avoid is -m486. It should be avoided for most processors other than a 486, because the 486 alignment requirements are less than optimal (i.e. tends to over-align) for both its predecessors and descendents. And if you don't need a debugging version of your code -fomit-frame-pointer is usually always useful as it frees up an extra general purpose register.Yeah, but... (Score:1)
I can DEFINITELY tell the difference between 2.2.x and 2.4.x -- 2.4 beats the hell out of 2.2.
- A.P.
--
Forget Napster. Why not really break the law?
Some analysis comments (Score:2)
They note that this was all run on the same hardware, but all that means is that the results are valid *for* that hardware. Some of the drastic changes in some areas might be due to, for example, the replacement of a generic driver with a specific driver optimized for one of the pieces of hardware they used. Obviously this change wouldn't carry over to all other systems.
All in all not bad though. It would've been nice to see some more rigorous data analysis though (the data analysis expected in a typical college freshman chemlab class is more extensive than this).
Re:Signal handling - so what? (Score:2)
--
Mike Mangino
Sr. Software Engineer, SubmitOrder.com
the lines they should have counted... (Score:1)
when it really should have just been something like.
(root@mustard)-(/dev/tty0)-(/usr/src/linux)
(Wed May 9)-(05:53pm) 19 # find . -name *.[ch] -exec egrep "< some terrible curse words >" {} \; -print | wc -l
yeah, that would'a worked.
Benchmarks are fun. (Score:1)
Too bad the do little detailed things like lines of code and Stat rather than how much RAM/CPU dose your dynamic web server need to saturate a T1.
Still educational for the none kernel hacker in any case.
Re:Signal handling - so what? (Score:1)
--
Re:How to lie with statistics (Score:1)
Why not? Just because you love Linux doesn't mean you don't use anything else. Heck, doesn't mean you don't love anything else. I'm in a polyamourous relationship with both Linux and NetBSD... :)
--
What about older kernels? (Score:1)
I heard from some people who were using 1.2.something in an embedded project that it's context switch times were quite a bit better than the latest.
Anyone out there know how the older kernels stack up?
Re:I learned something (Score:1)
Why can't you admit that it's boring up there! Come on, you know it is! All they talk about is how many feet of snow will be left on the ground when June comes around.
Re:I learned something (Score:1)
Re:Yeah, but... (Score:1)
I learned something (Score:2)
Re:Devices (Score:2)
Re:I learned something (Score:2)
Except YOU maybe. heh heh.
Re:Yeah, but... (Score:4)
Boosted performance a good idea (Score:1)
A lot of work went into making the UNIX schedular automagically give programs that are currently interacting with the user get a higher priority. Reducing the latency to an interactive program makes the system seem very snappy, and makes users more happy. A slow boat to china job doesn't need to be given high priority because it's gonna take forever anyway. Letting an interactive job run before it won't hold it up long, especially since most interactive jobs do only 1 or 2 timeslices of work before sleeping on the keyboard again.
In a single-user environment, this can be done well with the focus-boosting MS uses. There is a problem, however, with MS's implementation. The UNIX priority system was designed to make interactive jobs responsive without starving CPU-intensive jobs. MS doesn't do this. Focus boosting is a good idea, but MS's priority scheme is hostile to low-priority jobs. UNIX doesn't have such a thing since a UNIX box is usually multi-user/remote-user, so ID'ing the right process to boost the priority of is more difficult.
Interesting note, but in Win2K, if you set a CPU-intensive job to a high enough priority on a single-CPU system, it will use 100% of that processor's time, without letting ANYTHING else run. Talk about starving low-priority jos.
Re:GCC optimizations and benchmarking (Score:5)
How to lie with statistics (Score:2)
One can have a graph of any shape that he wants by carefully choosing the axis'.
features and drivers (Score:2)
I quote" Hardware compatibility is a large part of the growth."
I don't want a lot, I just want it all!
Flame away, I have a hose!
Re:I am Michael's raging bile duct (Score:1)
I find it interesting that both Michael and Jamie McCarthy post stories on /. - you would think Michigan wouldn't be big enough for the both of them :)
Caution: contents may be quarrelsome and meticulous!
Re:silly graphs (Score:1)
I don't dislike what they did, I dislike their presentation. They did a reasonably good job of data collect (not exceptional, but okay). FYI, I am 25, am a PhD candidate at a Big Ten University in chemistry, and have been teaching for 6 years.
Re:silly graphs (Score:1)
What community college do you teach at, stupid arrogant cock-sucker?
Very eloquent. I teach at the University of Minnesota, Twin Cities.
Re:silly graphs (Score:1)
It's a good thing Moore didn't have a pompous ass like you as an instructor, or he might have been too traumatized to make the observation that processing capacity doubles every 18 months.
If Moore's observation was correct (which most people seem to think has been shown by the history of the industry) there are ways to "prove" it, like trying different fitting functions and looking at their errors and/or correlation coefficients. In the case of Moore's law, one whould find that an exponential is the best fitting function.
For most of the graphs in the article, linear or simple polynomial (e.g., quadratic) would appear to give better/comparable fits to the presented data. It seems they chose exponential because it is more impressive to say "this is growing expoentially!" than to say "we fit the growth to a quadratic with coefficients blah-blah-blah."
Lies, damn lies, and statistics.
silly graphs (Score:4)
Silly graphs is a pet peeve of mine. I hate it when my students give me graphs like these. Needless gridlines, unlabeled legends, connected dots, and poor statistical analysis.
I also find it ironic that they used MS Excel (which they don't say they did, but it sure looks like it)...
Re:Page fault latency: in all of 2.2, or fixed? (Score:1)
release - without explaining where it actually came about.<BR><BR>
That's because stable kernels are rather on the security maintenance and driver update path, it does not tinker with the scheduling, memory, signal and disk I/O routines.<BR><BR>
Ploitting development kernels is actually more relevant.
Re:Another study: (Score:1)
Shortly, supported hardware grows exponentially.
Notice, that if the hardware driver development grew linearly, the cumulative amount of drivers would be quadratic. Since the rate of adding hardware drivers is probably a little bit faster than linear, the curve seems to be quadratic to exponetial.
This is far from being signs of bloat disease. This is actually quite healthy grow.
Wah. (Score:1)
Sure, I only have a 400MHz K6-III vs. their 850 MHz Pentium III, but it's not like Linux does everything twice as fast; it's much worse than that.
This benchmark was not that useful (Score:4)
Three students and a professor from Northern Michigan University spent the semester benchmarking a bunch of Linux kernels
Second, no data were presented on the main areas of the kernel that were improved. How is SMP performance in kernel space? Did the finer grained locks help? How is the performance from the threaded IP stack? Does it prevent IO blocking?
THAT kind of information would have been interesting. They tested only things that the kernel has done forever.
lmbench for BSD or Windows? (Score:1)
For those who are interested, here [bitmover.com] is the LMbench home page.
answered my own question (Score:1)
I would be curious to see your test results. (Score:1)
I'm especially interested in FreeBSD.
thanks,
chris
Re:Quite limited really (Score:1)
Oops. Make the obvious correction.
--
Re:Quite limited really (Score:3)
I use a "real world" benchmark (which of course might be completely irrelevant to you, however relevant it happens to be to me).
Here are some recent observations regarding this specific benchmark, ranked in order of effect:
If performance tuning is your forte, then clearly you've got your work cut out for you.
--
Re:Yeah, but... (Score:3)
Re:Yeah, but... (Score:4)
Re:Yeah, but... (Score:4)
For this application the 2.4 kernel kicks butt up and down the street all day. YMMV.
Another study: (Score:3)
http://plg.uwaterloo.ca/~migod/papers/icsm00.pd
Edward Tufte (Score:1)
oh no, not the bar graph! (Score:1)
Re:We'll beat Microsoft yet! (Score:2)
(what I'd REALLY want on a Windows system is an X server and Cygwin, but for the sake of arguement, I'll leave that stuff out)
I'm guessing we'd be approaching some huge numbers on both sides, and all I can really speculate is that I think Windows would have more overlapping functionality in its apps, but I can't say as for lines of code.
Anyway, lines of code is not directly a measure of bloat. In my mind, bloat is lines of code divided by (functionality times stability times performance), but I realize that not everyone shares my view on that.
Yeah.
-ben.c
Re:We'll beat Microsoft yet! (Score:2)
Re:Aaaah! Exponential! (Score:1)
--
Re:GCC optimizations and benchmarking (Score:1)
Can someone point me some links for GCC optimization?
Re:This is off topic as hell but.... (Score:1)
Re:This is off topic as hell but.... (Score:2)
Re:Devices (Score:3)
Yeah, right. The problem with this approach is that it leads to unnecessarily narrow definitions of functionality, and can prevent hardware manufacturers from doing things cheaper. Not only that, but the examples you chose are kind of screwy. "Current modems" without a qualifier implies the N+1 varieties of WinModems out there, which all do things differently. Many old sound cards did things their own way and had a small DOS TSR that provided SB compatability in software. The floppy, IDE, and ATAPI command sets, as well as the RS232 serial-port standards, are published and standardized, but these are properly communications protocols between devices, not the devices themselves. The PCI and ISA busses are, again, more like protocols to allow devices to communicate rather than devices themselves. I don't see too many non-PCI, non-ISA devices that plug into the insides of an x86.
Non-x86 hardware platforms have it easier; one vendor like Apple/Sun/IBM says, "This is the list of hardware that works on our platform," and you use it. The multitude of hardware vendors for x86 boards and devices has led to a large amount of conflicting standards and weird, proprietary hardware. (If a vendor can save $0.10 per unit on a device by leaving out hardware functions which can be replicated by a kludged binary driver, they will. Think WinModems.) This approach has also made x86 hardware cheaper than the alternatives.
Simply put, things will change and change quickly in hardware. Standards are a good idea, but they quickly become lowest-common-denominator, think "VGA".
Re:Quite limited really (Score:2)
------
Re:We'll beat Microsoft yet! (Score:2)
------
Re:How to lie with statistics (Score:2)
What is the 1.4G that you are referring to? 1.4GB HDD? 1.4Gbps Ethernet? 1.4GB RAM? 1.4GHz? $1400?
Nothing's more confusing to the non-computer-"literate" people than having people like us talking ambiguously.
------
Re:Quite limited really (Score:2)
------
Re:Quite limited really (Score:3)
RAM, base mem & extended mem. (Score:2)
Re:GCC optimizations and benchmarking (Score:2)
----
Re:More lines of code...so what? (Score:1)
I can't say I find these benchmarks very credible. Unfortunetly, people will see these "benchmarks" from a college professor and instantly think this is some seriously authoritive info on the comparative performance of various Linux kernels. Bleh.
If they are so authoritive on OS design and performance bottlenecks at such fine grain levels of OS mechanics, perhaps they should put their 4.5 years into improving Linux into where they think it should be performance wise.
But alas, they wait and wait for the next kernel release, run some non real-World benchmarks, and then try to ponder some conclusion from their numbers. Four and a half years and this is all they could come up with?
Don't get me wrong, I think these types of profiling benchmarks have their place, but usually should be used in the pursuit of finding the culprit to performance degradation found in real World benchmarks with a view to actually fixing these smallest yet most significant of bottlenecks.
Re:long term goals? (Score:1)
Huh? First off, a good printer will interpret some common printing language like Postscript or PCLx to render differences between various printer hardware irrelevant, for anything beyond plain text. So really, for these printers, the version of the kernel or even what OS is running the spooler is never going to be an issue as long as the printing app speaks PS or PCL.
In the case of crap printers that can't even print plain text without having the CPU tell it when to move the print head and when to splatter ink from what holes, the kernel version or OS can still be irrelevant as this can be done well outside the kernel. A filter program that accepts Postscript and then converts this into signals that the printer can accept does'nt have to be reliant on a particular kernel version.
Ghostscript compiles on practically any Unix, MS-DOS, Win9x, Winnt, Win2k, OS/2, VMS,... kernel shmernel.
Even if this were something kernel specific, the OEM could simply release a kernel driver for a version of Linux as source code, and then someone(s) would most likely build it into something much better, faster and stable for kernels up to current ones. Witness the history of the SBLive drivers! They started out from Creative quite closed, were buggy and featureless, creative released the code under pressure and now the SBLive is one of the best sound cards supported in the latest Linux kernels.
Re:We'll beat Microsoft yet! (Score:3)
I've always wondered why people say that. I can make several valid comparasions between apples and oranges:
________________________
Re:answered my own question (Score:1)
Re:This benchmark was not that useful (Score:3)
And that is exactly the point that Linux is often criticized for, compared to competitors (Solaris, FreeBSD): it may perform well under no- or light-load conditions, but it doesn't scale well. It would have been interesting to check whether this criticism is still valid for the 2.4 kernels.
Pretty sloppy presentation. (Score:3)
I'm not impressed.
Re:Aaaah! Exponential! (Score:1)
Three Students and a Professor (Score:2)
gotta love the mis-wordings (Score:1)
"As mentioned in our methodology section, this is due to a bug in the kernel code that lead to a feature freeze in subsequent kernels."
if a bug in the kernel code can cause a feature freeze, someone better debug the developers!
jon
Re:Signal handling - so what? (Score:1)
Does your brain have a serious design flaw?
Re:long term goals? (Score:1)
Re:We'll beat Microsoft yet! (Score:1)
Re:MS Graph? (Score:1)
Re:Yeah, but... (Score:1)
Re:We'll beat Microsoft yet! (Score:2)
Interesting... I think I do!
But there are still factors to consider. I think we at least need to multiply by the spaghetti ratio, but other factors, such as usefulness index, design cleanliness coeffecient and ugly hack quotient needs to be taken into account. :-)
Oh well.
Re:We'll beat Microsoft yet! (Score:5)
Depending on point of view, that has already happened long ago...
To make the comparison meaningfull, you have to get systems of somewhat equal capacity. The linux kernel by itself is in no way comparable to Windows 2000.
In addition we need various fileutilities, an accelerated X11-server (with Mesa/OpenGL, the video-extension, and antialiasing), one of Gnome/KDE (filemanager, basic desktop utilities, a simple texteditor, something akin to COM (which would be Bonobo or Kparts)), a working web-browser (Mozilla or Konqueror), some userfriendly utilities to replace the control-panel, a user-friendly email-client and newsreader, a simple webserver, basic networking utilities (Samba with a user-friendly network neighborhood browser, telnet, ftp, ping, ...), a good media-player (capable of playing at least wav, mp3, CD's, mpeg, avi, mov and preferably asf and wmf), minicom, a ppp-dialer, and probably quite a few other goodies I've forgotten to mention.
If we put all this into a linux-distribution, I doubt we would do much better than W2k. But to make things even worse, that wouldn't make much of a linux-system. Most linux-users wouldn't be too happy without emacs, gcc with friends, perl, python, tcl/tk, and most of the common command-line utilities (sed, awk, find, etc...) (, and probably also apache, MySQL or PostgreSQL, gimp, etc...).
Line-count? Well, guess what... Linux has become bloatware... Even more than what's produced in Redmond!
Re:Devices (Score:1)
Re:Devices (Score:1)
Re:You contradicted yourself there pal.. (Score:1)
Re:Devices (Score:1)
Besides... Apple only seems like it qualifies because there isn't much different hardware for it. It's not that all video cards use one driver, it's that there's only 2 video cards (exaggeration I know). If Apple got popular, they would be in the same boat. At least if i86 set the precident, other platforms could take over and not run into the same problems later.
Re:Devices (Score:2)
The most important 'benchmark' (Score:3)
Re:The entire semester? (Score:1)
Re:Quite limited really (Score:2)
Re:Quite limited really (Score:1)
That old open source saw
If you interested in some results that no one appears to have produced, go do them yourself. Don't criticise someone who has scratched their itch.
Re:Pretty sloppy presentation. (Score:1)
Good points. But numbers are numbers. And as long as they performed the benchmarks consistently across all kernels tested these numbers should be usefull. Besides, do you think a professor would put his best grad student on something like this?
Very Poor 2.2 Page Fault Latancy (Score:2)
According to this graph [nmu.edu] page fault latencies suck in kernel 2.2. Is this true? I think I'm running a 2.2.17 AC kernel though and if I'm just doing development and not causing swapping then it doesn't matter though right?
Re:Aaaah! Exponential! (Score:1)
I'd worry more if vmlinuz and modules start to grow exponentially.
---
---
is that the sound of pigs flying? (Score:1)
We'll beat Microsoft yet! (Score:3)
So when will line count surpass Windows 2000?
----
Re:Yeah, but... (Score:2)
MS Graph? (Score:2)
Re:This benchmark was not that useful (Score:2)
Why I would mod you down (Score:2)
Yesterday I modded some of these Michael related posts WAY down. Why?
1. Because they are often insulting, and I don't like to read lame insults on my slashdot. :-)
If you make an offtopic comment about a delicate subject, it really doesn't help if you start insulting.
Just state your opinion calmly and have respect for other people. If you'd post like that I would mod it up. (But sadly i wasted all my points modding you down yesterday
2. You also always post so mysteriously. Why? I still don't really understand what all the fuss is about. And that's also really irritating. So would you please explain thoroughly what the problem is. Only if we all know what the problem is can we solve it.
So please post something abjective and insightful about this, so we can discuss and solve the whole thing. If you keep posting like this you will only get modded down > get frustrated > post more insults > ...
Kernel Compilation for Performance (Score:2)
Is it worth the trouble?
Re:Linux Summarized Nicely (Score:2)
Then they are saying that it will take twice as long for Linux to tell my apps that I have ordered them killed.... (-1) so maybe that extra 1.5 microseconds might prevent a -9 switch.
Page fault latency: in all of 2.2, or fixed? (Score:3)
One thing that I wonder about: that huge performance hit on the page fault latency shown in 2.2.6. Is it still there as of 2.2.19? Did the fix make its way back into the 2.2 series, or is it only fixed as of the later 2.3's and the 2.4 series? 2.2.6 is the only 2.2 in their study, so the study doesn't answer the question.
-Rob
Converging in the Cauchy sense. (Score:2)
ie. There is not much fat to trim left...
Therefore the next dramatic improvements if they are to come will not be from tweaking this part or that part of the kernel, but rather from implementing entirely new classes of functionality.
ie. Linux has arrived. It's settled down, time for it to start exploring as yet unimagined new things to do instead of new ways to do old things.
The future will be, umm, fun.
This post is not designed or intended for use in on-line control of aircraft, air traffic, aircraft navigation or aircraft communications; or in the design, construction, operation or maintenance of any nuclear facility.
Re:The most important 'benchmark' (Score:2)
Reread his post. He's not suggesting anything of the sort. He's suggesting that a) many people still find it easier to use Windows than Linux, and b) that's a more important benchmark than speed.
An on-going study would be really useful. (Score:3)
Quite limited really (Score:4)
Where are the results for networking?
I definitely noticed a jump in performance between 2.2.16 and 2.4.0 so they must be missing something here.
They note the large increase in hardware support, but don't seem to realise that this new support and improved support has given Linux much more performance than their benchmarks might show.
Maybe the improvements in X etc have helped but no real performance difference between 2.1.38 and 2.4.0? Put any such machines through real world work and you'll soon spot the difference...
Re:Some Info on NMU (Score:2)
More info on the growth of linux (Score:2)