Linux Kernel Gets Fully Automated Test 159
An anonymous reader writes "The Linux Kernel is now getting automatically tested within 15 minutes of a new version being released, across a variety of hardware and the results are being published for all to see. Martin Bligh
announced this yesterday, running on top of IBM's internal test automation system. Maybe this will enable the kernel developers to keep up with the 2.6 kernel's rapid pace of change. Looks like it caught one new problem with last night's build already ..."
now all we need is automated.... (Score:5, Funny)
Re:now all we need is automated.... (Score:2)
Re:now all we need is automated.... (Score:3, Funny)
Re:now all we need is automated.... (Score:5, Interesting)
Apparently it works for Samba [samba.org].
Re:now all we need is automated.... (Score:2)
say for example producing similar load on demand wrappers for a load of functions in a dynamic library.
p.s.
It's been 14 minutes since you last success
Re:now all we need is automated.... (Score:2)
There's a fair bit of repetitive code in the kernel. I had to do some hacking to make some RS-422 cards we had work properly, and found that a lot of the char drivers especially contain very similar code, and structure. Code generation might help with older drivers that nobody cares about until they break. They tend to rot from the looks of things.
Re:now all we need is automated.... (Score:3, Insightful)
Re:now all we need is automated.... (Score:1)
Re:now all we need is automated.... (Score:2)
#include <stdio.h>
int main(void)
{
puts("#include<stdio.h>\n"
"int main(void)\n"
"{\n"
" puts(\"hello world!\");\n"
" return 0;\n"
"}");
return 0;
}
Re:now all we need is automated.... (Score:3, Insightful)
If you're going to used fixed-length buffers, though, at least use sNprintf!
Re:now all we need is automated.... (Score:1, Insightful)
Re:now all we need is automated.... (Score:2)
Re:now all we need is automated.... (Score:1)
Re:now all we need is automated.... (Score:2)
Being one fully aware of the possiblities of auto coding or using code generators, both of which exist today in one form or another, just not so completely available wide scope on much of any user/consumer platform..
I was being serious but certainly found the hum
IBM's Rational Software (Score:2)
It's based off of Eclipse. Check it out if you can.
Re:NOT FUNNY: Chinese Military Software (Score:1, Offtopic)
Sometimes, though, he interjects his posts into unrelated articles.
Why has it taken so long? (Score:1, Redundant)
Re:Why has it taken so long? (Score:2, Insightful)
Re:Why has it taken so long? (Score:3, Informative)
Re:Why has it taken so long? (Score:2, Funny)
Re:Why has it taken so long? (Score:1)
Stop playing the "then do it yourself" card, guys (Score:2)
Please don't play this card all the time. We hear it way too often in the Free Software/Open Source communities, and it's really quite silly.
The grandparent post asked if it would make more sense to do it another way. That's a perfectly valid and logical question. Either he's right, and it does make more sense, or he's wrong (for a variety of reasons), and it's best to keep it the way it is. None of these require one person to do it incorrectly, and another to do it proper
Re:Why has it taken so long? (Score:2)
Re:Why has it taken so long? (Score:2)
Re:Why has it taken so long? (Score:1, Funny)
Within 15 Minutes? WTF (Score:1, Insightful)
Would be much better to test it BEFORE a new version is being released, otherwise this is completely useless...
Re:Within 15 Minutes? WTF (Score:2)
There may be (and probably are) other test beds out there, testing releases. It would be better for Linus (and the world) if he could release already-tested code to the world, instead of having the world duplicate all the testing effort, and IBM seems like a perfect solution.
Re:Within 15 Minutes? WTF (Score:5, Informative)
So it's fairly well tied in already
Re:Within 15 Minutes? WTF (Score:1)
as far as I know (US) developers sleeping during the night time in... China
Re:Within 15 Minutes? WTF (Score:2)
Re:Within 15 Minutes? WTF (Score:1)
Re:Within 15 Minutes? WTF (Score:2)
And since the entire test run only takes 15 minutes, IBM (and the world) would benefit from allowing him multiple tests per release.
Comment removed (Score:4, Insightful)
Re:Within 15 Minutes? WTF (Score:4, Insightful)
If everyone did this, the newest kernels would never get tested. I think it is important that we have a diverse range of users using new, almost new, and older but well tested kernels.
Re:Within 15 Minutes? WTF (Score:3, Insightful)
Taking this into account, I believe this is meant to catch bugs mainly in nightly (unstable) builds and release candidates, not in "final" versions (those should, at least in theory, have no serious bugs left around as th
Re:Within 15 Minutes? WTF (Score:5, Informative)
Wait a minute... (Score:3)
You say it's "completely useless" because you have to wait 15 minutes when a kernel is released.
And this is modded "insightful".
Question: (Score:5, Interesting)
Re:Question: (Score:3, Informative)
Re:Question: (Score:3, Funny)
Re:Question: (Score:2)
Re:That would be good, except (Score:2)
Re:Question: (Score:1)
After a new kernel was released, power meters on mothers' basements everywhere saw a little blip. Add up all these blips, and you get a (somewhat) tested kernel.
Re:Question: (Score:2)
How much testing? (Score:2, Interesting)
Re:How much testing? (Score:5, Informative)
in yellow, rather than green or red. I have a few of those in the internal tests, but not the external set.
This is only the tip of the iceberg as to what can be done. We're already running LTP, etc internally, and several other tests. Some have licensing restrictions on results release (SPEC)
What took so long (Score:4, Interesting)
Presumably... (Score:5, Insightful)
Kjella
Re:Presumably... (Score:5, Informative)
Going from 90% working to 99.9% working is frigging hard. I had all this working 3-6 months ago, but the results weren't good enough quality to be published. Several people internally put a massive amount of work into improving the quality and stability of the harness.
Re:Presumably... (Score:3, Insightful)
The first 90% takes 10% of the time.
The last 10% takes 90% of the time.
I expect one could substitute "money", "labor", "effort" for "time" in the above.
Bob-
Re:Presumably... (Score:2)
The idea is the same though.
Re:Presumably... (Score:2)
It's magic [netbsd.org]! A single script and I can build a complete operating system for a big-endian 64bit architecture on a 32bit little-endian architecture, or any of the other 48 supported archs. More than that, I can build a complete NetBSD for any arch on any halfway POSIXish system.
build.sh bootstraps its own contained build utils (compiler, binutils et al) and builds the system with that. You can even build the complete system as non-root and get tarballs that you ca
Re:Presumably... (Score:2)
Re:Presumably... (Score:2)
2. http://cvsweb.netbsd.org/bsdweb.cgi/src/regress/ [netbsd.org] is supposedly used for regression testing. Ask a developer, I'm just a user
Re:Presumably... (Score:2)
You can basically retask servers in something like 10-60 minutes depending on what you are doing, and its a completely automatic process.
That is what aegis does (Score:3, Interesting)
and it can do a lot of other things too, like making sure that each change has an accompagning test and that all tests pass before anybody else is bothered with that change.
The biggest downside for aegis (as I see it) is that it needs to run on a central development server, it is not server based like CVS or the others(it has a cvs-like interface for reading). But OTOH, would it be so hare to have the kernel developers log into a central compile farm where the linux kernel i
Re:What took so long (Score:2)
Maybe... (Score:2, Interesting)
Re:Maybe... (Score:5, Informative)
The numbers are there, it's just a question of drawing graphs, etc. I have some for kernbench already, but I'm not finished automating them. If anyone wants to email me code to generate them from the directory structure published there, feel free
Re:Maybe... (Score:2)
It's too bad the Stanford Checker can't be integrated into your system.
This is awesome (Score:5, Insightful)
Sound issues? Older network and SCSI cards? There are a lot of drivers that break, and no one notices it because there is nobody with the hardware testing the -rc or -mm kernels.
Wouldn't it make more sense to package these tools for someone to install on their collection of oddball equipment, and assist in the debugging/testing?
Where's the ARM, MIPS, and SH?
Re:This is awesome (Score:5, Insightful)
Automated tests are not intended to catch everything or test strange permutations of pre-conditions. There purpose is to provide a mechanism for verifying that a build satisfies the basic requirements of the project.
More exotic configs need to be tested manually as usual but automated tests can provide a "failsafe" just in case a basic part of the build is broken.
Furthermore, it prevents regressions (Score:4, Insightful)
An automated test for B will catch regressions caused by my fix in A, making it harder to backslide. Backsliding is very expensive because bugs are far removed from their cause. If an automated test sees that changes in A caused a regression in B, the cause is immediately obvious.
Re:This is awesome (Score:2)
Isn't that what a compiler is for? ;)
Re:This is awesome (Score:2)
Re:This is awesome (Score:2)
Who should map the hardware testing platforms? I don't know, but I do know that if the new kernel builds are tested for a generic group of hardware and released, then other testers report on their tests using hardware X, you would end up with a relatively quick listing of a new build against many variants of hardware. Published correctly, it would allow people to search for
Re:This is awesome (Score:2)
The only real way to automate something like that would be a dummy load facility. Some software which would emulate the hardware being in place. Something conceptually similar to that effect anyway.
So then, for every driver for a device, you have a
Re:This is awesome (Score:2)
IBM doesn't sell any ARM, MIPS or SH-based systems. So, they don't test them.
The Debian buildd system is an automatic building and semi-testing system for, of course, all the archs that Debian supports, and that includes ARM, MIPS, and SH.
Re:This is awesome (Score:2)
That's how the PostgreSQL build farm [pgbuildfarm.org] works. People with wierd hardware [onlamp.com] apply to be added to the automated test farm. ARM, MIPS, PARISC, Alpha, PowerPC, Sparc, etc. are all represented well in the postgresql automated tests.
ARM Linux has something similar (Score:5, Informative)
ARM Linux has had something similar in Kautobuild [simtec.co.uk] for some time.
Although the testing and building is limited to the ARM platform.
The site also has a whos who thats worh looking at ;-)
Related projects at OSDL (Score:2, Informative)
http://osdl.org/projects/26lnxstblztn/results/ [osdl.org]
http://developer.osdl.org/cherry/compile/ [osdl.org]
News Flash (Score:5, Informative)
That said, pushing tests upstream is a great idea. Just not revolutionary or anything.
Re:News Flash (Score:2)
Redhat has several engineers that *are* upstream.
Re:News Flash (Score:2)
The other six machines seem OK. But that's a 50% buggered rate from various flavors of 2.6 upgrades, mostly from nightly 'yum update's. These are all IBM, Compaq, HP, and Dell machines, so somebody's
Long uptimes (Score:5, Interesting)
I hope they are using code from the Linux testing suite. That piece of work has already formed a nice set of tests. Also, I hope that the kernel is automatically built with many different combinations of options. And with time, I hope this will become better. The more tests, with the more hardware configurations, with the more kernel configurations, with the more types of input data (including many imaginative forms of incorrect input data to test that the kernel handles it gracefully and thwarts attacks based on such methods), the better quality we will have in the kernel, and it is likely that Linux will be unmatched in quality, stability, efficiency (well, maybe not efficiency necessarily), and long uptimes.
through the looking glass... (Score:4, Funny)
Re:through the looking glass... (Score:4, Informative)
Plus, there's a separate development grid where we test new test-harness code before it's put onto the
production grid.
Re:through the looking glass... (Score:1)
Re:through the looking glass... (Score:1)
Re:through the looking glass... (Score:3, Funny)
You're right, I made a mistake. I shall modify my test suite forthwith... [divide-by-zero error]
Does this mean... (Score:3)
Safety issues (Score:5, Funny)
Martin Bligh announced this yesterday, running on top of IBM's internal test automation system.
Hope he doesn't fall off and hurt himself.
Re:Safety issues (Score:2)
cool to see this publicly announced (Score:1)
I think Martin Bligh said that IBM has been using this for a while now, automatically downloading kernels upon release and testing them. The new thin
2.6.12 on amd64 (Score:2)
Re:2.6.12 on amd64 (Score:1)
Linux enters the world of QA 101! (Score:1)
Re:Linux enters the world of QA 101! (Score:1)
I realize that this is not the same as testing the entire package on dissimilar hardware like he is doing here; For instance, there are bound to be a few issues when developers of code and its underlying code base both submit updates the same evening. IMHO, it'd especially help new developers if there existed unit tests
Re:Linux enters the world of QA 101! (Score:1)
Regards,
Steve
Re:Linux enters the world of QA 101! (Score:2)
Secondly, it's MUCH, MUCH easier to fix a bug the night after it went in, not 3 months later. Everyone has context as to what's goin on fresh in their minds, and the change hasn't been buried under 7 tons of other crap.
Is this even worth anything? (Score:2)
Re:Is this even worth anything? (Score:2)
published?
For another, this is only the tip of the iceberg as to what can be done, but I'm not going to lock whatever I have now in some dingy dungeon until it's "finished". What's there is useful, ableit incomplete. Testing is *never* complete.
The main goal, as you put it, is to improve the quality of the linux kernel. If we can ensure the kernel builds, boots, and runs basic tests
Re:Is this even worth anything? (Score:2)
Any other Open Source projects have similar? (Score:2)
Any other projects out there with similar transparency in their automated testing?
The same thing for NetBSD (Score:2)
- Hubert
Re:Well, this time I am really unhappy! (Score:2, Insightful)
Testing a product to make it better doesn't mean the product is bad to start with. Some code has higher aspirations than that.
Re:Well, this time I am really unhappy! (Score:2)
Re:Well, this time I am really unhappy! (Score:2)
If you think that the 2.6.x kernels are unstable, you can use the 2.0, 2.2, or 2.4 kernels. All those versions are still being maintained, and they are definently stable.
Re:Well, this time I am really unhappy! (Score:2)
Re:Well, this time I am really unhappy! (Score:1)
2000 and XP are way diffrent than 95.
Windows '95, '98 and ME are descended from DOS and Windows 3.x, and contain significant portions of old 16-bit legacy code. These Windows versions are essentially DOS-based, with 32-bit extensions. Process and resource management, memory protection and security were added as an afterthought and are rudimentary at best. This Windows product line is totally unsuited for applications where