Kernel Hacker Keith Owens On kbuild 2.5, XFS, More 77
Jeremy Andrews writes: "Kerneltrap interviews Keith Owens this week, an experienced kernel hacker who has long contributed to the Linux kernel. His contributions include updating ksymoops and modutils, both of which he maintains. He also works on kbuild 2.5. Earlier, he built the original Integrated Kernel Debugging patch. He's also working on kdb and XFS.
Check out the interview." Lots of good information in here about things to expect in 2.5.
2.5 hurry up. (Score:1)
Re:2.5 hurry up. (Score:1)
Re:The Welsh Problem in the UK (Score:1)
XFS is pretty stable.... (Score:1)
Re:XFS is stable.... (Score:1)
Re:We believe you (Score:1)
This is so cool (Score:3, Interesting)
I'm wondering, do kernel developers use tools like vmware/plex86 to debug their running kernels ? It seems like we've come a long way since debugging with strategically placed printfs
Re:This is so cool (Score:1)
And no they are still usefull, like Linus has said (and you know Linus is always right, don't you?) they force you to think about what you are doing before trying another run.
Trust me nothing beets a bunch of printks and a serial console when debugging a kernel.
Jeroen
Re:This is so cool (Score:3, Informative)
>too scared of this 'black art'. Its good to see someone taking time out to make it a bit more comprehensible for 'the rest of us'.
Try www.kernelnewbies.org. Esp. look in the book section for Linux Device Drivers v.2 and the online version.
Wrt. debugging, prinks are still used alongside everything else. I do not think that debugging is done with vmware or plex86 yet but there is a port of the kernel to userland (User-Mode Linux) which is used by some.
Rasmus
Re:This is so cool (Score:1)
Jeff Dike and User Mode Linux [sourceforge.net] to the rescue.
Re:This is so cool (Score:3, Interesting)
Vmware or plex86 could possibly be of some use, except that they're no good for debugging device drivers since the real devices are hidden behind a virtualization layer. For non-device driver work User Mode Linux [sourceforge.net] is a more lightweight solution, and tracks the latest development kernels much more closely. An increasing number of the core developers are using User Mode Linux regularly.
For heavy debugging work on live kernels, kgdb is the perferred solution, with a serial cable link to a test machine. It takes a little more work to set this up and you need two machines. Kdb is a simpler debugger that can be patched into the kernel, useful for tracking down elusive kernel problems. It's included by default in SGI's XFS patch and pre-patched kernels.
There are some great tools available including LTT, the Linux Trace Toolkit and various lock-monitoring patches. Unfortunately, most driver development is still being done by the printk/reboot method. If this is your preferred method, make sure you install a journalling filesystem unless you like spending most of your time watching fsck work.
--
Daniel
Grammar flame (sorry :) (Score:1)
There's no need for an apostrophe* in "interviews".
* The superscript sign ( ' ) used to indicate the omission of a letter or letters from a word, the possessive case, or the plurals of numbers, letters, and abbreviations. (ex dictionary.com)
Re:Grammar flame (sorry :) (Score:1)
Don't be sorry. Putting an apostrophe in a third-person verb is a particularly dismal error, and deserves to be mocked without mercy.
PS. help for idiots [angryflower.com].
Broken Link (Score:1)
Re:Broken Link (Score:1)
I wonder will they incorporate ACPI (Score:2)
This could be very significant since ACPI allows for highly-automated system configuration, which is necessary if you want seamless hot-docking of external devices and ease of system upgrades.
Re:I wonder will they incorporate ACPI (Score:1)
http://content.techweb.com/wire/story/TWB200104
Here is a great article from the summit for more 2.5 info
http://lwn.net/2001/features/KernelSummit/
Re:I wonder will they incorporate ACPI (Score:1)
See http://lwn.net/2001/features/KernelSummit/ [lwn.net] for fairly comprehensive list of changes planned on for 2.5.
Third section up from the bottom is what they plan on doing for power management.
Intel's ACPI implementation is out there now, and is being used by FreeBSD. They are currently waiting for the 2.5 fork to submit it for Linux. As a measure of the complexity of ACPI, consider that this implementation has "5-7 person years" of development work in it already, and does not yet have support for putting systems to sleep.
So ACPI is important, despite its bulk. 2.4 already has the ACPI interpreter in it, but 2.5 will be where we see a truly working implementation of this standard.
It's a shame about Linus's opinion on kdb (Score:3, Interesting)
"In theory, there is no difference between theory and practice. In practice, there is." In hardware development, there is the theory of what the hardware documentation says the chip will do, and then there is the practice of what it actually does. DMA's don't, interrupts stick, registers report old data. Obviously, you START by writing a user space app that pokes at the hardware (and this is one area in which Linux is head and shoulders above WinNT - there is NO way for a user space app to access hardware in NT, while in Linux you simply have to be root), but when you finally need to hook interrupts, allocate DMA buffers, etc., you need a debugger that can look at these events.
Also, when porting to other CPUs, you sometimes need to see what is going on at the hardware level, and how it affects the drivers in the kernel.
Yes, allowing debugging without analysis is bad. But throwing us back to the stone knives and bear skins era just to encourage hardier folks is an overreaction. Sure, make a KDB kernel bitch and moan during startup. Make it only allow root access, not normal user access. Force all file systems to run in full sync mode. But please don't make debugging buggy hardware any harder than it needs to be.
(Now, if only AMD would add a JTAG debugger to the Athlon chip, I'd be a happy man.)
Re:It's a shame about Linus's opinion on kdb (Score:1)
Have you ever used jtag? It's the buggiest thing I've ever seen. I've used it a lot on PPC and what they always fail to mention is that the jtag controller has to know what rev mask you're using on the chip. If anything is amiss, you'll get random operation.Not to mention the way it badly interacts with caches. And also that most board manufacturers fsck their jtag ports up. Did I also fail to mention that the damn jtag controllers cost like 5-8 grand? In the end all it does is restrict development to groups willing to lay down big money for the priviledge of developing for your system.
t.
Re:It's a shame about Linus's opinion on kdb (Score:1)
True, the debugger program has to know what mask device you are working with, but that is usually a simple matter of selecting the correct mask when the debugger is launched.
Perhaps you are thinking of a full JTAG implementation, with full scan chain support et cetera. I am talking about a simple implementation like Motorola's Background debugging mode support.
Besides, if you thing JTAG is flaky, try using a bond-out pod. I've not seen bond-outs for CPUs faster than about 20MHz, and those were always "stand on one foot, hold tongue just right, think happy thoughts, and don't breath while debugging" affairs.
Global Makefile! (Score:4, Insightful)
From the interview:
This is excellent, and I hope more open source projects start to go this way. It's been known for a while that recursive make is a bad idea [pcug.org.au] because it's inaccurate. Naive recursive makefile structures tend to miss stuff that needs to be built/installed and fixing that problem (usually with ugly hacks like make dep) generally results in building stuff that doesn't need to be built.
What Keith describes is a nice solution that provides the benefits of recursive make without the problems: Use per-directory makefile fragments which can be maintained locally, but automatically generate a complete, tree-wide makefile that is actually used for the build.
There are tools other than make that provide more elegant solutions, but given that they never seem to catch on, I'm happy to see that someone is applying the tool we have (make) correctly, for once.
I'm looking forward to this one.
Re:that paper is weird (Score:2)
It's slow too.
Re:that paper is weird (Score:2)
I read the paper, and it seems to basically say "it's pretty hard to get your dependencies right with recursive makes.
"Pretty hard" is an enormous understatement on large projects. When the project consists of thousands of source files it quickly becomes the case that *no one* knows what all of the dependencies are.
The whole point of make is that the dependency management should be automatic. make is able to understand the full dependency tree all at once, if you allow it to. Multi-pass makes can solve that problem, but it's very hard to know how many passes are required and the process does get to be very slow.
I'm not going to comment on which method is better or takes less work, but the paper misrepresents just how bad recursive makes are.
I'll comment on it, and my experience is that you're dead wrong; for big projects recursive make is really bad. At one company I worked for a few years back I was the lead developer on a large Unix project. The build system was based on a recursive make that kept degrading over time. We'd give someone the task of fixing it occasionally, and they'd go off and spend a week or so getting it cleaned up, but within a couple of months it would be wrong again, and all of the developers were having to do frequent complete builds (which took five hours). Eventually we took to rebuilding nightly, and everyone got used to the idea that if your were working and stuff started to misbehave badly that you just had to come back the next day. I finally decided that the whole thing was costing us way too much in productivity, so I rewrote the build system myself.
My goals were simple: I wanted a build system that (a) would do nothing if everything was up to date, (b) would not build anything more than once during a run and (c) would build correctly. Oh, and I wanted it to be easy to maintain. After three weeks of time I couldn't really afford to waste on a stupid build system I achieved my goals, but I had to write a shell script that checked a lot of the cross-module dependencies itself and directed the makes. Build times dropped from five hours to three hours and an up-to-date check took less than 10 minutes. I was very proud of myself.
On the suggestion of one of the other engineers I decided to try a global makefile, using distributed fragments; it would be easy to construct and easy to maintain, but I was sure it would be too slow. Still, it only took three days to set up, so I tried it.
To my surprise, an up-to-date check took 30 seconds the first time, and 5 seconds after that (because of file system caching). Overall, complete build times dropped to just over two hours. The makefile fragments were easy to maintain and the resulting build system was very robust. I disarded my other system, we switched to the global makefile and we never again had to waste days futzing with build processes.
So I have no doubt about which is better and which takes less work. A global make is fast, trivial to build and maintain, and always works properly. Recursive makes are often very slow, require maintenance and sometimes screw up, which wastes lots of effort. The *only* advantage a recursive make has is that it makes local builds deep in the directory tree a few seconds faster. Even that advantage is nullified by adding a few extra targets to the global makefile fragments and specifying a build target on the command line.
Recursive make is, simply, misuse of make. Tools are much more effective when used properly.
Re:Global Makefile! (Score:1)
XFS or Ext3? (Score:1)
I was using Redhat 7.1 with XFS, but now that Redhat 7.2 came out with Ext3, I'm considering switching, and re-ghosting the machines.
Is Ext3 more stable than XFS?
Is the Kernel that came with the SGI distro of Redhat 7.1 stable? Or should I switch to 7.2 with Ext3? Which would I be better off with in the long run? Which will run the longest? And survive the most power outages? (this is going in the 7-8th grade building).
Thanks
Re:XFS or Ext3? (Score:1)
Basically, in ext3-speak, there are two kinds of metadata journaling. Writeback mode, supported by all the linux jfses, will guarantee that your directory and file structure is consistent after a hard crash, but it doesn't make any guarantees re: file data. This level of protection is about the same as ext2 + a really fast fsck on boot -- so you might see files with blocks from other files in reiserfs, for example. However, xfs is worse than that -- with its delayed allocation feature, you'll see entire files zeroed out after a crash. (See their mailing list archives for details.)
ext3 supports this mode too, but the default is ordered mode, which forces stricter ordering on data writes. Data always goes to disk before file metadata is updated, so you'll either see the "right" data after a crash or the old data -- but never damaged data.
AFAIK, ext3 is the only linux jfs with working ordered-mode support, though reiserfs apparently has patches in the works.
Re:XFS or Ext3? (Score:2)
The only way to guarantee that things do or don't get out to a drive is to run in a fully sync'd with the cache on the drive disabled. I do know that I can run in fully synchronous mode on XFS and I can guarantee that the write got out, but then your are throwing away all of your system cache and your system will be bogged to hell and back. Ext3 ordered mode is faster than XFS, Reiser, etc. because it essentially doesn't do journaling anymore (whereas the others, would write to the journal and then write the data out before the commit is done). When you really start to do any mildly heavy I/O this mode pukes over itself, since it requires all the data to get written to the drive before the transaction is considered commited. When you use either ordered or full data-journaled on ext3 you throw out all of your filesystem cache, and you better turn off the cache on your drive.
A word of advice, *never* leave your drive cache on with ordered, turn off the power to the drive, and all those "supposedly commited writes that have been guaranteed to get out the drive are not there. Now you are completely screwed, you have to a *full* fsck of the entire fs, since ordered mode isn't journaled. If you are running in "data-journalling" mode on ext3 you do get the journal, and it still blocks the transaction until it gets written to disk, but it also has a journal meaning you get the 2 write hit just like XFS, Reiser, etc running in synchronous mode.
So unless you are willing to take a performance hit ext3 gains you nothing over XFS, Reiser, JFS even then depending upon what you are doing it may be faster to run in full synchronous journal (on any one of them) with drive cache turned on making the ordered mode performance benefit nill. Any FS can guarantee that data will get out to the drive, I doubt any serious server will ever want to run that way. If you can safely assume that your system will stay up, ext3 performance in writeback sucks rocks compared to pretty much all the others. So the only benefit I see to ext3 (and admittedly it is a fairly significant one) is the ability to go from ext2 to ext3 without any data migration required.
Re:XFS or Ext3? (Score:2)
Is it really that hard to convert from ext2 to ReiserFS or XFS? I've never tried it.
I just installed WinXP and converted my 10 GB FAT32 partition to NTFS. The conversion took about 2 reboots and 10 minutes. It was totally automatic, with no input necessary on my part. Is is that much harder to convert in Linux?
Re:XFS or Ext3? (Score:2)
Re:XFS or Ext3? (Score:2)
I was incorrect in stating that ext3 ordered mode is not journaled, but again due to the syncronous data writes; you have the same issue where you take a big performance hit due to the *long* amount of time it takes to have the disk spin to the place it needs to. Not only that, but you lose all the benefit of being able to stack multiple transactions together before it gets out to the drive (people are amazed at what command tag queueing can do), along with all the other benefits of letting the OS flush things out when needed.
We switched back from ReiserFS to ext2... (Score:2, Interesting)
After power cuts on frozen development systems it regularly happened that files written minutes ago were completely corrupted; they were there, but just garbage in them; what you have written explains what probably happened; however, it troubled me that files written minutes ago were affected. What really upset me to throw out ReiserFS on every machine was when after a crash every File I created within the last two hours was destroyed; I never thought a Filesystem might take out many hundred files with such a precision. Even if I would not blame ReiserFS for this disaster (I Do), I consider it as completely unacceptable that all this happened without the slightest warning; no entry in the syslog, no boot message, nothing. ReiserFS pretended everything is fine. Do you have any explanation for such an behaviour, and are such effects just the downside for using a journaling fs, or is it something ReiserFS specific ? What added to my loss of confidence into this ReiserFs was that a few months ago reiserfsck did core dump when I tried to repair a file system that showed strange behaviour, which I regarded as exceptional behavior at that time.
For now I switched back to ext2 and feel pretty good to see a thorough filesystem check after a crash. I do not remember much trouble using XFS with IRIX, but I have no experience so far with any journaling fs on linux exept those mentioned above. So do You have any recommendation for a filesystem on a unstable development system, where I can not sacrifice too much performance, but need at least confidence into the integrity of my fs ? (I did not loose much data, but It easily takes a few hours to bring back a system from the backups, but an unnoticed damage to vital files can drive you crazy). p.
PRCS (Score:1)
What exactly PRCS can do better than CVS in terms of maintaining multiple branches.
Are there any known big open source project currently using PRCS?
Re:Slightly off topic (Score:1)
Signs u r a hard core geek :) (Score:2)
JA: Why does Linus refuse to include kdb?
Keith Owens: http://www.lib.uaa.alaska.edu/linux-kernel/archiv
JA: Why should it be included?
Keith Owens: http://marc.theaimsgroup.com/?l=linux-kernel&m=96
Its not good (Score:2, Insightful)
But its a good thing that a kernel debuger exists, it will help you understand how it works inside. But WONT help you FIX things.
bruj0-
ext3 (Score:1)
Re:ext3 (Score:1)
Kernel now has Kernel debugger? (Score:2)
Re:Kernel now has Kernel debugger? (Score:1)
www.kernel.org
but you have something like a distributionkernel (RH/Mandrake/Suse/Debian/Slack/whatever it is)
Re:Kernel now has Kernel debugger? (Score:1)
#
# Kernel hacking
#
CONFIG_DEBUG_KERNEL=y
# CONFIG_DEBUG_HIGHMEM is not set
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_IOVIRT is not set
CONFIG_MAGIC_SYSRQ=y
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_BUGVERBOSE is not set#