Linux Kernel Surpasses 10 Million Lines of Code

Linux Kernel Surpasses 10 Million Lines of Code 432

Posted by timothy on Wednesday October 22, 2008 @01:32PM from the nice-round-figures dept.

javipas writes "A simple analysis of the most updated version (a Git checkout) of the Linux kernel reveals that the number of lines of all its source code surpasses 10 million, but attention: this number includes blank lines, comments, and text files. With a deeper analysis thanks to the SLOCCount tool, you can get the real number of pure code lines: 6.399.191, with 96.4% of them developed in C, and 3.3% using assembler. The number grows clearly with each new version of the kernel, that seems to be launched each 90 days approximately."

Linux Kernel Surpasses 10 Million Lines of Code

This discussion has been archived. No new comments can be posted.

Search 432 Comments Log In/Create an Account

Comments Filter:

Lines of Code (Score:1, Insightful)

by Flyin Fungi ( 888671 ) writes: on Wednesday October 22, 2008 @01:36PM (#25471063)

Lines of code is not a good metric for performance. I'm in a software engineering class listening to how to use metrics on code.

What did sloccount say the kernel was worth? (Score:3, Insightful)

by OrangeTide ( 124937 ) writes: on Wednesday October 22, 2008 @01:38PM (#25471121) Homepage Journal

Because we'd all like to know how many man-months something a big as the linux kernel should take to implement. And laugh at the huge price tag sloccount will put on it.

Re:Lines of Code (Score:1, Insightful)

by Anonymous Coward writes: on Wednesday October 22, 2008 @01:47PM (#25471251)

The amount of code doesn't always correlate to the size of the final binary. You have to consider a slew of things when considering the Linux Kernel. First of all there is a lot of architecture specific code in there since Linux can run on everything from ARM chips to Sparc machines. Also you have to consider the built in drivers that are included in the source but aren't usually compiled with the kernel binary unless you're running an embedded or specialized system. If you have ever set up building the Linux kernel anyone would see there are a giant combination of things that a person could add and remove. The Kernel size getting larger just reflects more improvements and support for a wide range of machines. The final binary of a typical kernel has grown in size over the years but not at the rate of the lines of code so I wouldn't call Linux bloated because of the shear size of the code base.

Re:Micro-kernel vs massive kernel? (Score:3, Insightful)

by pembo13 ( 770295 ) writes: on Wednesday October 22, 2008 @01:56PM (#25471415) Homepage

I think they are including modules as well. And there are a growing number of userland drivers as well. So you can't come to a conclusion without knowing the size of the parts outside the kernel.

Lines of code as a metric (Score:5, Insightful)

by qoncept ( 599709 ) writes: on Wednesday October 22, 2008 @02:01PM (#25471511) Homepage

Funny that the summary calls attention to the fact that the number of lines includes comments and whitespace without any mention of how worthless lines of code is as a metric. Someone could easily go in and add or remove newlines wherever they wanted and without changed a bit of code make it 50 million or 50 thousand.

Re:What about the other .3% ? (Score:4, Insightful)

by glavenoid ( 636808 ) writes: on Wednesday October 22, 2008 @02:02PM (#25471533) Journal

Makefiles, build scripts, etc., perhaps?

Re:Line Count Not Always a Good Thing? (Score:4, Insightful)

by Microlith ( 54737 ) writes: on Wednesday October 22, 2008 @02:08PM (#25471629)

While Linux is huge, for a backdoor to be successful it would need to hit a huge number of systems. The majority of the kernel at this point tends to be drivers, not all of which are used in a given kernel.
For it to be even remotely worthwhile, it'd have to be placed into something that was both heavily used AND given little attention. These two positions are almost mutually exclusive.
Can anyone think of a place that would fall into these two categories? Even the more seemingly obscure parts of the kernel get attention fairly often and malicious changes wouldn't go unnoticed for long.

Re:Lines of Code (Score:5, Insightful)

by hondo77 ( 324058 ) writes: on Wednesday October 22, 2008 @02:11PM (#25471683) Homepage

Why? Are you still using an 80s-era Mac as your primary computer?

Re:Lines of Code (Score:5, Insightful)

by QRDeNameland ( 873957 ) writes: on Wednesday October 22, 2008 @02:14PM (#25471725)

If 1 Line of Code = 1 Library of Congress, you should acquaint yourself with the Enter key.

Re:Lines of Code (Score:1, Insightful)

by Anonymous Coward writes: on Wednesday October 22, 2008 @02:21PM (#25471821)

If you were a kernel developer, you'd know that yes, people are consolidating code all the time, to reduce LoC.
Dropping old drivers happens, too, but at a much more sedate pace, since unlike Winblows, Linux still supports 20-year-old devices.
But the thing is modular, so chances are you are never compiling more than half those LoCs, and usually a lot less than that...

Re:Reply from actual kernel developer please . . . (Score:5, Insightful)

by earlymon ( 1116185 ) writes: on Wednesday October 22, 2008 @02:34PM (#25472033) Homepage Journal

I'm a developer and was wondering what kind of testing is done to verify the code.
Guinea pigs. Millions of us.

gratutitous complexification (Score:2, Insightful)

by cthulhuology ( 746986 ) writes: on Wednesday October 22, 2008 @02:38PM (#25472097) Homepage

This only proves that the Linux Kernel is in need of a significant refactoring effort. The capacity for any single developer to understand or even read a significant portion of this code is NIL. As a result, the opportunity to reduce duplication of effort is quickly diminishing, and the ability of new users to contribute anything other than additional bloat is similarly diminishing. And while the core of the kernel may be "small", and much of this code is dealing with special cases for specific hardware, because of the size of the code involved it is increasingly difficult to identify what is substantial and what is merely stylistic differences between two drivers. Increasing LOC counts is a sure sign of under analysis and over reliance on the availability of cheap labor. You can pick any arbitrary number of lines of code (less than say 20k) and pick that as the number of lines the kernel should occupy. As an individual line may define a new abstraction, LOC represent a potential for a geometric increase in complexity. So either these 6-10 million lines of code represent some truly staggering level of irreducible complexity (most unlikely), or are merely the result of not refactoring the code sufficiently (most likely). This really is a milestone in gratuitous complexification that should be morned, not hailed.

"Actual" code? (Score:5, Insightful)

by TuringTest ( 533084 ) writes: on Wednesday October 22, 2008 @02:51PM (#25472295) Journal

Comments are also code.
If you only count as code what can be feed to the machine, you should look at the size of the compiled binary. Source code is meant to be read by *humans*, so comments do count. That's why the GPL requires them to be left in the files (the "preferred form" to edit), otherwise it wouldn't be source code.

Re:Isn't that normal? (Score:3, Insightful)

by Abreu ( 173023 ) writes: on Wednesday October 22, 2008 @03:25PM (#25472789)

...still, we should think about adding Asimov's three laws before we reach such an event horizon, no?

Re:Isn't that normal? (Score:3, Insightful)

by RAMMS+EIN ( 578166 ) writes: on Wednesday October 22, 2008 @03:49PM (#25473203) Homepage Journal

``Unfortunately, as you approach the limit, the performance must drop as you've now abstracted so far that your code becomes essentially a virtual machine on which your data runs.''
I don't see that. Not all abstraction makes things slower. In many cases, abstraction lets you write code at a higher level, while still compiling down to the code you would have written if working at a lower level.

Re:gratutitous complexification (Score:1, Insightful)

by Anonymous Coward writes: on Wednesday October 22, 2008 @03:56PM (#25473315)

This almost makes one think that you take LOC as an indicator of complexity, which is simply ridiculous. If you consider that the majority of that code tends to be in drivers and architecture code, the complexity argument goes out the window. Specific drivers and architectures are only of interest to certain people, and the vast majority of users are never going to interact with anything outside of their narrow window. While it is true that no one really understands the kernel in its entirety, it has been that way for well over a decade or so, yet things still seem to be making forward progress somehow. There is more to be said for having a strict hierarchy of subsystem maintainers and the associated trust metric for merging up, but this is the fundamental methodology that permits the system to scale so effectively.
The one thing your rant also excludes is that there is no need for someone to understand the entire system, even at the core level. The kernel is much more of a social atmosphere built around trust and interpersonal interaction. When various VM issues are encountered, all of the usual folks working in that area are CCed and left to work it out, etc, etc. It is much more an issue of knowing who to defer to in order to see results than it is someone at the top having a hand in everything. Subsystem maintainers are usually in their positions because they understand their problem space, and work in it on a daily basis. Trying to work around them or displacing that undermines the entire process.
The barrier for entry has gone up over time, but drivers now (where people usually start) are still a well documented area and one where a lot of resources and help exists to get one going. It is also arguable that writing a driver for the current kernel is orders of magnitude simpler to what it was in the pre-2.6 days prior to the introduction of the driver model, where interfacing was much more ad-hoc. While there is a lot of inherent complexity in the driver model, the vast majority of that is stuff that a driver developer simply doesn't need to care about.
Rather than whining about LOC, perhaps you can point to a few specific cases of why you believe the current system is error prone, since it seems to be working just fine for the rest of us.

Re:Lines of Code (Score:1, Insightful)

by Anonymous Coward writes: on Wednesday October 22, 2008 @04:07PM (#25473461)

maybe somebody should be working to pare it down some?
Personally, I'd much rather have a functional OS that, for instance, have drivers for whatever thing I connect to it.

Re:Function Point Analysis (Score:3, Insightful)

by DrVxD ( 184537 ) writes: on Wednesday October 22, 2008 @04:29PM (#25473797) Homepage Journal

i believe a more appropriate measure of the 'bloat' (i.e. useless functions) or the size of any software package is through function point analysis
I recall many years ago, a PHB (this is long enough ago that nobody called them that yet) was talking about developer productivity metrics; he announced that the powers that be were considering either KLoC or Function Points. The guy sitting next to me said "I have no idea what function points are, but they've got to be better than KLoC". The remark made one of those wonderful whooshing sounds as it sailed straight over the PHB's head...
LOC is without question one of the easiest measurement (aside from total package size in bytes, which is nearly as uninformative)
+1 - Fundamental Law Of Physics.
LoC's only redeeming feature as a metric of anything is that it's (relatively) easy to measure. Of course, there's the debate about "do we count comments", "do we count whitespace", "how do we count curly braces" - so it turns out it's actually NOT all that easy to measure. But don't let a PHB hear you speaking such heresy...

Re:Lines of Code (Score:1, Insightful)

by Anonymous Coward writes: on Wednesday October 22, 2008 @05:04PM (#25474391)

I'm probably going to be marked a troll but...
when did efficiency become outdated? Not every system is for the home PC either.

Re:"Actual" code? (Score:3, Insightful)

by bonch ( 38532 ) writes: on Wednesday October 22, 2008 @05:46PM (#25475049)

I don't really care much about theoretical programming paradigms. "Code" refers to the instruction statements written in a programming language for a compiler to interpret, not the comments written off to the side that the compiler ignores.

Re:Lines of Code (Score:3, Insightful)

by Just Some Guy ( 3352 ) writes: <kirk+slashdot@strauser.com> on Thursday October 23, 2008 @11:47AM (#25482821) Homepage Journal

No but a modern PC running windows uses 1000 times more RAM than GEOS Commodore 64, but doesn't really do anything extra. The OS needs to go on a diet.
GEOS supported thousands of printers, hundreds of hard drive adapters, hundreds of video cards, streaming network video, 3d gaming, virtual memory, several CPU vendors, hundreds of mice, and all that in 20KB of memory? Impressive!
Less sarcastic answer: modern computers do a whole awful lot more than GEOS did.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Linux Kernel Surpasses 10 Million Lines of Code 432

Linux Kernel Surpasses 10 Million Lines of Code More Login

Linux Kernel Surpasses 10 Million Lines of Code

Lines of Code (Score:1, Insightful)

What did sloccount say the kernel was worth? (Score:3, Insightful)

Re:Lines of Code (Score:1, Insightful)

Re:Micro-kernel vs massive kernel? (Score:3, Insightful)

Lines of code as a metric (Score:5, Insightful)

Re:What about the other .3% ? (Score:4, Insightful)

Re:Line Count Not Always a Good Thing? (Score:4, Insightful)

Re:Lines of Code (Score:5, Insightful)

Re:Lines of Code (Score:5, Insightful)

Re:Lines of Code (Score:1, Insightful)

Re:Reply from actual kernel developer please . . . (Score:5, Insightful)

gratutitous complexification (Score:2, Insightful)

"Actual" code? (Score:5, Insightful)

Re:Isn't that normal? (Score:3, Insightful)

Re:Isn't that normal? (Score:3, Insightful)

Re:gratutitous complexification (Score:1, Insightful)

Re:Lines of Code (Score:1, Insightful)

Re:Function Point Analysis (Score:3, Insightful)

Re:Lines of Code (Score:1, Insightful)

Re:"Actual" code? (Score:3, Insightful)

Re:Lines of Code (Score:3, Insightful)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot