The Linux Kernel Has Grown By 225,000 Lines of Code This Year, With Contributions From About 3,300 Developers

The Linux Kernel Has Grown By 225,000 Lines of Code This Year, With Contributions From About 3,300 Developers (phoronix.com) 88

Posted by msmash on Sunday September 16, 2018 @01:03PM from the where-we-are dept.

Here's an analysis of the Linux kernel repository that attempts to find some fresh numbers on the current kernel development trends. He writes: The kernel repository is at 782,487 commits in total from around 19.009 different authors. The repository is made up of 61,725 files and from there around 25,584,633 lines -- keep in mind there is also documentation, Kconfig build files, various helpers/utilities, etc. So far this year there has been 49,647 commits that added 2,229,836 lines of code while dropping 2,004,759 lines of code. Or a net gain of just 225,077 lines. Keep in mind there was the removal of some old CPU architectures and other code removed in kernels this year so while a lot of new functionality was added, thanks to some cleaning, the kernel didn't bloat up as much as one might have otherwise expected. In 2017 there were 80,603 commits with 3,911,061 additions and 1,385,507 deletions. Given just over one quarter to go, on a commit and line count 2018 might come in lower than the two previous years.

Linus Torvalds remains the most frequent committer at just over 3% while the other top contributions to the kernel this year are the usual suspects: David S. Miller, Arnd Bergmann, Colin Ian King, Chris Wilson, and Christoph Hellwig. So far in 2018 there were commits from 3,320 different email addresses. This is actually significantly lower than in previous years.

The Linux Kernel Has Grown By 225,000 Lines of Code This Year, With Contributions From About 3,300 Developers

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 88 Comments Log In/Create an Account

Comments Filter:

doesn't seem surprising (Score:2)

by thePsychologist ( 1062886 ) writes:

This doesn't seem surprising, given that new architecture comes out all the time. What's great is that this amazing piece of work is still FLOSS, powering my Macbook as I type this comment.
Lines of code (Score:3)

by SCVonSteroids ( 2816091 ) writes: on Sunday September 16, 2018 @01:26PM (#57323744)

Why do we measure in lines of code? Serious question.

- Re: (Score:2)
  
  by Archtech ( 159117 ) writes:
  
  What's a good alternative? Serious answer.
  Function points?
  At least LOC gives us a rough estimate of how much work has been done writing source code. Of course it may be bad code, or completely wasted, and of course it must be calibrated by the language used.
  The problem IMHO is not the use of LOC but the foolish assumptions some people make based on that metric.
  - Re: (Score:2)
    
    by Zero__Kelvin ( 151819 ) writes:
    
    The question isn't Why measure in Lines of Code, but why not put those numbers in a context. It is necessary to understand how LOC increase does, and more importantly perhaps, doesn't increase overall complexity, vulnerability / attack surface, etc. I don't think there is any intention to deceive here ... I just think the author forgot how much they know that the target audience doesn't.
  - Re: (Score:2)
    
    by arth1 ( 260657 ) writes:
    
    What's a good alternative? Serious answer.
    Function points?
    At least LOC gives us a rough estimate of how much work has been done writing source code.
    Z factor is a better one - the size of the code after being fed through compression (like compress, giving .Z files). Large amounts of extra spacing or unnecessary line breaks won't be factored in, while code originality gives a higher score than copy/paste jobs.
  - Re: Lines of code (Score:5, Informative)
    
    by jd ( 1658 ) writes: <imipak@ y a hoo.com> on Sunday September 16, 2018 @04:09PM (#57324334) Homepage Journal
    
    Lines of code tells you how much work was put in.
    The ratio of lines of code to code blocks tells you how maintainable the code is.
    Defect density tells you the quality of the code.
    A triple of these would give you a reasonable analysis.
    
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  Because we are lacking better ways to measure code.
  Sure, lines of code doesn't have to be a good thing and neither does binary size, but those are easy to quantify.
  Would you rather that we used electro-psychometry to measure it?
- Re:Lines of code (Score:5, Insightful)
  
  by ShanghaiBill ( 739463 ) writes: on Sunday September 16, 2018 @01:43PM (#57323800)
  
  Why do we measure in lines of code? Serious question.
  LOC is an important metric, for quantifying both progress and complexity.
  The mistake is assuming that more is better.
  
  - Re:Lines of code (Score:4, Insightful)
    
    by fahrbot-bot ( 874524 ) writes: on Sunday September 16, 2018 @02:23PM (#57323940)
    
    Why do we measure in lines of code? Serious question.
    LOC is an important metric, for quantifying both progress and complexity.
    And yet, LOC doesn't necessarily quantify either progress or complexity.
    
  - Re: (Score:2)
    
    by gTsiros ( 205624 ) writes:
    
    considering that the best code is the one you don't write, i'd say LOC is a horrible, horrible metric.
    at work, our main product is roughly 2 million LOC (not counting blank lines, lines with only comments, preprocessor statements...)
    considering that so far any class i've tried to simplify was down by *at least* 95% (not exaggerating, in some cases i straight up *removed* classes because they were duplicated), i'd say the actual code is less than 100k. Possibly on the order of 10k.
- Re: (Score:2)
  
  by Kjella ( 173770 ) writes:
  
  Why do we measure in lines of code? Serious question.
  Lack of a better metric? Though at this level of abstraction I think classifying a project as small = 1 kLOC, medium = 10 kLOC, big = 100 kLOC and huge = 1000 kLOC project works just fine. I would think the number of maintainers you need scales pretty linearly with LOC. But it doesn't mean it's a useful measure of productivity....
- Re: (Score:2)
  
  by QuietLagoon ( 813062 ) writes:
  
  ... Why do we measure in lines of code? Serious question. ...
  
  Some metrics have as their base the number of lines of code. In and of itself, LoC is pretty meaningless. Other views which use LoC, however, can be quite useful.
- Re: (Score:2)
  
  by AlanObject ( 3603453 ) writes:
  
  Why do we measure in lines of code? Serious question.
  I would like to know a decent alternative. For the past 20 years whenever I was involved in a contractual transfer of intellectual property of software I would invariably get asked "how many lines of code?"
  Generally the question is not too hard to answer as long as you don't get too picky. After all we are talking about a "find" command piped into "wc" on various checked out directory trees. Who knows what percentage of that actual compile-active source code versus everything else. The legal and ac
- Re: (Score:2)
  
  by AHuxley ( 892839 ) writes:
  
  The idea was for small amount of code to do DOTADIW, or "Do One Thing and Do It Well."
  That would see a lot of really well understood code working hard to make a really great OS.
- Re: (Score:2)
  
  by Pseudonym ( 62607 ) writes:
  
  Because that's what diff measures.
LOC != kernel bloat (Score:5, Informative)

by Zero__Kelvin ( 151819 ) writes: on Sunday September 16, 2018 @01:26PM (#57323746) Homepage

An important caveat here is that increase in LOC count does not mean a linear increase in loaded kernel memory usage. For example, for every new driver each line of code is counted, but that driver may or may not compiled in or loaded as a module. If a driver for wireless card X is 1200 lines of code, but your system doesn't have that card and it was compiled as a module, then zero of those added lines of code generated machine code get loaded at runtime .

There are more than 1000 .config options, and over 30 supported hardware architectures, so your code mileage *will* vary.

- Re: LOC != kernel bloat. FALSE!! (Score:1)
  
  by Anonymous Coward writes:
  
  The kernel is bloat. Most of the new code should be outside and indenpent of of the kernel. Look at ZFS it is outside of the kernel and proves that NO file system code is needed in the kernel except for boot file system so the kernel can load first needed modules like a FS so rest will follow.
  Break the kernel up and we can get to the point of true replace parts and modules. Even having 2 or more of the âoesameâ FS so you do not have to do Big Bang upgrades.
  - Re: (Score:2)
    
    by Zero__Kelvin ( 151819 ) writes:
    
    Congratulations ... You win the SFC award!
- Re: (Score:2)
  
  by deathguppie ( 768263 ) writes:
  
  the Linux kernel is a mature piece of code. The amount of changes to core architecture should be limited to planned events and take years to mature to the point of inclusion. What we are looking at is just the amount of interest in making sure that the Linux kernel is being maintained at a prodigious rate so that we can wake up knowing that our bugs are being fixed and our security patches are on time. If there are people out there thinking that any amount of "lines of code" are some kind of development
  - Re: (Score:2)
    
    by Zero__Kelvin ( 151819 ) writes:
    
    Absolutely, my brother!
Linux jumped the shark years ago (Score:1)

by Anonymous Coward writes:

Never had the critical mass to win over desktop users (real workstation desktops, not hobbyists). Now it's just a bunch of change for the sake of change, plus a ton of meltdown/spectre bloat.
It's too bad, it could have been a contender had the chips fallen their way and they had make a few critical decisions differently. Now with the systemd malware being stuffed down everyone's throat, it's assuredly game over.
Embedded = Fubar (Score:1)

by Anonymous Coward writes:

If/When you look at the linux embedded scene, you'll notice that there is way too much duplication of functionality going on.
Then you have to look at the linux embedded kernel forks which haven't been integrated into the mainstream kernel, and you find even more duplication going on.
It is horrible, and it is getting worse every day.
Also, the newly integrated Cryptocoin miner for the x86 SMM EC is a real LOC hog, but that's what you get when you need to hide its functionality.
All in C? (Score:1)

by Anonymous Coward writes:

Is all written in C by men? That's horrible!
How many bugs in, say, 10,000 lines of code? (Score:4, Interesting)

by QuietLagoon ( 813062 ) writes: on Sunday September 16, 2018 @02:11PM (#57323886)

What's the rate of bug occurrence per 10k LoC in the Linux kernel? I'm less concerned about additional kernel bloat than I am about additional kernel bugs.

- Re: (Score:3, Informative)
  
  by Anonymous Coward writes:
  
  For example, this: https://scan.coverity.com/proj... [coverity.com]
  They state 0.45 defects/kLOC. Of course, they won't "find" all defects... and there might be some false positives in there. But you get the ballpark.
  Use your favourite search engine (hopefully not Google and its ilk).
  Kids, these days. When I was young, I queried Altavista with telnet. Tsk, tsk.
  - Re: (Score:2)
    
    by phantomfive ( 622387 ) writes:
    
    They state 0.45 defects/kLOC. Of course, they won't "find" all defects...
    They won't find most defects.
    But you get the ballpark
    Not really because the kernel developers know how to avoid the kind of bugs Coverity scans for (Coverity has been haranguing them over it for nearly two decades now).
- Re: (Score:2)
  
  by Lost Race ( 681080 ) writes:
  
  If kernel code is like most code, and it probably is, there is about 1 bug per line of code. So 10,000 or so.
- Re: (Score:2)
  
  by cyberchondriac ( 456626 ) writes:
  
  Latest release name for Redhat AES8: Bloaty McBloatface.
  Just poking fun. Linux is still quite lean compared to most OSes.
Kernel code or drivers? (Score:3)

by Gravis Zero ( 934156 ) writes: on Sunday September 16, 2018 @05:12PM (#57324552)

What really matters here is if we are talking about if this is a lot of code that has a direct effect on the functionality of all kernels or if this is really about code for specific kernel drivers. Last I read, the kernel core was like 2M LOC while kernel drivers made up 31M LOC.

So, these contributors.... (Score:2)

by dwywit ( 1109409 ) writes:

is Sievers still banned?
Uncontrolable Bloat (Score:3)

by aberglas ( 991072 ) writes: on Sunday September 16, 2018 @08:12PM (#57325076)

If those statistics are really true. Over 200,000 NEW lines in one year in an existing, complex system is as unmanageable as 3000 contributers. As products age, the rate of additions goes down as things have to be integrated into a complex system.
Sure, Linux is not actually as monolithic as described. But a bug in any one of those lines could bring down the whole kernel.
It is a credit to skill of the maintainers that they can make this work. And a debit that they try.

- Re: (Score:2)
  
  by fibonacci8 ( 260615 ) writes:
  
  You're one of those people that compiles kernels with every available option enabled, regardless of the use case, aren't you?
- Re: (Score:2)
  
  by phantomfive ( 622387 ) writes:
  
  But a bug in any one of those lines could bring down the whole kernel.
  Most of it is drivers, and most of those drivers are for devices not running on your computer, so if there is a bug, it will be in a code path that is impossible to reach on your system (are you using JFS?). The core kernel is a lot smaller.
  - - Re: (Score:2)
      
      by phantomfive ( 622387 ) writes:
      
      It really doesn't beg any question, you are not writing clearly. If you want to know why little-used drivers are in the kernel, the answer is that even if only .05% of the users actually need it, and someone is willing to maintain it, then that is a good enough reason to put it in. The kernel is well architected so adding another driver won't affect the quality of the other parts of the code.
5 commits per day (Score:2)

by mapkinase ( 958129 ) writes:

Linus Torwalds commits 5 commits per day
Yup. Moving to L4 (Score:2)

by meburke ( 736645 ) writes:

As a person who's been doing UNIX since 1984, ATT SVR3.2 was my favorite, although I've used other variants. I'm tired of Linux crap. I'm tired of systemd and the systemd wars. I'm tired of having to learn nuanced differences in various distros just to do basic, common, tasks. I'm tired of package repositories that suck when it comes to good maintenance (although they are still better than rpm hell). I'm tired of half-baked security measures that are badly designed and beyond human understanding. (IMO, admi
- - Re: (Score:2)
    
    by Bengie ( 1121981 ) writes:
    
    FreeBSD is still maintained by many of the original AT&T Unix, BSD, and Solaris engineers. Some have taken more high level project management positions, but quite a few are still in the trenches writing code.
Line counting (Score:1)

by peretto ( 2541640 ) writes:

And how many lines did they removed? :)
- Re: (Score:2)
  
  by sabbede ( 2678435 ) writes:
  
  2,004,759
Totally massive fail!!! (Score:2)

by LostMyBeaver ( 1226054 ) writes:

25 million lines of code is inexcusable at any level.

The amount of code required to make Linux work is not even a million. Let's assume you can get every feature of interest into a LOT less than that and then depend on modules for everything else.

Consider that the Linux Kernel as it stands today is one massive repository of trash on trash on trash on something less trashy.

Linus mad Linux and he made Git and Git has things like submodules and there are things like GVFS as well in some environments. Why not s
So, they're counting documentation? (Score:2)

by sabbede ( 2678435 ) writes:

The summary seems to indicate that part of those changes are in documentation, which is not code.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

doesn't seem surprising (Score:2)

Lines of code (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Lines of code (Score:5, Informative)

Re: (Score:1)

Re:Lines of code (Score:5, Insightful)

Re:Lines of code (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

LOC != kernel bloat (Score:5, Informative)

Re: LOC != kernel bloat. FALSE!! (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Linux jumped the shark years ago (Score:1)

Embedded = Fubar (Score:1)

All in C? (Score:1)

How many bugs in, say, 10,000 lines of code? (Score:4, Interesting)

Re: (Score:3, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Kernel code or drivers? (Score:3)

So, these contributors.... (Score:2)

Uncontrolable Bloat (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

5 commits per day (Score:2)

Yup. Moving to L4 (Score:2)

Re: (Score:2)

Line counting (Score:1)

Re: (Score:2)

Totally massive fail!!! (Score:2)

So, they're counting documentation? (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals