Intel Sees a 3888.9% Performance Improvement in the Linux Kernel - From One Line of Code (phoronix.com) 61

Posted by EditorDavid on Saturday November 09, 2024 @01:34PM from the 3888X-developers dept.

An anonymous reader shared this report from Phoronix: Intel's Linux kernel test robot has reported a 3888.9% performance improvement in the mainline Linux kernel as of this past week...

Intel thankfully has the resources to maintain this automated service for per-kernel commit/patch testing and has been maintaining their public kernel test robot for years now to help catch performance changes both positive and negative to the Linux kernel code. The commit in question causing this massive uplift to performance is mm, mmap: limit THP alignment of anonymous mappings to PMD-aligned sizes. The patch message confirms it will fix some prior performance regressions and deliver some major uplift in specialized cases...

That mmap patch merged last week affects just one line of code.
This week the Register also reported that Linus Torvalds revised a previously-submitted security tweak that addressed Spectre and Meltdown security holes, writing in his commit message that "The kernel test robot reports a 2.6 percent improvement in the per_thread_ops benchmark."

Intel Sees a 3888.9% Performance Improvement in the Linux Kernel - From One Line of Code

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 61 Comments Log In/Create an Account

Comments Filter:

- Re:riiiiiight (Score:5, Interesting)
  
  by Archtech ( 159117 ) writes: on Saturday November 09, 2024 @02:47PM (#64933389)
  
  "Intel's Linux kernel test robot has reported a 3888.9% performance improvement in the mainline Linux kernel as of this past week..."
  Or, in normal language everyone can immediately understand, a 38-odd times improvement. Why the insane urge to express everything in percentages? It's almost as bad as using microfortnights instead of seconds.
  
  - Re: (Score:3)
    
    by narcc ( 412956 ) writes:
    
    Why the insane urge to express everything in percentages?
    To make the number bigger, obviously. People don't understand basic math, so the framing matters. For example:
    "Sure, they say it's 3800% faster, but it's really just a 38x improvement."
    People also don't understand how taxes work. The combination can be expensive: "I can give you a 2% raise, but it'll put you just over the line into the next tax bracket. It's up to you."
    - Re: (Score:3)
      
      by jsonn ( 792303 ) writes:
      
      People also don't understand how taxes work. The combination can be expensive: "I can give you a 2% raise, but it'll put you just over the line into the next tax bracket. It's up to you."
      I'm always amazed by this because all semi-sane tax systems have accumulative tax brackets, e.g. the higher tax rate only applies to the part of the income in the new bracket.
      - Re:riiiiiight (Score:4, Insightful)
        
        by LDA6502 ( 7474138 ) writes: on Saturday November 09, 2024 @04:55PM (#64933627)
        
        More than a few conservative anti-tax politicians and news organizations like to misrepresent how a graduated income tax works in order to keep the rubes fearful. If all you do is look up your tax owed in a table or have someone else prepare it for you, you may not understand how the formula works.
        
        
        Re: (Score:2)
        
        by account_deleted ( 4530225 ) writes:
        
        Comment removed based on user account deletion
    - Comment removed (Score:4, Insightful)
      
      by account_deleted ( 4530225 ) writes: on Sunday November 10, 2024 @12:43AM (#64934337)
      
      Comment removed based on user account deletion
      
  - Re: (Score:3)
    
    by jaa101 ( 627731 ) writes:
    
    38-odd times improvement
    Actually a 39.889 times improvement, so "40-odd" is much more accurate than "38-odd".
    - Re: (Score:2)
      
      by Archtech ( 159117 ) writes:
      
      38-odd times improvement
      Actually a 39.889 times improvement, so "40-odd" is much more accurate than "38-odd".
      The number cited was 3888.9% , not 3988.9%.
      - Re: (Score:2)
        
        by jaa101 ( 627731 ) writes:
        
        Yes, but that was the percentage increase. Notice how a 100% increase is a 2 times improvement.
  - Re: (Score:2)
    
    by drinkypoo ( 153816 ) writes:
    
    Or, in normal language everyone can immediately understand
    This is technical news. Not everyone will immediately understand it anyway, and all of the people who are capable of understanding it are capable of understanding a percentage improvement. There is absolutely no value to making sure the layman can understand the performance increase when they cannot understand when it will apply.
- Re: riiiiiight (Score:5, Interesting)
  
  by beelsebob ( 529313 ) writes: on Saturday November 09, 2024 @03:22PM (#64933467)
  
  Or a really simple optimisation that was non obvious. Aligning things in memory can have an *enormous* performance impact. Finding a place where things werenâ(TM)t aligned and making sure they are now is very much the kind of thing Iâ(TM)d expect to be a massive win. I used to work at one of the major OS vendors, and this absolutely is the kind of thing weâ(TM)d on occasion find, completely legitimately.
  
  - Re: riiiiiight (Score:5, Interesting)
    
    by organgtool ( 966989 ) writes: on Saturday November 09, 2024 @03:41PM (#64933509)
    
    And it's not just memory alignment that can generate great improvements. Not that long ago, the Linux kernel got a huge boost in network performance by reordering the elements in a large struct to significantly reduce the chances of cache misses.
    
    - Re: riiiiiight (Score:2)
      
      by ArmoredDragon ( 3450605 ) writes:
      
      Automatic struct optimization isn't a thing in C yet?
      - Re: (Score:3)
        
        by flux ( 5274 ) writes:
        
        One reason C doesn't reorder struct fields is that it's forbiden by the standard.
        In any case to make it happen optimally you'd need to know the access patterns, but C compilation units typically compiled separately, so you wouldn't know what the optimal order would be. And how do you automatically determine the best order for fields in the first place, when it's related to the access patterns of such fields? The problem becomes more intractable if the fields are part of the provided API.
        Something like profi
        
        Re: riiiiiight (Score:2)
        
        by ArmoredDragon ( 3450605 ) writes:
        
        Rust reorders the structs to optimize on alignment boundaries and minimize cache misses. You can override this for e.g. FFI using repr.
        
        Re: (Score:3)
        
        by account_deleted ( 4530225 ) writes:
        
        Comment removed based on user account deletion
        
        Re: (Score:2)
        
        by flux ( 5274 ) writes:
        
        You'd use #[repr(C)] or a crate like PacketStruct to deal with that.
        
        Re: (Score:2)
        
        by flux ( 5274 ) writes:
        
        Good to know. I've do write Rust but never considered that. From the (safe) programmer's point of view that's invisible, so Rust is permitted to do that.
        Alignment boundaries-based field reordering is a good feature, but how about dealing with false sharing? It would be very wasteful to align everything by cache line size and you can't know from the struct definition itself how they should be ordered best.
      - Re: riiiiiight (Score:4, Informative)
        
        by edwdig ( 47888 ) writes: on Saturday November 09, 2024 @07:56PM (#64933955)
        
        When you're in kernel space, things often have to be aligned to match hardware requirements. You don't want the compiler re-organizing the fields in your page table or your GPU command list.
        You also don't want it re-organizing things when you're writing code that gets called from other languages.
        Most of the use cases for C nowadays are cases where you really need things to be exactly how you specified.
        
      - Comment removed (Score:4, Insightful)
        
        by account_deleted ( 4530225 ) writes: on Sunday November 10, 2024 @12:58AM (#64934355)
        
        Comment removed based on user account deletion
        
  - OS (Score:1)
    
    by Anonymous Coward writes:
    
    I can clearly see which 'major OS vendor' you were working at :-)
  - Re: (Score:2)
    
    by viperidaenz ( 2515578 ) writes:
    
    Or more specifically to this optimisation, align things with respect to the data the benchmark uses. The summary doesn't go in to the 6x slower performance in other scenarios.
    It's a trade off between aligned memory and fragmented memory.
Misleading (Score:5, Informative)

by Inzkeeper ( 767071 ) writes: on Saturday November 09, 2024 @01:42PM (#64933255) Journal

How about the other part of the article:
This change has been shown to regress some workloads significantly.
One reports regressions in various spec benchmarks, with up to 600% slowdown.

- Re:Misleading (Score:5, Insightful)
  
  by ls671 ( 1122017 ) writes: on Saturday November 09, 2024 @02:01PM (#64933305) Homepage
  
  Also, no mention of AMD. Is anybody savvy enough to tell if this should apply to AMD as well?
  
- Re:Misleading (Score:5, Interesting)
  
  by bill_mcgonigle ( 4333 ) * writes: on Saturday November 09, 2024 @02:24PM (#64933341) Homepage Journal
  
  The misalignment caused the 600% slowdown in benchmarks, according to the commit message.
  This fixes that by skipping this codepath on unaligned requests.
  What's weird is Linus reverted the problem two years ago for the same reason but the committer put it back a few weeks later.
  https://git.kernel.org/pub/scm... [kernel.org]
  To me this looks like a partial solution with room for doing it right.
  
  - Re: (Score:2)
    
    by Inzkeeper ( 767071 ) writes:
    
    Ah, I see. I read that wrong!
- Re: Misleading (Score:2)
  
  by dowhileor ( 7796472 ) writes:
  
  So asking, improvements in things like treading while things like pixel mapping take take a hit? No big deal i think.
"Performance improvement" (Score:3)

by backslashdot ( 95548 ) writes: on Saturday November 09, 2024 @01:50PM (#64933283)

Way too much dramatization. No way it's a 38x overall performance improvement, that would defy all logic and something like that would hit mainstream news. So next time you want people to swallow BS, use a much smaller number like say 3.8%.

- Re: (Score:1)
  
  by thegarbz ( 1787294 ) writes:
  
  That's just your ignorance talking. The 38x improvement was in a very specific case, it even is implied as such in TFS, and yes a buggy implementation of code can definitely screw up your results like this, and in some cases even worse.
  Just because you're afraid of big numbers or don't understand what is going on doesn't mean it isn't actually real.
Sounds like they are trying (Score:2)

by Revek ( 133289 ) writes:

Too little, too late.
- Re: (Score:3)
  
  by Gravis Zero ( 934156 ) writes:
  
  It's not an Intel specific change. This impacts all platforms.
- Re: (Score:1, Flamebait)
  
  by gweihir ( 88907 ) writes:
  
  Not really. And Intel is dead. They will just take a while to die.
  - Re:Sounds like they are trying (Score:4, Interesting)
    
    by Mspangler ( 770054 ) writes: on Saturday November 09, 2024 @03:24PM (#64933469)
    
    Intel is not dead (cue the shovel to the head). They do need to get back to their knitting.
    Remember Apple was pronounced dead once, and so was AMD.
    On the other hand GE did die due to thinking they were a bank.
    So, the question is should Intel concentrate on the super chips that have bragging potential but a market of 0.1% of users, or for the power efficient desktop/laptop market?
    I ask because looking at the M4 I realized I can't really put my M1 Air to full load. My Linux box has about the same performance, twice the RAM, six times the storage, and five times the USB ports. Apple's desktops are not a good fit for me.
    
    - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      Intel is dead. All they ever had was superior manufacturing which came form their _memory_ business. That is over. Their CPUs always sucked in one way or another and they have not managed any innovation for a long time now. And recently, they try to pull stunts like jeopardizing CPU reliability to boost performance. That has the stink of desperation.
      - Re: (Score:2)
        
        by yuvcifjt ( 4161545 ) writes:
        
        "Intel is dead" - actually far from it.
        There's still a huge amount of talent at Intel, and their single-core performance is still exceptionally good, if not on equal footing to AMD [techspot.com].
        Besides, their business isn't limited to CPU production.
        They just need better management, ideally someone who isn't guzzling millions for his own sake [reuters.com].
        
        Re: (Score:2)
        
        by drinkypoo ( 153816 ) writes:
        
        There's still a huge amount of talent at Intel, and their single-core performance is still exceptionally good, if not on equal footing to AMD.
        If you're not #1, you're #2.
        AMD now has the fastest processor in every segment and is outselling Intel in the datacenter.
        Intel had better get their shit together fast or they really will die.
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        Indeed. And the thing is, AMD did this form a position of weakness. That shows how truly fucked Intel is. Intel may survive, but from available history, it is very unlikely they will ever be #1 again.
        Add to that, that ARM is becoming more and more of a thing. Intel is inactive in that space or late to the game. AMD can deliver things like mixed AMD64/ARM chiplets.
Read the fine print. (Score:4, Interesting)

by Gravis Zero ( 934156 ) writes: on Saturday November 09, 2024 @02:00PM (#64933297)

The patch message confirms it will fix some prior performance regressions and deliver some major uplift in specialized cases...

If you read the THP documentation [kernel.org] you'll learn that "THP only works for anonymous memory mappings and tmpfs/shmem" which means unless you're using tmpfs or shared memory with a request in excess of PMD_SIZE bytes (2MiB on my system) then this has no impact.
It seems unlikely that many programs will see much difference in performance but it's always nice to see improvements added to the kernel.

- Re: (Score:2)
  
  by drinkypoo ( 153816 ) writes:
  
  tmpfs used to be default for some systems, I think it was mostly reverted because of the stuff that for some reason thinks it's smart to unpack a massive archive there, like Nvidia driver runfiles. (They also don't respect the TMPDIR variable so you have to add the flag --tmpdir=$TMPDIR to get them to act like everyone else's software. The command line option is literally named after the environment variable used in Unix[likes] for decades! Just support the fucking variable!)
  *ahem*
  I am using tmpfs because i
  - Re: (Score:2)
    
    by Gravis Zero ( 934156 ) writes:
    
    I think anything writing files of substance (more than 100MiB) should check that destination has enough storage space (and permissions) before any writing is attempted. It seems logical enough that POSIX should have a function dedicated to this practice. I'm a bit surprised that nobody has tried to make it a standard practice.
    - Re: (Score:2)
      
      by drinkypoo ( 153816 ) writes:
      
      Yes, on the one hand I am glad that nvidia provides a runfile driver so I don't have to mess with their rep, on the other hand I wish they were more competent about it, and on the gripping hand why does their driver install conflict with Multiarch for other things so that I have to use the runfile?
      Since I am all-Linux now, my next GPU will probably be from AMD, and it's largely for reasons like this (but also price-motivated, ofc.)
      Back on topic, it is pretty surprising that you don't specify how big a file
That's just f'ing great... (Score:5, Funny)

by RevRa ( 1728 ) writes: on Saturday November 09, 2024 @02:07PM (#64933317) Journal

Now all my crappy code will crash much faster. ;)

- Re: (Score:2)
  
  by JamesTRexx ( 675890 ) writes:
  
  Hey, the faster it crashes, the faster a fix may come.*
  *excluding Microsoft software which relies on crashes
Impacts all CPUs (Score:4, Interesting)

by Gravis Zero ( 934156 ) writes: on Saturday November 09, 2024 @02:18PM (#64933329)

This isn't an Intel specific change or even a x86_64 specific change, this impacts every Linux platform. The only reason "Intel Sees..." is in there is because they are the ones doing regression testing for the kernel.

Let me guess (Score:2)

by The Car ( 1846356 ) writes:

DEBUG_BUILD=0;
It really does not matter how many lines of code (Score:3)

by gweihir ( 88907 ) writes: on Saturday November 09, 2024 @02:41PM (#64933373)

That is maybe a curiosity, but irrelevant. Also, 4000%? Most of that will _not_ mapt o general performance.
All that is to see here is pretty stupid reporting.

- Re: (Score:2)
  
  by thegarbz ( 1787294 ) writes:
  
  All that is to see here is pretty stupid reporting.
  I see only stupid people seeing how TFS doesn't say it's general performance improvements. In fact it literally uses the phrase "uplift in specialized cases".
  - Re: (Score:2)
    
    by Torodung ( 31985 ) writes:
    
    Yeah, there's a whole lot of skeptics on Slashdot basing their skepticism on, "That's a really big number. Can't be right." I will remind them that there is a big difference between skepticism and doubt. Skepticism demands more than a surface evaluation.
    - Re: (Score:1)
      
      by gweihir ( 88907 ) writes:
      
      I throw 35 years of CS experience, a CS engineering PhD and experience with CPUs on all levels and understanding of system design into the pot. That enough for you to make it "more than a surface evaluation"?
  - Re: (Score:1)
    
    by gweihir ( 88907 ) writes:
    
    You cannot keep quiet when you have nothing to say, can you? I am pointing something out. I am not making a claim. Of course, the difference is lost on you.
great (Score:5, Funny)

by dawg1234 ( 6925868 ) writes: on Saturday November 09, 2024 @02:50PM (#64933391)

Runs infinite loop in couple minutes now.

Impact, please. (Score:1)

by dave314159259 ( 1107469 ) writes:

Is this something that now takes microseconds instead of milliseconds for something that isn't done often, or something that takes milliseconds instead of large fractions of a second for something that's maybe done once or twice per boot on your average system?
If it were something done often enough to be noticable by Joe Average Linux User. I'd expect it to be bigger news.
Older equipment (Score:2)

by buss_error ( 142273 ) writes:

The group I volunteer with got half a truckload of older systems. A speed up in the kernel would be welcome for these systems to see new life and boldly go where no cpu has gone before.
The only problem... (Score:2)

by Torodung ( 31985 ) writes:

The specialized case seeing a 38x increase in performance is the HCF instruction.
But boy does it burn HOT!
int _low_level_write(int chan, char *buf, int len) (Score:2)

by grep -v '.*' * ( 780312 ) writes:

{
return 0; // WOW!! it runs SO much faster now!
//
// What loser ADDED IN all of those annoying error checks anyway?
//
// ignore any existing code below ..
}
One weird trick (Score:2)

by Tony Isaac ( 1301187 ) writes:

And Linux magically runs 38x faster!
But like all of those "one weird tricks," this one line of code isn't all it's cracked up to be.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Re:riiiiiight (Score:5, Interesting)

Re: (Score:3)

Re: (Score:3)

Re:riiiiiight (Score:4, Insightful)

Re: (Score:2)

Comment removed (Score:4, Insightful)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: riiiiiight (Score:5, Interesting)

Re: riiiiiight (Score:5, Interesting)

Re: riiiiiight (Score:2)

Re: (Score:3)

Re: riiiiiight (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: riiiiiight (Score:4, Informative)

Comment removed (Score:4, Insightful)

OS (Score:1)

Re: (Score:2)

Misleading (Score:5, Informative)

Re:Misleading (Score:5, Insightful)

Re:Misleading (Score:5, Interesting)

Re: (Score:2)

Re: Misleading (Score:2)

"Performance improvement" (Score:3)

Re: (Score:1)

Sounds like they are trying (Score:2)

Re: (Score:3)

Re: (Score:1, Flamebait)

Re:Sounds like they are trying (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Read the fine print. (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

That's just f'ing great... (Score:5, Funny)

Re: (Score:2)

Impacts all CPUs (Score:4, Interesting)

Let me guess (Score:2)

It really does not matter how many lines of code (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:1)

great (Score:5, Funny)

Impact, please. (Score:1)

Older equipment (Score:2)

The only problem... (Score:2)

int _low_level_write(int chan, char *buf, int len) (Score:2)

One weird trick (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals