Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Intel Linux Hardware

Intel Thread Director Is Headed to Linux for a Major Boost in Alder Lake Performance (hothardware.com) 38

The Linux 5.18 kernel is adding support this spring for the Intel Hardware Feedback Interface to make better decisions about where to place given work among available CPU cores/threads, reports Phoronix.

This is significant because Intel's Alder Lake CPUs "are the first x86-64 processors to embrace a hybrid paradigm with two separate CPU architectures on the same die," explains Hot Hardware: These two separate CPU architectures have different strengths and capabilities. The Golden Cove "performance cores" (or P-cores) feature Intel's latest high-performance desktop CPU architecture, and they are blisteringly fast. Meanwhile, the Gracemont "efficiency cores" (or E-cores) are so small that four of them, along with 2MB of shared L2 cache, can nearly fit in the same space as a single Golden Cove core. They're slower than the Golden Cove cores, but also much more efficient, at least in theory.

The idea is that background tasks and light workloads can be run on the E-cores, saving power, while latency-sensitive and compute-intensive tasks can be run on the faster P-cores. The benefits of this may not have been exactly as clear as Intel would have liked on Windows, but they were even less visible on Linux. That's because Linux isn't aware of the unusual configuration of Alder Lake CPUs.

Well, that's changing in Linux 5.18, slated for release this spring. Linux 5.18 is bringing support for the Intel Enhanced Hardware Feedback Interface, or EHFI...

This is essentially the crux of Intel's "Thread Director," which is an intelligent, low-latency hardware-assisted scheduler.

This discussion has been archived. No new comments can be posted.

Intel Thread Director Is Headed to Linux for a Major Boost in Alder Lake Performance

Comments Filter:
  • by evanh ( 627108 ) on Sunday February 13, 2022 @01:50AM (#62263195)

    The effect is very clear on Linux - The P-cores run very hot.

  • Of what Apple introduced in the M series and A12 CPU?
    • by storkus ( 179708 )

      rip off of what Apple introduced in the M series and A12 CPU?

      No, it's a rip off of ARM's BIG | little arch which Apple (presumably) licensed for their M series.

      Perhaps they've come up with a better way to share the load among the cores but otherwise can't see the diff.

      • Don't know about the licensing, but Apple has had both performance and efficiency cores in their A series (iPhone, iPad) processors since the A10 [wikipedia.org] - so the past six generations.

        • by DamnOregonian ( 963763 ) on Sunday February 13, 2022 @03:56AM (#62263271)
          BIG.little existed in the ARM world 5 years before Apple implemented it.
          The first commercial implementation of it that I I'm aware of is Samsung's, which predated Apple by about 3 years.
          Apple's implementation also had a limitation that Samsung chips (and Alder Lake, and Current Apple silicon) didn't have- it could only run P or E cores at once, not both. This closely matched ARM's solution for fast-tracking OS support (use a hypervisor to handle switching between clusters transparent to the OS)
          Samsung left that decision up to the OS scheduler.
          • Apple's implementation also had a limitation that Samsung chips (and Alder Lake, and Current Apple silicon) didn't have- it could only run P or E cores at once, not both. This closely matched ARM's solution for fast-tracking OS support (use a hypervisor to handle switching between clusters transparent to the OS)

            Yup, the first-generation A10 had that limitation, while the second-generation (the A11) did not.

            • Meaning Samsung beat them to the punch there, too.

              I'm not trying to engage in a flame war on "Who's better, Apple or Samsung", because, frankly, Apple's processors are flat out fucking better in every way.
              But they were way behind the curve in HMP processors. I interpreted (maybe erroneously) your post saying, "They've been doing this for 6 generations...", as somehow countermanding the person you replied to who said "Apple didn't come up with this shit."

              If I was mistaken, I apologize.
    • by serviscope_minor ( 664417 ) on Sunday February 13, 2022 @04:08AM (#62263285) Journal

      My god Apple fanbois are a tedious bunch. You probably think iTunes was when Apple invented music.

    • lol.
      Nope- much more fundamental than that. This is a ripoff of what Apple introduced when they invented microprocessors.
    • Comment removed based on user account deletion
  • Simple Algorithm? (Score:4, Insightful)

    by crow ( 16139 ) on Sunday February 13, 2022 @01:51AM (#62263199) Homepage Journal

    Shouldn't this be a simple matter of tracking how often tasks yield, and using the p-cores for tasks likely to need more than a full tick of processor time? It would seem like a bit of tuning would tend to get that right without a lot of complication.

    • I'm not sure the shit-cores have the full instruction set that the arson-cores do. Either way, a good way to burn your house down is to set one of these arson-cores to a long heavy task and then go on vacation.
      • by DamnOregonian ( 963763 ) on Sunday February 13, 2022 @04:02AM (#62263277)
        Hah. My house survived the AMD Barton cores, it'll survive this.
      • You mean blocking for I/O on a disk/network/keyboard/USB should be easy to detect and run an e-core? In fact, shouldn't all processes default to e-cores and only have the high CPU threads allocated to p-cores? That can't be all that difficult.

        • Blocking for I/O on a disk/network/keyboard/USB doesn't run on any core. That's what blocking is. Not running.
          If you're implying that a thread that blocks should only run on an e-core, I find that confusing, since you seem to be suggesting that threads that require I/O should only be able to process it slowly.
    • by DamnOregonian ( 963763 ) on Sunday February 13, 2022 @04:05AM (#62263279)
      Linux already has HMP support. However, Intel has a CPU-directed approach that claims to be better (i.e., it's aware of the types of instructions being executed, etc)
      I.e., look at a thread of execution and tell the OS what the power/performance/latency tradeoff is between executing on which core.
      It's pretty neat, actually. It's a lot better than the "dumb" HMP that exists currently.
  • by Powercntrl ( 458442 ) on Sunday February 13, 2022 @03:40AM (#62263261) Homepage

    My first thought was "Who is Intel's Thread Director and why is he going to work for Linux?" Also, is he in any way related to that General Failure guy who has been reading my drive?

    • Re: (Score:2, Funny)

      by Anonymous Coward

      No relation. Thread Director used to work for General Protection Fault, whose portfolio also includes malware that makes your programs crash.

      • by Anonymous Coward

        Just make sure Major Catastrophe doesn't get involved.

  • by spaceman375 ( 780812 ) on Sunday February 13, 2022 @04:33AM (#62263299)
    Many vary smart people have spent countless hours addressing the issues of scheduling tasks on CPUs, traffic control on buses of deferring speeds, intelligently choosing cache or RAM or other memory spaces, and just allocating resources efficiently based on time constraints. I want to see this expanded to produce the first true Network OS. As in, write a tiny kernel that just enumerates the hardware of the device it boots on, and offers a standardized interface to access any and all of it on the local network. Remove the OS from any specific instance; instead treat each CPU and core as a pool, and conglomerate all the RAM into a field of memory accessible at widely varying speeds depending on location. All peripherals are available to all tasks; just use the nearest mouse, keyboard, and display to the user regardless of what device they may be connected to. Have a meta-kernel with a network-level scheduler that handles which hardware gets what tasks, and run it on whatever CPU in the pool has enough resources to handle it. Possibly even define a virtual CPU that gives it independence from individual hardware architectures. Dynamically allocate hardware resources to each task from whatever is available on the local LAN. Different instances of this BorgOS would need to decide how much to trust eachother across the internet; tho one huge skynet might be possible humans are too territorial to let true integration happen. I've seen this many times in SciFi: I want to see it for real.
    • by Anonymous Coward

      To what end do you want this BorgOS ? What's the use case ?

      If all you want to do is run generic computing jobs, like, say, run things Seti@home on most of of your devices, that don't depend on any I/O except network, all the pieces exist to do it today, at least for devices where OS/firmware can be replaced to run some version of Linux, which is many of them, thanks to inevitable security vulnerabilities. Of course, no single hypothetical BorgOS actually exists. But you can load Linux on many devices, some

    • This doesn't sound all that "hard to do" (as in, I don't think you have to write any code beyond a few simple shell scripts, not that it would necessarily be trivial to configure) with existing Unix software. To wit, you would construct applications to run on a personal cloud, and run each one against its own network display server, then you'd connect that server to whichever display you were near. If your application can be constructed in such a fashion that it calls out sub-processes to do heavy lifting w

  • I assume that they'll fuck it up and introduce a bunch of security problems that will end up giving a net slowdown which patch (if they bother to patch them).

    Why is anyone still buying Intel's shit processors?

    • Why is anyone still buying Intel's shit processors?

      Because they have more marketing than AMD. Intel inside your brain.

    • Because they're currently the best processors on the consumer market.

      Of course, that's subject to change. We'll see when Zen4 arrives.
  • Nobody smart is running that 2nd rated, overpriced and insecure Intel crap anyways these days.

    • 2nd rated? [tomshardware.com]
      • by gweihir ( 88907 )

        For anything than gaming when you have too much money? Yes. Nobody sane buys Intel for a DC these days.

        • For anything than gaming when you have too much money? Yes. Nobody sane buys Intel for a DC these days.

          That's just untrue.
          I am part of an organization that manages a little under a dozen datacenters :P
          AMD is putting along at around 10% of new purchases.
          You can't define 90% of purchases as insane.

          Also, the 12900K is significantly faster than, and cheaper than, competing AMD desktop parts.
          In fact, as of Alder Lake, the only edge AMD has is performance per watt (which they still hold a significant advantage in)
          But performance per dollar is now far below Intel, as well as the crown for top performer (whic

As long as we're going to reinvent the wheel again, we might as well try making it round this time. - Mike Dennison

Working...