Linux Kernel Power Bug Is Fixed 145
An anonymous reader writes "The Linux kernel power bug that caused high power usage for many Intel Linux systems has finally been addressed. Matthew Garrett of Red Hat has devised a solution for the ASPM Linux power problem by mimicking Microsoft Windows' power behavior in the Linux kernel. A patch is on LKML for this solution to finally restore the battery life under Linux."
Re:Good News (Score:5, Informative)
Canonical is regularly doing kernel upgrades without upgrading the whole distro, this one is a major issue for a lot of people.
Re:Good News (Score:4, Informative)
Re:Good News (Score:5, Informative)
That's a wrong impression given by a poorly written article summary. If you read the patch submission, the only involvement of Windows here was using a presentation about their OS as a way to clarify the minimal documentation about this area. Nothing was copied from Windows.
Re:overblown (Score:5, Informative)
Reporting was a bit sensationalist, but the problem was both real and significant. I don't doubt the 14 to 36% regression they're reporting on exists, and is about that large.
Be careful with ASPM... (Score:5, Informative)
The onboard intel nic on my intel motherboard randomly disappears with ASPM enabled due I think to a hardware issue.
For me it is pcie_aspm=off or all hell breaks loose.
Don't feed the troll (Score:5, Informative)
This is making Linux look bad without reason. Before the whole "Linux Power Regression" coming up and being advertised as a problem by Phoronix, I did enjoy reading the occasional article (Benchmarks, etc.) by Phoronix, but after this whole thing I have lost complete respect for Phoronix.
It's not a Linux bug but BIOS misbehaving. Linux is simply playing it safe.
Summary: http://www.fewt.com/2011/09/about-kernel-30-power-regression-myth.html [fewt.com] (been posted before in older threads)
Re:Good News (Score:5, Informative)
It's funny they had to fix it by copying the method from Windows though.
Unfortunately (as is too often the case) the "bug" was an interaction between the linux kernel and the absolutely fucked state of the BIOS in general, and ACPI in particular.
Because not all boards support PCIe Active State Power Management(a part of the PCIe spec that provides for powering down an unused link to save power), and bad things can happen if you try to use it on a board that doesn't, a board that does support it is supposed to advertise that fact. In practice, a large swath of boards where it works just fine were failing to declare that. The Linux Kernel obligingly didn't try to use it(unless ASPM=force was used). Since what is supposed to happen apparently usually doesn't, they've had to examine the mechanism used by Windows systems to infer whether or no ASPM is good to go, reasoning that vendors are unlikely to ship BIOSes where the Windows default behavior causes horrible things to happen.
ACPI is a bit of a problem child...
Re:Good News (Score:5, Informative)
1. You need to detect boards that are capable of it, so that you don't try to shut down idle links in a system where that could cause crashes, losing touch with peripherals, or other havoc.
2. You need the actual logic for detecting idle PCIe links, and the appropriate driver support and so on for instructing the PCIe controller(s) to change link power states.
Part two is the bulk of the matter, and it already worked for some time now, if your board declared ASPM support or if you used ASPM force. Part one is comparatively simple; but the approach that Linux previously used was hobbled by the fact that boards frequently don't declare ASPM support even when they have it; but enough boards don't that just defaulting to force would be risky. To deal with this, the latest patch adds the heuristics that Windows uses to detect ASPM, since the method that is supposed to work frequently doesn't, but vendors aren't going to ship gear that doesn't support Windows...
Sandy Bridge Still Brutal (Score:3, Informative)
Re:Be careful with ASPM... (Score:4, Informative)
Re:Good News (Score:5, Informative)
Re:Good News (Score:2, Informative)
The patch imitates a nonstandard behavior -- if it sees BIOS claiming that ASPM is disabled, it leaves devices in whatever state BIOS left it instead of following what BIOS claims about device. This is certainly a bug in BIOS, and possibly a bug in Windows that happened to cancel the BIOS bug. However very likely that incompetent hardware or BIOS vendors (I am pretty sure, it's a certain company known as "the largest BIOS vendor") first produced wrong BIOS behavior, and Microsoft seized the opportunity to exploit it instead of reporting the bug, so now buggy BIOS behavior is a de-facto standard, even though actual ACPI standard says otherwise.
Please note that this is just one more link in a very long chain of "bug for bug compatibility" implementation that ACPI support turned into. For example, Linux now identifies itself as Windows Vista (!!!) in ACPI scripts just to avoid broken tables that are given to it if it identifies itself as Linux. How the Hell any OS-dependent things ended up in the supposedly platform-independent ACPI standard in the first place, is a separate question, and very likely the answer is this: http://antitrust.slated.org/www.iowaconsumercase.org/011607/3000/PX03020.pdf [slated.org]