Linus Torvalds Calls Intel Patches 'Complete and Utter Garbage' (lkml.org) 507
An anonymous reader writes:
On the Linux Kernel Mailing List, Linus Torvalds ended up responding to a long-time kernel developer (and former Intel engineer) who'd been describing a new microcode feature addressing Indirect Branch Restricted Speculation "where a future CPU will advertise 'I am able to be not broken' and then you have to set the IBRS bit once at boot time to *ask* it not to be broken."
Linus calls it "very much part of the whole 'this is complete garbage' issue. The whole IBRS_ALL feature to me very clearly says 'Intel is not serious about this, we'll have a ugly hack that will be so expensive that we don't want to enable it by default, because that would look bad in benchmarks'. So instead they try to push the garbage down to us. And they are doing it entirely wrong, even from a technical standpoint. I'm sure there is some lawyer there who says 'we'll have to go through motions to protect against a lawsuit'. But legal reasons do not make for good technology, or good patches that I should apply."
Later Linus says forcefully that these "complete and utter garbage" patches are being pushed by someone "for unclear reasons" -- and adds another criticism. The whole point of having cpuid and flags from the microarchitecture is that we can use those to make decisions. But since we already know that the IBRS overhead is huge on existing hardware, all those hardware capability bits are just complete and utter garbage. Nobody sane will use them, since the cost is too damn high. So you end up having to look at "which CPU stepping is this" anyway. I think we need something better than this garbage.
Linus calls it "very much part of the whole 'this is complete garbage' issue. The whole IBRS_ALL feature to me very clearly says 'Intel is not serious about this, we'll have a ugly hack that will be so expensive that we don't want to enable it by default, because that would look bad in benchmarks'. So instead they try to push the garbage down to us. And they are doing it entirely wrong, even from a technical standpoint. I'm sure there is some lawyer there who says 'we'll have to go through motions to protect against a lawsuit'. But legal reasons do not make for good technology, or good patches that I should apply."
Later Linus says forcefully that these "complete and utter garbage" patches are being pushed by someone "for unclear reasons" -- and adds another criticism. The whole point of having cpuid and flags from the microarchitecture is that we can use those to make decisions. But since we already know that the IBRS overhead is huge on existing hardware, all those hardware capability bits are just complete and utter garbage. Nobody sane will use them, since the cost is too damn high. So you end up having to look at "which CPU stepping is this" anyway. I think we need something better than this garbage.
Is there any other option, Linus? (Score:5, Interesting)
You are right, Linus, as usual.
But I'd prefer the Linux Kernel Development team to push a complete proposal on the table.
Like totally ditching the support to Intels starting with the releases on next March 1st (or better April?).
Linus Haiku (Score:5, Funny)
Linus proclaims thus:
This patch is a piece of shit.
So what else is new?
Re:Linus Haiku (Score:5, Interesting)
So I'm gonna submit his email as evidence in my small claims court action against Intel.
Re:Linus Haiku (Score:5, Insightful)
Linus proclaims thus: This patch is a piece of shit. So what else is new?
If you mean "useful, straight communication from Linus as usual", then I'm with ya.
But if you're trying to imply that Linus indiscriminately calls *everything* a piece of shit, then you're so offbase that I'll wonder if you're astroturfing on behalf of Intel. When Linus criticizes stuff, he's spot on. This patch is indeed a piece of shit.
Re: (Score:3)
Linus proclaims thus:
This patch is a piece of shit.
So what else is new?
If you mean "useful, straight communication from Linus as usual", then I'm with ya.
But if you're trying to imply that Linus indiscriminately calls *everything* a piece of shit, then you're so offbase that I'll wonder if you're astroturfing on behalf of Intel. When Linus criticizes stuff, he's spot on. This patch is indeed a piece of shit.
Haiku yours is not
The point missed by you it was
It was a joke. Whoosh.
Re:Linus Haiku (Score:4, Informative)
Haiku yours is not The point missed by you it was It was a joke. Whoosh.
For what it's worth, your post isn't a haiku either. Nor was the original "Linus Haiku". A haiku need not have a 5-7-5 syllable structure; and a 5-7-5 syllable structure does not make something a haiku. Haiku require a cutting word (kireji), and carry imagery of the natural world.
These are closer to senryu than haiku.
Re:Is there any other option, Linus? (Score:4, Insightful)
And how does excluding 80-90% of the installed user base help Linux exactly?
I understand the sentiment, it's just not a professional way of handling the situation.
Re: (Score:2)
And how does excluding 80-90% of the installed user base help Linux exactly?
I understand the sentiment, it's just not a professional way of handling the situation.
It doesn't help anyone and neither does the patch in question. Until new CPU models from Intel hit the market, this shitty patch will do exactly nothing. And we should thank Linus for telling Intel to make the new CPU models less shitty than they were obviously planning to.
Re:Is there any other option, Linus? (Score:5, Insightful)
I understand the sentiment, it's just not a professional way of handling the situation.
Linus always tells it like it is, which you can either view as professional or not. But from an engineering perspective, it seems better to do that than just say something polite so you don't upset people.
It appears to me he's directing his displeasure at Intel management/legal/marketing making decisions where really they shouldn't.
And how does excluding 80-90% of the installed user base help Linux exactly?
I very much doubt he's going to do anything of the sort. I would suggest the exact opposite in fact; he wants the best solution for all and is complaining that Intel's patches are constructed for their own benefit (legal/ass-covering), rather than that of their customers.
Re: (Score:3)
Tell that to Challenger.
It's unfortunate that "professional" has become a synonym for not hurting anybody's feelings, particularly when "anybody" is a corporation.
Re:Is there any other option, Linus? (Score:4, Informative)
Why not? Somebody has to call bullshit on this.
What would you have him do, get some PR flunky to "corporatize" the message until nobody is really sure what it's all about?
Re: Is there any other option, Linus? (Score:3, Insightful)
While that is completely true, saying it doesn't solve the problem.
Re: Is there any other option, Linus? (Score:5, Insightful)
While that is completely true, saying it doesn't solve the problem.
It is no more Linus' problem to solve than it is your or mine.
It is entirely up to Intel to do this and do it properly.
Thank goodness Linus is saying this, because it will force Intel to solve it.
Re: (Score:3)
Thank goodness Linus is saying this, because it will force Intel to solve it.
Oh you sweet summer child.
Re: (Score:3, Insightful)
"be the OS that is still susceptible to this flaw after everyone else isn't."
This patch does not fix the problem anyway. It is a patch to bypass the fix by default so that NEW intel processors do no loose speed (but will loose the security).
All Linus has to do is ignore this patch and run every Intel new and old processor with the fixes. It'll run them slower, but they are atleast somewhat more secure.
And your idiotic intel fanboy.
Intel: Years of insufficient management. (Score:3)
The arrogance and attempts to mislead seem to be a present-day response to problems created during years of ignorance. Intel's recent CEOs have not had the enthusiasm for technology or the social ability necessary to manage Intel, in my opinion. Intel suffered from insufficient management 11 1/2 years ago: More Intel employees should say in public what they have said in private: Intel CEO Paul Otellini is no [slashdot.org]
Re: (Score:2, Funny)
Re:Is there any other option, Linus? (Score:5, Insightful)
Nobody is claiming Linus is being unprofessional or Intel is being professional, they are claiming that excluding Intel support from the kernel would be unprofessional, which is why it is never going to happen.
Re:Is there any other option, Linus? (Score:5, Informative)
I went in expecting the usual Linus ranting, and although he doesn't disappoint in that department, he also has a valid point.
As I understand it, Intel proposes to build in a switch in future CPU's which tells the CPU to stop being insecure. The switch is going to be off by default and must be switched on by the kernel during boot. Intel proposes to let all future CPU's be insecure by default.
Re:Is there any other option, Linus? (Score:5, Interesting)
by doing this it magically becomes the operating system's fault that the CPUs are insecure by design.
"we documented how OS vendors could turn on the secure mode and cripple performance at the same time. they chose not to use it, so any security flaws are their fault".
Re: (Score:2)
Linus usually has a valid point when he goes on one of his rants. They aren't just a cranky guy slagging people at random; they're his way of calling out especially bad bullshit. That's the only reason people are willing to put up with them.
Re:Is there any other option, Linus? (Score:5, Insightful)
IMHO, Intel is basically operating in PR-nightmare-cleanup-mode right now.
They fucked up badly and are trying to lie, cheat and manipulate their way out of it.
They are desperately trying to make it look like this is a generic problem (It's not; the AMD and ARM variants of these bugs are much less evil) and they are trying to safe face and shift blame however they can.
Plain and simple truth is that Intel has knowingly made malicious choices and now they've been caught.
Re:Is there any other option, Linus? (Score:5, Interesting)
Linus seems to (begrudingly) accept the need for a temporary fix and there is already a temporary fix that works for current CPU's.
The problem is Intel calling it a permanent fix and implying that every future CPU will be unsecure by default unless the OS flips a switch.
That way Intel can blame any performance issues on the OS and still pretend their CPU is fast, even though it isn't when running in the secure mode that no sane person would ever use.
How about a car analogy:
Imagine all cars have two bugs in the gearbox that trigger on putting it in reverse certain ways.
But 1 makes a dashboard light blink one time.
All car manufacturers have this bug, and they all fixed it when found.
Bug 2 makes your car explode.
AMD and ARM knew about this and fixed it. It made their cars a bit slower, but atleast it wouldn't explode.
Intel knew about it too, but they choose to ignore it. Their cars are a bit faster because of this.
Intel fixed this by sending out a widget that stops the car from exploding, this widget does make Intel cars go slower.
The widget doesn't fix it automatically, though! The driver has to switch the widget on every time he starts the car. If the driver doesn't switch the widget on, putting the car in reverse will still make it explode.
Intel also says that this is how all future cars will be prevented from exploding; by adding this widget to every future car and requiring the driver to switch it on; it'll always be in "explode-on-reverse" mode by default.
Intel does get to claim their car is faster by default though. Just don't put it in reverse.
As a bonus analogy; Intel claims both bugs are the same because they are both triggered by the same action, so therefore all car manufacturers are vulnerable to the exploding car bug.
Re: (Score:3)
Re: (Score:2)
Re:Is there any other option, Linus? (Score:5, Interesting)
Re: (Score:3)
Re:Is there any other option, Linus? (Score:5, Insightful)
And fundamental problems are still fundamental problems. The reason practically every processor has the same issues is because the same optimizations we used to make processors faster had the same fundamental design error.
I mean, either someone designed the core branch predictor block and everyone worldwide copied it for every processor, or everyone implemented it differently, yet it has the same Spectre flaw, implying that the flaw is inherent in the way branch predictors work.
The only way you can guarantee the designs are error free are to abandon everything that makes modern processors fast - OOO, speculation, branch prediction, and plenty more, including potentially pipelining (the fundamental technology everyone is trying to speed up by avoiding pipeline stalls). Go back to the old fetch-and-execute cycle and where memory operands are fully decoded and retrieved prior to even considering fetching the next instruction.
Everyone will hate it, because now your 4GHz processor will be as fast as a 500MHz one.
Re:Is there any other option, Linus? (Score:4, Insightful)
The only reason why many people need that 4Ghz to begin with is because of how bloated software has become.
Re:Is there any other option, Linus? (Score:5, Informative)
The only reason why many people need that 4Ghz to begin with is because of how bloated software has become.
And because sensors have got better giving us mych larger datasets. You know, sound, video and audio are the common ones.
I challenge you to find the bloat in FFMPEG, for example. Now try transcoding a HD video on a modern 4GHz desktop versus a 1st gen Raspberry Pi.
Re: (Score:2)
I've wrote "many people", not "all people". Specifically, I am talking all the people running wasteful browsers and applications written in slow languages. You are absolutely correct that for transcoding, and for many other heavy backend processing jobs, the cpu is utilized efficently.
Re: (Score:3, Interesting)
I think you'll agree that the new school is the majority.
Re:Is there any other option, Linus? (Score:5, Insightful)
If everyone was still running with 640x480 and PC speakers the way they were supposed to
640x480? lol n00b.
No we should be running in 40x25 teletext mode like god intended. Or are we meant to be in 80x25 VT100 like we were supposed to? Or are we supposed to be running on electricomechanicla teletypes like we were supposed to?
Or are we meant to be running on punched cards like we were supposed to? Or front panel switches?
Als the PC speaker was always shite even by the standards of the day. We should all be running the Ti.SN76489 as we're supposed to.
. The biggest issue with computers is seemingly ever-present need to build entertainment machines out of them. If we just did away with the entertainment requirements
OK, I'll bite.
Suppose we did as you wished, and separated off entertainment to a different enclave. How do you suppose one would make such entertainment? Them PCs are awfully convenient and it helps to have a fast one.
we would be just fine with 500MHz CPUs.
Yeah... no. I like having a massive, fast machine with huge screens for coding even I'm targeting an 8051 or some sort with a few k of RAM. I also want fast machines for scientific data processing. We have better languages and better optimizers than were practical two decades ago and I like it.
If I'm runing some factory test rig or octoprint on my 3D printer, that 500MHz CPU is just fine and that's what I use (well they're faster now).
Who the fuck really needs FFMPEG anyway?
Anyone who deals with video. Like computer vision researchers too of which I am one.
But they have the money to build render farms for their needs. What do the rest of us really need 4K video for?
A news for nerds website is not really the best place t obe a luddite.
Re:Is there any other option, Linus? (Score:5, Insightful)
The only reason why many people need that 4Ghz to begin with is because of how bloated software has become.
What an ignorant and useless comment. Software bloat is a very minor portion of what it is our computers do. Almost none of what makes up bloat in software ever even comes close to pegging the CPU and would give you a few percent speed increase at the most.
What we do depend on CPUs now is raw computing power. I was fine with a 500MHz PC back when my digital camera had a floppy disc in the back, now that it generates a 50mpxl 14bpp file it's not going to cut it and that has nothing to do with the relative bloat of the image viewer.
Likewise for web browsers. It's not bloat to blame for the fact I expect a browser to be able to stream a 4K movie in surround sound. It's not bloat to blame for the 40 tabs I run concurrently. It's not bloat that I have 10 office apps open along side video conferencing software along with those browser windows.
And among all that my CPU is sitting at 30% utilisation, 25% of which is being taken up by a full system virus scan (fuck monday mornings on my work machine).
Spare us your "we could make this less bloated and we would all be happy with 500MHz" garbage.
Re:Is there any other option, Linus? (Score:4, Funny)
You're probably the reason pornhub is so slow lately.
I'll take your word for it.
Re: (Score:3)
Re:Is there any other option, Linus? (Score:4, Insightful)
And why is all that javascript needed? Usually as a hacky method of writing something that could have been implemented better (improving html/css), as a way to prevent users from controlling their app (web apps), or as malware (trackers, and now apparently miners).b
Re: (Score:3)
Java, Flash, and the JS interpreter, are all forms of VMs. I don't see why java/flash are inherently more insecure than JS. Hell, I suspect that modern java, with all its revisions and safety, is likely more secure than javascript.
Re: (Score:2)
The only reason why many people need that 4Ghz to begin with is because of how bloated software has become.
You are completely WRONG. Games, multimedia encoding, 4K streaming, etc. People are asking for more CPU intensive applications. Office applications, email and so forth barely make a 4ghz machine break a sweat. And before you "All games happen on GPU" there have been huge advancements in AI, open world games and the like that have to keep track of massive amounts of game state to give you the impression that you're in an immersive world. Maybe you're okay with a Pentium 2 because you can barely get on s
Re: (Score:2)
Which games require that much processing power purely for game logic? Can you give an actual example? Also, most game AIs are not trying to play at the best possible level, but to simulate a weak human to play against.
Re: (Score:3, Informative)
240hz monitor and trying to get a steady 240fps. And boy do I mean steady. Even a blip between 240fps and 230fps can be perceived as microstutter, and that was with a gsync monitor. Gonna need a fast CPU to generate a frame in 1/240th of a second.
Wrong. Your frame rate is determined primarily by your GPU. The CPU component (for graphics rendering) is the API where the data goes from the CPU to the GPU. That is why we have DX12 and Vulkan, to get much closer to the metal. With games that are less demanding visually when you turn v-sync off with for example an i4790k Devils Canyon the frame rates goes way over 240fps even without Vulkan and DX12. So you see, there is no CPU bottleneck for graphics rendering. It's all GPU bound.
Re: (Score:3)
Try feeding the data required for a GTX 1080 with a Pentium 3. Ain't fucking happening.
Well obviously it "ain't fucking happening" because Pentium 3 motherboards didn't have PCI Express slots, they only had AGP and PCI slots. What I'm referring to are Intel/AMD motherboards that have PCI-e slots and more specifically, they need to be x16 slots for a graphics card like a GTX 1080. Otherwise, the bottleneck would be the PCI-e slot itself not being able to handle the throughput of the actual graphics card. In all of these scenarios, even your fictitious troll one, let's assume there was an AG
Re: (Score:3)
It is very nice that you are showing links that say that CPU intensive game exists. None of the links explain why they're like that, whether it is graphics or whether it is necessary game logic. My question was, and remains, under what case a game really CAN'T DO WITHOUT a SINGLE high end CPU
Re: (Score:3)
My question was, and remains, under what case a game really CAN'T DO WITHOUT a SINGLE high end CPU
You were already provided with a nearly immediate answer by someone else: Dwarf Fortress
Look dude, I've been building PC's for over 20 years and I've been reading hardware sites and bench-marking for as long. There are bottlenecks in just about every aspect of hardware depending on the situation: CPU, GPU, hard disk, memory, front side bus, north bridge controller, south bridge controller, L1/L2/L3 cache and the list goes on and on. There's multi-core vs. single core vs. dual core vs. quad core and each
Re: (Score:3)
Re: (Score:2)
Re: (Score:3)
To a lay person like me, this sounds like normal design compromises, which are common to any design problem, in everything from kettles to houses.
So in a sense, the reason this became a blame issue, isn't that there are technical compromises, but that these did not come with clear advice about how these technical compromises could not guarantee something which the operating systems were relying upon.
Kettles are dangerous, as they are near the limits of what is safe -- eg. do not plug two kettles into one ex
Re:Is there any other option, Linus? (Score:5, Insightful)
It's not a normal design compromise. AMD isn't affected by Meltdown because they did it right, Intel cut corners to get a small performance boost that they didn't need.
Worse still, Intel's botched microcode fix can brick systems. Apparently 7 months wasn't enough to properly test it.
Re:Is there any other option, Linus? (Score:4, Insightful)
to get a small performance boost that they didn't need.
So since when does a CPU company compete on anything other than performance of that CPU? Or are you suggesting we all buy Intel CPU's because we love the IME? :-)
Re: (Score:3)
To a lay person like me, this sounds like normal design compromises,
I think a normal person could understand that it makes more sense for the security guard to check their ID before they go into a building, not after they have gone into a building and rifled through the filing cabinets. Intel chose to behave in this fashion while AMD (and literally everyone else) did it the correct way.
(edit) (Score:3)
Intel chose to behave in this fashion while AMD (and literally everyone else) did it the correct way.
Correction, everyone else but IBM [itjungle.com], whose Power7 through Power9 processors are also vulnerable and where mitigation will be expensive. How far POWER has fallen, from the PPC601 where everything was done right... to today. (PPC601 was actually a POWER processor in a sense, in that it actually implemented the full ISA.)
Re: (Score:2)
Indeed scary. Just as well it is a very very big universe, with lots of space separating everyone.
Re:Is there any other option, Linus? (Score:5, Insightful)
Your comment is marked Insightful, 4 -- but it is overall totally wrong. You are presenting a false dichotomy!
The issue with Spectre is not that there is a fundamental problem with branch prediction (or BP misses, or pipelining, or speculation, or any combination of these). The issue is that some processors don't actually clean up after themselves when branch mis-prediction occurs. They roll back instruction execution in some cases (great). In others cases they may simply abandon execution (okay). But they don't ever do so much rollback work as to invalidate cache lines (bad!).
There is such a thing as provably reversible computation and storage, and it can be done correctly. But you have to limit the length of instructions over which you will continue to speculate to something you can reverse; and you **have** to flush cached information that should have never been available in the first place.
Re:Is there any other option, Linus? (Score:5, Informative)
Somebody with mod-points should mod up the AC parent. They are completely correct.
The design flaw is not in using speculation or branch-predication. It is allowing the side-effects of instructions in those streams to be visible in the machine before the branches are retired. This is really basic stuff - I remember a discussion about this very issue in a processor design course back in '00 - '01.
Intel gambled that the state of the cache was not visible to the programmer. Flush+reload showed that they were wrong.
Re: (Score:3)
Re: (Score:3)
The processors all start executing instructions ahead of the current one when possible. If the code branches in a different direction than expected, they discard the results of any future instructions, and make it look like the CPU never even attempted to run that future code.
For 25 years or so, all chip designers assumed that the state of the CPU cache was sufficiently complex that software wouldn't be able to predict what memory was or wasn't in the cache. This meant that they didn't have to worry about t
Re: Is there any other option, Linus? (Score:5, Interesting)
Even invalidating the loaded cache pages isn't necessarily sufficient. Because the act of loading one page means the flushing of another page, it may be possible to then do spectre in the opposite direction...preload the cache and if any preloaded pages become slower to access then you can determine the branch predictor caused them to be flushed. At least in theory....in practice that becomes more difficult in a multiprocess environment where other processes could be responsible for flush,but I certainly wouldnt want to predict it isn't possible.
So the full solution may need to be more complex. Just like the CPU includes more registers than the architecture specifies so it can do scrap work in this extra registers and then roll it back without affecting the real registers,the CPU may need extra cache pages so that it can load a page and then flush it without having lost any of the previously loaded pages.
Or alternatively, approach the problem from the opposite perspective. The problem is caused not just because of speculative execution but also because (for performance reason) the OS leaves all process memory mapped into every processes address space and the uses permission to try and make that memory unavailable. The other fix is to find a way to redesign virtual memory so that other processes memory is NOT mapped into each others memory space and is thus truely inaccessible. But that may be an even more difficult solution to implement
Re:Is there any other option, Linus? (Score:5, Interesting)
The reason practically every processor has the same issues is because the same optimizations we used to make processors faster had the same fundamental design error.
I mean, either someone designed the core branch predictor block and everyone worldwide copied it for every processor, or everyone implemented it differently, yet it has the same Spectre flaw, implying that the flaw is inherent in the way branch predictors work.
No. The fix is to not read from memory into the CPU cache during the speculative execution when that block of data is not there already. Changing this in the CPUs core would solve both Spectre and Meltdown, at a reasonable cost (would not defeat much current optimizations).
Re:Is there any other option, Linus? (Score:4, Informative)
Re:Is there any other option, Linus? (Score:5, Insightful)
This is the correct answer. Proceed to Go and collect your $200.
You may want set aside more silicon for caching and less for handling the speculation.
You may also want more cores rather than greater complexity. This would not have been a good choice ten years ago, but now people are learning how to use the extra cores, it will probably sell well.
Alternatively, you could set a flag to say whether your application cares about the risk (if the entire machine is dedicated to a single offline task, you probably don't).
Re:Is there any other option, Linus? (Score:5, Informative)
And fundamental problems are still fundamental problems. The reason practically every processor has the same issues is because
Is because it doesn't. AMD is not vulnerable to MELTDOWN and is less vulnerable to SPECTRE because they are more scrupulous and responsible than Intel, FULL STOP. There is no other reasonable way to regard the situation.
Every speculative processor has some of the same issues, to some degree, but that is not every processor, and you are still using Intel's bullshit excuse FUD language when you say that all processors are vulnerable to these attacks. That is a lie as stated.
Re: (Score:3)
Is because it doesn't. AMD is not vulnerable to MELTDOWN and is less vulnerable to SPECTRE because they are more scrupulous and responsible than Intel, FULL STOP. There is no other reasonable way to regard the situation.
Are you really arguing that this was a deliberate case of malicious and cynical irresponsibility -- an actual moral failing -- by Intel, and that AMD is actually more moral than Intel?
I'm willing to buy into the idea that near monopoly status led Intel to a set of somewhat predictable moral hazards where a series of every day management decisions regarding engineering resources vs. profits led them to choose profits over resource expenditure. Product roadmaps and model changes that minimized engineering ch
Fundamental problems : Yes, but... (Score:5, Informative)
And fundamental problems are still fundamental problems. The reason practically every processor has the same issues is because the same optimizations we used to make processors faster had the same fundamental design error.
I mean, either someone designed the core branch predictor block and everyone worldwide copied it for every processor, or everyone implemented it differently, yet it has the same Spectre flaw, implying that the flaw is inherent in the way branch predictors work.
Well, there are different level in the whole Spectre/Meltdown debacle.
Not all CPU are affected the same.
(And nitpicking : only CPUs doing speculative execution are affected. Lots of RISC don't, only some recent like 64bits ARM cores do. And there are still CISC cores that don't even in modern days like older Atoms and Xeon Phi on Intel MIC boards).
Speculative execution, from the moment it was presented (around the era of Intel Pentium Pro) as a new technology, was criticised as potentially executing past important checks if the speculation goes wrong. But it was dismissed back then. ...because back in those days (in the era of RC4 and 3DES encryption, MD4 MD5 checksums, etc.), attacking cryptography was still done by breaking imperfect algorithms and brute forcing small search spaces. Timing side channel where something of academic interests only. Known to exist, but in practice there are simple way to attack encryption.
- because in the end, nothing is committed to memory/register, but instead is discarded. There are not (direct) permanent effect of the wrong speculation.
- nobody paid attention much to the indirect, less significant effects, that still could be measured (like bringing memory into cache).
-
(so nobody in the early-to-mid 90s would have though that cache could lead to useable exploitable attack).
"Spectre" is just some researcher figuring out a way to exploit this "known from the beginning" knowledge by putting it into the light of how crypto is attacked nowadays (side-channels, timing, etc.)
That's the thing which every single CPU is affected by and which is still speculative execution working as it should (and normally should still be contained to data that could be accessed by the application anyway).
But then, there are the cause of the "Spectre Variants / Meltdown" - due to "excessive" optimisations, suddenly the CPU not only access things that the application could already access anyway. It usually boils down to the CPU (and its designer) were trying to be way to smart.
Meltdown only exclusively affects Intel CPUs. On intel CPUs, to speed things up, memory protection check are post-poned. If something happens to be already available in the cache, speculative execusion might pick it up even if it violates memory protection.
This runs countrary of how memory location works, is undocumented (unlike the base caracteristics of speculative execution).
(AMD CPU, as a counter example, are guaranteed by AMD to not be affected, because they do the expensive checks at the beginning of the pipeline and never let speculation through if it reads from an unauthorised memory location. There everything works according to docs).
Spectre variants affects Intel CPUs: to speed things up, even if the destination of a jump is unknown (because it depends on a memory location that isn't even known yet: e.g. not-yet computed index of a jump table), an Intel will try to speculate where the execution would go (by keeping a list to remember where usually this poistion ends-up jumping to). Due to the specific way Intel CPUs work internally and keep this list of "possible destinations of a jump", it can get confused and jump to an impossible situation. the speculative execution will jump to an address that is not even in the jump table, it will execute at a position that could never be reach under normal circumstance.
(AMD cannot exclude if their CPUs are affected. They definitely do not work the same way as Intel, so the current demo code created by Google certainly won't work, but they might be still exploited, albeit with a different exploit implementation)
That again is not just speculative execution working as it is known to work since 20 years ago, it's speculative execution doing that it should never be able to.
Again:
- there are things which affects nearly everyone, because that's how speculative execution is known to work (it just only recently started to get abused due to partially being forgotten and shifting interests).
- there are things which affect specific CPUs from specific vendors (mostly Intel, pehaps a few from AMD too) because the vendors (mostly Intel's engineer) went completely over-board and used formally invalid shortcuts.
The only way you can guarantee the designs are error free are to abandon everything that makes modern processors fast - OOO, speculation, branch prediction, and plenty more, including potentially pipelining (the fundamental technology everyone is trying to speed up by avoiding pipeline stalls). Go back to the old fetch-and-execute cycle and where memory operands are fully decoded and retrieved prior to even considering fetching the next instruction.
First : no, things like OOO (out-of-order) and multi-threading are safe - all these execute only stuff that is guaranteed to be executed. (It will always get executed anyway, even if the CPU was non-pipelined).
(Lots of modern-day GPUs and similar - Xeon Phi - work on such principle. They do try to keep the pipe-line busy, but they only do so by running stuff that is guaranteed to always be executed eventually anyway. They just pick from more different threads to do so).
Only speculative execution/branch predictions are about executing things that might not be *needed*.
(So only those are susceptible to be attacked).
And only specific implementations (mostly by Intel, perhaps a few by AMD too) do sometime execute things there there are no reasonable rational way to ever execute.
Second, although its hard to guaranteed, it is possible to build speculative execution that get rids of everything., even side effects (it will require extra silicon, but just like speculative results are tracked and discarded, speculative page caching could also be tracked, and pages-that-were-only-brought-in-by-speculation could be flushed, so no measurable difference could be exploited as timing side channel).
You need to reconsider some of the optimisation that were done, and not only look for things that could break actual code (e.g.: discard results by wrong speculation) like always done until now, but also look for things which were neglected but are popular in modern-day exploits that are mostly geared around side-channels.
Again, that's no guarantee (some hacker could think about another different exploit), but there are way to remove known exploit without sacrificing optimisations.
Re: (Score:3)
I would like to see some benchmarks to see exactly what would happen if we killed branch prediction. I bet Intel has a command to turn it off for debugging.
No doubt it'd kill performance, but I'm curious by how much.
For instance, the Raspberry Pi 3's CPU doesn't include any branch prediction at all. It runs at 1.2 Ghz. Different architecture, obviously, but still. If it can chug along pretty good playing h.264 video, run Netflix from Chromium, and do quite a lot of things rather well with a crippled chi
Re: (Score:3)
It's quite untrue.
The basic problem with the Spectre is not branch predictor and speculation, but weak hashes used to identify cache entries, that allow for purposeful collision.
A speculative execution down a branch is canceled, but the results are cached; the cache entry having a certain identifier based on the branch address and some other details. Next time the program enters that branch, the results are there, free to pick from the cache, following the identifier.
The problem is the identifier is *not un
Re: Is there any other option, Linus? (Score:2)
Re: (Score:3)
An open hardware platform would not solve the issue. This is not some super secret crap hidden deeply within the silicon, it's actually a documented feature of the CPU, for everyone to plainly see. And since it's hardware, there isn't much anyone could do to patch it because, well, getting so small soldering irons to manipulate the die are hard to come by.
Even if you somehow magically managed to manipulate the hardware in such a way that it works again, how many do you think have the skill and equipment to
Re: (Score:3)
it can be fixed with a microcode update!
The ideal and inexpensive (performance wise) fix is to *not* read from memory into the CPU cache during the speculative execution when that block of data is *not* there already. That cannot be done thanks to a microcode update.
Linus inventing a new chips (Score:3)
Or maybe starting up a serious open source hardware spec. He's probably one of the few people with the knowledge of what chips need to have, an audience,
He might have the knowledge of what a ship needs to have,
but the last time he tried making a chip [wikipedia.org] it didn't work that well~
---
Note: I'm just playing silly. I actually admire the guy for what his done (Linux, Git, etc.) but I couldn't resis pointing out that he DID already try making a CPU.
Re:Obligatory: Intel CPU Backdoor Report (Jan 1 20 (Score:5, Funny)
Is there a tl;dr version of that tl;dr version?
Re: Is there any other option, Linus? (Score:5, Informative)
AMD has one problem in common with Intel: Spectre. Meltdown is alone Intel's problem.
Meltdown is fairly easy to exploit and quite serious. Spectre could be as serious, but so far nobody has shown conclusively that it is actually exploitable in a real life situation. Intel spun it to make people think they're the same, so everyone thinks Intel and AMD have the same problem. They don't. Intel has a serious, potentially crippling security hole and a potentially serious but most likely not usable security hole. AMD only has the latter.
Re: (Score:3)
I read nothing in that link that says Power processors are vulnerable to Variant 3, commonly called Meltdown. The very article you link to says that Variant 3 is vulnerable on Intel processors.
Maybe you don't understand the difference between Meltdown and Spectre.
Don't forget guys (Score:5, Informative)
Don't forget guys Intel are the biggest contributor of code to the Linux kernel and it was they who wrote that code that would have crippled AMD as well as Intel cpus against their own flaw. Luckily AMD picked up on it and submitted a "elseif" statement to Intels code so AMD users wouldn't be neeedlessly affected by Intels cpu flaw.
Re:Don't forget guys (Score:5, Informative)
Yep, I'm sure it was just a simple oversight that Intel's patch that hurt performance on Intel and AMD, and wasn't necessary on AMD, was applied by default to CPUs from both vendors. You know, Intel has only known about this for 6-7 months, so they were really rushed to get a working patch out in time. /sarcasm.
Re:Don't forget guys (Score:5, Informative)
Here is CPU optimization expert Agner Fog's blog on the subject: Intel's "cripple AMD" function [agner.org]
What is going on here...? (Score:5, Interesting)
From the email correspondance; Linus says to mr Woodhouse:
"As it is, the patches are COMPLETE AND UTTER GARBAGE.
They do literally insane things. They do things that do not make
sense. That makes all your arguments questionable and suspicious. The
patches do things that are not sane.
WHAT THE F*CK IS GOING ON?"
In the post, Linus is not addressing much technical detail (just mentions "garbage MSR writes" whatever than means), but his bullshit detector goes off big time.
It is clear that he thinks the patches are sub-optimal, but that in itself cannot be the first time in Linux kernel history. There seems to be something else behind, or why would he ask "WHAT THE F*CK IS GOING ON" question? Why does he play the "questionable" and "suspicious" card? Does he think that there is something shady going on from Intel, that goes beyond the technical stuff?
Can anyone shed some light?
Re:What is going on here...? (Score:5, Insightful)
Linus is pointing out that the patches as submitted do things that should not be necessary. For example, the Linux kernel now uses this code technique called “retpoline” to avoid one of the Spectre bug variants. But this set of new patches also includes a performance-hurting workaround for the same Spectre variant that was already worked around. Why would that be necessary? It suggests that maybe Intel isn’t fully disclosing everything that they know, and that maybe the “retpoline” workaround is insufficient for reasons that Intel is keeping secret.
Don't Bet On Malice When Stupidity Will Do? (Score:5, Interesting)
We're seeing similar problems to this with other very-long-established technologies, such as Windows [with Windows 10]. Things that have worked for decades up until W10 are breaking, or they are breaking in new and frustrating ways.
For example, I have a triple-screen setup and using removable SSDs via a caddy unit, I can boot my computer into 2 different W10 instances, as well as multiple Linux builds. The 2 W10 instances behave in completely different ways, despite being set up, by me, with EXACTLY the same approach [scripted]. On one of them the Task Bar keeps relocating itself around the desktop, on the other it remains static. I've been back-and-forth with Microsoft and they don't know why...
At the root of the problem I suspect they have changed something in W10, written by someone no longer at the company, possibly poorly documented and possibly with unknown consequences.
Maybe Intel are having similar issues... A decision was made a very long time ago to do something insecure and stupid with speculative execution, but the person who made that decision is no longer with the company, so a new Team are trying to fix it and simply don't know what they're doing...
I honestly don't know what the source is, but I do know that I am seeing "existing" functionality break with much greater frequency on core platforms like this. It just smacks of carelessness...
Re:What is going on here...? (Score:5, Informative)
He's saying that you shouldn't have to "opt-in" to the security that everybody expects when you boot up your processor.
At the moment, the processor just says "Hey, if you flip some magic bits when I boot I'll slow myself down and try to apply a fix".
The processor should instead say "Hey, I'm one of the fixed models, don't bother trying to fix me again".
It's a marketing / legal tactic so they can say the processor runs at such-a-speed (but insecurely) whereas anyone who actually cares about using the processor has to - every boot - flips lots of magic bits to make it secure and kill its performance. If you forget, insecure. If you do it wrong, insecure. If your OS doesn't support it, insecure.
What Linus wants, and I can't disagree with, is a flag to this "this processor isn't vulnerable, so you don't need to do anything." which, if it's not present, they know that they have to apply as many protections as they can but can say "Hey, you have an insecure processor, we'll do our best" in the syslogs.
Lazy Intel? (Score:3)
Is it just my impression that Intel didn't do squat during the past half year, and only started searching for fixes now that the vulnerabilities are public?
What's also shocking to me, Intel is introducing new CPU models to the market that still don't have the flaws fixed. They really think the whole problem is overrated and no urgent action is needed.
Re:Lazy Intel? (Score:5, Informative)
But i have to say, even as i prefer AMD, AMD does not have spectre resistant CPUs either.
Yes, they do. SPECTRE attacks are more difficult to carry out against AMD than against Intel. In fact, they are only vulnerable to one out of two of the classes of SPECTRE attack. Please don't lie.
Meme (Score:3)
Linus said some bad words about Intel's behavior to Mr. Woodhouse, an Amazon employee.
Amazon is a major cloud provider.
Linus is now in his late 40s.
So.... the headline should read "OLD MAN YELLS AT CLOUD! CLOUD ANSWERS!"
Re:and your solution is? (Score:5, Insightful)
I think his response to all of this was a verbal kick to the scrotum of intel in a very public way.
I am glad of it too, had it not been for this thread I would not have known about the issues with these 'patches' which now seem more like last minute frantically cobbled together garbage.
Because of linus' efforts I, and many other lurkers here on slashdot will be VERY wary of any 'updates' involving intel cpu's.
Linus does not need to fix this, the community does not need to fix this, Intel needs to fix this, lets be realistic.
Re: (Score:3)
Re: (Score:3)
Intel has billions of dollars in cash on hand, let alone their yearly profits. They could spend some of that compensating people.
But no, Intel is a corporation, and consequences are for little people.
Re:and your solution is? (Score:5, Interesting)
we must fix things with what is possible, no matter how ugly.
Intel went straight to ugly, and did not satisfactorily explore the realm of the possible. Linus perceived this, and announced it to the world. The ball is now in Intel's court. They can be responsible and competent, or the whole world can know that they are the fuckups that they are. It's their call.
Re: (Score:2)
Why is fixing it Linus' job? He pointed out, accurately, that Intel's fix is for this Intel problem is complete bullshit. The importance of this is that he has stripped away Intel's efforts to portray its fix as something we should accept.
I applaud the man. Linus has proved that the Emperor has no clothes. It isn't his job to serve as his tailor.
New Silicon. (Score:2)
Re:and your solution is? (Score:5, Insightful)
My reading also. Intel did some shady things they likely did know were shady in order to have the best performance. Now that they have been caught and the shady things actually turn out to be really bad, they still do not want to fix them, because they do not want to admit how much they padded the performance of their chips and they still do not want to compete with an actually good design because everybody will see how they have been screwed over by Intel.
Linux is just calling that out. I mean, a functionality that fixes a very critical security bug and it is _off_ by default? That is insane!
Re: (Score:3)
They knew damn well it was shady. It was widely discussed at the time. And then they patented it so AMD could not do it!
Intel are up to their neck in it.
Re: (Score:2)
How about intel compensating everyone whos brought one of their flawed chips?"
That'd be about 25% of the people on the planet, right?
Might affect the stock price and bonuses, right?
Nonstarter.
Re: (Score:3)
Why? Intel had to recall Pentiums due to the FDIV bug and that was arguably way less serious than Meltdown.
That was just P54Cs. This is every intel processor since the Pentium Pro. And many of the motherboards will require BIOS updates to take a newer processor, so even if Intel could make new processors which fit into the same package and solve these problems, they literally wouldn't work in a huge number of scenarios — especially OEM PCs whose BIOS is designed to carefully restrict which processors and memory you're allowed to install.
Re: (Score:2)
Used to be a lot of clever spods in Cambridge where ARM started up(through Acorn
In British English SPOD is an acronym - Sole Purpose Obtain Degree for a student who is academically good but socially reclusive.
Just a lump of ice for all you septics on /.
Re: (Score:3)
It was kind of you to decode SPOD. Thank you.
Now could you do the same for:
The closest Google got me was the dangers of flushing toilets in a house using a septic system that had iced up in the winter time, and I'm not sure that is where you were going...
Re:ARM guys will probably do it right (Score:5, Insightful)
ARM does not have to fix anything for the issue under discussion and neither has AMD. Meltdown is Intel only. They did it to get more performance while everybody else was careful and did not do it. Intel was warned by numerous research papers that this could go badly. Now they are lying about it and are trying to a) confuse the issue and b) have the fix (which exposes their real performance when running securely) not active at startup. a) is dishonorable and b) is insane. Linus is just calling them out here.
Spectre is something else, and hits almost everybody. While fortunately, it is much, much harder to exploit (Meltdown is easy), Spectre will also be much harder to fix. It is possible that we will see an arms-race for a while with Spectre and that, in the end, it will need to be a compiler-level fix that finally fixes things. Not good, but apparently, the performance penalties for an actual hardware fix at this time would be a performance loss of 5x...20x.
But to re-iterate: The only reason why Intel tries to lump Meltdown and Spectre together is so that they do not look as grossly incompetent and dismissive of their customers's security as they are with Meltdown.
Re: (Score:2)
Ryzen 7 1st gen vs Intel i7 8th gen is something like 15% faster in multi-core vs 15% faster in gaming work-loads. About. There's variations like AVX (where Intel HEDT is faster still relative the i7) and games where the performance impact is closer to 40%. The Ryzen 7 chip and motherboard is cheaper. ThreadRipper socket is cool.
With Ryzen 2000-series for the "2600" engineering sample the clock was 200 MHz higher. Guess we'll see if it stop at that or not.
The AMD chips have a bit more PCI-express lanes and
Re: (Score:2)
Re: (Score:3)
Re: (Score:2)
You can fix it.
You just need to ensure that you treat speculative loads in a way such that they can't reveal information. Say, another cache just for speculative loads. This would also mean you don't clutter up the normal cache with the non-used results, and for the brief moment that you're executing down a path you're using already-cached results, just from another cache. No reason that can't then send a hint or even directly transfer to the main cache.
The problem is really: Is fixing it worth while co
Re: (Score:3)
How good are Intel engineers these days?
About as good as Volkswagen's. Both followed their remit perfectly (Intel: Performance above all else, VW: pass the emissions test above all else)