One idea I had for making a spectre-proof speculative execution engine is to build a theoretical model CPU such that it compares a branch oracle to the results of its branch prediction unit and takes the amount of time needed by the misprediction penalty when the branch predictor mispredicts, but doesn't actually execute any mispredicted instructions. This makes the model CPU unable to have spectre-style vulnerabilities since it doesn't do any actual speculative execution.
Then, a physical CPU is built using the exact same design (where every instruction and every cache state change completes at the exact same clock cycle as the corresponding instruction in the theoretical model) but using speculative execution instead of the branch oracle. This gives a physical cpu that can be proven to not have any timing vulnerabilities that the model CPU doesn't have (ruling out spectre-style vulnerabilities) by proving that it follows the same steps as the model CPU.
I started writing some code to simulate that, but didn't finish.
having an interesting discussion with Mitch and others on news:comp.arch about if my idea works or not, feel free to check it out if your interested
Created attachment 33 [details]
archive of conversation on comp.arch
archive.org doesn't seem to currently archive usenet and doesn't work with google groups so I downloaded the messages for the Libre-SOC branch prediction thread which also includes the discussion about my idea for a way to make a speculative processor immune to spectre-style bugs
According to the Spectre paper here: https://spectreattack.com/spectre.pdf
there are multiple Spectre style vulnerabilities. I will focus here on Spectre variant 1 from the paper.
The solution I present below comes from the Solutions section of the paper in "Mitigation Options".
Variant 1 of the attack can be prevented by having commit buffers from L2 or memory to L1 during speculative execution, and by having a commit buffer from reg to L1.
Its all about making sure that cache is reverted during L1 so that you can cache flushes cannot leak sensitive information.
Oh, and cache flushes should be disallowed during speculative execution.
The timing concept you describe
(In reply to Yehowshua from comment #3)
> Its all about making sure that cache is reverted during L1 so that you can
> cache flushes cannot leak sensitive information.
so basically this extends the Dependency Matrix "Shadow" concept actually
right down into the Caches.
(if a Shadow exists it is because there is some "damage" that could occur due
On Mon, May 18, 2020 at 8:25 PM Yehowshua <firstname.lastname@example.org> wrote:
> > so basically this extends the Dependency Matrix "Shadow" concept actually
> > right down into the Caches.
> Exactly - keep in mind that this is only for variant 1.
there may still be some impact. reduction in TLB entry availability
may cause timing alterations.
the L0 Cache/Buffer *might* actually be the accidentally-already-designed
place where speculation stops. i stress "might".
> I’ll read up on the other variants later.
> Why do CPUs have to be so complicated...
actually, you know what? collating all "shadow" signals right down into
the L0CacheBuffer might be a really good way to ensure that there is time
for the L0CacheBuffer to "collate" (merge) multiple requests.
right now, as it stands, each request that comes in, if sent immediately
on an elstrided sequence, could be followed up by another request that
*could* have been merged with the 1st into the exact same cache-line...
if only the 1st had been delayed for just one cycle.
if there is a good reason to delay them, this actually could be beneficial.