Bug 510 - check for combinatorial path through DIV FSM
Summary: check for combinatorial path through DIV FSM
Status: RESOLVED FIXED
Alias: None
Product: Libre-SOC's first SoC
Classification: Unclassified
Component: Source Code (show other bugs)
Version: unspecified
Hardware: Other Linux
: High enhancement
Assignee: Luke Kenneth Casson Leighton
URL:
Depends on:
Blocks: 383
  Show dependency treegraph
 
Reported: 2020-10-04 20:11 BST by Luke Kenneth Casson Leighton
Modified: 2020-10-11 15:01 BST (History)
1 user (show)

See Also:
NLnet milestone: ---
total budget (EUR) for completion of task and all subtasks: 0
budget (EUR) for this task, excluding subtasks' budget: 0
parent task for budget allocation:
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Kenneth Casson Leighton 2020-10-04 20:11:17 BST
https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/fu/div/fsm.py;h=befb463d0ef1f23fcbde6933e0c0e9070cf08b3f;hb=a1f8daaf0e4e2fcea1edf8b9037a2082cc1a59a2#l170

jacob when running nextpnr5 the highest reported path latency is through the DIV FSM because the CompUnit Manager input snd output latches, controlled by go/req from the regfiles are combinatorial *and* nextpnr5 is finding and reporting a combinatorial through the FSM itself.

can you please take a look and see where that is, you can use "ltp" in yosys on the compunit or pileline ilang file (after synth or synth_ecp5)

i suspect it is this:

      m.d.comb += self.n.valid_o.eq(~self.empty & self.div_state_next.o.done)

although it us likely to be more.  i think the root of the issue is that no output should be copied excepy synchtonously or through a synchronous register.

i.e. referring to saved_state is ok because that is copied synxhronously and thus creates a firebreak.

however referring to data.o directly and combinatorially is not.

this is a high priority issue as it will be picked up later during ASIC timing analysis which is a month closer to the deeadline, which would be really bad to try to "fix" with only days left before GDSII files are due.
Comment 1 Jacob Lifshay 2020-10-04 23:20:34 BST
commit a779f6c7d11302781f831e274cbf84b4e65593f5 (HEAD -> master, origin/master, origin/HEAD)
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Sun Oct 4 15:17:41 2020 -0700

    change div FSM pipeline unit to not have a combinatorial path directly from inputs to outputs
    
    Fixes #510
Comment 2 Luke Kenneth Casson Leighton 2020-10-04 23:29:15 BST
neat trick, i love it! :)

diff --git a/src/soc/fu/div/fsm.py b/src/soc/fu/div/fsm.py
index 9083d6b9..2a78b19a 100644
--- a/src/soc/fu/div/fsm.py
+++ b/src/soc/fu/div/fsm.py
@@ -106,7 +106,14 @@ class DivState:
 
     @property
     def done(self):
-        return self.q_bits_known == self.quotient_width
+        return self.will_be_done_after(steps=0)
+
+    def will_be_done_after(self, steps):
+        """ Returns 1 if this state will be done after
+            another `steps` passes through DivStateNext."""
+        assert isinstance(steps, int), "steps must be an integer"
+        assert steps >= 0
+        return self.q_bits_known >= max(0, self.quotient_width - steps)
 
     @property
     def quotient(self):
Comment 3 Jacob Lifshay 2020-10-04 23:38:51 BST
Oops, forgot to check the ilang. The loop is still there.
Comment 4 Luke Kenneth Casson Leighton 2020-10-04 23:42:01 BST
(In reply to Jacob Lifshay from comment #3)
> Oops, forgot to check the ilang. The loop is still there.

aw doh :)

btw i get this from pia, i did a git pull on salsa just now, nothing pushed
recently?
    return pia.InstructionInput(ra=inp["ra"], rb=inp["rb"], overflow=overflow)
TypeError: PyInstructionInput.__new__() missing required positional argument: rc
Comment 5 Luke Kenneth Casson Leighton 2020-10-04 23:49:30 BST
(In reply to Jacob Lifshay from comment #3)
> Oops, forgot to check the ilang. The loop is still there.

i think... you're always going to have a loop: in a FSM that's unavoidable.
however the thing is that the input and the output to that loop *must* be
separated entirely by "sync".

given that both self.empty and self.saved_state are sync, i think that's
been successfully achieved.
Comment 6 Jacob Lifshay 2020-10-04 23:51:59 BST
(In reply to Jacob Lifshay from comment #3)
> Oops, forgot to check the ilang. The loop is still there.

Guess it's not exactly a loop. according to yosys, the max path length was reduced from 13 gates to 8 gates. Luke, I'll let you close this if you think it's sufficient.

command used:
yosys <<<$'read_ilang div_pipeline_FSMDivCore.il\nproc\nflatten\nsynth\n;;;\nltp -noff'
Comment 7 Jacob Lifshay 2020-10-05 00:03:47 BST
Assigning to Luke to test with nextpnr and stuff. Close if you think the fix was sufficient, otherwise reassign to me.
Comment 8 Luke Kenneth Casson Leighton 2020-10-11 15:01:27 BST
COOL!

well done jacob.  the sdram path is in litex.  and is around the 10ns range.


Info: Critical path report for clock '$glbnet$sys_clk' (posedge -> posedge):
Info: curr total
Info:  0.0 10.2  Setup sdram_choose_req_grant_TRELLIS_FF_Q_DI_PFUMX_Z_1_SLICE.CE
Info: 2.7 ns logic, 7.4 ns routing


Info: Max frequency for clock              '$glbnet$sys_clk': 98.47 MHz (PASS at 55.01 MHz)
Info: Max frequency for clock          '$glbnet$crg_clkout1': 307.50 MHz (PASS at 25.00 MHz)
Info: Max frequency for clock '$glbnet$clk100$TRELLIS_IO_IN': 339.33 MHz (PASS at 100.00 MHz)