https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/fu/div/fsm.py;h=befb463d0ef1f23fcbde6933e0c0e9070cf08b3f;hb=a1f8daaf0e4e2fcea1edf8b9037a2082cc1a59a2#l170 jacob when running nextpnr5 the highest reported path latency is through the DIV FSM because the CompUnit Manager input snd output latches, controlled by go/req from the regfiles are combinatorial *and* nextpnr5 is finding and reporting a combinatorial through the FSM itself. can you please take a look and see where that is, you can use "ltp" in yosys on the compunit or pileline ilang file (after synth or synth_ecp5) i suspect it is this: m.d.comb += self.n.valid_o.eq(~self.empty & self.div_state_next.o.done) although it us likely to be more. i think the root of the issue is that no output should be copied excepy synchtonously or through a synchronous register. i.e. referring to saved_state is ok because that is copied synxhronously and thus creates a firebreak. however referring to data.o directly and combinatorially is not. this is a high priority issue as it will be picked up later during ASIC timing analysis which is a month closer to the deeadline, which would be really bad to try to "fix" with only days left before GDSII files are due.
commit a779f6c7d11302781f831e274cbf84b4e65593f5 (HEAD -> master, origin/master, origin/HEAD) Author: Jacob Lifshay <programmerjake@gmail.com> Date: Sun Oct 4 15:17:41 2020 -0700 change div FSM pipeline unit to not have a combinatorial path directly from inputs to outputs Fixes #510
neat trick, i love it! :) diff --git a/src/soc/fu/div/fsm.py b/src/soc/fu/div/fsm.py index 9083d6b9..2a78b19a 100644 --- a/src/soc/fu/div/fsm.py +++ b/src/soc/fu/div/fsm.py @@ -106,7 +106,14 @@ class DivState: @property def done(self): - return self.q_bits_known == self.quotient_width + return self.will_be_done_after(steps=0) + + def will_be_done_after(self, steps): + """ Returns 1 if this state will be done after + another `steps` passes through DivStateNext.""" + assert isinstance(steps, int), "steps must be an integer" + assert steps >= 0 + return self.q_bits_known >= max(0, self.quotient_width - steps) @property def quotient(self):
Oops, forgot to check the ilang. The loop is still there.
(In reply to Jacob Lifshay from comment #3) > Oops, forgot to check the ilang. The loop is still there. aw doh :) btw i get this from pia, i did a git pull on salsa just now, nothing pushed recently? return pia.InstructionInput(ra=inp["ra"], rb=inp["rb"], overflow=overflow) TypeError: PyInstructionInput.__new__() missing required positional argument: rc
(In reply to Jacob Lifshay from comment #3) > Oops, forgot to check the ilang. The loop is still there. i think... you're always going to have a loop: in a FSM that's unavoidable. however the thing is that the input and the output to that loop *must* be separated entirely by "sync". given that both self.empty and self.saved_state are sync, i think that's been successfully achieved.
(In reply to Jacob Lifshay from comment #3) > Oops, forgot to check the ilang. The loop is still there. Guess it's not exactly a loop. according to yosys, the max path length was reduced from 13 gates to 8 gates. Luke, I'll let you close this if you think it's sufficient. command used: yosys <<<$'read_ilang div_pipeline_FSMDivCore.il\nproc\nflatten\nsynth\n;;;\nltp -noff'
Assigning to Luke to test with nextpnr and stuff. Close if you think the fix was sufficient, otherwise reassign to me.
COOL! well done jacob. the sdram path is in litex. and is around the 10ns range. Info: Critical path report for clock '$glbnet$sys_clk' (posedge -> posedge): Info: curr total Info: 0.0 10.2 Setup sdram_choose_req_grant_TRELLIS_FF_Q_DI_PFUMX_Z_1_SLICE.CE Info: 2.7 ns logic, 7.4 ns routing Info: Max frequency for clock '$glbnet$sys_clk': 98.47 MHz (PASS at 55.01 MHz) Info: Max frequency for clock '$glbnet$crg_clkout1': 307.50 MHz (PASS at 25.00 MHz) Info: Max frequency for clock '$glbnet$clk100$TRELLIS_IO_IN': 339.33 MHz (PASS at 100.00 MHz)