475 – cxxsim improvements

Bug 475 - cxxsim improvements

Summary: cxxsim improvements

Status:	RESOLVED FIXED

Alias:	None

Product:	Libre-SOC's first SoC
Classification:	Unclassified
Component:	Source Code (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	--- enhancement
Assignee:	whitequark

URL:

Depends on:
Blocks:

Reported:	2020-08-27 11:33 BST by Luke Kenneth Casson Leighton
Modified:	2021-05-23 12:20 BST (History)
CC List:	2 users (show)

See Also:	364 622 638
NLnet milestone:	NLNet.2019.10.043.Wishbone
total budget (EUR) for completion of task and all subtasks:	1750
budget (EUR) for this task, excluding subtasks' budget:	1750
parent task for budget allocation:	362
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:	"whitequark"={amount=1500, submitted=2021-05-05, paid=2021-05-11} "cesar"={amount=250, submitted=2021-05-05, paid=2021-05-11}

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Luke Kenneth Casson Leighton 2020-08-27 11:33:45 BST

https://github.com/nmigen/nmigen/issues/439 - cxxsim fsm bug (DONE)

cxxsim needs some adjustments after its first iteration encountered issues

Comment 1 Luke Kenneth Casson Leighton 2020-09-02 12:24:33 BST

whitequark, hi, i saw this extensive explanation:
https://freenode.irclog.whitequark.org/nmigen/2020-09-02

if you feel that you need to do more than would be reasonably fairly
covered by EUR 1500 please do not think that you have to limit decisions
because of that.  we can go up to EUR 3000 as that is the available
budget within #362.

Comment 2 Luke Kenneth Casson Leighton 2020-09-20 22:25:19 BST

whitequark hi, apologies i am dealing woth a deadline and rtlil sim is sufficient for that.

cesar reports that he tried running test_issuer.py with the latest revisions to cxxsim and it "hung" i.e. likely went into a combinatorial loop.

libresoc has some unusual combinatorial bypass modules which may be triggering unanticipated behaviour, making it a very good way to catch errors.

unfortunately the fact that it is such a massive design means that creating MVCEs is tricky (not helped by the fact that i am occupied with the deadline).

cesar if you can help, you can do a couple of things:

1) run any of the test_pipe_caller.py unit tests in soc/fu/*

2) run all of the compunit tests in soc/fu/compunits/test/*

3) cut back test_issuer.py as i described in the list message by changing the number of ALUs (to zero) and also cutting out the tests that are added to it (at the bottom of test_issuer.py)

you may not be able to cut out LDST entirely without an extensive overhaul of test_issuer.py

if test_ldst_compunits.py works but a severely cut back test_issuer.py does not then we have narrowed things down considrrably.

i will add some EUR for you to investigate this, cesar.  whitequark this will not come from the amount agreed with you.

Comment 3 Cesar Strauss 2020-09-21 10:35:11 BST

(In reply to Luke Kenneth Casson Leighton from comment #2)

> cesar if you can help, you can do a couple of things:

Sure. Will let you know what I can find.

Comment 4 Luke Kenneth Casson Leighton 2020-09-21 11:04:15 BST

(In reply to Cesar Strauss from comment #3)
> (In reply to Luke Kenneth Casson Leighton from comment #2)
> 
> > cesar if you can help, you can do a couple of things:
> 
> Sure. Will let you know what I can find.

star.  there's at least 20 pipe_caller and compunit tests that can be
used, to at least narrow down the area.

Comment 5 Cesar Strauss 2020-09-26 17:51:57 BST

All non-broken pipe_caller and compunit tests seem to hang in cxxsim.

I will try to see what they have in common, and investigate further.

Interestingly, these seem to hang even in pysim:

src/soc/fu/compunits/test/test_alu_compunit.py
src/soc/fu/compunits/test/test_cr_compunit.py

The following are broken for me, and were not tested:

src/soc/fu/alu/test/test_pipe_caller.py
src/soc/fu/branch/test/test_pipe_caller.py
src/soc/fu/cr/test/test_pipe_caller.py
src/soc/fu/compunits/test/test_div_compunit.py
src/soc/fu/ldst/test/test_pipe_caller.py
src/soc/fu/mul/test/test_pipe_caller.py
src/soc/fu/shift_rot/test/test_pipe_caller.py

Comment 6 Luke Kenneth Casson Leighton 2020-09-26 18:37:36 BST

(In reply to Cesar Strauss from comment #5)
> All non-broken pipe_caller and compunit tests seem to hang in cxxsim.
> 
> I will try to see what they have in common, and investigate further.
> 
> Interestingly, these seem to hang even in pysim:
> 
> src/soc/fu/compunits/test/test_alu_compunit.py
> src/soc/fu/compunits/test/test_cr_compunit.py

damnit i've not been keeping these up-to-date, relying on test_issuer instead.
this means that one of the operands needed is not being set up properly

this leaves either the rd.req "unanswered" (with the obvious result that
the compunit waits indefinitely for an operand that is never going to be
supplied) or the wr.req "unanswered" (with the obvious result that the
compunit waits indefinitely for a result to be read that is never going to
be read).

this is since creating the PowerDecoder2 "subsets".

test_issuer.py uses its own completely separate (full) PowerDecoder2
called "simdec", but the */test_pipe_caller.py tests do not do that.
they'll need to be updated.

Comment 7 Luke Kenneth Casson Leighton 2020-09-26 21:05:12 BST

ok cesar i found the issue, it is unique to the simulator which needs
an assembly code ("asmcode") which is the comment field from the
CSV files, to be able to interpret the assembly listings in the unit
tests.

can you try the tests again, they should now work.

Comment 8 Cesar Strauss 2020-09-27 11:52:00 BST

(In reply to Luke Kenneth Casson Leighton from comment #7)
> can you try the tests again, they should now work.

Thanks, it helped a lot.

Every */test_pipe_caller.py test is affected.

I already found a single combination of inputs that reproduces the issue, for those cases.

I will proceed with the reducing task.

Comment 9 Luke Kenneth Casson Leighton 2020-09-27 11:57:09 BST

(In reply to Cesar Strauss from comment #8)
> (In reply to Luke Kenneth Casson Leighton from comment #7)
> > can you try the tests again, they should now work.
> 
> Thanks, it helped a lot.
> 
> Every */test_pipe_caller.py test is affected.
> 
> I already found a single combination of inputs that reproduces the issue,
> for those cases.

fantastic.  and yet, compalu_multi is not?  that narrows it down quite considerably, given that there's actually not a huge difference between them.

> I will proceed with the reducing task.

awesome.

Comment 10 Cesar Strauss 2020-09-27 12:26:50 BST

(In reply to Luke Kenneth Casson Leighton from comment #9)
> 
> fantastic.  and yet, compalu_multi is not?  that narrows it down quite
> considerably, given that there's actually not a huge difference between them.

I hadn't tried compalu_multi. It does seem to be affected.

I had only tried alu_fsm.py, which is not affected.

Comment 11 Luke Kenneth Casson Leighton 2020-09-27 13:25:14 BST

(In reply to Cesar Strauss from comment #10)

> I hadn't tried compalu_multi. It does seem to be affected.

ok that's also good as it is way less code.

> 
> I had only tried alu_fsm.py, which is not affected.

which is even more interesting.

the likely candidate then, taking a wild guess, is the use of the latchregister function.

this creates a mux with a loop that is guaranteed 100% not to ever become a combinatorial loop however that requires manual analysis to verify.

Comment 12 Cesar Strauss 2020-09-27 15:36:59 BST

Found it!

https://github.com/nmigen/nmigen/issues/439#issuecomment-699641967

In a wild turn of events, the hang turned out to be triggered just by updating an input wider than 32 bits, in the simulation process.

I guess we tended to use 32 bits or less in our own experiments, which resulted a lack of coverage for wider signals.

This was only caught in higher-level tests, which used the full 64 bits of our processor design.

Comment 13 Luke Kenneth Casson Leighton 2020-09-27 16:44:49 BST

an intriguing but ultimately simple find :)

it is quite normal to expect the remaining bits (MSBs) to be set to zero, and also for any bits that will not fit in the LHS to be thrown away.

this for example is how all 1s are set: x.eq(~0) or x.eq(-1) where x is unsigned and way less than 64 bit.

Comment 14 Cesar Strauss 2020-09-27 17:40:06 BST

(In reply to Luke Kenneth Casson Leighton from comment #13)
> it is quite normal to expect the remaining bits (MSBs) to be set to zero,
> and also for any bits that will not fit in the LHS to be thrown away.
> 
> this for example is how all 1s are set: x.eq(~0) or x.eq(-1) where x is
> unsigned and way less than 64 bit.

Any non-zero update triggers the issue, provided that the signal being updated has a width greater than 32 bits.

For instance, in my trials, alu_fsm.py and alu_hier.py were not affected. This is because they instance 8-bit and 16-bit ALUs in their unit tests. I can easily make them hang by instancing a 64-bit ALU instead.

On the other hand, compalu_multi hangs because the imm_data field in the CompALUOpSubset Record is hardcoded to be 64 bits.

Comment 15 Luke Kenneth Casson Leighton 2020-09-27 18:13:39 BST

(In reply to Cesar Strauss from comment #14)
> (In reply to Luke Kenneth Casson Leighton from comment #13)
> > it is quite normal to expect the remaining bits (MSBs) to be set to zero,
> > and also for any bits that will not fit in the LHS to be thrown away.
> > 
> > this for example is how all 1s are set: x.eq(~0) or x.eq(-1) where x is
> > unsigned and way less than 64 bit.
> 
> Any non-zero update triggers the issue, provided that the signal being
> updated has a width greater than 32 bits.

interesting.  that's a very important distinction that should go in the
nmigen bugreport.  i suspect it's likely down to the use of ctypes.

it would be worthwhile specifically adding in very long signal unit tests
(128, 256, 512 bit).

Comment 16 Cesar Strauss 2020-09-29 11:02:33 BST

Whitequark fixed the hanging cxxsim issue in nMigen.

Now that the tests no longer hang, I can see that the test_pipe_caller.py tests do give the same results as pysim.

However, the compunit tests seem to wait for a result that never comes.

I'll continue investigating.

Comment 17 Luke Kenneth Casson Leighton 2020-09-29 11:06:12 BST

(In reply to Cesar Strauss from comment #16)
> Whitequark fixed the hanging cxxsim issue in nMigen.

i saw that, fantastic, and thank you for narrowing it down.
 
> Now that the tests no longer hang, I can see that the test_pipe_caller.py
> tests do give the same results as pysim.

excellent.

> However, the compunit tests seem to wait for a result that never comes.

and pysim works?
 
> I'll continue investigating.

ok

Comment 18 Cesar Strauss 2020-09-29 11:37:20 BST

(In reply to Luke Kenneth Casson Leighton from comment #17)
> > However, the compunit tests seem to wait for a result that never comes.
> 
> and pysim works?

Yes.

I will proceed by comparing pysim and cxxsim traces for
~/src/soc/src/soc/experiment/test/test_compalu_multi.py

Comment 19 Cesar Strauss 2020-09-29 22:33:10 BST

(In reply to Cesar Strauss from comment #18)
> However, the compunit tests seem to wait for a result that never comes.

This is the problem:

self.shadown_i = Signal(name="cu_shadown_i", reset=1)

On pysim, it keeps high, even if not driven by the simulation process.

On cxxsim, it stays low.

So, shadown_i is kept active, the result is never written, and the process keeps waiting for a result that never comes.

Actually, if I explicitly set shadown_i high, most of the compunit tests begin to pass already.

Comment 20 Luke Kenneth Casson Leighton 2020-09-30 00:12:42 BST

(In reply to Cesar Strauss from comment #19)
> (In reply to Cesar Strauss from comment #18)
> > However, the compunit tests seem to wait for a result that never comes.
> 
> This is the problem:
> 
> self.shadown_i = Signal(name="cu_shadown_i", reset=1)
> 
> On pysim, it keeps high, even if not driven by the simulation process.
> 
> On cxxsim, it stays low.

ah yes.

> Actually, if I explicitly set shadown_i high, most of the compunit tests
> begin to pass already.

ahh very interesting.

this might be optimised out (by yosys), or just forgot that reset arg can be nonzero.

the signal is not driven yet because we have not yet added the shadow matrices.

Comment 21 Luke Kenneth Casson Leighton 2020-12-02 15:12:44 GMT

waiting for merge, but not a blocker for this task:

* https://github.com/YosysHQ/yosys/pull/2466
* https://github.com/YosysHQ/yosys/pull/2459