Bug 325 - create POWER9 TRAP pipeline
Summary: create POWER9 TRAP pipeline
Status: RESOLVED FIXED
Alias: None
Product: Libre-SOC's first SoC
Classification: Unclassified
Component: Source Code (show other bugs)
Version: unspecified
Hardware: Other Linux
: --- enhancement
Assignee: Samuel A. Falvo II
URL:
Depends on: 356 421 436
Blocks: 383 409
  Show dependency treegraph
 
Reported: 2020-05-19 13:28 BST by Luke Kenneth Casson Leighton
Modified: 2020-10-01 23:58 BST (History)
4 users (show)

See Also:
NLnet milestone: NLNet.2019.10.Wishbone
total budget (EUR) for completion of task and all subtasks: 500
budget (EUR) for this task, excluding subtasks' budget: 0
parent task for budget allocation: 383
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:
cole={amount=100,paid=2020-10-01} lkcl={amount=300,paid=2020-10-01}


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Kenneth Casson Leighton 2020-05-19 13:28:46 BST
we need a "trap" pipeline, or to decide if this is to be considered part of Branch, due to the changes that trap makes to the PC.  this by analysing the register port allocation.

Ops:

* OP_TRAP     done and test
* OP_MTMSR    not done d in simulator
* OP_MFMSR    done needs test
* OP_RFID     done and test
* OP_SC       done needs test
* OP_ADDPCIS  TODO

all of these fit nicely into the same 5 regfile ports

SPRSET and its get-friend are one too many, because SPRSET/get need
to access the "slow" SPR regfile as well.


TODO: RFID priv/virt

# XXX f_out.virt_mode <= b_in(MSR.IR) or b_in(MSR.PR);
# XXX f_out.priv_mode <= not b_in(MSR.PR);


source:



source:

https://git.libre-soc.org/?p=soc.git;a=tree;f=src/soc/fu/trap;hb=HEAD
Comment 1 Luke Kenneth Casson Leighton 2020-05-19 15:11:03 BST
* twi TO,RA,SI
* tw TO,RA,RB

    a <- EXTS((RA)[32:63])
    b <- EXTS((RB)[32:63])
    if (a < b) & TO[0] then TRAP
    if (a > b) & TO[1] then TRAP
    if (a = b) & TO[2] then TRAP
    if (a <u b) & TO[3] then TRAP
    if (a >u b) & TO[4] then TRAP

Special Registers Altered:

    None

so, here, inputs are:

* RA
* RB (or imm)
* MSR (where does this come from?)

logic is identical to CMP


outputs are:

* NIA
* SRR1 (SPR)

see
https://github.com/antonblanchard/microwatt/blob/master/execute1.vhdl

irq_nia <= (to_unsigned(16#700#, 64));
srr1 <= msr_copy(ctrl.msr);
-- set bit 46 to say trap occurred
srr1(63 - 46) <= '1';
                

this one although it smells like a Branch it actually seems to smell more like a MicroCoded double op, a combination of OP_CMP and OP_BR.

i am reluctant to suggest we do microcoding for this design right now
Comment 2 Luke Kenneth Casson Leighton 2020-05-19 15:57:27 BST
this code in microwatt execute1.vdhl tends to suggest that two SPRs need to be
outputted from TRAP:

* SRR0 (containing the PC *after* the trap is to return - NIA)
* SRR1 (containing the new MSR)

note they are also using a FSM, writing out the two registers SRR0 and SRR1 across
2 clock cycles.

    ctrl_tmp.irq_state <= WRITE_SRR0;
    exception := '0';
        illegal := '0';
        exception_nextpc := '0';
        v.e.exc_write_enable := '0';
        v.e.exc_write_reg := fast_spr_num(SPR_SRR0);
        v.e.exc_write_data := e_in.nia;

    if ctrl.irq_state = WRITE_SRR1 then
        v.e.exc_write_reg := fast_spr_num(SPR_SRR1);
        v.e.exc_write_data := ctrl.srr1;
            v.e.exc_write_enable := '1';
            ctrl_tmp.msr(MSR_SF) <= '1';
            ctrl_tmp.msr(MSR_EE) <= '0';
            ctrl_tmp.msr(MSR_PR) <= '0';
            ctrl_tmp.msr(MSR_IR) <= '0';
            ctrl_tmp.msr(MSR_DR) <= '0';
            ctrl_tmp.msr(MSR_RI) <= '0';
            ctrl_tmp.msr(MSR_LE) <= '1';
        f_out.redirect <= '1';
        f_out.redirect_nia <= ctrl.irq_nia;
        v.e.valid := e_in.valid;
Comment 3 Luke Kenneth Casson Leighton 2020-05-19 16:49:15 BST
section 7.5.9 covers TRAP, and shows that yes:

* SRR0 is set to "PC following trap"
* SRR1 contains the *old* MSR (sorry, not the new MSR)
* MSR is updated to meet the conditions of the trap
Comment 4 Jacob Lifshay 2020-05-19 17:54:55 BST
I think this should also cover interrupts.

I would suggest having the execution unit be a FSM since traps/interrupts are rare, so the FSM is not activated unless a trap/interrupt is taken. When we receive an external interrupt, we could switch the decoder to insert a special trap instruction instead of decoding normally.
Comment 5 Luke Kenneth Casson Leighton 2020-05-19 19:13:27 BST
(In reply to Jacob Lifshay from comment #4)
> I think this should also cover interrupts.
> 
> I would suggest having the execution unit be a FSM since traps/interrupts
> are rare, so the FSM is not activated unless a trap/interrupt is taken. When
> we receive an external interrupt, we could switch the decoder to insert a
> special trap instruction instead of decoding normally.

traps are stunningly simple, the only thing they do is change the MSR and NIA (program counter).  and change SRR0 and SRR1. it is hardly even an FSM :)

i like the interrupt idea, i believe it is kinda expected to do this.

l.
Comment 6 Luke Kenneth Casson Leighton 2020-05-19 21:43:34 BST
* did the usual code-morphs blah blah just to make me feel happy
  (and also so i know what the code is doing, so i feel confident
  when it comes to connecting it up).

  i'm done with that :)

* branch's data structure uses Data to indicate when things
  are updated.  i felt it might be sensible to keep a consistent
  API so did the same thing to Trap's data structure.

  thus, o.nia.ok, o.srr0.ok and o.srr1.ok are all set to 1
  when the trap signals "go"

  these signals, for the two SPRs, can be wired directly to
  the FunctionUnit "WRITE_REQUEST" signals with no additional
  logic needed for decoding the operand at the FunctionUnit.
Comment 7 Luke Kenneth Casson Leighton 2020-05-20 00:19:00 BST
(In reply to Luke Kenneth Casson Leighton from comment #5)
> (In reply to Jacob Lifshay from comment #4)
> > I think this should also cover interrupts.
> > 
> > I would suggest having the execution unit be a FSM since traps/interrupts
> > are rare, so the FSM is not activated unless a trap/interrupt is taken. When
> > we receive an external interrupt, we could switch the decoder to insert a
> > special trap instruction instead of decoding normally.
> 
> traps are stunningly simple, the only thing they do is change the MSR and
> NIA (program counter).  and change SRR0 and SRR1. it is hardly even an FSM :)
> 
> i like the interrupt idea, i believe it is kinda expected to do this.

hints of this technique:

irq_valid := '0';
if ctrl.msr(MSR_EE) = '1' then
    if ctrl.dec(63) = '1' then
	ctrl_tmp.irq_nia <= std_logic_vector(to_unsigned(16#900#, 64));
	report "IRQ valid: DEC";
	irq_valid := '1';
    elsif i_in.irq = '1' then
	ctrl_tmp.irq_nia <= std_logic_vector(to_unsigned(16#500#, 64));
	report "IRQ valid: External";
	irq_valid := '1';
    end if;
end if;

so hypothetically, yes, we could create an OP_IRQ that is covered by this pipeline and is responsible for changing nia etc.
Comment 8 Luke Kenneth Casson Leighton 2020-05-20 00:31:05 BST
here is the continuation of that process (the FSM in microwatt which has to take a couple of cycles to write out SRR0/1)

the first bit drops SRR1 into the SPR regfile (we can do 1 cycle) then moves on

the second elsif is near identical to TRAP except it sets the priv bit in the outgoing MSR.

this we would need to detect at the instruction issue phase (MSR PR bit set) and yes that is a pain in the neck, for *issue* phase to have to do that.

however once detected, again, an internal OP_INTPRIVthing can be created, sent to the Trap pipeline for it to action and generate SRR1 and 0


elsif irq_valid = '1' and e_in.valid = '1' then
    -- we need two cycles to write srr0 and 1
    -- will need more when we have to write HEIR
    -- Don't deliver the interrupt until we have a valid instruction
    -- coming in, so we have a valid NIA to put in SRR0.
    exception := '1';
    ctrl_tmp.srr1 <= msr_copy(ctrl.msr);

elsif e_in.valid = '1' and ctrl.msr(MSR_PR) = '1' and
            instr_is_privileged(e_in.insn_type, e_in.insn) then
    -- generate a program interrupt
    exception := '1';
    ctrl_tmp.irq_nia <= std_logic_vector(to_unsigned(16#700#, 64));
    ctrl_tmp.srr1 <= msr_copy(ctrl.msr);
    -- set bit 45 to indicate privileged instruction type interrupt
    ctrl_tmp.srr1(63 - 45) <= '1';
    report "privileged instruction";
Comment 9 Luke Kenneth Casson Leighton 2020-05-24 04:41:42 BST
suggest adding mfmsr and mtmsr to TRAP pipeline.  does mean adding RT and MSR as additional output regs.

needs adding to CSV files.
2#0001010011#  =>  
(ALU,    OP_MFMSR,     NONE,       NONE,        NONE, RT,   '0', '0', '0', '0', ZERO, '0', NONE, '0', '0', '0', '0', '0', '0', NONE, '0', '1'), -- mfmsr
2#0010110010#  => 
(ALU,    OP_MTMSRD,    NONE,       NONE,        RS,   NONE, '0', '0', '0', '0', ZERO, '0', NONE, '0', '0', '0', '0', '0', '0', NONE, '0', '1'), -- mtmsrd # ignore top bits and d
Comment 10 Luke Kenneth Casson Leighton 2020-05-24 04:43:22 BST
could also add SPRSET as well (1 more SPR reg to input)
Comment 11 Luke Kenneth Casson Leighton 2020-05-25 03:41:26 BST
and probably addpcis:

D <- d0||d1||d2
RT <- NIA + EXTS(D || 16 0)
Comment 12 Luke Kenneth Casson Leighton 2020-05-27 15:43:44 BST
(In reply to Luke Kenneth Casson Leighton from comment #11)
> and probably addpcis:
> 
> D <- d0||d1||d2
> RT <- NIA + EXTS(D || 16 0)

and OP_RFID:

	    when OP_RFID =>
		f_out.redirect <= '1';
                f_out.virt_mode <= b_in(MSR_IR) or b_in(MSR_PR);
                f_out.priv_mode <= not b_in(MSR_PR);
		f_out.redirect_nia <= a_in(63 downto 2) & "00"; -- srr0
                -- Can't use msr_copy here because the partial function MSR
                -- bits should be left unchanged, not zeroed.
                ctrl_tmp.msr(63 downto 31) <= b_in(63 downto 31);
                ctrl_tmp.msr(26 downto 22) <= b_in(26 downto 22);
                ctrl_tmp.msr(15 downto 0)  <= b_in(15 downto 0);
                if b_in(MSR_PR) = '1' then
                    ctrl_tmp.msr(MSR_EE) <= '1';
                    ctrl_tmp.msr(MSR_IR) <= '1';
                    ctrl_tmp.msr(MSR_DR) <= '1';
                end if;

although i have no idea what to do about those "priv_mode" and "virt_mode"
bits: we don't have global data structures like they do in microwatt.
Comment 13 Luke Kenneth Casson Leighton 2020-05-27 15:48:07 BST
(In reply to Luke Kenneth Casson Leighton from comment #10)
> could also add SPRSET as well (1 more SPR reg to input)

no, it's 2.  too many.  2 because one is FAST (covering TAR, LR, SRR0/1 etc.)
the other is for the (slow) SPR regfile SRAM.  not a good idea.  that's 6
input regs, and 6 output.
Comment 14 Luke Kenneth Casson Leighton 2020-05-27 15:49:42 BST
and OP_SC.  this also sets NIA and SRR1.

	    when OP_SC =>
		-- check bit 1 of the instruction is 1 so we know this is sc;
                -- 0 would mean scv, so generate an illegal instruction interrupt
		-- we need two cycles to write srr0 and 1
                if e_in.insn(1) = '1' then
                    exception := '1';
                    exception_nextpc := '1';
                    ctrl_tmp.irq_nia <= std_logic_vector(to_unsigned(16#C00#, 64));
                    ctrl_tmp.srr1 <= msr_copy(ctrl.msr);
                    report "sc";
                else
                    illegal := '1';
                end if;
Comment 15 Luke Kenneth Casson Leighton 2020-06-02 12:21:39 BST
https://git.libre-soc.org/?p=soc.git;a=commitdiff;h=2955619bc39a83b09920f8452882c4c327d30fef

cole, this is important to understand: you *do not* need to "understand"
what is "going on".  you do not need to read 1,300 page PDFs - in full -
memorised in their entirety - before being able to take action.

this can be extremely uncomfortable.  it feels... "unnatural", like,
"i should know everything because if i don't i might make a mistake or
implement the wrong thing".

sound familiar?

please understand: "understanding" != "action".

the trick that i learned - a long time ago, from doing reverse-engineering -
is to find someone else's *working* implementation, and *without* understanding,
perform as direct a one-to-one translation as is humanly and conveniently
possible.

forget optimisation.

forget "understanding".

forget even *bug* fixes.

just... get... it... translated.

through that process, it turns out that by walking over the code (which you
quotes don't understand quotes), this puts that code into your brain.

**THEN** when you read the spec - the 1300 page PDF - you go, "ohhhh, yeahhh, i
recognise that.  that's those bits/flags/functions/insert-whatever that i saw
when i was reading and translating the code, line-by-line".

and *NOW* you have quotes understanding quotes.

without having taken that first step, i guarantee to you that you will get
nowhere, fast.


so just - and i mean this literally - translate the comments that i've put
from microwatt execute1.vhdl into nmigen.  you can see how Michael and i
did that already for both OP_TRAP and OP_MFMSR.

i deliberately inserted the original microwatt code even for those, so that
you have a direct side-by-side comparison, and can follow the adage
"A is to B, as C is to D".

you'll also need to translate the bit-patterns MSR_* - this again should be
obvious and easy.
Comment 16 Luke Kenneth Casson Leighton 2020-06-02 15:06:17 BST
TODO: add OP_SC line but not exactly like it is done in microwatt:
it has to go in extra.csv so as to be able to detect (and exclude) scv.
this through bit 1 (actually bit 30 sigh) "if e_in.insn(1) = '1' then"

     17 =>       (ALU,    OP_SC,        NONE,       NONE,        NONE, NONE, '0', '0', '0', '0', ZERO, '0', NONE, '0', '0', '0', '0', '0', '0', NONE, '0', '0'), -- sc
	
so this would be something like:

17 in binary ------  bit 30 here
010001------------------------1-,TRAP,OP_SC,....,sc,NONE

see

https://github.com/antonblanchard/microwatt/commit/6f7ef8b1b91d2d5e0ed52f0538a727c6f39aa899#diff-bcf9aa33b3759d5675f1426aa1aeb0f2

                elsif std_match(f_in.insn, "010001--------------0000000---1-") then
                        report "PPC_sc";
                        v.decode := sc_instr;
Comment 17 Luke Kenneth Casson Leighton 2020-06-02 18:54:29 BST
Cole: example (A is to B, as C is to D).  examine case OP_TRAP and case OP_SC.

A:
                ctrl_tmp.irq_nia <= std_logic_vector(to_unsigned(16#C00#, 64));
                ctrl_tmp.srr1 <= msr_copy(ctrl.msr);

B:

                ????


C:
                    ctrl_tmp.irq_nia <= std_logic_vector(to_unsigned(16#700#, 64));
                    ctrl_tmp.srr1 <= msr_copy(ctrl.msr);

D:

                    comb += self.o.nia.data.eq(0x700)         # trap address
                    comb += self.o.nia.ok.eq(1)


it should be BLINDINGLY obvious that B should be:

                    comb += self.o.nia.data.eq(0xC00)         # trap address
                    comb += self.o.nia.ok.eq(1)


therefore, in switch statement OP_SC, that's what goes into that function.

and that's it.

ta-daaa.  OP_SC is completed.

wasn't that easy?

Q: do you understand or know what OP_SC is or does?
A: i don't... and don't care, and it is completely irrelevant to you, as well.

   (it's *optional* to have understanding: you may *desire* understanding,
    you may *like* to have understanding, but it is, fundamentally, *completely
    irrelevant* to the actual task)

Q: do we *need* to understand or know what OP_SC is or does?
A: no we do not.

Q: will it work?
A: yes it will.

Q: will there be bugs?
A: most probably, and those can be found with unit tests.

Q: do absolutely all the required unit tests have to be written *right now*?
A: no they do not.

Q: do we need to freeze and lock up solid in total fear at our total and
   complete lack of understanding just because those unit tests do not exist?
A: of course not.

so reading a 1300 page PDF is completely and utterly pointless.  and making reading that 1300 page PDF a hard, fixed, absolute critical dependency on *completing this task* is a false assumption, isn't it?
Comment 18 Cole Poirier 2020-06-03 00:07:13 BST
(In reply to Luke Kenneth Casson Leighton from comment #15)
> https://git.libre-soc.org/?p=soc.git;a=commitdiff;
> h=2955619bc39a83b09920f8452882c4c327d30fef
> 
> cole, this is important to understand: you *do not* need to "understand"
> what is "going on".  you do not need to read 1,300 page PDFs - in full -
> memorised in their entirety - before being able to take action.
> 
> this can be extremely uncomfortable.  it feels... "unnatural", like,
> "i should know everything because if i don't i might make a mistake or
> implement the wrong thing".
> 
> sound familiar?
> 
> please understand: "understanding" != "action".
> 
> the trick that i learned - a long time ago, from doing reverse-engineering -
> is to find someone else's *working* implementation, and *without*
> understanding,
> perform as direct a one-to-one translation as is humanly and conveniently
> possible.
> 
> forget optimisation.
> 
> forget "understanding".
> 
> forget even *bug* fixes.
> 
> just... get... it... translated.
> 
> through that process, it turns out that by walking over the code (which you
> quotes don't understand quotes), this puts that code into your brain.
> 
> **THEN** when you read the spec - the 1300 page PDF - you go, "ohhhh,
> yeahhh, i
> recognise that.  that's those bits/flags/functions/insert-whatever that i saw
> when i was reading and translating the code, line-by-line".
> 
> and *NOW* you have quotes understanding quotes.
> 
> without having taken that first step, i guarantee to you that you will get
> nowhere, fast.
> 
> 
> so just - and i mean this literally - translate the comments that i've put
> from microwatt execute1.vhdl into nmigen.  you can see how Michael and i
> did that already for both OP_TRAP and OP_MFMSR.
> 
> i deliberately inserted the original microwatt code even for those, so that
> you have a direct side-by-side comparison, and can follow the adage
> "A is to B, as C is to D".
> 
> you'll also need to translate the bit-patterns MSR_* - this again should be
> obvious and easy.

With you all the way Luke, just needed this redirection/confidence boost.
Comment 19 Cole Poirier 2020-06-03 00:07:53 BST
(In reply to Luke Kenneth Casson Leighton from comment #17)
> Cole: example (A is to B, as C is to D).  examine case OP_TRAP and case
> OP_SC.
> 
> A:
>                 ctrl_tmp.irq_nia <= std_logic_vector(to_unsigned(16#C00#,
> 64));
>                 ctrl_tmp.srr1 <= msr_copy(ctrl.msr);
> 
> B:
> 
>                 ????
> 
> 
> C:
>                     ctrl_tmp.irq_nia <=
> std_logic_vector(to_unsigned(16#700#, 64));
>                     ctrl_tmp.srr1 <= msr_copy(ctrl.msr);
> 
> D:
> 
>                     comb += self.o.nia.data.eq(0x700)         # trap address
>                     comb += self.o.nia.ok.eq(1)
> 
> 
> it should be BLINDINGLY obvious that B should be:
> 
>                     comb += self.o.nia.data.eq(0xC00)         # trap address
>                     comb += self.o.nia.ok.eq(1)
> 
> 
> therefore, in switch statement OP_SC, that's what goes into that function.
> 
> and that's it.
> 
> ta-daaa.  OP_SC is completed.
> 
> wasn't that easy?
> 
> Q: do you understand or know what OP_SC is or does?
> A: i don't... and don't care, and it is completely irrelevant to you, as
> well.
> 
>    (it's *optional* to have understanding: you may *desire* understanding,
>     you may *like* to have understanding, but it is, fundamentally,
> *completely
>     irrelevant* to the actual task)
> 
> Q: do we *need* to understand or know what OP_SC is or does?
> A: no we do not.
> 
> Q: will it work?
> A: yes it will.
> 
> Q: will there be bugs?
> A: most probably, and those can be found with unit tests.
> 
> Q: do absolutely all the required unit tests have to be written *right now*?
> A: no they do not.
> 
> Q: do we need to freeze and lock up solid in total fear at our total and
>    complete lack of understanding just because those unit tests do not exist?
> A: of course not.
> 
> so reading a 1300 page PDF is completely and utterly pointless.  and making
> reading that 1300 page PDF a hard, fixed, absolute critical dependency on
> *completing this task* is a false assumption, isn't it?

Very, very helpful. Indeed my false assumption was thankfully very false :)
Comment 20 Cole Poirier 2020-06-03 00:46:44 BST
Pushed a commit with my attempt at OP_RFID and OP_SC. Can you take a look at OP_RFID because I found it hard and confusing to translate? Is the only remaining TRAP instruction to be implemented OP_ADDPCIS? I see the pseudo code here for it, but I cannot understand it. Should I look for it's implementation in microwatt on github?
Comment 21 Luke Kenneth Casson Leighton 2020-06-03 00:54:24 BST
 comb += self.o.msr.data.eq(Cat(b[63:31], b[26:22], b[15:0]))

this says "concatenate bits and put thrm into the LSBs of MSR".

look again at the VHDL.  it says, "transfer 3 sets of bits into their exact same corresponding bit places.

also, python numbering is start:end+1

you have specified end:start

so the ranges are 0:16
22:27
31:64

make 3 separate assignments ok?



 comb += self.o.msr.ok.eq(a)


the ok field is 1 bit, saying "this reg is to be written".  by placing the LSB of a into it (because msr.ok is only 1 wide), you end up with random situations where the MSB will randomly get written.

msr.ok should be set to 1, to gyarantee it will be written.


also remember to translate this bit:

          if c_in(MSR_PR) = '1' then
 131                         ctrl_tmp.msr(MSR_EE) <= '1';
 132                         ctrl_tmp.msr(MSR_IR) <= '1';
 133                         ctrl_tmp.msr(MSR_DR) <= '1';



OP_SC looks great.
Comment 22 Cole Poirier 2020-06-03 00:59:16 BST
(In reply to Luke Kenneth Casson Leighton from comment #21)
>  comb += self.o.msr.data.eq(Cat(b[63:31], b[26:22], b[15:0]))
> 
> this says "concatenate bits and put thrm into the LSBs of MSR".
> 
> look again at the VHDL.  it says, "transfer 3 sets of bits into their exact
> same corresponding bit places.
> 
> also, python numbering is start:end+1
> 
> you have specified end:start
> 
> so the ranges are 0:16
> 22:27
> 31:64
> 
> make 3 separate assignments ok?

Thanks, will do. 

> 
>  comb += self.o.msr.ok.eq(a)
> 
> 
> the ok field is 1 bit, saying "this reg is to be written".  by placing the
> LSB of a into it (because msr.ok is only 1 wide), you end up with random
> situations where the MSB will randomly get written.
> 
> msr.ok should be set to 1, to gyarantee it will be written.

Ah thank you that was a typo, had meant to put 1.
 
> also remember to translate this bit:
> 
>           if c_in(MSR_PR) = '1' then
>  131                         ctrl_tmp.msr(MSR_EE) <= '1';
>  132                         ctrl_tmp.msr(MSR_IR) <= '1';
>  133                         ctrl_tmp.msr(MSR_DR) <= '1';
> 
> 
> 
> OP_SC looks great.

Thanks. I'm not sure how to tranlate c_in and in OP_MTMSR this seems to be done by just:

```
comb += self.o.msr.data.eq(a)
comb += self.o.msr.ok.eq(1)
```

Is this correct?
Comment 23 Cole Poirier 2020-06-03 01:11:03 BST
Pushed a commit with your suggested fixes, however I don't think what I wrote makes sense, but since I don't understand b_in and c_in I tried follow what you did in OP_MTMSR.

Also, does OP_ADDPCIS need to be implemented? I cannot find it's implementation in microwatt.
Comment 24 Luke Kenneth Casson Leighton 2020-06-03 01:38:12 BST
(In reply to Cole Poirier from comment #23)
> Pushed a commit with your suggested fixes, however I don't think what I
> wrote makes sense, but since I don't understand b_in and c_in I tried follow
> what you did in OP_MTMSR.

:)

you're still thinking too much.  less brain, more automatonic robot :)

the bitassignnents look great.

however, with this:
                comb += self.o.msr.data.eq(b)
 171                 comb += self.o.msr.ok.eq(1)
 172 

you just trashed all the good work :)

lose those lines.


> Also, does OP_ADDPCIS need to be implemented? I cannot find it's
> implementation in microwatt.

correct, we leave it for now.
Comment 25 Luke Kenneth Casson Leighton 2020-06-03 01:47:10 BST
i added some more TODOs, and also translated the "if blah blah" in OP_MTMSR
which you can now look at and cookie-cut for OP_RFID.

also you missed srr1 in OP_SC.  remember to set srr1.ok.eq(1)
Comment 26 Cole Poirier 2020-06-03 02:13:32 BST
Thanks for your assisstance, it made things much easier, and more robotic :)
I just pushed a new commit with your suggested changes. If this is now complete is the next step writing unit tests?
Comment 27 Luke Kenneth Casson Leighton 2020-06-03 02:28:01 BST
(In reply to Cole Poirier from comment #26)
> Thanks for your assisstance, it made things much easier, and more robotic :)
> I just pushed a new commit with your suggested changes. If this is now
> complete is the next step writing unit tests?

yes... but it's not complete.  every single line needs to be translated,
which means paying attention to every line.  it's quite... tedious.
you missed several: i highlighted them for you.
Comment 28 Luke Kenneth Casson Leighton 2020-06-03 02:35:18 BST
(In reply to Cole Poirier from comment #22)

> Thanks. I'm not sure how to tranlate c_in

err... errr... that means *sigh* having to look at the CSV files and see if
reg c can be moved to reg a, for MTMSR.

b _definitely_ needs to be in position "b" - so that means that in OP_RFID
this needs to be b not a:
     comb += self.o.msr.data[stt:end].eq( --->a<---- [stt:end])


> and in OP_MTMSR this seems to be
> done by just:
> 
> ```
> comb += self.o.msr.data.eq(a)
> comb += self.o.msr.ok.eq(1)
> ```
> 
> Is this correct?

it is... but we need to check minor_31.csv - ah:
0b0010110010,ALU,OP_MTMSRD,NONE,NONE,RS,NONE

that's wrong.  it says "RS goes into reg c" and if you look at fu/trap/pipe_data.py
we don't *have* an rs.

so i'll go ahead and change that to:

0b0010110010,ALU,OP_MTMSRD,RS,NONE,NONE,NONE

and that will put RS into *a* (which we have)

gimme 1 sec...
Comment 29 Luke Kenneth Casson Leighton 2020-06-03 02:41:06 BST
(In reply to Luke Kenneth Casson Leighton from comment #28)

> 0b0010110010,ALU,OP_MTMSRD,RS,NONE,NONE,NONE
> 
> and that will put RS into *a* (which we have)
> 
> gimme 1 sec...

commit 7e32307f763f693a405a5d1691fb23894bcd3be3 (HEAD -> master, origin/master)
Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Wed Jun 3 02:39:29 2020 +0100

    update submodule for ISA tables


done, do make sure to do a "git submodule update" and make sure
"git submodule" reports this:
1970241a6db97d4ace1e053e72f2a3d9462e98b2 libreriscv
Comment 30 Luke Kenneth Casson Leighton 2020-06-03 02:50:05 BST
oo... errr... do another git pull - i added this:

                # L = self.fields.FormXL[0:-1]
                # if e_in.insn(16) = '1' then  <-- this is X-form field "L".

so rather than "with m.If(insn[16]):"

you want: "with m.If(L):"

ok?
Comment 31 Cole Poirier 2020-06-03 03:31:04 BST
(In reply to Luke Kenneth Casson Leighton from comment #27)
> (In reply to Cole Poirier from comment #26)
> > Thanks for your assisstance, it made things much easier, and more robotic :)
> > I just pushed a new commit with your suggested changes. If this is now
> > complete is the next step writing unit tests?
> 
> yes... but it's not complete.  every single line needs to be translated,
> which means paying attention to every line.  it's quite... tedious.
> you missed several: i highlighted them for you.

Thanks for the help!!

Ok, I *think* I have made the necessary changes. Just pushed a commit. I'll take a look at your comments and make any new recommended changes in the morning.
Comment 32 Luke Kenneth Casson Leighton 2020-06-03 04:27:01 BST
-                # TODO translate this, import and use br_ext from branch stage
-                # f_out.redirect_nia <= a_in(63 downto 2) & "00"; -- srr0
+                comb += self.o.nia.data.eq(br_ext(a[63:1] & 0))
+                comb += self.o.nia.ok.eq(1)

br_ext already adds the 2 zeros.  & in vhdl means "concatenate".  however they
go MSB -> LSB order, python is LSB->MSB order

                comb += self.o.nia.data.eq(br_ext(a[2:]))

you removed all of these, please do put them back:

-        a_i, b_i, cia_i, msr_i = self.i.a, self.i.b, self.i.cia, self.i.msr
-        o, msr_o, nia_o = self.o.o, self.o.msr, self.o.nia
-        srr0_o, srr1_o = self.o.srr0, self.o.srr1
+        a_i, b_i = self.i.a, self.i.b


-                #     -- just update EE and RI
-                #     ctrl_tmp.msr(MSR_EE) <= c_in(MSR_EE);
-                #     ctrl_tmp.msr(MSR_RI) <= c_in(MSR_RI);
+                L = self.fields.FormX.L[0:-1]
+                with m.If(L):
+                    comb += self.o.msr[MSR_EE].eq(self.i.msr[MSR_EE])
+                    comb += self.o.msr[MSR_RI].eq(self.i.msr[MSR_RI])
+

here, this is copying the incoming MSR into the outgoing MSR.  that's
just wasting time.    look again: it says c_in, and we discussed that
this is to be moved in comment #28 to "a".

also, those defines MSR_xx do not exist.  note the comment at the top:
# TODO: turn these into python constants (just "MSR_SF = 63-0 # comment" etc.)
"""
    Listed in V3.0B Book III Chap 4.2.1
    -- MSR bit numbers
    constant MSR_SF  : integer := (63 - 0);     -- Sixty-Four bit mode
....
"""

i'm leaving it to you to create those constants.
Comment 33 Cole Poirier 2020-06-04 01:29:18 BST
(In reply to Luke Kenneth Casson Leighton from comment #32)
> -                # TODO translate this, import and use br_ext from branch
> stage
> -                # f_out.redirect_nia <= a_in(63 downto 2) & "00"; -- srr0
> +                comb += self.o.nia.data.eq(br_ext(a[63:1] & 0))
> +                comb += self.o.nia.ok.eq(1)
> 
> br_ext already adds the 2 zeros.  & in vhdl means "concatenate".  however
> they
> go MSB -> LSB order, python is LSB->MSB order
> 
>                 comb += self.o.nia.data.eq(br_ext(a[2:]))

I see. Thank you I will make this fix next...

> you removed all of these, please do put them back:
> 
> -        a_i, b_i, cia_i, msr_i = self.i.a, self.i.b, self.i.cia, self.i.msr
> -        o, msr_o, nia_o = self.o.o, self.o.msr, self.o.nia
> -        srr0_o, srr1_o = self.o.srr0, self.o.srr1
> +        a_i, b_i = self.i.a, self.i.b
> 
> 
> -                #     -- just update EE and RI
> -                #     ctrl_tmp.msr(MSR_EE) <= c_in(MSR_EE);
> -                #     ctrl_tmp.msr(MSR_RI) <= c_in(MSR_RI);
> +                L = self.fields.FormX.L[0:-1]
> +                with m.If(L):
> +                    comb += self.o.msr[MSR_EE].eq(self.i.msr[MSR_EE])
> +                    comb += self.o.msr[MSR_RI].eq(self.i.msr[MSR_RI])
> +
> 
> here, this is copying the incoming MSR into the outgoing MSR.  that's
> just wasting time.    look again: it says c_in, and we discussed that
> this is to be moved in comment #28 to "a".

Yes, sorry, I found that part confusing, and misinterpreted your change of the csv to mean that this change was unnecessary, will fix this next next.

> also, those defines MSR_xx do not exist.  note the comment at the top:
> # TODO: turn these into python constants (just "MSR_SF = 63-0 # comment"
> etc.)
> """
>     Listed in V3.0B Book III Chap 4.2.1
>     -- MSR bit numbers
>     constant MSR_SF  : integer := (63 - 0);     -- Sixty-Four bit mode
> ....
> """
> 
> i'm leaving it to you to create those constants.

I did this, and restored the convenience variables, and restored the deleted VHDL comments. It should be in the right state now, so I can move on to making the two above, next, and next next changes.
Comment 34 Luke Kenneth Casson Leighton 2020-06-04 01:34:43 BST
> go MSB -> LSB order, python is LSB->MSB order

sorry, that should be "python is LSB->MSB+1" order.
Comment 35 Cole Poirier 2020-06-04 23:41:16 BST
(In reply to Luke Kenneth Casson Leighton from comment #34)
> > go MSB -> LSB order, python is LSB->MSB order
> 
> sorry, that should be "python is LSB->MSB+1" order.

I think everything has been brought into line with your suggestions. Please take a look and let me know if I've made a mistake, I'm specifically not confident about my putting 'comb += nia_o.data.eq(br_ext(a_i[2:]))' on line 193, in OP_MFMSR.
Comment 36 Luke Kenneth Casson Leighton 2020-06-05 01:32:42 BST
 153                 with m.If(L):
 154                     comb += msr_o[MSR_EE].eq(msr_i[MSR_EE])
 155                     comb += msr_o[MSR_RI].eq(msr_i[MSR_RI])
 156 

look again at the VHDL. it inputs from c_in.  we assigned a_i to that.

here instead the copy is ciming from the incoming msr (msr_i).  clearly thus is wrong.

line by line.

compare, literally, by putting two fingers pointing one at the VHDL and the other at the nmigen code, like you are 5 years old learning to read, if you have to!
Comment 37 Luke Kenneth Casson Leighton 2020-06-05 01:35:56 BST
 197                 with m.If(a[MSR_PR]):
 198                         msr_o[MSR_EE].eq(1)
 199                         msr_o[MSR_IR].eq(1)

bug.  line 187 says "b_in" doesn't it? so why is a (a variable that does not exist) being tested at line 197?
Comment 38 Luke Kenneth Casson Leighton 2020-06-05 02:06:31 BST
i'm looking at the 3.0B pseudocode for OP_SC and going "hmmm".

SC-Form

* sc LEV

    SRR0 <-iea CIA + 4
    SRR1[33:36] <- 0
    SRR1[42:47] <- 0
    SRR1[0:32]  <- MSR[0:32] 
    SRR1[37:41] <- MSR[37:41]
    SRR1[48:63] <- MSR[48:63]
    MSR <- new_value 
    NIA <- 0x0000_0000_0000_0C00
Comment 39 Luke Kenneth Casson Leighton 2020-06-05 02:09:59 BST
(In reply to Luke Kenneth Casson Leighton from comment #38)
> i'm looking at the 3.0B pseudocode for OP_SC and going "hmmm".
> 
> SC-Form
> 
> * sc LEV
> 
>     SRR0 <-iea CIA + 4

looking at what OP_TRAP does i think we need these 2 extra lines in OP_ SC:

      comb += srr0_o.data.eq(cia_i+4)   # old PC
      comb += srr0_o.ok.eq(1)
Comment 40 Luke Kenneth Casson Leighton 2020-06-05 02:25:19 BST
OP_RFID POWER9 3.0B pseudocode also has this:

    NIA <-iea SRR0[0:61] || 0b00

oooooo ngggh ok.  so in microwatt they are inputting SRR0 from a_in, and SRR1 from b_in.

we are not going to do that.

so in these lines 193 to 197:

 193                 comb += nia_o.data.eq(br_ext(a_i[2:]))
 194                 comb += nia_o.ok.eq(1)
 195                 for stt, end in [(0,16), (22, 27), (31, 64)]:
 196                     comb += msr_o.data[stt:end].eq(b_i[stt:end])
 197                 with m.If(a[MSR_PR]):

substitute srr0_i for a_i (and a at line 197) and srr1_i for b_i
Comment 41 Luke Kenneth Casson Leighton 2020-06-05 03:13:03 BST
https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/decoder/power_decoder2.py;hb=HEAD

for OP_SC, OP_RFID and OP_TRAP we need to set the fast spr fields in DecodeA, DecodeB and so on, so that the fields are correctly set.

this may involve the CSV files, putting the information there.

example: OP_TRAP needs to output SRR0, therefore DecodeOut needs to set fast_out to SRR0 when the opcode is OP_TRAP.
Comment 42 Luke Kenneth Casson Leighton 2020-06-05 03:34:18 BST
from microwatt decode1.vhdl

elsif v.decode.insn_type = OP_RFID then
    report "PPC RFID";
    v.ispr1 := fast_spr_num(SPR_SRR0);
    v.ispr2 := fast_spr_num(SPR_SRR1);
end if;

that's what needs to go in PowerDecode2
Comment 43 Luke Kenneth Casson Leighton 2020-06-05 04:37:24 BST
(In reply to Luke Kenneth Casson Leighton from comment #42)
> from microwatt decode1.vhdl
> 
> elsif v.decode.insn_type = OP_RFID then
>     report "PPC RFID";
>     v.ispr1 := fast_spr_num(SPR_SRR0);
>     v.ispr2 := fast_spr_num(SPR_SRR1);
> end if;
> 
> that's what needs to go in PowerDecode2

done.  DecodeOut --> SRR0, DecodeOut2 --> SRR1

so.

when the TRAP pipeline is called, SRR0 will go into spr1, SRR1 will go into spr2.



--- a/src/soc/decoder/power_decoder2.py
+++ b/src/soc/decoder/power_decoder2.py
@@ -241,6 +241,12 @@ class DecodeOut(Elaboratable):
             with m.If(~self.dec.BO[2]): # 3.0B p38 BO2=0, use CTR reg
                 comb += self.fast_out.data.eq(FastRegs.CTR) # constant: CTR
                 comb += self.fast_out.ok.eq(1)
+
+        # RFID 1st spr (fast)
+        with m.If(op.internal_op == InternalOp.OP_RFID):
+                comb += self.fast_out.data.eq(FastRegs.SRR0) # constant: SRR0
+                comb += self.fast_out.ok.eq(1)
+
         return m
 
 
@@ -275,6 +281,11 @@ class DecodeOut2(Elaboratable):
                 comb += self.fast_out.data.eq(FastRegs.LR) # constant: LR
                 comb += self.fast_out.ok.eq(1)
 
+        # RFID 2nd spr (fast)
+        with m.If(op.internal_op == InternalOp.OP_RFID):
+                comb += self.fast_out.data.eq(FastRegs.SRR1) # constant: SRR1
+                comb += self.fast_out.ok.eq(1)
+
         return m
Comment 44 Luke Kenneth Casson Leighton 2020-06-05 04:52:55 BST
i added a version of msr_copy, and this version it has the option to
zero the destination msr or not.  so unlike in the comments, it *can*
be used in OP_RFID.
Comment 45 Luke Kenneth Casson Leighton 2020-06-05 05:19:12 BST
 184                     with m.If(a[MSR_PR]):
 185                             msr_o[MSR_EE].eq(1)
 186                             msr_o[MSR_IR].eq(1)
 187                             msr_o[MSR_DR].eq(1)

should be msr_o.data
Comment 46 Cole Poirier 2020-06-05 15:16:05 BST
Ok. I think now the nmigen should match the microwatt VHDL, however, there may be small errors as you made some modifications but left others for me to fix, and I got confused about some of your comments re a_i, b_i, but I did my best to interpret your comments while reading, line-by-line the vhdl from the comments.
Comment 47 Luke Kenneth Casson Leighton 2020-06-05 16:12:47 BST
this is going to be a mess.

mtmsr pseudocode:

if L = 0 then
    MSR48 <- (RS)48 | (RS)49
    MSR58 <- ((RS)58 | (RS)49)
        & ¬(MSR41 & MSR3 & (¬(RS)49))
    MSR59 <- ((RS)59 | (RS)49)
        & ¬(MSR41 & MSR3 & (¬(RS)49))
    MSR32:40 42:47 49:50 52:57 60:62
       (RS)32:40 42:47 49:50 52:57 60:62
else:
    MSR48 62 <- (RS)48 62

i forgot to add the pseudocode from mtmsr and mfmsr to the isa markdown
files.  the above is note-form from the 3.0B PDF

mfmsr:

RT <- MSR
Comment 48 Luke Kenneth Casson Leighton 2020-06-05 16:21:30 BST
just did some updates/comments so we can see what's going on, in english.
Comment 49 Cole Poirier 2020-06-05 16:33:03 BST
(In reply to Luke Kenneth Casson Leighton from comment #48)
> just did some updates/comments so we can see what's going on, in english.

Appreciated, significantly clearer than it was before. Looking at the code, it seems to be in line with the VHDL in the comments. One question, what does '# XXX which bit?' on line 147 refer to?
Comment 50 Luke Kenneth Casson Leighton 2020-06-05 16:43:30 BST
(In reply to Cole Poirier from comment #46)
> Ok. I think now the nmigen should match the microwatt VHDL, however, there
> may be small errors as you made some modifications but left others for me to
> fix, and I got confused about some of your comments re a_i, b_i, but I did
> my best to interpret your comments while reading, line-by-line the vhdl from
> the comments.

microwatt decided to use a_in, b_in and c_in as the input lines for *SPRs*.

we are not going to do that.

therefore all places where SPRs are inputs, the names need to *be* SRR0,
*be* SRR1, *NOT* a_i, b_i, c_i.

however - some of the instructions *DO TAKE REGISTERS AS INPUT*.
see section 5.4.3, "mtmsr" for example: this takes a *REGISTER* (RS)
as the input, and moves it to the MSR (as output)

whereas OP_RFID, you can see clearly the pseudocode says:

  nia <= SRR0
  msr <= SRR1

however microwatt has this:

  nia <- a_in
  msr <- b_in

this is obviously a design decision that WE ARE NOT GOING TO COPY.

where they must have done this:

  a_in <- SRR0
  b_in <- SRR1

and then and only then could this microwatt code succeed:

  nia <- a_in
  msr <- b_in

we are **NOT** repeat **NOT** going to re-purpose register file lanes.
we have *SEPARATE* lanes for SPRs, *SEPARATE* lanes for integer registers.

if we were to copy what microwatt have done we would need full data-routing
crossbars in front of the register file ports to get data from the SPR
register files over to the integer inputs on the TRAP pipeline.

in an out-of-order system such "hacks" creates a severe architectural problem for
Dependency Tracking and Management.
Comment 51 Luke Kenneth Casson Leighton 2020-06-05 16:46:38 BST
i've removed the microwatt pseudocode because it now serves no purpose.
with microwatt using a_in, b_in and c_in as overloaded-lanes for incoming
SPRs, it it no longer useful.
Comment 52 Cole Poirier 2020-06-05 17:14:11 BST
(In reply to Luke Kenneth Casson Leighton from comment #51)
> i've removed the microwatt pseudocode because it now serves no purpose.
> with microwatt using a_in, b_in and c_in as overloaded-lanes for incoming
> SPRs, it it no longer useful.

Looks good. What can I do now. Is there something left to be done here or should I move on to another pipeline? I apologize, I cannot determine this for myself by looking at fu/trap/main_stage.
Comment 53 Luke Kenneth Casson Leighton 2020-06-05 19:14:41 BST
(In reply to Cole Poirier from comment #52)
> (In reply to Luke Kenneth Casson Leighton from comment #51)
> > i've removed the microwatt pseudocode because it now serves no purpose.
> > with microwatt using a_in, b_in and c_in as overloaded-lanes for incoming
> > SPRs, it it no longer useful.
> 
> Looks good. What can I do now. Is there something left to be done here or
> should I move on to another pipeline? 

> I apologize, I cannot determine this
> for myself by looking at fu/trap/main_stage.

nono, i've already started referencing the PDF.

two things (on this bugreport):

1) make a test_pipe_caller.py and remaining data structures/code.  (cookie-cut an existing one, delete all the switch statements).  don't overwrite pipe_data.py.  do cookie-cut a trap_input_data.py (cr_input_record.py)

2) add the pseudocode to sprset.mdwn (yes sprset.mdwn is a bad name) from comment #47

https://libre-soc.org/openpower/isa/sprset/

or if you're bored of this one for a while, move on to
https://bugs.libre-soc.org/show_bug.cgi?id=348
Comment 54 Cole Poirier 2020-06-05 20:08:03 BST
(In reply to Luke Kenneth Casson Leighton from comment #47)
> this is going to be a mess.
> 
> mtmsr pseudocode:
> 
> if L = 0 then
>     MSR48 <- (RS)48 | (RS)49
>     MSR58 <- ((RS)58 | (RS)49)
>         & ¬(MSR41 & MSR3 & (¬(RS)49))
>     MSR59 <- ((RS)59 | (RS)49)
>         & ¬(MSR41 & MSR3 & (¬(RS)49))
>     MSR32:40 42:47 49:50 52:57 60:62
>        (RS)32:40 42:47 49:50 52:57 60:62
> else:
>     MSR48 62 <- (RS)48 62
> 
> i forgot to add the pseudocode from mtmsr and mfmsr to the isa markdown
> files.  the above is note-form from the 3.0B PDF
> 
> mfmsr:
> 
> RT <- MSR

Luke, in order to complete task 2 of your last comment, should this be added with the note 'Note-Form from the 3.0B PDF' instead of 'X-Form' or 'XFX-Form'?

Additionally should the pseudo code for OP_SC from comment 38 be added to sprset.mdwn as well?
Comment 55 Luke Kenneth Casson Leighton 2020-06-05 20:15:22 BST
(In reply to Cole Poirier from comment #54)
> (In reply to Luke Kenneth Casson Leighton from comment #47)
> > this is going to be a mess.
> > 
> > mtmsr pseudocode:
> > 
> > if L = 0 then
> >     MSR48 <- (RS)48 | (RS)49
> >     MSR58 <- ((RS)58 | (RS)49)
> >         & ¬(MSR41 & MSR3 & (¬(RS)49))
> >     MSR59 <- ((RS)59 | (RS)49)
> >         & ¬(MSR41 & MSR3 & (¬(RS)49))
> >     MSR32:40 42:47 49:50 52:57 60:62
> >        (RS)32:40 42:47 49:50 52:57 60:62
> > else:
> >     MSR48 62 <- (RS)48 62
> > 
> > i forgot to add the pseudocode from mtmsr and mfmsr to the isa markdown
> > files.  the above is note-form from the 3.0B PDF
> > 
> > mfmsr:
> > 
> > RT <- MSR
> 
> Luke, in order to complete task 2 of your last comment, should this be added
> with the note 'Note-Form from the 3.0B PDF' instead of 'X-Form' or
> 'XFX-Form'?

whatever is in the PDF from the spec.  the format however has to be *exactly* like the others, down to the linebreaks, because it is machine readable.

see pywriter.py,in the top level Makefile.

so you can test it before committing

> 
> Additionally should the pseudo code for OP_SC from comment 38 be added to
> sprset.mdwn as well?

grep sc or scv *.mdwn and you'll find those are in system.mdwn already.
Comment 56 Cole Poirier 2020-06-05 20:49:44 BST
(In reply to Luke Kenneth Casson Leighton from comment #55)
> whatever is in the PDF from the spec.  the format however has to be
> *exactly* like the others, down to the linebreaks, because it is machine
> readable.
> 
> see pywriter.py,in the top level Makefile.
> 
> so you can test it before committing

Apologies Luke, after 20 minutes of checking and rechecking the psuedo code against the PDF and making lots of small modifications I am unable to get pywriter to complete. It errors with the following stack trace, I think I require your assistance.

```
UnusedElaboratable: Enable tracemalloc to get the object allocation traceback
[None, <_ast.Name object at 0x7f19c3d02320>, '=', <_ast.Constant object at 0x7f19c3d02358>]
Traceback (most recent call last):
  File "src/soc/decoder/pseudo/pywriter.py", line 135, in <module>
    isa.write_pysource(source)
  File "src/soc/decoder/pseudo/pywriter.py", line 58, in write_pysource
    pycode, rused = convert_to_python(pcode, d.form)
  File "/home/colepoirier/src/soc/src/soc/decoder/power_pseudo.py", line 219, in convert_to_python
    tree = gsc.compile(pcode, mode="exec", filename="string")
  File "/home/colepoirier/src/soc/src/soc/decoder/pseudo/parser.py", line 797, in compile
    tree = self.parser.parse(code)
  File "/home/colepoirier/src/soc/src/soc/decoder/pseudo/parser.py", line 784, in parse
    result = self.parser.parse(code, lexer=self.lexer, debug=self.debug)
  File "/usr/local/lib/python3.7/dist-packages/ply-3.11-py3.7.egg/ply/yacc.py", line 333, in parse
    return self.parseopt_notrack(input, lexer, debug, tracking, tokenfunc)
  File "/usr/local/lib/python3.7/dist-packages/ply-3.11-py3.7.egg/ply/yacc.py", line 1201, in parseopt_notrack
    tok = call_errorfunc(self.errorfunc, errtoken, self)
  File "/usr/local/lib/python3.7/dist-packages/ply-3.11-py3.7.egg/ply/yacc.py", line 192, in call_errorfunc
    r = errorfunc(token)
  File "/home/colepoirier/src/soc/src/soc/decoder/pseudo/parser.py", line 767, in p_error
    raise SyntaxError(p)
SyntaxError: LexToken(NUMBER,48,2,29)
```

I cannot identify what in my current pseudo-code is causing this problem:

```
 # Move To Machine State Register
 
 X-Form
 
 * mtmsr RS,L
 
     if L = 0 then
       MSR48 <- (RS)48 | (RS)49
       MSR58 <- ((RS)58 | (RS)49)
         & ¬(MSR41 & MSR3 & (¬(RS)49))
       MSR59 <- ((RS)59 | (RS)49)
         & ¬(MSR41 & MSR3 & (¬(RS)49))
       MSR32:40 42:47 49:50 52:57 60:62
         <- (RS)32:40 42:47 49:50 52:57 60:62
     else
       MSR48 62 <- (RS)48 62
 
 Special Registers Altered:
 
     MSR
 
 # Move From Machine State Register
 
 X-Form
 
 * mfmsr RT
 
     RT <- MSR
 
 Special Registers Altered:
 
     None
 
```

> 
> > 
> > Additionally should the pseudo code for OP_SC from comment 38 be added to
> > sprset.mdwn as well?
> 
> grep sc or scv *.mdwn and you'll find those are in system.mdwn already.

Ah yes thank you, I mistakenly thougt OP_SC because I worked on it at the same time as MTMSR and MFMSR belonged to sprset not system.
Comment 57 Cole Poirier 2020-06-05 21:12:58 BST
(In reply to Luke Kenneth Casson Leighton from comment #53)
> two things (on this bugreport):
> 
> 1) make a test_pipe_caller.py and remaining data structures/code. 
> (cookie-cut an existing one, delete all the switch statements).  don't
> overwrite pipe_data.py.  do cookie-cut a trap_input_data.py
> (cr_input_record.py)

This is done. Should I start trying to figure out how to write these unit tests using fu/cr/test/test_pipe_caller as a guide? Or if this will be outside of my current capacity, should I move on to bug #348?

> 2) add the pseudocode to sprset.mdwn (yes sprset.mdwn is a bad name) from
> comment #47
> 
> https://libre-soc.org/openpower/isa/sprset/

Waiting on your help with this one, I figured I shouldn't waste more time on it given that it's likely something that you're capable of identifying and fixing in a matter of seconds because you understand the parser completely.

> or if you're bored of this one for a while, move on to
> https://bugs.libre-soc.org/show_bug.cgi?id=348
Comment 58 Luke Kenneth Casson Leighton 2020-06-05 22:59:51 BST
(In reply to Cole Poirier from comment #56)
> (In reply to Luke Kenneth Casson Leighton from comment #55)
> > whatever is in the PDF from the spec.  the format however has to be
> > *exactly* like the others, down to the linebreaks, because it is machine
> > readable.
> > 
> > see pywriter.py,in the top level Makefile.
> > 
> > so you can test it before committing
> 
> Apologies Luke, after 20 minutes of checking and rechecking the psuedo code
> against the PDF and making lots of small modifications I am unable to get
> pywriter to complete. It errors with the following stack trace, I think I
> require your assistance.

yep, the pseudo-language is not documented, it's one of those things you
have to look at other scripts and just work it out by inference.

do a git pull on the libre-soc.org repo, i added the code to it there,
you should find it works.

one thing: you had a space in front of every line.  that would not have worked:
the parser was written very very quickly.
Comment 59 Luke Kenneth Casson Leighton 2020-06-05 23:02:15 BST
(In reply to Cole Poirier from comment #57)
> (In reply to Luke Kenneth Casson Leighton from comment #53)
> > two things (on this bugreport):
> > 
> > 1) make a test_pipe_caller.py and remaining data structures/code. 
> > (cookie-cut an existing one, delete all the switch statements).  don't
> > overwrite pipe_data.py.  do cookie-cut a trap_input_data.py
> > (cr_input_record.py)
> 
> This is done. Should I start trying to figure out how to write these unit
> tests using fu/cr/test/test_pipe_caller as a guide?

yes sure.  although bear in mind that only the instructions which are supported
in the emulator will work.

> Or if this will be
> outside of my current capacity, should I move on to bug #348?

generally speaking if there are several tasks which you can do, just
do another if you're blocked: don't wait to ask, just do it.

> > 2) add the pseudocode to sprset.mdwn (yes sprset.mdwn is a bad name) from
> > comment #47
> > 
> > https://libre-soc.org/openpower/isa/sprset/
> 
> Waiting on your help with this one, I figured I shouldn't waste more time on
> it given that it's likely something that you're capable of identifying and
> fixing in a matter of seconds because you understand the parser completely.

actually i don't: it has been several months, and i wrote it very fast,
to perform a minimal job!

i've done a git push on the libre-soc repo you should be able to retry.
Comment 60 Luke Kenneth Casson Leighton 2020-06-06 00:25:24 BST
 161                     with m.If(a_i[MSR_PR]):
 162                         msr_o.data[MSR_EE].eq(1)
 163                         msr_o.data[MSR_IR].eq(1)
 164                         msr_o.data[MSR_DR].eq(1)

and at line 185 comb += is missing
Comment 61 Cole Poirier 2020-06-06 00:46:06 BST
(In reply to Luke Kenneth Casson Leighton from comment #58)
> yep, the pseudo-language is not documented, it's one of those things you
> have to look at other scripts and just work it out by inference.
> 
> do a git pull on the libre-soc.org repo, i added the code to it there,
> you should find it works.
> 
> one thing: you had a space in front of every line.  that would not have
> worked:
> the parser was written very very quickly.

Ah that's just when I pasted it here (I'm 90% sure), not in the file, for some reason buzilla likes to prepend a space when I'm pasting code here.

(In reply to Luke Kenneth Casson Leighton from comment #59)
> (In reply to Cole Poirier from comment #57)
> > This is done. Should I start trying to figure out how to write these unit
> > tests using fu/cr/test/test_pipe_caller as a guide?
> 
> yes sure.  although bear in mind that only the instructions which are
> supported
> in the emulator will work.

Oh interesting, is this our fork of gem5? Or is this  the one that is generated from parsing the csv files? I recall this being discussed on list but I can't remember which one we are currently using.
 
> > Or if this will be
> > outside of my current capacity, should I move on to bug #348?
> 
> generally speaking if there are several tasks which you can do, just
> do another if you're blocked: don't wait to ask, just do it.

Don't worry, like 'waiting days or hours between commits' this is a bad habit you helped me break a few weeks ago. I ended up working on Systemes Libres stuff with Yehowshua for Wednesday's meeting. I'll be leaving this bug for now, to work on bug #348. After I finish the pseudo-code translation for that, I plan come back to do the unit tests for this, unless, as often happens with the project, plans change.

> > > 2) add the pseudocode to sprset.mdwn (yes sprset.mdwn is a bad name) from
> > > comment #47
> > > 
> > > https://libre-soc.org/openpower/isa/sprset/
> > 
> > Waiting on your help with this one, I figured I shouldn't waste more time on
> > it given that it's likely something that you're capable of identifying and
> > fixing in a matter of seconds because you understand the parser completely.
> 
> actually i don't: it has been several months, and i wrote it very fast,
> to perform a minimal job!
> 
> i've done a git push on the libre-soc repo you should be able to retry.

Hmm yes, looking at the pseudo-code you've written there is no way I would have figured out all of the necessary modifications to the spec pseudo code needed to get the parser to validate it. Thanks for doing this for me, with this knowledge I think I'll be able to do it solo next time.
Comment 62 Luke Kenneth Casson Leighton 2020-06-06 03:23:21 BST
> (In reply to Luke Kenneth Casson Leighton from comment #59)
> > (In reply to Cole Poirier from comment #57)
> > > This is done. Should I start trying to figure out how to write these unit
> > > tests using fu/cr/test/test_pipe_caller as a guide?
> > 
> > yes sure.  although bear in mind that only the instructions which are
> > supported
> > in the emulator will work.
> 
> Oh interesting, is this our fork of gem5?

no

> Or is this  the one that is
> generated from parsing the csv files?

yes.  hence the need for the pseudocode.  until that exists there *is* no mfmsr (etc) to test against!

in this case even the TRAP function has to be written.

this would (in python) move the SPRs about, set NIA to 0x700 blah blah.

  
> > > Or if this will be
> > > outside of my current capacity, should I move on to bug #348?
> > 
> > generally speaking if there are several tasks which you can do, just
> > do another if you're blocked: don't wait to ask, just do it.
> 
> Don't worry, like 'waiting days or hours between commits' this is a bad
> habit you helped me break a few weeks ago.

:)


> > i've done a git push on the libre-soc repo you should be able to retry.
> 
> Hmm yes, looking at the pseudo-code you've written there is no way I would
> have figured out all of the necessary modifications to the spec pseudo code
> needed to get the parser to validate it. Thanks for doing this for me, with
> this knowledge I think I'll be able to do it solo next time.

cool.

the syntax is python-indented (that was a pig to implement), and borrowed python concepts such as brackets.  i replaced operators such as = with <- and so on.

subscripted ranges were obviously not possible to do in ASCII so i did straight brackets python slices but kept the POWER numbering.

so the translator pushes an extra +1 out on the "end" of the slice.

you saw that when doing bperm.
Comment 63 Cole Poirier 2020-06-07 01:31:50 BST
(In reply to Luke Kenneth Casson Leighton from comment #60)
>  161                     with m.If(a_i[MSR_PR]):
>  162                         msr_o.data[MSR_EE].eq(1)
>  163                         msr_o.data[MSR_IR].eq(1)
>  164                         msr_o.data[MSR_DR].eq(1)
> 
> and at line 185 comb += is missing

fixed
Comment 64 Cole Poirier 2020-06-07 01:38:30 BST
(In reply to Luke Kenneth Casson Leighton from comment #62)
> > Or is this  the one that is
> > generated from parsing the csv files?
> 
> yes.  hence the need for the pseudocode.  until that exists there *is* no
> mfmsr (etc) to test against!
> 
> in this case even the TRAP function has to be written.
> 
> this would (in python) move the SPRs about, set NIA to 0x700 blah blah.

This makes so much sense now and is very, very cool! Especially in light of your comments from comment #59 "actually i don't [completely understand pywriter.py]: it has been several months, and i wrote it very fast,
to perform a minimal job!"

Regarding your comment on today's kanban daily update, "put a preliminary version of mtmsr into the mdwn for Cole to check", how do I go about checking the mdwn pseudo code with the emulator and the nmigen module?

> the syntax is python-indented (that was a pig to implement), and borrowed
> python concepts such as brackets.  i replaced operators such as = with <-
> and so on.
> 
> subscripted ranges were obviously not possible to do in ASCII so i did
> straight brackets python slices but kept the POWER numbering.
> 
> so the translator pushes an extra +1 out on the "end" of the slice.
> 
> you saw that when doing bperm.

Thanks this helps me understand the 'problem' even better than I did from your earlier comments. Also very ingenious.
Comment 65 Luke Kenneth Casson Leighton 2020-06-07 01:39:27 BST
(In reply to Cole Poirier from comment #63)
> (In reply to Luke Kenneth Casson Leighton from comment #60)
> >  161                     with m.If(a_i[MSR_PR]):
> >  162                         msr_o.data[MSR_EE].eq(1)
> >  163                         msr_o.data[MSR_IR].eq(1)
> >  164                         msr_o.data[MSR_DR].eq(1)
> > 
> > and at line 185 comb += is missing
> 
> fixed

superb.  i moved that to a separate function (msr_check_pr) because
it's duplicated.  note that i used the fact that in *both* cases,
the m.If() check comes from the argument that just got copied.

   comb += msr_o.data.eq(a_i)
   with m.If(a_i[MSR_PR]):

is exactly the same as:

   comb += msr_o.eq(a_i)
   with m.If(msr_o[MSR_PR]):

and therefore it is not necessary to pass a_i *and* msr_o into
msr_check_pr().
Comment 66 Cole Poirier 2020-06-07 01:44:51 BST
(In reply to Luke Kenneth Casson Leighton from comment #65)
> superb.  i moved that to a separate function (msr_check_pr) because
> it's duplicated.  note that i used the fact that in *both* cases,
> the m.If() check comes from the argument that just got copied.
> 
>    comb += msr_o.data.eq(a_i)
>    with m.If(a_i[MSR_PR]):
> 
> is exactly the same as:
> 
>    comb += msr_o.eq(a_i)
>    with m.If(msr_o[MSR_PR]):
> 
> and therefore it is not necessary to pass a_i *and* msr_o into
> msr_check_pr().

Fantatsic!
Comment 67 Luke Kenneth Casson Leighton 2020-06-07 01:51:09 BST
(In reply to Cole Poirier from comment #64)
> (In reply to Luke Kenneth Casson Leighton from comment #62)
> > > Or is this  the one that is
> > > generated from parsing the csv files?
> > 
> > yes.  hence the need for the pseudocode.  until that exists there *is* no
> > mfmsr (etc) to test against!
> > 
> > in this case even the TRAP function has to be written.
> > 
> > this would (in python) move the SPRs about, set NIA to 0x700 blah blah.
> 
> This makes so much sense now and is very, very cool! Especially in light of
> your comments from comment #59 "actually i don't [completely understand
> pywriter.py]: it has been several months, and i wrote it very fast,
> to perform a minimal job!"
> 
> Regarding your comment on today's kanban daily update, "put a preliminary
> version of mtmsr into the mdwn for Cole to check", how do I go about
> checking the mdwn pseudo code with the emulator and the nmigen module?

in the libreriscv submodule first to "git pull origin master" - watch out
because submodules are a pain - then you can generate the emulator code.

run pywriter.py (see top-level Makefile).  then - and only then - will
the simulator "understand" mtmsr.

you can inspect the results by looking at... errr... what did we say
it was added to?  sprset.mdwn?

so... pywriter.py will.. err.. put that into... 1sec....
soc/decoder/isa/sprset.py

[btw do *NOT* "git add" that to the repository (it's auto-generated.  we
 do *NOT* add auto-generated files to the git repository, because they
 change).]

example:  mcrxrx, from the sprset.mdwn file

* mcrxrx BF

    CR[4*BF+32:4*BF+35] <-  XER[OV] || XER[OV32] || XER[CA] || XER[CA32]

---> is translated to, in sprset.py --->

    @inject()
    def op_mcrxrx(self, CR):
        CR[4 * BF + 32:4 * BF + 35 + 1] = concat(XER[OV], XER[OV32], XER[CA], XER[CA32]
            )
        return (CR,)


at that point, you *should* be able to add something to
soc/fu/trap/test/test_pipe_caller.py in class TrapTestCase

which i see you deleted all the unit tests (i usually leave one
in there on cookie-cutting so that it's less typing when it comes
to creating one.  now you will have to re-open one of the
test_pipe_caller.py files to cut/paste an example from.  this is
more work for you)


here's the thing: it is highly likely that op_trap() from sprset.py
will fail.  this because it's literally the first time that the emulator
has ever seen the "trap" instruction, and that code also has to be written.

do not let this stop you from actually writing the unit test!

the unit test *is* the way to highlight that the code needs to be written!
[this is called test-driven development]
https://en.wikipedia.org/wiki/Test-driven_development


> > so the translator pushes an extra +1 out on the "end" of the slice.
> > 
> > you saw that when doing bperm.
> 
> Thanks this helps me understand the 'problem' even better than I did from
> your earlier comments. Also very ingenious.

just lazy :)
Comment 68 Luke Kenneth Casson Leighton 2020-06-07 02:32:06 BST
actually... cole, can you do the same thing as i just did with msr_check_pr() create a function called trap(m, addr, nia)?

except this time it is a member function of the class, so trap(self, m, addr, nia)

and just as you see i redid comb=m.d.comb in msr_check_pr, repeat/copy the convenience variables that allow you to set self.o.nia etc etc

the reason for making this function is because you can see it is used twice already.

also i am thinking it may be a good idea to add the address (0x700, etc) actually to the trap_input_record and pass it in from the *decode* phase.

that would allow us to do exceptions in a micro-code fashion, and also when it comes to illegal instructions or privileged ones, in the *decode* phase (PowerDecoder2) we *CHANGE* the opcode from whatever it is from the incoming instruction, into an OP_TRAP, set the trap address to 0x700 (or whatever) manually, and let the trap pipeline execute it.
Comment 69 Luke Kenneth Casson Leighton 2020-06-07 02:42:35 BST
see https://bugs.libre-soc.org/show_bug.cgi?id=325#c8

there, the only difference between OP_TRAP and that code is: one of the bits of SRR is changed (to indicate "priv execution")

we can pass that in as a flag to CompTrapOpSubset, and change the instruction to OP_TRAP.

illegal instruction trap happens. ta-daaa
Comment 70 Cole Poirier 2020-06-07 02:52:29 BST
(In reply to Luke Kenneth Casson Leighton from comment #67)
> in the libreriscv submodule first to "git pull origin master" - watch out
> because submodules are a pain - then you can generate the emulator code.
> 
> run pywriter.py (see top-level Makefile).  then - and only then - will
> the simulator "understand" mtmsr.
> 
> you can inspect the results by looking at... errr... what did we say
> it was added to?  sprset.mdwn?
> 
> so... pywriter.py will.. err.. put that into... 1sec....
> soc/decoder/isa/sprset.py
> 
> [btw do *NOT* "git add" that to the repository (it's auto-generated.  we
>  do *NOT* add auto-generated files to the git repository, because they
>  change).]
> 
> example:  mcrxrx, from the sprset.mdwn file
> 
> * mcrxrx BF
> 
>     CR[4*BF+32:4*BF+35] <-  XER[OV] || XER[OV32] || XER[CA] || XER[CA32]
> 
> ---> is translated to, in sprset.py --->
> 
>     @inject()
>     def op_mcrxrx(self, CR):
>         CR[4 * BF + 32:4 * BF + 35 + 1] = concat(XER[OV], XER[OV32],
> XER[CA], XER[CA32]
>             )
>         return (CR,)
> 
> 
> at that point, you *should* be able to add something to
> soc/fu/trap/test/test_pipe_caller.py in class TrapTestCase
> 
> which i see you deleted all the unit tests (i usually leave one
> in there on cookie-cutting so that it's less typing when it comes
> to creating one.  now you will have to re-open one of the
> test_pipe_caller.py files to cut/paste an example from.  this is
> more work for you)
> 
> 
> here's the thing: it is highly likely that op_trap() from sprset.py
> will fail.  this because it's literally the first time that the emulator
> has ever seen the "trap" instruction, and that code also has to be written.
> 
> do not let this stop you from actually writing the unit test!
> 
> the unit test *is* the way to highlight that the code needs to be written!
> [this is called test-driven development]
> https://en.wikipedia.org/wiki/Test-driven_development

Very helpful, thank you. Will get to work on this.
Comment 71 Cole Poirier 2020-06-07 03:09:30 BST
(In reply to Luke Kenneth Casson Leighton from comment #68)
> actually... cole, can you do the same thing as i just did with
> msr_check_pr() create a function called trap(m, addr, nia)?
> 
> except this time it is a member function of the class, so trap(self, m,
> addr, nia)
> 
> and just as you see i redid comb=m.d.comb in msr_check_pr, repeat/copy the
> convenience variables that allow you to set self.o.nia etc etc
> 
> the reason for making this function is because you can see it is used twice
> already.
> 
> also i am thinking it may be a good idea to add the address (0x700, etc)
> actually to the trap_input_record and pass it in from the *decode* phase.

Sorry, I'm very shaky on the dynamic input records and how they work with the decoder.

One difference between OP_TRAP and OP_SC is that the end of the OP_TRAP function is:

```
# take a copy of the current PC in SRR0
comb += srr0_o.data.eq(cia_i)   # old PC
comb += srr0_o.ok.eq(1)
```
While the end of the OP_SC function is:

```
# and store the (next-after-return) PC in SRR0
comb += srr0_o.data.eq(cia_i+4) # addr to begin from on return
comb += srr0_o.ok.eq(1)
```

So should the function be:

```
def trap(self, m, addr, nia_o, srr0_o, trap_addr):
    comb = m.d.comb
    comb += nia_o.data.eq(trap_addr)
    comb += nia_o.ok.eq(1)

    comb += srr0_o.data.eq(addr) # addr to begin from on return
    comb += srr0_o.ok.eq(1)
```

The below is where my shakiness on the CompXOpSubset and the decoder starts to confuse me. If you can briefly explain that to me here, I can use this when doing the documentation after the tape-out deadline.

> that would allow us to do exceptions in a micro-code fashion, and also when
> it comes to illegal instructions or privileged ones, in the *decode* phase
> (PowerDecoder2) we *CHANGE* the opcode from whatever it is from the incoming
> instruction, into an OP_TRAP, set the trap address to 0x700 (or whatever)
> manually, and let the trap pipeline execute it.

(In reply to Luke Kenneth Casson Leighton from comment #69)
> see https://bugs.libre-soc.org/show_bug.cgi?id=325#c8
> 
> there, the only difference between OP_TRAP and that code is: one of the bits
> of SRR is changed (to indicate "priv execution")
> 
> we can pass that in as a flag to CompTrapOpSubset, and change the
> instruction to OP_TRAP.
> 
> illegal instruction trap happens. ta-daaa
Comment 72 Cole Poirier 2020-06-07 03:36:42 BST
(In reply to Luke Kenneth Casson Leighton from comment #67)
> in the libreriscv submodule first to "git pull origin master" - watch out
> because submodules are a pain - then you can generate the emulator code.
> 
> run pywriter.py (see top-level Makefile).  then - and only then - will
> the simulator "understand" mtmsr.
> 
> you can inspect the results by looking at... errr... what did we say
> it was added to?  sprset.mdwn?
> 
> so... pywriter.py will.. err.. put that into... 1sec....
> soc/decoder/isa/sprset.py
> 
> [btw do *NOT* "git add" that to the repository (it's auto-generated.  we
>  do *NOT* add auto-generated files to the git repository, because they
>  change).]
> 
> example:  mcrxrx, from the sprset.mdwn file
> 
> * mcrxrx BF
> 
>     CR[4*BF+32:4*BF+35] <-  XER[OV] || XER[OV32] || XER[CA] || XER[CA32]
> 
> ---> is translated to, in sprset.py --->
> 
>     @inject()
>     def op_mcrxrx(self, CR):
>         CR[4 * BF + 32:4 * BF + 35 + 1] = concat(XER[OV], XER[OV32],
> XER[CA], XER[CA32]
>             )
>         return (CR,)
> 
> 
> at that point, you *should* be able to add something to
> soc/fu/trap/test/test_pipe_caller.py in class TrapTestCase
> 
> which i see you deleted all the unit tests (i usually leave one
> in there on cookie-cutting so that it's less typing when it comes
> to creating one.  now you will have to re-open one of the
> test_pipe_caller.py files to cut/paste an example from.  this is
> more work for you)
> 
> 
> here's the thing: it is highly likely that op_trap() from sprset.py
> will fail.  this because it's literally the first time that the emulator
> has ever seen the "trap" instruction, and that code also has to be written.
> 
> do not let this stop you from actually writing the unit test!
> 
> the unit test *is* the way to highlight that the code needs to be written!
> [this is called test-driven development]
> https://en.wikipedia.org/wiki/Test-driven_development

I'm having a difficult time understanding how to write the first test for class TrapTestCase looking both at cr/test/test_pipe_caller and logical/test/test_pipe_caller... Do I have to translate all the code that was set up for CR into setting up for TRAP? Trying to think through it I can't figure it out, and I suspect it is related to my previous comment about not understanding how the test_pipe_caller, XXXInput/OutputData, CompALUOpSubset, and CompTrapOpSubset, and the decoder, relate to and interact with each other... I have a little idea of each but I'm doing the 'freeze cause I don't understand' right now.
Comment 73 Luke Kenneth Casson Leighton 2020-06-07 04:09:59 BST
(In reply to Cole Poirier from comment #71)
> (In reply to Luke Kenneth Casson Leighton from comment #68)
> > actually... cole, can you do the same thing as i just did with
> > msr_check_pr() create a function called trap(m, addr, nia)?
> > 
> > except this time it is a member function of the class, so trap(self, m,
> > addr, nia)
> > 
> > and just as you see i redid comb=m.d.comb in msr_check_pr, repeat/copy the
> > convenience variables that allow you to set self.o.nia etc etc
> > 
> > the reason for making this function is because you can see it is used twice
> > already.
> > 
> > also i am thinking it may be a good idea to add the address (0x700, etc)
> > actually to the trap_input_record and pass it in from the *decode* phase.
> 
> Sorry, I'm very shaky on the dynamic input records and how they work with
> the decoder.


will explain tomorrow when it is not 4am.
briefyly: every Record has a dictionary called fields.  by walking this dictionary we can cheat and copy *only* a subset of fields... as long as the names all match up.

see eq_from_execute1 and grep source code for where it is used.


> One difference between OP_TRAP and OP_SC is that the end of the OP_TRAP
> function is:
> 
> ```
> # take a copy of the current PC in SRR0
> comb += srr0_o.data.eq(cia_i)   # old PC
> comb += srr0_o.ok.eq(1)
> ```
> While the end of the OP_SC function is:
> 
> ```
> # and store the (next-after-return) PC in SRR0
> comb += srr0_o.data.eq(cia_i+4) # addr to begin from on return
> comb += srr0_o.ok.eq(1)
> ```
> 
> So should the function be:
> 
> ```
> def trap(self, m, addr, nia_o, srr0_o, trap_addr):
>     comb = m.d.comb
>     comb += nia_o.data.eq(trap_addr)
>     comb += nia_o.ok.eq(1)
> 
>     comb += srr0_o.data.eq(addr) # addr to begin from on return
>     comb += srr0_o.ok.eq(1)

yes! well done.  ok i would cut out srr0_o from those args and just redo them inside the fn.

so now also cookie cut srr0_0 and nia_o from lines 89 and 90 "convenience" variables.  see how that works?

so OP_SC should call self.trap(0xc00, cia_i+4) and OP_TRAP should be obvious.


> ```
> 
> The below is where my shakiness on the CompXOpSubset and the decoder starts
> to confuse me. If you can briefly explain that to me here, I can use this
> when doing the documentation after the tape-out deadline.

tomorrow.

late.
Comment 74 Luke Kenneth Casson Leighton 2020-06-07 04:23:45 BST
(In reply to Cole Poirier from comment #72)

> I'm having a difficult time understanding how to write the first test for
> class TrapTestCase looking both at cr/test/test_pipe_caller and
> logical/test/test_pipe_caller... Do I have to translate all the code that
> was set up for CR into setting up for TRAP? Trying to think through it I
> can't figure it out, and I suspect it is related to my previous comment
> about not understanding how the test_pipe_caller, XXXInput/OutputData,
> CompALUOpSubset, and CompTrapOpSubset, and the decoder, relate to and
> interact with each other... I have a little idea of each but I'm doing the
> 'freeze cause I don't understand' right now.

ok.  suggestion.  take alu test_pipe_caller.py and cut out absolutely every test_* except for one, make sure to pick one with only one assembly instruction.

(or take copy of tpc.py, call it tpc2.py)

then run it and load the generated alu_simulator.gtk in gtkwave.

at the top level you will see an instruction.

add that to the window.

then just go through the hierarchy adding signals.

look especially for these

fn_unit
dest1
dest2
src1_i
src2_i
issue_i
busy_o
rd_go_i
rd_rel_o
wr go req as well

then go *back* to the source code (remember have SEVERAL files open at once)

note that "hey anywhere i have m.submodules those show up in the hierarchy!"

note that the Record contents also show up in the gtkwave output.

look at the debug output and note the instruction assembly code 0xNnnnff1234 etc got outputted and that it also matches that toplevel signal in gtkwave.

the tree hierarchy in gtkwave will start to give you an idea of the interrelationships of the modules, in ways that staring at the static code simply cannot give you, and the  "time" dimension will show you valuable insights too.

once you have got used to this, add *one* more unit test back in, rerun tpc2.py, and hit ctrl-shift-r in gtkwave.

you should be able to track the change in outputs, the result from the operation, with what the debug output shows.
Comment 75 Luke Kenneth Casson Leighton 2020-06-07 07:11:44 BST
commit d2e0e6bea57795fb522e0805b16e6f2852472f98
Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Sun Jun 7 07:03:58 2020 +0100

    add MSR to simulator context

as comment says.  i also updated the git submodule so that sprset now
contains the (new, untested) mtmsr/mfmsr.  if you rerun pywriter.py
you'll see the code now gets created.

and that in turn means that if you add a unit test that uses mfmsr
or mtmsr, the code in the auto-generated sprset.py for op_mfmsr/op_mtmsr
will get called.

whether that (completely untested) code is *accurate* or not is another matter.

we now have a situation of two "unknowns"

1. unknown simulator
2. unknown hardware code.

the normal solution for this is to "call in" something that *is* known,
and that would be the qemu emulator.  except we kinda need some help from
michael to do that in a short timescale, or we need to investigate it
in a longer one.  given that there's a critical task needed (investigating
the CR bit-ordering) i'd rather we not pull him off of that.
Comment 76 Luke Kenneth Casson Leighton 2020-06-07 07:26:29 BST
i'm also attempting to add support for TRAP - as a function - in the
parser and simulator, *without* changing fixedtrap.mdwn, which is
taken directly from the 3.0B PDF.  bit of a pain.

now, if you create a test_tdi() function in trap test_pipe_caller.py,
the simulator *should* print out "TRAP TODO".


diff --git a/src/soc/decoder/isa/caller.py b/src/soc/decoder/isa/caller.py
index 61aa451..ae5d479 100644
--- a/src/soc/decoder/isa/caller.py
+++ b/src/soc/decoder/isa/caller.py
@@ -233,6 +233,11 @@ class ISACaller:
         self.decoder = decoder2.dec
         self.dec2 = decoder2
 
+    def TRAP(self, trap_addr=0x700):
+        print ("TRAP: TODO")
+        # store PC in SRR0, set PC to 0x700
+        # store MSR in SRR1, set MSR to um errr something
+
     def memassign(self, ea, sz, val):
         self.mem.memassign(ea, sz, val)
Comment 77 Cole Poirier 2020-06-07 21:42:16 BST
(In reply to Luke Kenneth Casson Leighton from comment #73)
> will explain tomorrow when it is not 4am.

You are very bad at taking your own advice :) Thank you for explaining... the entire process I should follow to gain MUCH greater understanding of the SOC and how it works... all at 4 am!!

> briefyly: every Record has a dictionary called fields.  by walking this
> dictionary we can cheat and copy *only* a subset of fields... as long as the
> names all match up.
> 
> see eq_from_execute1 and grep source code for where it is used.
> 
> 
> > One difference between OP_TRAP and OP_SC is that the end of the OP_TRAP
> > function is:
> > 
> > ```
> > # take a copy of the current PC in SRR0
> > comb += srr0_o.data.eq(cia_i)   # old PC
> > comb += srr0_o.ok.eq(1)
> > ```
> > While the end of the OP_SC function is:
> > 
> > ```
> > # and store the (next-after-return) PC in SRR0
> > comb += srr0_o.data.eq(cia_i+4) # addr to begin from on return
> > comb += srr0_o.ok.eq(1)
> > ```
> > 
> > So should the function be:
> > 
> > ```
> > def trap(self, m, addr, nia_o, srr0_o, trap_addr):
> >     comb = m.d.comb
> >     comb += nia_o.data.eq(trap_addr)
> >     comb += nia_o.ok.eq(1)
> > 
> >     comb += srr0_o.data.eq(addr) # addr to begin from on return
> >     comb += srr0_o.ok.eq(1)
> 
> yes! well done.  ok i would cut out srr0_o from those args and just redo
> them inside the fn.

Thanks! Will do.

> so now also cookie cut srr0_0 and nia_o from lines 89 and 90 "convenience"
> variables.  see how that works?
> 
> so OP_SC should call self.trap(0xc00, cia_i+4) and OP_TRAP should be obvious.
> 
> 
> > ```
> > 
> > The below is where my shakiness on the CompXOpSubset and the decoder starts
> > to confuse me. If you can briefly explain that to me here, I can use this
> > when doing the documentation after the tape-out deadline.
> 
> tomorrow.
> 
> late.

Hehe :)
Comment 78 Cole Poirier 2020-06-07 21:45:35 BST
(In reply to Luke Kenneth Casson Leighton from comment #74)
> (In reply to Cole Poirier from comment #72)
> 
> > I'm having a difficult time understanding how to write the first test for
> > class TrapTestCase looking both at cr/test/test_pipe_caller and
> > logical/test/test_pipe_caller... Do I have to translate all the code that
> > was set up for CR into setting up for TRAP? Trying to think through it I
> > can't figure it out, and I suspect it is related to my previous comment
> > about not understanding how the test_pipe_caller, XXXInput/OutputData,
> > CompALUOpSubset, and CompTrapOpSubset, and the decoder, relate to and
> > interact with each other... I have a little idea of each but I'm doing the
> > 'freeze cause I don't understand' right now.
> 
> ok.  suggestion.  take alu test_pipe_caller.py and cut out absolutely every
> test_* except for one, make sure to pick one with only one assembly
> instruction.
> 
> (or take copy of tpc.py, call it tpc2.py)
> 
> then run it and load the generated alu_simulator.gtk in gtkwave.
> 
> at the top level you will see an instruction.
> 
> add that to the window.
> 
> then just go through the hierarchy adding signals.
> 
> look especially for these
> 
> fn_unit
> dest1
> dest2
> src1_i
> src2_i
> issue_i
> busy_o
> rd_go_i
> rd_rel_o
> wr go req as well
> 
> then go *back* to the source code (remember have SEVERAL files open at once)
> 
> note that "hey anywhere i have m.submodules those show up in the hierarchy!"
> 
> note that the Record contents also show up in the gtkwave output.
> 
> look at the debug output and note the instruction assembly code 0xNnnnff1234
> etc got outputted and that it also matches that toplevel signal in gtkwave.
> 
> the tree hierarchy in gtkwave will start to give you an idea of the
> interrelationships of the modules, in ways that staring at the static code
> simply cannot give you, and the  "time" dimension will show you valuable
> insights too.
> 
> once you have got used to this, add *one* more unit test back in, rerun
> tpc2.py, and hit ctrl-shift-r in gtkwave.
> 
> you should be able to track the change in outputs, the result from the
> operation, with what the debug output shows.

Thank you, this is the process for MUCH greater understanding I mentioned above. I plan on working on this today after doing a more polished draft of the b-deck, and fixing the trap() convenience function. Probably will work on this for a few days until I feel like I'm starting to get a handle on things. Will also work on translating psuedo code to nmigen for SPR tomorrow.
Comment 79 Cole Poirier 2020-06-07 21:51:52 BST
(In reply to Luke Kenneth Casson Leighton from comment #75)
> commit d2e0e6bea57795fb522e0805b16e6f2852472f98
> Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
> Date:   Sun Jun 7 07:03:58 2020 +0100
> 
>     add MSR to simulator context
> 
> as comment says.  i also updated the git submodule so that sprset now
> contains the (new, untested) mtmsr/mfmsr.  if you rerun pywriter.py
> you'll see the code now gets created.
> 
> and that in turn means that if you add a unit test that uses mfmsr
> or mtmsr, the code in the auto-generated sprset.py for op_mfmsr/op_mtmsr
> will get called.
> 
> whether that (completely untested) code is *accurate* or not is another
> matter.
> 
> we now have a situation of two "unknowns"
> 
> 1. unknown simulator
> 2. unknown hardware code.
> 
> the normal solution for this is to "call in" something that *is* known,
> and that would be the qemu emulator.  except we kinda need some help from
> michael to do that in a short timescale, or we need to investigate it
> in a longer one.  given that there's a critical task needed (investigating
> the CR bit-ordering) i'd rather we not pull him off of that.

So until we have the known qemu emulator I should just try writing appropriate unit tests for when we have the emulator working? Should I raise a bug report for investigating the qemu powerpc emulator on a longer timescale? What should I say in the bug description?

(In reply to Luke Kenneth Casson Leighton from comment #76)
> i'm also attempting to add support for TRAP - as a function - in the
> parser and simulator, *without* changing fixedtrap.mdwn, which is
> taken directly from the 3.0B PDF.  bit of a pain.
> 
> now, if you create a test_tdi() function in trap test_pipe_caller.py,
> the simulator *should* print out "TRAP TODO".
> 
> 
> diff --git a/src/soc/decoder/isa/caller.py b/src/soc/decoder/isa/caller.py
> index 61aa451..ae5d479 100644
> --- a/src/soc/decoder/isa/caller.py
> +++ b/src/soc/decoder/isa/caller.py
> @@ -233,6 +233,11 @@ class ISACaller:
>          self.decoder = decoder2.dec
>          self.dec2 = decoder2
>  
> +    def TRAP(self, trap_addr=0x700):
> +        print ("TRAP: TODO")
> +        # store PC in SRR0, set PC to 0x700
> +        # store MSR in SRR1, set MSR to um errr something
> +
>      def memassign(self, ea, sz, val):
>          self.mem.memassign(ea, sz, val)

This is our autogenerated from parsing csv files emulator? Is this something I can help with or are you handling it?

So is this TRAP function added in sort of ad hoc via caller.py, because it isn't contained in fixedtrap.mdwn? Is it absent from fixedtrap.mdwn because this is a different kind of trap?
Comment 80 Luke Kenneth Casson Leighton 2020-06-07 22:11:59 BST
(In reply to Cole Poirier from comment #79)
> (In reply to Luke Kenneth Casson Leighton from comment #75)
> > commit d2e0e6bea57795fb522e0805b16e6f2852472f98
> > Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
> > Date:   Sun Jun 7 07:03:58 2020 +0100
> > 
> >     add MSR to simulator context
> > 
> > as comment says.  i also updated the git submodule so that sprset now
> > contains the (new, untested) mtmsr/mfmsr.  if you rerun pywriter.py
> > you'll see the code now gets created.
> > 
> > and that in turn means that if you add a unit test that uses mfmsr
> > or mtmsr, the code in the auto-generated sprset.py for op_mfmsr/op_mtmsr
> > will get called.
> > 
> > whether that (completely untested) code is *accurate* or not is another
> > matter.
> > 
> > we now have a situation of two "unknowns"
> > 
> > 1. unknown simulator
> > 2. unknown hardware code.
> > 
> > the normal solution for this is to "call in" something that *is* known,
> > and that would be the qemu emulator.  except we kinda need some help from
> > michael to do that in a short timescale, or we need to investigate it
> > in a longer one.  given that there's a critical task needed (investigating
> > the CR bit-ordering) i'd rather we not pull him off of that.
> 
> So until we have the known qemu emulator I should just try writing
> appropriate unit tests for when we have the emulator working? Should I raise
> a bug report for investigating the qemu powerpc emulator on a longer
> timescale? What should I say in the bug description?
> 
> (In reply to Luke Kenneth Casson Leighton from comment #76)
> > i'm also attempting to add support for TRAP - as a function - in the
> > parser and simulator, *without* changing fixedtrap.mdwn, which is
> > taken directly from the 3.0B PDF.  bit of a pain.
> > 
> > now, if you create a test_tdi() function in trap test_pipe_caller.py,
> > the simulator *should* print out "TRAP TODO".
> > 
> > 
> > diff --git a/src/soc/decoder/isa/caller.py b/src/soc/decoder/isa/caller.py
> > index 61aa451..ae5d479 100644
> > --- a/src/soc/decoder/isa/caller.py
> > +++ b/src/soc/decoder/isa/caller.py
> > @@ -233,6 +233,11 @@ class ISACaller:
> >          self.decoder = decoder2.dec
> >          self.dec2 = decoder2
> >  
> > +    def TRAP(self, trap_addr=0x700):
> > +        print ("TRAP: TODO")
> > +        # store PC in SRR0, set PC to 0x700
> > +        # store MSR in SRR1, set MSR to um errr something
> > +
> >      def memassign(self, ea, sz, val):
> >          self.mem.memassign(ea, sz, val)
> 
> This is our autogenerated from parsing csv files emulator? 

nono, that's actually the function that gets called *by* the emulator.
it's decoder/isa/caller.py (which is the base class for the simulator)

> Is this something I can help with 

actually, probably yes.  it should be as brain-dead simple as
putting things into the self.spr dict:

    self.spr['SRR0'] = self.pc.CIA

and also modify the NIA.  that sort of thing.


> So is this TRAP function added in sort of ad hoc via caller.py, because it
> isn't contained in fixedtrap.mdwn?

it is... however look at the PDF SPEC (or see comment #1 which contains
the pseudo-code.  that's not a function, is it?

so what i did was, identify the keyword "TRAP" and got it to output
a *function* call "self.TRAP()".  auto-generated decoder/isa/fixedtrap.py
now contains this:

class fixedtrap:

    @inject()
    def op_twi(self, RA):
        a = EXTS(RA[32:64])
        if lt(a, EXTS(SI)) & TO[0]:self.TRAP()
        if gt(a, EXTS(SI)) & TO[1]:self.TRAP()


*now* it can call the function which was added in caller.py, *now* we
can get that function to make the required modifications to the SPRs
and to the PC (NIA - Next Instruction Address).
Comment 81 Cole Poirier 2020-06-07 22:18:35 BST
(In reply to Luke Kenneth Casson Leighton from comment #80)
> (In reply to Cole Poirier from comment #79)
> > This is our autogenerated from parsing csv files emulator? 
> 
> nono, that's actually the function that gets called *by* the emulator.
> it's decoder/isa/caller.py (which is the base class for the simulator)
> 
> > Is this something I can help with 
> 
> actually, probably yes.  it should be as brain-dead simple as
> putting things into the self.spr dict:
> 
>     self.spr['SRR0'] = self.pc.CIA
> 
> and also modify the NIA.  that sort of thing.
> 
> 
> > So is this TRAP function added in sort of ad hoc via caller.py, because it
> > isn't contained in fixedtrap.mdwn?
> 
> it is... however look at the PDF SPEC (or see comment #1 which contains
> the pseudo-code.  that's not a function, is it?

No indeed it seems more similar to a keyword. Ah, so just translating that psuedo code into the TRAP method of ISACaller?
 
> so what i did was, identify the keyword "TRAP" and got it to output
> a *function* call "self.TRAP()".  auto-generated decoder/isa/fixedtrap.py
> now contains this:
> 
> class fixedtrap:
> 
>     @inject()
>     def op_twi(self, RA):
>         a = EXTS(RA[32:64])
>         if lt(a, EXTS(SI)) & TO[0]:self.TRAP()
>         if gt(a, EXTS(SI)) & TO[1]:self.TRAP()
> 
> 
> *now* it can call the function which was added in caller.py, *now* we
> can get that function to make the required modifications to the SPRs
> and to the PC (NIA - Next Instruction Address).

Aha, this is actually starting to make A LOT of sense :)
Comment 82 Luke Kenneth Casson Leighton 2020-06-07 22:23:07 BST
(In reply to Cole Poirier from comment #79)
> (In reply to Luke Kenneth Casson Leighton from comment #75)

> > the normal solution for this is to "call in" something that *is* known,
> > and that would be the qemu emulator.  except we kinda need some help from
> > michael to do that in a short timescale, or we need to investigate it
> > in a longer one.  given that there's a critical task needed (investigating
> > the CR bit-ordering) i'd rather we not pull him off of that.
> 
> So until we have the known qemu emulator I should just try writing
> appropriate unit tests for when we have the emulator working? Should I raise
> a bug report for investigating the qemu powerpc emulator on a longer
> timescale?

yes please.

> What should I say in the bug description?

that we need one example unit test of how to run some assembly code
"qemu vs simulator" rather than "qemu vs hardware"

or, even just "qemu with some asserts" but looking *exactly* like how
it's done in the simulator, right now.

and that ultimately it should be possible to have a runtime switch
(or a suite of tests) that allow us to verify *aaalll* the simulator
test_pipe_caller.py tests, in the following way:

* simulator vs qemu
* qemu vs hardware
* hardware vs simulator

later i would very much like to be able to add microwatt to that list.
Comment 83 Luke Kenneth Casson Leighton 2020-06-07 22:31:40 BST
(In reply to Cole Poirier from comment #81)
> (In reply to Luke Kenneth Casson Leighton from comment #80)
> > it is... however look at the PDF SPEC (or see comment #1 which contains
> > the pseudo-code.  that's not a function, is it?
> 
> No indeed it seems more similar to a keyword. 

i basically disregarded entirely the statement at the top of the POWER v3.0B
PDF, "this pseudocode is in no way intended to be an executable language"
and did exactly that.  teehee


> Ah, so just translating that
> psuedo code into the TRAP method of ISACaller?

yeeees :)

> >     @inject()
> >     def op_twi(self, RA):
> >         a = EXTS(RA[32:64])
> >         if lt(a, EXTS(SI)) & TO[0]:self.TRAP()
> >         if gt(a, EXTS(SI)) & TO[1]:self.TRAP()
> > 
> > 
> > *now* it can call the function which was added in caller.py, *now* we
> > can get that function to make the required modifications to the SPRs
> > and to the PC (NIA - Next Instruction Address).
> 
> Aha, this is actually starting to make A LOT of sense :)

ta-daaaa :)

so i saw the latest commit, and added some TODO comments.  if you
can do them *at the same time* as actually adding things in
caller.py ISACAller.TRAP, then you will see clearly the direct connection
between the two.

oh - yes, add a test_twi (or test_tdi) as well, you will have to
set an initial register RA, such as:

            initial_regs = [0] * 32
            initial_regs[2] = random.randint(0, (1<<32)-1)

or maybe do a fixed number, initially, so you know exactly what
you are testing against.  remember if you set initial_regs[2]
then that means that "RA" in the instruction has to be 2.
from fixedtrap.mdwn:

D-Form

* twi TO,RA,SI

therefore, what you put in the listing, would have to be twi TO,-->2<--,SI


TO is a bitfield, saying whether you want to compare equals, greater, less-than etc. etc.

SI is an immediate that you want to compare against register RA.

so just pick something that, *manually*, you know will be "equal", or
something.
Comment 84 Cole Poirier 2020-06-07 23:32:17 BST
(In reply to Luke Kenneth Casson Leighton from comment #82)
> (In reply to Cole Poirier from comment #79)
> > (In reply to Luke Kenneth Casson Leighton from comment #75)
> 
> > > the normal solution for this is to "call in" something that *is* known,
> > > and that would be the qemu emulator.  except we kinda need some help from
> > > michael to do that in a short timescale, or we need to investigate it
> > > in a longer one.  given that there's a critical task needed (investigating
> > > the CR bit-ordering) i'd rather we not pull him off of that.
> > 
> > So until we have the known qemu emulator I should just try writing
> > appropriate unit tests for when we have the emulator working? Should I raise
> > a bug report for investigating the qemu powerpc emulator on a longer
> > timescale?
> 
> yes please.
> 
> > What should I say in the bug description?
> 
> that we need one example unit test of how to run some assembly code
> "qemu vs simulator" rather than "qemu vs hardware"
> 
> or, even just "qemu with some asserts" but looking *exactly* like how
> it's done in the simulator, right now.
> 
> and that ultimately it should be possible to have a runtime switch
> (or a suite of tests) that allow us to verify *aaalll* the simulator
> test_pipe_caller.py tests, in the following way:
> 
> * simulator vs qemu
> * qemu vs hardware
> * hardware vs simulator
> 
> later i would very much like to be able to add microwatt to that list.

Done https://bugs.libre-soc.org/show_bug.cgi?id=368
Comment 85 Cole Poirier 2020-06-07 23:34:24 BST
(In reply to Luke Kenneth Casson Leighton from comment #83)
> (In reply to Cole Poirier from comment #81)
> > (In reply to Luke Kenneth Casson Leighton from comment #80)
> i basically disregarded entirely the statement at the top of the POWER v3.0B
> PDF, "this pseudocode is in no way intended to be an executable language"
> and did exactly that.  teehee

Seems to have saved us many months of work!
 
> > Ah, so just translating that
> > psuedo code into the TRAP method of ISACaller?
> 
> yeeees :)

Makes sense :)

> so i saw the latest commit, and added some TODO comments.  if you
> can do them *at the same time* as actually adding things in
> caller.py ISACAller.TRAP, then you will see clearly the direct connection
> between the two.

Perfect, will do.

> oh - yes, add a test_twi (or test_tdi) as well, you will have to
> set an initial register RA, such as:
> 
>             initial_regs = [0] * 32
>             initial_regs[2] = random.randint(0, (1<<32)-1)
> 
> or maybe do a fixed number, initially, so you know exactly what
> you are testing against.  remember if you set initial_regs[2]
> then that means that "RA" in the instruction has to be 2.
> from fixedtrap.mdwn:
> 
> D-Form
> 
> * twi TO,RA,SI
> 
> therefore, what you put in the listing, would have to be twi TO,-->2<--,SI
> 
> 
> TO is a bitfield, saying whether you want to compare equals, greater,
> less-than etc. etc.
> 
> SI is an immediate that you want to compare against register RA.
> 
> so just pick something that, *manually*, you know will be "equal", or
> something.

Thanks, very helpful, will do as well.
Comment 86 Luke Kenneth Casson Leighton 2020-06-08 13:59:34 BST
hi michael, apologies i couldn't help it, i did some of the TODO
comments on trap main_stage:

-        # TODO: MSR (into srr1)
+        # take a copy of the current MSR in SRR1
+        comb += msr_copy(srr1_o.data, msr_i) # old MSR
+        comb += srr1_o.ok.eq(1)
 
also i wandered through the 3.0B PDF and found the section that
covers "Program Interrupts" - Book III 7.5.9

i therefore added these defines:

PI_TRAP = (63 - 46)    # 1 if exception is "trap" type

and changed the setting of the relevant SRR1 bit to use that definition:

                    self.trap(0x700, cia_i)
                    comb += srr1_o.data[PI_TRAP].eq(1)

now we have words instead of numbers when it comes to reading through the
code.  we can see, in two lines of code:

1. "it's a trap" (self.trap - duh). 
2. execution continues at address 0x700
3. the trap handler (in software) can read SRR1 and know exactly what to do
Comment 87 Luke Kenneth Casson Leighton 2020-06-08 14:42:15 BST
(In reply to Luke Kenneth Casson Leighton from comment #86)
> hi michael,

cole!  gaah :)   names... massive headache (again) at the moment.
Comment 88 Cole Poirier 2020-06-08 22:03:36 BST
(In reply to Luke Kenneth Casson Leighton from comment #86)
> hi michael, apologies i couldn't help it, i did some of the TODO
> comments on trap main_stage:
> 
> -        # TODO: MSR (into srr1)
> +        # take a copy of the current MSR in SRR1
> +        comb += msr_copy(srr1_o.data, msr_i) # old MSR
> +        comb += srr1_o.ok.eq(1)
>  
> also i wandered through the 3.0B PDF and found the section that
> covers "Program Interrupts" - Book III 7.5.9
> 
> i therefore added these defines:
> 
> PI_TRAP = (63 - 46)    # 1 if exception is "trap" type
> 
> and changed the setting of the relevant SRR1 bit to use that definition:
> 
>                     self.trap(0x700, cia_i)
>                     comb += srr1_o.data[PI_TRAP].eq(1)
> 
> now we have words instead of numbers when it comes to reading through the
> code.  we can see, in two lines of code:
> 
> 1. "it's a trap" (self.trap - duh). 
> 2. execution continues at address 0x700
> 3. the trap handler (in software) can read SRR1 and know exactly what to do

(In reply to Luke Kenneth Casson Leighton from comment #87)
> (In reply to Luke Kenneth Casson Leighton from comment #86)
> > hi michael,
> 
> cole!  gaah :)   names... massive headache (again) at the moment.

Hehe no problem :) I took a look at this yesterday evening, and I was going to ask you for help because I was stuck, so, so much the better!
Comment 89 Luke Kenneth Casson Leighton 2020-06-08 23:51:25 BST
ok so i added two new signals to Decode2Execute1Type, called
traptype and trapaddr.  the idea is that rather than get main_stage.py
to *hard-code* what type of trap it is, and *hard-code* the address to
jump to, the *decoder* says what the trap type is, and where to jump to.

in this way we can detect interrupts, privileged instructions, and so on,
in the *decoder*.

unfortunately... this means that the decoder needs access to MSR.  whoops.
this is slightly unfortunate, because up until now, PowerDecoder2 needed
absolutely nothing - no information *at all* - about the "state".  didn't
need the PC, didn't need anything.


+        with m.If(op.internal_op == InternalOp.OP_TRAP):
+            comb += e.traptype.eq(TT_TRAP) # request trap interrupt
+            comb += e.trapaddr.eq(0x70)    # addr=0x700 (strip first nibble)
+
+        return m
 
+        # privileged instruction
+        with m.If(instr_is_privileged(m, op.internal_op, e.insn) &
+                  msr[MSR_PR]):
+            # privileged instruction trap
+            comb += op.internal_op.eq(InternalOp.OP_TRAP)
+            comb += e.traptype.eq(TT_PRIV) # request privileged instruction
+            comb += e.trapaddr.eq(0x70)    # addr=0x700 (strip first nibble)
Comment 90 Cole Poirier 2020-06-09 20:32:32 BST
(In reply to Luke Kenneth Casson Leighton from comment #89)
> ok so i added two new signals to Decode2Execute1Type, called
> traptype and trapaddr.  the idea is that rather than get main_stage.py
> to *hard-code* what type of trap it is, and *hard-code* the address to
> jump to, the *decoder* says what the trap type is, and where to jump to.
> 
> in this way we can detect interrupts, privileged instructions, and so on,
> in the *decoder*.
> 
> unfortunately... this means that the decoder needs access to MSR.  whoops.
> this is slightly unfortunate, because up until now, PowerDecoder2 needed
> absolutely nothing - no information *at all* - about the "state".  didn't
> need the PC, didn't need anything.
> 
> 
> +        with m.If(op.internal_op == InternalOp.OP_TRAP):
> +            comb += e.traptype.eq(TT_TRAP) # request trap interrupt
> +            comb += e.trapaddr.eq(0x70)    # addr=0x700 (strip first nibble)
> +
> +        return m
>  
> +        # privileged instruction
> +        with m.If(instr_is_privileged(m, op.internal_op, e.insn) &
> +                  msr[MSR_PR]):
> +            # privileged instruction trap
> +            comb += op.internal_op.eq(InternalOp.OP_TRAP)
> +            comb += e.traptype.eq(TT_PRIV) # request privileged instruction
> +            comb += e.trapaddr.eq(0x70)    # addr=0x700 (strip first nibble)

Very cool. Can you comment on the implications of the decoder needing access to MSR here? or is that meant to be a comment on a different bug report. If so, please link it here, because I'd like to understand the decoder better. Going to finally get to try the learning process involving gtkwave today!
Comment 91 Luke Kenneth Casson Leighton 2020-06-09 21:59:35 BST
(In reply to Cole Poirier from comment #90)
> (In reply to Luke Kenneth Casson Leighton from comment #89)
> > ok so i added two new signals to Decode2Execute1Type, called
> > traptype and trapaddr.  the idea is that rather than get main_stage.py
> > to *hard-code* what type of trap it is, and *hard-code* the address to
> > jump to, the *decoder* says what the trap type is, and where to jump to.
> > 
> > in this way we can detect interrupts, privileged instructions, and so on,
> > in the *decoder*.
> > 
> > unfortunately... this means that the decoder needs access to MSR.  whoops.
> > this is slightly unfortunate, because up until now, PowerDecoder2 needed
> > absolutely nothing - no information *at all* - about the "state".  didn't
> > need the PC, didn't need anything.
> > 
> > 
> > +        with m.If(op.internal_op == InternalOp.OP_TRAP):
> > +            comb += e.traptype.eq(TT_TRAP) # request trap interrupt
> > +            comb += e.trapaddr.eq(0x70)    # addr=0x700 (strip first nibble)
> > +
> > +        return m
> >  
> > +        # privileged instruction
> > +        with m.If(instr_is_privileged(m, op.internal_op, e.insn) &
> > +                  msr[MSR_PR]):
> > +            # privileged instruction trap
> > +            comb += op.internal_op.eq(InternalOp.OP_TRAP)
> > +            comb += e.traptype.eq(TT_PRIV) # request privileged instruction
> > +            comb += e.trapaddr.eq(0x70)    # addr=0x700 (strip first nibble)
> 
> Very cool. Can you comment on the implications of the decoder needing access
> to MSR here?

sigh it's annoying. however given that we gave to pass state information around, containing the PC, adding the MSR to that,state is not so bad.  it's just that it messes with dependency tracking.

> or is that meant to be a comment on a different bug report. If
> so, please link it here, because I'd like to understand the decoder better.

the decoder's really not difficult.  funnily enough i updated this page

https://libre-soc.org/Documentation/SOC/index/

it links to PowerDecoder.  the thing yo remember is that there is no hardcoded list of junk.  i will edit the page further, or we discuss on list. thus bugrrport us intended for TRAP
Comment 92 Cole Poirier 2020-06-20 02:38:03 BST
Hi Luke,

I added some code to TRAP in decoder.isa.caller but commented out because I didn't want to break test_caller. I implemented most of the pseudo-code but felt like I was on very shaky ground as I didn't fully grok the way I should be implementing the pseudo-code for this function. I think I need some guidance on how to set the 'TRAP exception type', and review of my commented code to see if I'm on the write track.

Am I correct in my understanding that once this is done in decoder.isa.caller it should be added as two tests (twi, tw) in fu.trap.test_pipe_caller?

After this all that remains for the TRAP pipeline is writing the formal proof?
Comment 93 Luke Kenneth Casson Leighton 2020-06-20 03:06:33 BST
(In reply to Cole Poirier from comment #92)
> Hi Luke,
> 
> I added some code to TRAP in decoder.isa.caller but commented out because I
> didn't want to break test_caller.

given that absolutely nothing calls it, that is not possible.

therefore commenting it out is pointless and extra work.

only when a unit test is added will the cide actually be called.

when called, it will call uncommented code, which will fail, and you will have achieved nothing of value.

if however you add the code, and the unit test in test_sim.py, you actually have something to start comparing against.

that is progress.

adding commented out code is succumbing to lack of confidence :)


> I implemented most of the pseudo-code but
> felt like I was on very shaky ground as I didn't fully grok the way I should
> be implementing the pseudo-code for this function.

only by actually trying it, and getting the test_sim.py to check the PC, abd SPRS SRR0 and SRR1 and the MSR against qemu will we find out.

until then it is guesswork.

> I think I need some
> guidance on how to set the 'TRAP exception type',

traptype? i did it already.  and added references to the 3.0B spec page.

> and review of my commented
> code to see if I'm on the write track.

yes looks perfectly reasonable to me.  may need to create a mirror msr_copy function though.

will check tomorrow when it's not 3am

> Am I correct in my understanding that once this is done in
> decoder.isa.caller it should be added as two tests (twi, tw) in
> fu.trap.test_pipe_caller?

yyyep.  and also we need to add the sane to test_sim.py and check against qemu.

> After this all that remains for the TRAP pipeline is writing the formal
> proof?

yes.
Comment 94 Luke Kenneth Casson Leighton 2020-06-20 11:40:18 BST
ok so let's go through how you could have determined, for yourself,
if the additions "would break" something.  there are two approaches:

1) run all unit tests.  that means all of them.  this basically means
   "nosetests3" at the top level.

   however this is only useful in cases where the unit tests all work
   already and where they do not take several days to complete
   (such as the ieee754fpu ones).

   therefore, perhaps a sensible thing to do would be to run
   *relevant* unit tests and only run the full lot perhaps once a week.

2) track the function that you're writing through (using source code
   grep and by asking questions) to see if it is actually used.


for (2) here, we can do this:

a) function is called TRAP therefore let us grep -r TRAP all python source.

   decoder/isa/caller.py:    def TRAP(self, trap_addr=0x700):
   decoder/isa/caller.py:        print ("TRAP: TODO")
   decoder/isa/fixedtrap.py:        if lt(a, EXTS(SI)) & TO[0]:self.TRAP()
   decoder/isa/fixedtrap.py:        if gt(a, EXTS(SI)) & TO[1]:self.TRAP()
   ...
   ...

b) the only location appears to be fixedtrap.py therefore we inspect that
   file.

   turns out it is called *only* by op_twi, op_tw, op_tdi or op_td.

c) now we need to work out how _those_ get called (we already know the
   answer, from our knowledge of the purpose of the simulator,
   but let's walk through it)

d) where does the class fixedtrap get used?  grep -r "fixedtrap" shows:

    decoder/isa/all.py:from soc.decoder.isa.fixedtrap import fixedtrap
    decoder/isa/all.py:class ISA(ISACaller, comparefixed, 
             fixedarith, fixedload, fixedstore, condition,
---->>       fixedtrap, fixedshift, stringldst, bcd, system,
             branch, fixedlogical, sprset):

   (a better way to have done that would be to grep by import as well:
    grep -r "fixedtrap" | grep import)

e) now we open all.py and find that it contains a class ISA which is
   multiply-derived from something that includes ISACaller

ah ha!

and, also, we noted some dictionaries from (d) which added op_tdi etc.
etc. to a dictionary fixedtrap_instrs

and that in all.py that dictionary merges into self.instrs.

f) therefore, we now focus our attention on ISACaller, looking for
   self.instrs.

    def call(self, name):
--->     info = self.instrs[name]  <----

g) therefore let us check where call() is called...

        def execute_one(self):
            opname = code.split(' ')[0]
--->        yield from self.call(opname)  <----

h) therefore we now grep -r for execute_one:

simulator/test_sim.py:                yield from simulator.execute_one()
fu/compunits/test/test_compunit.py:   yield from sim.execute_one()


and TA-DAAAAA!  now we know which unit tests to call!

or, we can continue with further examination of the chain (and also
check if ISACaller.call() is also used somewhere - which i believe
it is).  and part of that continued investigation will find
(grep -r "test_compunit") that test_compunit.py contains a base class
from which multiple unit tests derive or import and consequently we
need to examine those (or just simply run them).

if we _did_ examine them we would find that ABSOLUTELY NONE of the unit
tests make an assembly code call "tdi", or "twi" and so on.

and we know this also because we have been discussing adding them and
*have not yet done so*.

therefore our examination of the trail happens to link back up with our
internal knowledge and expectations.

with zero of the unit tests calling "trap" opcodes, we **KNOW** that the
ISACaller.TRAP() function cannot be called, and that, therefore, no matter
what we put in it there *can be no quotes damage quotes whatsoever*

despite appearances this task did not take long.  perhaps a couple of minutes.
however it determined *for ourselves* that there can be no "damage".

this skill - the willingness to apply this technique - is something that
you really must develop for yourself, and i describe it in summary form
in the HDL_workflow.

in addition, you will notice how many files were opened up during that
process.  a *lot*.

and *this* is also why it is so essential to have the code respect the 80
char limit, because if you have 6, 8, 10 files open and are working with
them or referring to them all the time, you *need* as many terminals
on-screen as you can possibly get.

if those files "wrap" over it becomes a readability headache that
significantly decreases everyone's productivity.
Comment 95 Cole Poirier 2020-06-21 23:58:05 BST
(In reply to Luke Kenneth Casson Leighton from comment #93)
> (In reply to Cole Poirier from comment #92)
> > Hi Luke,
> > 
> > I added some code to TRAP in decoder.isa.caller but commented out because I
> > didn't want to break test_caller.
> 
> given that absolutely nothing calls it, that is not possible.
> 
> therefore commenting it out is pointless and extra work.
> 
> only when a unit test is added will the cide actually be called.
> 
> when called, it will call uncommented code, which will fail, and you will
> have achieved nothing of value.
> 
> if however you add the code, and the unit test in test_sim.py, you actually
> have something to start comparing against.
> 
> that is progress.
> 
> adding commented out code is succumbing to lack of confidence :)

Yes mostly it was the end of the day and I was exhausted so I wasn't at that point mentally capable of doing the necessary investigation, but still wanted your input.

> > I implemented most of the pseudo-code but
> > felt like I was on very shaky ground as I didn't fully grok the way I should
> > be implementing the pseudo-code for this function.
> 
> only by actually trying it, and getting the test_sim.py to check the PC, abd
> SPRS SRR0 and SRR1 and the MSR against qemu will we find out.
> 
> until then it is guesswork.

Thanks, that's very helpful.

> > I think I need some
> > guidance on how to set the 'TRAP exception type',
> 
> traptype? i did it already.  and added references to the 3.0B spec page.

Ah yes saw that, I meant how I should set it in the code I write for caller.py

> > and review of my commented
> > code to see if I'm on the write track.
> 
> yes looks perfectly reasonable to me.  may need to create a mirror msr_copy
> function though.
> 
> will check tomorrow when it's not 3am

Sounds likely.

> > Am I correct in my understanding that once this is done in
> > decoder.isa.caller it should be added as two tests (twi, tw) in
> > fu.trap.test_pipe_caller?
> 
> yyyep.  and also we need to add the sane to test_sim.py and check against
> qemu.

Helpful, makes sense.
 
> > After this all that remains for the TRAP pipeline is writing the formal
> > proof?
> 
> yes.

Excellent, helps me understand the scopes of tasks needed to write modules, will use this in the tutorial I write.
Comment 96 Cole Poirier 2020-06-22 00:00:00 BST
(In reply to Luke Kenneth Casson Leighton from comment #94)
> ok so let's go through how you could have determined, for yourself,
> if the additions "would break" something.  there are two approaches:
> [...snip...]
> and *this* is also why it is so essential to have the code respect the 80
> char limit, because if you have 6, 8, 10 files open and are working with
> them or referring to them all the time, you *need* as many terminals
> on-screen as you can possibly get.
> 
> if those files "wrap" over it becomes a readability headache that
> significantly decreases everyone's productivity.

Thanks very much for going through this in such detail for me Luke, I will refer back to this frequently for the next few days, and include this in the tutorial I write.
Comment 97 Luke Kenneth Casson Leighton 2020-06-22 00:10:38 BST
(In reply to Cole Poirier from comment #95)

> 
> Ah yes saw that, I meant how I should set it in the code I write for
> caller.py

well, as i am guessing as much as anyone, here, given that nothing has been tested i'd suggest just going with it and adapting iteratively.

starting by running a tdi instruction in test_sim.py and seeing what happens, checking whether the PC changes to 0x700 or not.
Comment 98 Cole Poirier 2020-06-22 00:36:20 BST
(In reply to Luke Kenneth Casson Leighton from comment #97)
> (In reply to Cole Poirier from comment #95)
> 
> > 
> > Ah yes saw that, I meant how I should set it in the code I write for
> > caller.py
> 
> well, as i am guessing as much as anyone, here, given that nothing has been
> tested i'd suggest just going with it and adapting iteratively.
> 
> starting by running a tdi instruction in test_sim.py and seeing what
> happens, checking whether the PC changes to 0x700 or not.

I meant, don't I have to compare against TO and set the correct bit for the TRAP exception type? How to implement the following in caller.py:
(comment #1 of this bug report)
```
if (a < b) & TO[0] then TRAP
if (a > b) & TO[1] then TRAP
if (a = b) & TO[2] then TRAP
if (a <u b) & TO[3] then TRAP
if (a >u b) & TO[4] then TRAP
```

and the related:
```
PI_FP   = (63 - 43)    # 1 if FP exception
PI_PRIV = (63 - 45)    # 1 if privileged interrupt
PI_TRAP = (63 - 46)    # 1 if exception is "trap" type
PI_ADR  = (63 - 47)    # 0 if SRR0 = address of instruction causing             exception
```

Do I just set the appropriate bit in SRR1? And I do this after setting SRR1 to the old MSR? Or I set MSR to (?) with the appropriate bit for the TRAP exception set?

Right now comment #3 and the code you wrote for fu/trap/main_stage.py's 
`with m.Case(InternalOp.OP_TRAP):` section seem to conflict. Perhaps I'm just not getting it, but it does seem like there's an inconsistency that needs to be resolved. 

(comment #3 of this bug report)
```
section 7.5.9 covers TRAP, and shows that yes:

* SRR0 is set to "PC following trap"
* SRR1 contains the *old* MSR (sorry, not the new MSR)
* MSR is updated to meet the conditions of the trap
```
fu/trap/main_stage.py's 
`with m.Case(InternalOp.OP_TRAP):`
```
#### trap ####
with m.Case(InternalOp.OP_TRAP):
    # trap instructions (tw, twi, td, tdi)
    with m.If(should_trap):
        # generate trap-type program interrupt
        self.trap(trapaddr<<4, cia_i)
        with m.If(traptype == 0):
            # say trap occurred (see 3.0B Book III 7.5.9)
            comb += srr1_o.data[PI_TRAP].eq(1)
        with m.If(traptype & TT_PRIV):
            comb += srr1_o.data[PI_PRIV].eq(1)
        with m.If(traptype & TT_FP):
            comb += srr1_o.data[PI_FP].eq(1)
        with m.If(traptype & TT_ADDR):
            comb += srr1_o.data[PI_ADDR].eq(1)
```
Comment 99 Luke Kenneth Casson Leighton 2020-06-22 00:53:23 BST
(In reply to Cole Poirier from comment #98)


> I meant, don't I have to compare against TO and set the correct bit for the
> TRAP exception type? How to implement the following in caller.py:

look in fixedtrap.py.  search for op_tdi.  does it not look familiar?

> (comment #1 of this bug report)
> ```
> if (a < b) & TO[0] then TRAP
> if (a > b) & TO[1] then TRAP
> if (a = b) & TO[2] then TRAP
> if (a <u b) & TO[3] then TRAP
> if (a >u b) & TO[4] then TRAP
> ```

does op_tdi etc not look like that?

> and the related:
> ```
> PI_FP   = (63 - 43)    # 1 if FP exception
> PI_PRIV = (63 - 45)    # 1 if privileged interrupt
> PI_TRAP = (63 - 46)    # 1 if exception is "trap" type
> PI_ADR  = (63 - 47)    # 0 if SRR0 = address of instruction causing         
> exception
> ```

ok you miiight simply be able to import thise and use them.

although strictly speaking they ahoukd be moved to a common location.

> Do I just set the appropriate bit in SRR1? And I do this after setting SRR1
> to the old MSR? Or I set MSR to (?) with the appropriate bit for the TRAP
> exception set?

i don't know! like i said i would simply try it and check SRR1 (etc) in test_sim.py alongside testing of PC.

then if the test fails you have the beginnings of a comparative answer.

tests failing *is* ok.

just go with it.

write the unit test.

run it.

stop flapping about, worrying about what you don't know: write the darn test and run it already! :)

i know what is going on.  you are false correlating "i don't know what i am doing" with "i should not take action UNTIL.i know what i am doing".

this will get you precisely nowhere.

by setting up an iterative loop (based around the unit test) you can make progress *despite* not knowing.

write the darn test and run it! :)
Comment 100 Cole Poirier 2020-06-22 01:11:04 BST
(In reply to Luke Kenneth Casson Leighton from comment #99)

> i don't know! like i said i would simply try it and check SRR1 (etc) in
> test_sim.py alongside testing of PC.
> 
> then if the test fails you have the beginnings of a comparative answer.
> 
> tests failing *is* ok.
> 
> just go with it.
> 
> write the unit test.
> 
> run it.
> 
> stop flapping about, worrying about what you don't know: write the darn test
> and run it already! :)
> 
> i know what is going on.  you are false correlating "i don't know what i am
> doing" with "i should not take action UNTIL.i know what i am doing".
> 
> this will get you precisely nowhere.
> 
> by setting up an iterative loop (based around the unit test) you can make
> progress *despite* not knowing.
> 
> write the darn test and run it! :)

Ok, makes sense to me. Thanks for the necessary kick-in-the-pants to just DO it. Tomorrow actually, brain starting to melt from email overload.
Comment 101 Cole Poirier 2020-07-01 02:04:21 BST
Unfortunately I'm running into trouble trying to get test_sim.py to work. Running into the following error, which when I googled it led me to the index.libre-soc.org archive of bug #186 'Create Power9 Decoder'. What I discerned from that is that this was a bug that Michael fixed a few months ago, so I'm not sure what I'm missing/doing wrong.

```
ERROR: test_loop (__main__.DecoderTestCase)
in godbolt.org:
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_sim.py", line 200, in test_loop
    self.run_tst_program(program, [9], initial_mem={})
  File "test_sim.py", line 251, in run_tst_program
    with run_program(prog, initial_mem) as q:
  File "/home/colepoirier/src/soc/src/soc/simulator/qemu.py", line 110, in run_program
    q.gdb_eval('$cr=0')
  File "/home/colepoirier/src/soc/src/soc/simulator/qemu.py", line 89, in gdb_eval
    return self.gdb.write(f'-data-evaluate-expression {expr}')
  File "/usr/local/lib/python3.7/dist-packages/pygdbmi-0.9.0.3-py3.7.egg/pygdbmi/gdbcontroller.py", line 247, in write
    timeout_sec=timeout_sec, raise_error_on_timeout=raise_error_on_timeout
  File "/usr/local/lib/python3.7/dist-packages/pygdbmi-0.9.0.3-py3.7.egg/pygdbmi/gdbcontroller.py", line 285, in get_gdb_response
    "Did not get response from gdb after %s seconds" % timeout_sec
pygdbmi.gdbcontroller.GdbTimeoutError: Did not get response from gdb after 1 seconds
```
Comment 102 Luke Kenneth Casson Leighton 2020-07-01 06:15:11 BST
(In reply to Cole Poirier from comment #101)
> Unfortunately I'm running into trouble trying to get test_sim.py to work.
> Running into the following error, which when I googled it led me to the
> index.libre-soc.org archive of bug #186 'Create Power9 Decoder'. What I
> discerned from that is that this was a bug that Michael fixed a few months
> ago, so I'm not sure what I'm missing/doing wrong.

absolutely no idea.  you'll need to update everything and re-run
pywriter.py first.
Comment 103 Luke Kenneth Casson Leighton 2020-07-01 07:30:57 BST
also need to know exactly which version of qemu and which version of gdb.
Comment 104 Luke Kenneth Casson Leighton 2020-07-22 14:48:47 BST
https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/fu/trap/main_stage.py;h=119d3e918be408472cfe0b428fe3f3ea201c2c71;hb=5e73a60c7beb52480dcf73469f597aa28b845227#l267

 266                 # if the opcode's LEV sub-field is equal to 1.
 267                 trap_to_hv = Signal(reset_less=True)
 268                 lev = Signal(6, reset_less=True)
 269                 comb += lev.eq(op[31-26:32-20])
 270                 comb += trap_to_hv.eq(lev == Const(1, 6))
 
samuel: two things.

1. we are following microwatt, which has not implemented hypervisor.  therefore if doing LEV at all this should be an illegal instruction (which is tested and raised in PwerDecode2, not here) however this needs checking as there is something odd about SC LEV !=0

2. direct field bit accessing of op.insn is a "no-no".  it is impossible to understand and read.  using fields.FormSC.LEV is the "correct" convention.  only code from microwatt that has not been decoded yet has "direct" bit access (bit 20 of insn for example)

3. the bit fields are in *PowerISA* order and need reversing: this is why we use DecodeFields, why FieldSelectableInt exists.

4. the only reason the trap formal proof assertion is correct against this newly-added code is because *both are the wrong bit order*

i will update code-comments to reference here.
Comment 105 Luke Kenneth Casson Leighton 2020-07-22 14:50:40 BST
(In reply to Luke Kenneth Casson Leighton from comment #104)

> samuel: two things.

four. 

noooobody expects the spanish inquisitionnn.  the keyyy element is surprise.  surprise and...   the twoo key elements are...
Comment 106 Luke Kenneth Casson Leighton 2020-07-22 15:05:43 BST
v3.0B p952

Programming Note

If LEV **IN POWERISA FIELD NOTATION ORDER** == 1 the hypervisor is invoked.
This is the only
way that executing an instruction can cause hyper-
visor state to be entered.

Because this instruction is not privileged, it is possi-
ble for application software to invoke the hypervi-
sor. However, such invocation should be
considered a programming error.

except, V3.0B book I chapter 1 section 1.3.3 has nothing to say about
Reserved fields except by referring us to section 1.8, which *still*
does not have anything to say.. or it does... by circularly referring
in section 1.9.2...

ok, yes: LEV > 1, by being "reservded", should trigger "illegal instruction"
however because we are not supporting hypervisor i believe that LEV=1
should *also* raise illegal instruction (where it can be sorted out
from there).

i will check with openpower-hdl-cores.
see:
http://lists.mailinglist.openpowerfoundation.org/pipermail/openpower-hdl-cores/2020-July/000084.html
Comment 107 Luke Kenneth Casson Leighton 2020-07-22 15:47:19 BST
due to PowerISA numbering, this *might be largernumber:smallernumber
and i do not know if that works as intended.

                comb += expected_msr[MSR.TEs:MSR.TEe+1].eq(0)


>>> l = [1,2,3,4]
>>> l[2:0]
[]

it doesn't.  it returns an empty list.

MSR.TEs i believe is *greater* than TSR.TEe.

    TEs = (63 - 53)    # Trace Enable (subfield)
    TEe = (63 - 54)    # Trace Enable (subfield)

>>> 63-53
10
>>> 63-54
9

it is.  so that means that the above is:

            expected_msr[10:9+1]

which is going to be an empty list.

also in proof:

commit 64087f35f6b1e10429615836ec9077b55e8e85fd (HEAD -> master, origin/master)
Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Wed Jul 22 15:48:18 2020 +0100

    field number ordering wrong way round?
Comment 108 Samuel A. Falvo II 2020-07-22 16:56:16 BST
(In reply to Luke Kenneth Casson Leighton from comment #104)
> 1. we are following microwatt, which has not implemented hypervisor. 

After reading its readme, it also doesn't yet implement supervisor state either.  Can you confirm whether we are aiming to implement supervisor state?

> 3. the bit fields are in *PowerISA* order and need reversing: this is why we
> use DecodeFields, why FieldSelectableInt exists.

In all the IBM specs I've read in the past, bit 0 (or, alternatively, the smallest numbered bit) is considered to be the most significant bit for any field.  This is a convention that dates back to before the IBM System/360.

In other words, if there is a 5-bit field, its bits are *labelled* according to the top row, but have arithmetic *meaning* according to the bottom row:

    0 1 2 3 4

    M . . . L
 
    4 3 2 1 0

where M=2^4, and L=2^0, if those bits are set.

What you're saying contradicts my understanding, and what I've read before.  I tried looking in the OpenPower ISA books for an explanation (it seems like anyone wanting to write an assembler would have an absolute need to know this data), but I found nothing!

This seems like a gaping hole in the spec, and I'd argue it needs top priority for resolution by the OpenPower group.
Comment 109 Samuel A. Falvo II 2020-07-22 17:00:42 BST
(In reply to Luke Kenneth Casson Leighton from comment #107)
> due to PowerISA numbering, this *might be largernumber:smallernumber
> and i do not know if that works as intended.
> 
>                 comb += expected_msr[MSR.TEs:MSR.TEe+1].eq(0)
> 
> 
> >>> l = [1,2,3,4]
> >>> l[2:0]
> []
> 
> it doesn't.  it returns an empty list.
> 
> MSR.TEs i believe is *greater* than TSR.TEe.
> 
>     TEs = (63 - 53)    # Trace Enable (subfield)
>     TEe = (63 - 54)    # Trace Enable (subfield)
> 
> >>> 63-53
> 10
> >>> 63-54
> 9
> 
> it is.  so that means that the above is:
> 
>             expected_msr[10:9+1]
> 
> which is going to be an empty list.
> 
> also in proof:
> 
> commit 64087f35f6b1e10429615836ec9077b55e8e85fd (HEAD -> master,
> origin/master)
> Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
> Date:   Wed Jul 22 15:48:18 2020 +0100
> 
>     field number ordering wrong way round?

Then we're going to need a better abstraction for specifying arbitrary register fields than just x[63-y].  This is perennial point of confusion, and I guarantee anyone coming into the project will run into the same issues.
Comment 110 Samuel A. Falvo II 2020-07-22 17:08:23 BST
(In reply to Samuel A. Falvo II from comment #108)
> This seems like a gaping hole in the spec, and I'd argue it needs top
> priority for resolution by the OpenPower group.

To add two more things to this train of thought:

1. If bit 0 of a field is the least significant bit, just written to the left of the most significant bit, then why do we bother with the 63-x computation at all?  It shouldn't matter how the bits are printed on paper.

2. This implies that some 32-bit registers (those which are documented using bits ranging from 32 to 63) actually occupy the upper half of a GPR when moved, not the lower-half.
Comment 111 Luke Kenneth Casson Leighton 2020-07-22 17:14:23 BST
(In reply to Samuel A. Falvo II from comment #108)
> (In reply to Luke Kenneth Casson Leighton from comment #104)
> > 1. we are following microwatt, which has not implemented hypervisor. 
> 
> After reading its readme, it also doesn't yet implement supervisor state
> either.  Can you confirm whether we are aiming to implement supervisor state?

we're tracking microwatt explicitly.  when microwatt implements supervisor,
we read the code and update Libre-SOC accordingly.

> > 3. the bit fields are in *PowerISA* order and need reversing: this is why we
> > use DecodeFields, why FieldSelectableInt exists.
> 
> In all the IBM specs I've read in the past, bit 0 (or, alternatively, the
> smallest numbered bit) is considered to be the most significant bit for any
> field.  This is a convention that dates back to before the IBM System/360.
> 
> In other words, if there is a 5-bit field, its bits are *labelled* according
> to the top row, but have arithmetic *meaning* according to the bottom row:
> 
>     0 1 2 3 4
> 
>     M . . . L
>  
>     4 3 2 1 0
> 
> where M=2^4, and L=2^0, if those bits are set.

yep.  this *directly* contradicts the expectation of everyone else's
conventions, *and* it is inverted from python behaviour.

this is why we set up the Decode Fields - and use them.  self.fields.FormNNNN
*specifically* sets up (named) fields, *specifically* performs the bit-order
inversion needed so that it is *not* necessary for programmers to do that,
laboriously, each and every time.

use of "op.insn[....:....]" we're therefore *not* doing.  the code that
you wrote actually checks the *MSB* (bit 5) of LEV, in both the trap
main_stage and the spec, because Const(1, 6) is *NOT* an IBM/PowerISA
bit-order constant.

so that needs fixing: by following the convention i set, and not removing
code in places where that convention is used.

 
> What you're saying contradicts my understanding, and what I've read before. 

it took Michael and i several days to get right.


> I tried looking in the OpenPower ISA books for an explanation (it seems like
> anyone wanting to write an assembler would have an absolute need to know
> this data), but I found nothing!

i know.  it's there - i have said a couple of times, now: section 1.3.4
"Notation", page 4:

- Bits in instructions, fields, and bit strings are
  numbered from left to right, starting with bit 0

- The leftmost bit of a sequence of bits is the
  most significant bit of the sequence.
 
> This seems like a gaping hole in the spec, and I'd argue it needs top
> priority for resolution by the OpenPower group.

it's not missing: it's just not emphasised, it's stated plainly, once and
only once, on p4  then the remaining 1,500 pages assume that you've read
those two sentences and know what they mean.
Comment 112 Luke Kenneth Casson Leighton 2020-07-22 17:22:10 BST
(In reply to Samuel A. Falvo II from comment #110)
> (In reply to Samuel A. Falvo II from comment #108)
> > This seems like a gaping hole in the spec, and I'd argue it needs top
> > priority for resolution by the OpenPower group.
> 
> To add two more things to this train of thought:
> 
> 1. If bit 0 of a field is the least significant bit, 

it's not - it's the MSB.  read the spec again.  p4.  section 1.3.2

- The leftmost bit of a sequence of bits is the
  most significant bit of the sequence.


> just written to the
> left of the most significant bit, then why do we bother with the 63-x
> computation at all?  It shouldn't matter how the bits are printed on paper.

see above.

 
> 2. This implies that some 32-bit registers (those which are documented using
> bits ranging from 32 to 63) actually occupy the upper half of a GPR when
> moved, not the lower-half.

that's incorrect.  a register of length 64 bits, the lower half LSBs are
in bit positions numbered 32-63.

yes.  it's very very odd.  mulli pseudocode for example:

    XO-Form

        mulhw RT,RA,RB (Rc=0)
        mulhw. RT,RA,RB (Rc=1)

    Pseudo-code:

        prod[0:63] <- MULS((RA)[32:63], (RB)[32:63])
        RT[32:63] <- prod[0:31]

funfunfun - note there we're taking the *high* word from the
product... by taking the *lower*-indexed parts of prod!

oh and because it's 32-bit inputs, the LSBs are taken from
RA and RB using indexing RA[32:63].
Comment 113 Luke Kenneth Casson Leighton 2020-07-22 17:23:41 BST
(In reply to Luke Kenneth Casson Leighton from comment #112)

> oh and because it's 32-bit inputs, the LSBs are taken from
> RA and RB using indexing RA[32:63].

you are permitted as many opportunites as needed involving running
outside, screaming loudly, and coming back to the keyboard.  try
not to scare the neighbours in the next county.
Comment 114 Samuel A. Falvo II 2020-07-22 17:26:21 BST
(In reply to Luke Kenneth Casson Leighton from comment #111)
> this is why we set up the Decode Fields - and use them.  self.fields.FormNNNN

But, do we have anything similar for registers?  There's a metric ton of decoders for instruction formats, which I get; but I'm talking also about arbitrary fields of *registers*.

Per your own advice, I was to translate the pseudocode found in the specs into formal properties.  The pseudocode itself refers to raw bit fields of registers by number.  If that is an error, then maybe I should invest some time writing translation/mapping functions for IBM_regs to EveryoneElse_Regs, assuming these don't already exist somewhere.

But, looking throughout the code (not just trap, but everywhere), I routinely see bit-fields referenced using direct mathematical mapping (63-x).
Comment 115 Luke Kenneth Casson Leighton 2020-07-22 18:58:44 BST
(In reply to Samuel A. Falvo II from comment #114)
> (In reply to Luke Kenneth Casson Leighton from comment #111)
> > this is why we set up the Decode Fields - and use them.  self.fields.FormNNNN
> 
> But, do we have anything similar for registers?  There's a metric ton of
> decoders for instruction formats, which I get; but I'm talking also about
> arbitrary fields of *registers*.

right.  ok so here - for accessing registers (self.i.a, self.i.msr) - we have followed the convention of manually translating the hectares-per-potato conbention into "normal" industry-standard LSB is bit 0" convention.

in the nmimgen HDL.

in the ISACaller *simulator* we have **NOT** done that because we have actually made the pseudocode executable.
> 
> Per your own advice, I was to translate the pseudocode found in the specs
> into formal properties.  The pseudocode itself refers to raw bit fields of
> registers by number.

ah yes i see.  ok

>  If that is an error, then maybe I should invest some
> time writing translation/mapping functions for IBM_regs to
> EveryoneElse_Regs, assuming these don't already exist somewhere.

that has already been done.  i have mentioned it three times: the Field Decoders.

these are self.fields.FormSC for example and self.fields.FormSC contains for example an LEV member.

> 
> But, looking throughout the code (not just trap, but everywhere), I
> routinely see bit-fields referenced using direct mathematical mapping (63-x).

oh ok you are referring to those.

sigh, yes.

because the same constants (e.g MSR.HV) are used in *both* the unit tests for the ISACaller *and* in the HDL, and because the simulator does bit reversing but the HDL does not require it, we have to satisfy both.

i have considered auto creating soc.consts.MSRr which is bitreversed from soc.consts.MSR

one of those can be used in the simulator and in unit tests.

the other can be used in the HDL code.

honestly it is pretty irritating to have this stuff inverted from the industry stsndard convention for bitfields.  it took 3 months to even notice that we had the meaning of LE/BE self-consistently inverted throughout the entire codebase.
Comment 116 Luke Kenneth Casson Leighton 2020-07-22 19:04:50 BST
(In reply to Luke Kenneth Casson Leighton from comment #115)

> > Per your own advice, I was to translate the pseudocode found in the specs
> > into formal properties.  The pseudocode itself refers to raw bit fields of
> > registers by number.
> 
> ah yes i see.  ok

didn't finish.

i didn't clarify: i meant, only the accessors to MSR and SRR1.  not the LEV field, TO field etc.

sorry.

the instruction opcode fields are not part *of* the pseudocode, they are used *by* the pseudocode.

some fields such as sh, me, mb, from ShiftRotate, are "constructed" (IBM actually got these wrong in the spec!)

manual bitinversion combined with manual reordering would be so unclear and so confusing that i cannot countenance it.

at least by using the Decode Fields/Forms that reduces confusion there.

for registers such as MSR etc yes we could do with something better.
Comment 117 Luke Kenneth Casson Leighton 2020-07-22 20:09:16 BST
commit 8fef04dc4954c146277e1534c89d013d7a4174d7 (HEAD -> master, origin/master)
Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Wed Jul 22 20:08:29 2020 +0100

    sigh, auto-create some little/big-endian classes for accessing MSR/PI fields

ok!  been meaning to do this for a while.
Comment 118 Luke Kenneth Casson Leighton 2020-07-22 20:15:40 BST
https://git.libre-soc.org/?p=soc.git;a=commitdiff;h=6e9aab358be8cd5f53494c035f0bc67dd7826b5d

this is in ISACaller.TRAP()

-        self.msr[63-MSR.SF] = 1
-        self.msr[63-MSR.EE] = 0
-        self.msr[63-MSR.PR] = 0
-        self.msr[63-MSR.IR] = 0
-        self.msr[63-MSR.DR] = 0
-        self.msr[63-MSR.RI] = 0
-        self.msr[63-MSR.LE] = 1
+        self.msr[MSRb.SF] = 1
+        self.msr[MSRb.EE] = 0
+        self.msr[MSRb.PR] = 0
+        self.msr[MSRb.IR] = 0
+        self.msr[MSRb.DR] = 0
+        self.msr[MSRb.RI] = 0
+        self.msr[MSRb.LE] = 1

i liked the idea of not doing the (silly) 63-NNN stuff, so changed things
to use explicit class/enums that do the 63- for us.

also i need to update ISACaller.TRAP() to reflect the bitfield changes
in trap main_stage.py (and the proof):

                ...
                ...
                comb += expected_msr[MSR.VSX].eq(0)
                comb += expected_msr[MSR.PR].eq(0)
                comb += expected_msr[MSR.FP].eq(0)
                comb += expected_msr[MSR.PMM].eq(0)

i'll do that now.
Comment 119 Luke Kenneth Casson Leighton 2020-07-22 20:37:43 BST
On Wed, Jul 22, 2020 at 8:13 PM Samuel Falvo II <sam.falvo@gmail.com> wrote:
>
> I'm not suggesting you need to divert what you're working on.  I'm
> just going to take a break for a bit and meditate on what you've
> written.  I need to find (or develop for myself) tools (code,
> convention, or both) to prevent the massive proliferation of silly
> mistakes that I'm injecting.

sam: really, don't worry about it.  it's part of learning. you've identified
at least two bugs, one spec discrepancy, and more.  this is valuable!

plus, as you've probably noticed, i miss things (takes a couple of
repetitions and i eventually "get it") - for example you had to emphasise
twice about the register bitfields.  i got it in the end.

apparently, it's well-known in the POWER community that everyone dealing
with PowerISA goes through this "adaptation" phase to LE / BE.

now - and this is actually part of the Charter - what i am *not* going
to do is take away the opportunity for you to fix the errors that you
introduced: that's why i put code-comments (that were more lines to
write than if i'd "fixed" the things myself).
Comment 120 Luke Kenneth Casson Leighton 2020-07-24 09:49:37 BST
-                # TODO: check ordering (which is smaller, which is larger)
-                # MSR.TSs or MSR.TSe+1?

+                comb += field(expected_msr, MSRb.TEs, MSRb.TEe).eq(0)


this is still wrong.  the left side of the slice is still greater tham the right side of the slice because MSR.TEs is less than MSR.TEb which, after subtraction from 63 become the other way round.

therefore not only does the code do nothing, the Assertion returns an rempty list on both sides of the "==" which evaluates to TRUE which will always silently succeed.
Comment 121 Luke Kenneth Casson Leighton 2020-07-24 09:53:24 BST
-                # Power ISA V3.0B, Book 2, Section 3.3.1
+                # We are not supporting hypervisor or transactional semantics,
+                # so we skip enforcing those fields' properties.

code good, however the removing of the line referencing the spec, reduces the educational and documentational
aspect of the project and makes maintenance and review harder.

finding that section took time, so is important to preserve the comment.
Comment 122 Luke Kenneth Casson Leighton 2020-07-24 10:00:58 BST
                 # check EE (48) IR (58), DR (59): PR (49) will over-ride
-                comb += [
-                    Assert(msr_o[48] == (srr1_i[48] | srr1_i[48])), # EE
-                    Assert(msr_o[58] == (srr1_i[58] | srr1_i[58])), # IR
-                    Assert(msr_o[59] == (srr1_i[59] | srr1_i[59])), # DR
-                ]
+                for bit in [48, 58, 59]:
+                    comb += Assert(
+                        field(msr_o, bit) ==
+                        (field(srr1_i, bit) | field(srr1_i, 49))
+                    )
 

i appreciate where you're coming from, here, creating a loop.  however because it is only 3 items it saves zero lines of code, and actually looks less elegant.

in addition, the purpose here is to *literally* allow a reader to line-by-line, side-by-side, no "thought" required, compare the proof against the pseudocode.

the loop actively goes against being able to do that, doesn't it?

that extra thought step of having to "interpret" the loop *could* be a missed review of a mistake, and is extra mental effort.

so, on balance, revert this one.
Comment 123 Luke Kenneth Casson Leighton 2020-07-24 10:05:25 BST
@@ -242,13 +256,6 @@ class TrapMainStage(PipeModBase):
                 # check problem state
                 msr_check_pr(m, msr_o.data)
 
-                # hypervisor stuff.  here: bits 3 (HV) and 51 (ME) were
-                # copied over by msr_copy but if HV was not set we need
-                # the *original* (msr_i) bits
-                with m.If(~msr_i[MSR.HV]):
-                    comb += msr_o.data[MSR.HV].eq(msr_i[MSR.HV])
-                    comb += msr_o.data[MSR.ME].eq(msr_i[MSR.ME])
-
                 # don't understand but it's in the spec.  again: bits 32-34
                 # are copied from srr1_i and need *restoring* to msr_i
                 bits = slice(63-31,63-29+1) # bits 29, 30, 31 (Power notation)


mm removing that does not feel right.  i believe this is what microwatt is doing and removing what microwatt does is inadviseable.

have to check 1 sec
Comment 124 Luke Kenneth Casson Leighton 2020-07-24 10:10:04 BST
(In reply to Luke Kenneth Casson Leighton from comment #123)
> @@ -242,13 +256,6 @@ class TrapMainStage(PipeModBase):
>                  # check problem state
>                  msr_check_pr(m, msr_o.data)
>  
> -                # hypervisor stuff.  here: bits 3 (HV) and 51 (ME) were
> -                # copied over by msr_copy but if HV was not set we need
> -                # the *original* (msr_i) bits
> -                with m.If(~msr_i[MSR.HV]):
> -                    comb += msr_o.data[MSR.HV].eq(msr_i[MSR.HV])
> -                    comb += msr_o.data[MSR.ME].eq(msr_i[MSR.ME])
> -
>                  # don't understand but it's in the spec.  again: bits 32-34
>                  # are copied from srr1_i and need *restoring* to msr_i
>                  bits = slice(63-31,63-29+1) # bits 29, 30, 31 (Power
> notation)
> 
> 
> mm removing that does not feel right.  i believe this is what microwatt is
> doing and removing what microwatt does is inadviseable.
> 
> have to check 1 sec

no i know: i copied the behaviour of the spec pseudocode here for OP_MTMSR/D.

so this needs to be restored (and the proof to follow exactly the mtmsr/d pseudocode as well)
Comment 125 Luke Kenneth Casson Leighton 2020-07-24 10:12:14 BST
@@ -239,24 +211,6 @@ class Driver(Elaboratable):
                 comb += [
                     Assert(msr_o.ok),
                     Assert(nia_o.ok),
-                    ]
-
-                # Note: going through the spec pseudo-code, line-by-line,
-                # in order, with these assertions.  idea is: compare
-                # *directly* against the pseudo-code.  therefore, leave
-                # numbering in (from pseudo-code) and add *comments* about
-                # which field it is (3 == HV etc.)
-
-                # spec: MSR[51] <- (MSR[3] & SRR1[51]) | ((¬MSR[3] & MSR[51]))
-                with m.If(field(msr_i, 3)): # HV
-                    comb += Assert(field(msr_o, 51) == field(srr1_i, 51) # ME
-                with m.Else():
-                    comb += Assert(field(msr_o, 51) == field(msr_i, 51)) # ME
-
-                comb += [
-                    # spec: MSR[3] <- (MSR[3] & SRR1[3])
-                    Assert(field(msr_o, 3) == (field(srr1_i, 3) &
-                                               field(msr_i, 3))), # HV
                 ]
 
                 # if (MSR[29:31] != 0b010) | (SRR1[29:31] != 0b000) then


revert that in the proof.
Comment 126 Luke Kenneth Casson Leighton 2020-07-24 10:25:26 BST
i will commit this shortly.  do you note two things:

1) comments are added which "explain why".  removing comments is detrimental
   to communication and code maintenance.

2) is it now clear what i said: that the python slice numbering has to
   be LHS < RHS?  in each and every case (including TEs and TEe but
   now where field_slice is used 30, 31 you've *introduced* a bug)
   the BE indexing "LHS small number, RHS large number" is ***INVERTED***
   by the subtraction from 63.

with that commit (which i will do in a minute) you should now find bugs
and python exceptions come up when running the proof and the trap unit test
test_pipe_caller.py, which i will leave for you to fix.


diff --git a/src/soc/consts.py b/src/soc/consts.py
index fe29743e..8ed63eab 100644
--- a/src/soc/consts.py
+++ b/src/soc/consts.py
@@ -13,12 +13,15 @@ def field_slice(start, end):
     where the start and end bits use IBM conventions.  start < end.
     The range specified is inclusive on both ends.
     """
+    start = 63 - start
+    end = 63 - end
+    # XXX must do the endian-reversing BEFORE doing the comparison
+    # if done after, that instead asserts that (after modification)
+    # start *MUST* be greater than end!
     if start >= end:
         raise ValueError(
             "start ({}) must be less than end ({})".format(start, end)
         )
-    start = 63 - start
-    end = 63 - end
     return slice(end, start + 1)
Comment 127 Luke Kenneth Casson Leighton 2020-07-24 10:33:05 BST
this comment has been removed:

@@ -239,24 +211,6 @@ class Driver(Elaboratable):
                 comb += [
                     Assert(msr_o.ok),
                     Assert(nia_o.ok),
-                    ]
-
-                # Note: going through the spec pseudo-code, line-by-line,
-                # in order, with these assertions.  idea is: compare
-                # *directly* against the pseudo-code.  therefore, leave
-                # numbering in (from pseudo-code) and add *comments* about
-                # which field it is (3 == HV etc.)

its purpose is to explain why the strategy of how the code has been written
was chosen.

removing it leaves people confused as to why the code is written the way
that it is.

plus, a reviewer will think, "why is this code written in this fashion??
it's so laborious, plenty of opportunities to quotes optimise quotes",
not understanding that those optimisations *interfere* with clarity.
Comment 128 Luke Kenneth Casson Leighton 2020-07-24 10:40:53 BST
ok committed.

commit 0c0d37306316246c12b79e0982e97689531f97e9 (HEAD -> master)
Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Fri Jul 24 10:40:32 2020 +0100

    code review comments for trap and proof
Comment 129 Samuel A. Falvo II 2020-07-24 14:48:21 BST
(In reply to Luke Kenneth Casson Leighton from comment #126)
> i will commit this shortly.  do you note two things:
> ...
> +    start = 63 - start
> +    end = 63 - end
> +    # XXX must do the endian-reversing BEFORE doing the comparison
> +    # if done after, that instead asserts that (after modification)
> +    # start *MUST* be greater than end!
>      if start >= end:
>          raise ValueError(
>              "start ({}) must be less than end ({})".format(start, end)
>          )
> -    start = 63 - start
> -    end = 63 - end
>      return slice(end, start + 1)

This is a bug.

Consider: if you perform the mapping prior to the if-statement, if you invoke field_slice(x, 0, 31), then by the time the if-statement executes, start=63 and end=32.  start >= end will always be true for every valid use of the field() call.

Putting the error-check prior to the mapping does two things:
1. It enforces correct usage of the function, and,
2. Gives correct diagnostics when the exception is raised, in terms of the correct abstraction (IBM's "MSB 0" notation, as I learned last night that's what it's called, per Wikipedia).

This will need to be switched back.
Comment 130 Samuel A. Falvo II 2020-07-24 14:52:02 BST
(In reply to Samuel A. Falvo II from comment #129)
> Putting the error-check prior to the mapping does two things:
> 1. It enforces correct usage of the function, and,
> 2. Gives correct diagnostics when the exception is raised, in terms of the
> correct abstraction (IBM's "MSB 0" notation, as I learned last night that's
> what it's called, per Wikipedia).
> 
> This will need to be switched back.

Alternatively, swap the sense of the if-statement, and somehow preserve the input parameters for the exception report, but I think that just makes the code harder to read.
Comment 131 Samuel A. Falvo II 2020-07-24 14:54:00 BST
(In reply to Luke Kenneth Casson Leighton from comment #127)
> this comment has been removed:
> 
> @@ -239,24 +211,6 @@ class Driver(Elaboratable):
>                  comb += [
>                      Assert(msr_o.ok),
>                      Assert(nia_o.ok),
> -                    ]
> -
> -                # Note: going through the spec pseudo-code, line-by-line,
> -                # in order, with these assertions.  idea is: compare
> -                # *directly* against the pseudo-code.  therefore, leave
> -                # numbering in (from pseudo-code) and add *comments* about
> -                # which field it is (3 == HV etc.)
> 
> its purpose is to explain why the strategy of how the code has been written
> was chosen.
> 
> removing it leaves people confused as to why the code is written the way
> that it is.
> 
> plus, a reviewer will think, "why is this code written in this fashion??
> it's so laborious, plenty of opportunities to quotes optimise quotes",
> not understanding that those optimisations *interfere* with clarity.

Sorry; it's hard for me to know which comments are for me-only versus general consumption.  I'll put it back in.
Comment 132 Samuel A. Falvo II 2020-07-24 14:58:31 BST
(In reply to Luke Kenneth Casson Leighton from comment #126)
> 1) comments are added which "explain why".  removing comments is detrimental
>    to communication and code maintenance.

My aim was to remove only those comments I felt were addressed solely to me.  I wasn't clear which comments qualified.

> 2) is it now clear what i said: that the python slice numbering has to
>    be LHS < RHS?  in each and every case (including TEs and TEe but

That has always been clear, and has always been the intention.  However, bugs remain despite my best efforts.

>    now where field_slice is used 30, 31 you've *introduced* a bug)

Actually, removed one.  ;)  Walk through the code again.  :)  That's why I wrote the function: to automate the error-checking, so that I wouldn't run into this issue again.
Comment 133 Samuel A. Falvo II 2020-07-24 15:04:57 BST
(In reply to Samuel A. Falvo II from comment #132)
> Actually, removed one.  ;)  Walk through the code again.  :)  That's why I
> wrote the function: to automate the error-checking, so that I wouldn't run
> into this issue again.

Just to illustrate what I mean, with your changes, this is what I get when I run the trap proof now.  Note that the bit numbers are wrong relative to what's in the source code, and they're demonstrably out of order in the exception report, even though they're *in* order in the source code.


======================================================================
ERROR: test_ilang (proof_main_stage.TrapMainStageTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/kc5tja/git/libre-soc/soc/src/soc/fu/trap/formal/proof_main_stage.py", line 272, in test_ilang
    vl = rtlil.convert(Driver(), ports=[])
  File "/home/kc5tja/git/libre-soc/nmigen/nmigen/back/rtlil.py", line 1056, in convert
    fragment = ir.Fragment.get(elaboratable, platform).prepare(**kwargs)
  File "/home/kc5tja/git/libre-soc/nmigen/nmigen/hdl/ir.py", line 39, in get
    obj = obj.elaborate(platform)
  File "/home/kc5tja/git/libre-soc/soc/src/soc/fu/trap/formal/proof_main_stage.py", line 126, in elaborate
    comb += field(expected_msr, MSRb.TEs, MSRb.TEe).eq(0)
  File "/home/kc5tja/git/libre-soc/soc/src/soc/consts.py", line 36, in field
    return r[field_slice(start, end)]
  File "/home/kc5tja/git/libre-soc/soc/src/soc/consts.py", line 22, in field_slice
    raise ValueError(
ValueError: start (10) must be less than end (9)

----------------------------------------------------------------------
Comment 134 Luke Kenneth Casson Leighton 2020-07-24 15:39:48 BST
(In reply to Samuel A. Falvo II from comment #129)
> (In reply to Luke Kenneth Casson Leighton from comment #126)
> > i will commit this shortly.  do you note two things:
> > ...
> > +    start = 63 - start
> > +    end = 63 - end
> > +    # XXX must do the endian-reversing BEFORE doing the comparison
> > +    # if done after, that instead asserts that (after modification)
> > +    # start *MUST* be greater than end!
> >      if start >= end:
> >          raise ValueError(
> >              "start ({}) must be less than end ({})".format(start, end)
> >          )
> > -    start = 63 - start
> > -    end = 63 - end
> >      return slice(end, start + 1)
> 
> This is a bug.

wait.... end, start+1

ah damn, you inverted end and start, without commenting it.

that's why i didn't notice, because the expectation is that if it's called "start" it will be first.

can you use different variable names?

msb0_end = 63 - start
msb0_start = 63 - end

followed by return slice(msb0_start, msb0_end + 1)

actually the other way round.  the *parameters* need to be called msb0_start and msb0_end

end = 63 - msb0_start
start = 63 - msb0_end
Comment 135 Luke Kenneth Casson Leighton 2020-07-24 15:43:24 BST
(In reply to Samuel A. Falvo II from comment #132)
> (In reply to Luke Kenneth Casson Leighton from comment #126)
> > 1) comments are added which "explain why".  removing comments is detrimental
> >    to communication and code maintenance.
> 
> My aim was to remove only those comments I felt were addressed solely to me.
> I wasn't clear which comments qualified.

yehyeh, my fault. i've added XXX at start and end of bits that can be removed, now.

> >    now where field_slice is used 30, 31 you've *introduced* a bug)
> 
> Actually, removed one.  ;)  Walk through the code again.  :)  That's why I
> wrote the function: to automate the error-checking, so that I wouldn't run
> into this issue again.

good idea.  even as a tiny function it is still critically important to properly comment and document, because of the massive confusion and interaction with python slices.
Comment 136 Luke Kenneth Casson Leighton 2020-07-24 16:04:22 BST
(In reply to Luke Kenneth Casson Leighton from comment #134)

> end = 63 - msb0_start
> start = 63 - msb0_end

commit 02d18dca18c0736de135158436eed3b7d64bbed7 (HEAD -> master, origin/master)
Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Fri Jul 24 16:03:31 2020 +0100

    returned field_slice to original, and added comments

done
Comment 137 Luke Kenneth Casson Leighton 2020-07-24 18:28:16 BST
                 # check EE (48) IR (58), DR (59): PR (49) will over-ride
-                comb += [
-                    Assert(msr_o[48] == (srr1_i[48] | srr1_i[48])), # EE
-                    Assert(msr_o[58] == (srr1_i[58] | srr1_i[58])), # IR
-                    Assert(msr_o[59] == (srr1_i[59] | srr1_i[59])), # DR
-                ]

nuts i wrote a nice explanation then lost the browser connection.  can you restore these lines because they look like the pseudocode and were carefully crafted tgat way.

expanding the loop is even less clear.
Comment 138 Samuel A. Falvo II 2020-07-24 23:23:56 BST
(In reply to Luke Kenneth Casson Leighton from comment #137)
>                  # check EE (48) IR (58), DR (59): PR (49) will over-ride
> -                comb += [
> -                    Assert(msr_o[48] == (srr1_i[48] | srr1_i[48])), # EE
> -                    Assert(msr_o[58] == (srr1_i[58] | srr1_i[58])), # IR
> -                    Assert(msr_o[59] == (srr1_i[59] | srr1_i[59])), # DR
> -                ]
> 
> nuts i wrote a nice explanation then lost the browser connection.  can you
> restore these lines because they look like the pseudocode and were carefully
> crafted tgat way.
> 
> expanding the loop is even less clear.

Wouldn't msr_o[48] refer to the LSB-0 bit 48 though?  That seems like it'd pick the wrong bit to check.
Comment 139 Luke Kenneth Casson Leighton 2020-07-25 00:19:49 BST
(In reply to Samuel A. Falvo II from comment #138)

> Wouldn't msr_o[48] refer to the LSB-0 bit 48 though?  That seems like it'd
> pick the wrong bit to check.

right, yes, running it through fields() first.

which might go over the line limit.

arse.

see what you can do.
Comment 140 Luke Kenneth Casson Leighton 2020-07-25 00:23:00 BST
(In reply to Samuel A. Falvo II from comment #138)
> (In reply to Luke Kenneth Casson Leighton from comment #137)
> >                  # check EE (48) IR (58), DR (59): PR (49) will over-ride
> > -                comb += [
> > -                    Assert(msr_o[48] == (srr1_i[48] | srr1_i[48])), # EE
> > -                    Assert(msr_o[58] == (srr1_i[58] | srr1_i[58])), # IR
> > -                    Assert(msr_o[59] == (srr1_i[59] | srr1_i[59])), # DR
> > -                ]

it should have been srr[49] i.e. the PR field at the end, anyway.

see what happens if you do PR = field(srr_i, 49) then use that shortened variable name.
Comment 141 Samuel A. Falvo II 2020-08-04 03:54:51 BST
Is there anything more which needs to be done for this issue?  I think I've completed it.  Is it safe to mark this accordingly?
Comment 142 Luke Kenneth Casson Leighton 2020-08-04 10:07:39 BST
yes and make sure to track it on your wiki homepage.  i will work out percentage contributions and budgets.  cole same for you.
Comment 143 Cole Poirier 2020-08-04 19:15:20 BST
(In reply to Luke Kenneth Casson Leighton from comment #142)
> yes and make sure to track it on your wiki homepage.  i will work out
> percentage contributions and budgets.  cole same for you.

I don't think I really contributed to this one.
Comment 144 Luke Kenneth Casson Leighton 2020-08-04 19:20:32 BST
(In reply to Cole Poirier from comment #143)

> I don't think I really contributed to this one.

discussion equals "valuable contribution".
Comment 145 Cole Poirier 2020-08-04 19:28:11 BST
(In reply to Cole Poirier from comment #143)
> (In reply to Luke Kenneth Casson Leighton from comment #142)
> > yes and make sure to track it on your wiki homepage.  i will work out
> > percentage contributions and budgets.  cole same for you.
> 
> I don't think I really contributed to this one.

In that case, great.
Comment 146 Cole Poirier 2020-09-14 20:38:06 BST
Luke, (when your RSI has abated) can you update the "The table of payments (in EUR) for this task; TOML format:" field so I know what amount to submit for this bug in my pending rfp?