Bug 269 - auto-conversion / parser of POWER ISA Spec v3.0B
Summary: auto-conversion / parser of POWER ISA Spec v3.0B
Status: PAYMENTPENDING FIXED
Alias: None
Product: Libre-SOC's first SoC
Classification: Unclassified
Component: Source Code (show other bugs)
Version: unspecified
Hardware: PC Linux
: --- enhancement
Assignee: Luke Kenneth Casson Leighton
URL:
Depends on: 280 282
Blocks: 241
  Show dependency treegraph
 
Reported: 2020-03-28 11:12 GMT by Luke Kenneth Casson Leighton
Modified: 2020-12-02 15:04 GMT (History)
5 users (show)

See Also:
NLnet milestone: NLNet.2019.10.Standards
total budget (EUR) for completion of task and all subtasks: 1000
budget (EUR) for this task, excluding subtasks' budget: 1000
parent task for budget allocation: 241
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:
"lkcl"=500 "mnolan"=500


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Kenneth Casson Leighton 2020-03-28 11:12:43 GMT
see http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2020-March/005470.html
and http://lists.mailinglist.openpowerfoundation.org/pipermail/openpower-hdl-cores/2020-March/000007.html

the idea is, rather than write the simulator (and hardware) by hand
(which will be tedious, slow, and full of mistakes), parse the pseudo-code
*in the spec*, and turn it into *actual code*.
Comment 1 Luke Kenneth Casson Leighton 2020-03-28 17:35:22 GMT
screw it, just decided to go with a PDFtoTXT conversion and muddle through.
started adding these, could people please look at them and review, make
ABSOLUTELY sure that they're the same as what's in the v3.0B PDF?

list table here:
https://libre-riscv.org/openpower/isa/
Comment 2 Luke Kenneth Casson Leighton 2020-03-30 16:16:31 BST
the basics of the parser now work.

this:

    n  <- 0
    do while n < 64
       if (RS)[63-n] = 0b1 then
            leave
       n  <- n + 1
    RA <- EXTZ64(n)

gets turned into this:

    n = 0
    while n < 64:
        if RS[63 - n] == 1:
            break
        n = n + 1
    RA = EXTZ64(n)

which is executable python code.

one little piece of syntax not currently supported is the regfile read.
(RS) means "look up in the GPR using the bits from the field named RS of
the opcode", not "change the bits in the opcode"

unfortunately, assignment to RT means "change the regfile".

it's a little more non-obvious than at first glance.
Comment 3 Luke Kenneth Casson Leighton 2020-03-31 14:21:44 BST
okaay, i have this:

# Count Trailing Zeros Word

X-Form

* cnttzw RA,RS (Rc=0)
* cnttzw.  RA,RS (Rc=1)

    n <- 0

    do while n < 32
       if (RS)[63-n=] = 0b1 then
            leave
       n  <- n + 1

    RA <- EXTZ64(n)

which gets turned into:

    n = 0
    while n < 64:
        if GPR(RS)[63 - n] == 1:
            break
        n = n + 1
    RA = EXTZ64(n)

that may need to be "self.GPR(RS)" or just
"self.regfile.read_reg(RS")

hmmm nuts, it needs to be more like this, doesn't it?

            r1_sel = yield pdecode2.e.read_reg1.data
            operand1 = self.regfile.read_reg(r1_sel)

because we are pushing things through decode2. that's code from the
internal_ops simulator.

actually... y'know what?  i'm inclined to suggest skipping decode2
for this, because the "Form" (X-Form, etc.) is *already* encoded in
the spec.
Comment 4 Luke Kenneth Casson Leighton 2020-04-01 14:42:39 BST
aaaaa it's a horrible mess however it's functional.

i need to do a parser for the actual pages in openpower/isa, next.
btw i don't know why this didn't occur to me earlier, i've added
the wiki as a submodule into soc.

this allows us to treat the isa files as "local" (in the repo).
no need to try to grab them by wget.
Comment 5 Luke Kenneth Casson Leighton 2020-04-01 14:48:24 BST
the way it works is:

* lexers (PowerLexer) and filters sort out python-style indentation
* parser (PowerParser) turns pseudo-code into an AST
* as part of that parsing, registers that were read and written to
  are noted
* all reg index numbers (fields - RA, RB) that were "read" from GPR
  are "yield" obtained from the opcode.
* a variable named after the field is dropped into the dictionary
  of code to be executed
* code is executed
* all reg index numbers that were "written" in the code are "yield"
  obtained (again), and, from the dictionary of exec()'d code,
  the value is obtained, then stored in the "regfile".
Comment 6 Luke Kenneth Casson Leighton 2020-04-02 16:06:13 BST
ok subdirectory-reader completed, it parses the markdown files and
creates a dictionary of instructions:
https://git.libre-riscv.org/?p=soc.git;a=blob;f=src/soc/decoder/pseudo/pagereader.py;hb=HEAD

i therefore just completed a read on all pages at
http://libre-riscv.org/openpower/isa and made formatting-corrections

we *should* now be able to piece together code-generation followed by
actually plugging these into a simulator.

next task will be to auto-generate code-fragments.
Comment 7 Luke Kenneth Casson Leighton 2020-04-03 19:37:23 BST
michael i've started doing code-fragments, they're created with
decoder/pseudo/pywriter.py

you'll need to run "make gitupdate" in the top level of the soc
directory, that will pull in the git submodule for the wiki

can you take a look, see what the code "actually" would need to
look like?  i'd like to keep it really clean and simple.
Comment 8 Michael Nolan 2020-04-03 19:45:10 BST
(In reply to Luke Kenneth Casson Leighton from comment #7)

> can you take a look, see what the code "actually" would need to
> look like?  i'd like to keep it really clean and simple.

What do you mean? I'm looking at the code in fixedarith.py and it looks pretty good...
Comment 9 Luke Kenneth Casson Leighton 2020-04-03 20:11:26 BST
(In reply to Michael Nolan from comment #8)
> (In reply to Luke Kenneth Casson Leighton from comment #7)
> 
> > can you take a look, see what the code "actually" would need to
> > look like?  i'd like to keep it really clean and simple.
> 
> What do you mean? I'm looking at the code in fixedarith.py and it looks
> pretty good...

i know, right! :)

ok so take a look at... mmm... where is it...  power_pseudo.py
and do this:

diff --git a/src/soc/decoder/power_pseudo.py b/src/soc/decoder/power_pseudo.py
index 86306b0..d05ca25 100644
--- a/src/soc/decoder/power_pseudo.py
+++ b/src/soc/decoder/power_pseudo.py
@@ -96,10 +96,10 @@ D <- d0||d1||d2
 """
 
 #code = testreg
-#code = cnttzd
+code = cnttzd
 #code = cmpi
 #code = cmpeqb
-code = addpcis
+#code = addpcis
 #code = bpermd
 
 def tolist(num):


then that should modify register 17 to be an (incorrect *sigh*) value
except, arse, right now it doesn't work :)

oh i know why.  forgot to add __bool__.  sorted.  working now.

ok so, if you look at the bits that go round the "exec (code, d)",
you see that the registers are "pre-prepared" (pre-read) into
variables that are dropped into the dictionary (d).

this creates *local* variables (in d) named RA, RT, RB, RS, which can
end up being modified in the code-fragment.

following code-execution, the *modified* variables have to be "extracted"
for storage into the register file.  if they're in fact registers.

when *not* using exec() but instead writing out the code-fragments to
an actual file, we need some pre- and post- functions which are called
before and after each function, that do the exact same thing.

several ways to do that:

1) pass in a dictionary of variables that are
to be added to locals():

https://www.pythondoeswhat.com/2017/03/when-you-can-update-locals.html

then *return* a copy of locals() from the function.


2) make everything class-based (yuk) self.RA, self.RT, blah blah.

this one i feel looks messy.


3) use exec() treating the code as a string

   however using exec() means that if you get a syntax error it's
   impossible to work out where the hell it was.  it's also really slow.


4) do everything as modules (!) yes really.  modules can be executed on-demand

   https://docs.python.org/2/library/runpy.html

   it then becomes possible to drop the global variables into the module
   and also "extract" them afterwards.  will need investigation.

   also might need an __main__submodule.  maybe.   have to test it.


5) other
Comment 10 Michael Nolan 2020-04-03 20:45:05 BST
(In reply to Luke Kenneth Casson Leighton from comment #9)
> oh i know why.  forgot to add __bool__.  sorted.  working now.

Yeah the values look correct to me now.


> when *not* using exec() but instead writing out the code-fragments to
> an actual file, we need some pre- and post- functions which are called
> before and after each function, that do the exact same thing.
> 
> several ways to do that:
> 
> 1) pass in a dictionary of variables that are
> to be added to locals():

This seems a little odd to me, but I'd be ok with it.

> 2) make everything class-based (yuk) self.RA, self.RT, blah blah.
>
> this one i feel looks messy.

I don't have a problem with this one, as it seems easier to understand to me than locals() or dropping variables into a module
 
> 5) other

It'd make the generated code a little messier, but couldn't we do what power_pseudo.py is currently doing and pass in a dict as a second argument? This seems similar to option 2 in both pros and cons, in that the generated code isn't as nice looking but it's a bit easier to understand
Comment 11 Luke Kenneth Casson Leighton 2020-04-03 21:34:11 BST
(In reply to Michael Nolan from comment #10)
> (In reply to Luke Kenneth Casson Leighton from comment #9)
> > oh i know why.  forgot to add __bool__.  sorted.  working now.
> 
> Yeah the values look correct to me now.
> 
> 
> > when *not* using exec() but instead writing out the code-fragments to
> > an actual file, we need some pre- and post- functions which are called
> > before and after each function, that do the exact same thing.
> > 
> > several ways to do that:
> > 
> > 1) pass in a dictionary of variables that are
> > to be added to locals():
> 
> This seems a little odd to me, but I'd be ok with it.
> 
> > 2) make everything class-based (yuk) self.RA, self.RT, blah blah.
> >
> > this one i feel looks messy.
> 
> I don't have a problem with this one, as it seems easier to understand to me
> than locals() or dropping variables into a module
>  
> > 5) other
> 
> It'd make the generated code a little messier, but couldn't we do what
> power_pseudo.py is currently doing and pass in a dict as a second argument?

then use it as d['RA'] = xxx etc.

i know from experience when working for Pine Digital (we did surpriiise
a python compiler) that dictionary accesses are a *lot* slower than attribute
accesses.

> This seems similar to option 2 in both pros and cons, in that the generated
> code isn't as nice looking but it's a bit easier to understand


*dinggg* lightbulb.  decorators.

https://realpython.com/primer-on-python-decorators/#decorating-functions-with-arguments

i always tend to get confused with decorators so will write a quick test.
Comment 12 Luke Kenneth Casson Leighton 2020-04-03 21:55:21 BST
https://stackoverflow.com/questions/17862185/how-to-inject-variable-into-scope-with-a-decorator

nope.  epic fail.

suddenly realised, it's real simple:

def cmpi(RA, RB): # pass in regs as parameters
    code
    blah
    return (RS,) # return modified params as results.

duh.
Comment 13 Luke Kenneth Casson Leighton 2020-04-03 22:41:10 BST
(In reply to Luke Kenneth Casson Leighton from comment #12)

> def cmpi(RA, RB): # pass in regs as parameters
>     code
>     blah
>     return (RS,) # return modified params as results.
> 
> duh.

ok can you take a look?  (run pywriter.py again)

it does mean that we have to establish a "caller" system
which will also understand the number of parameters (called and returned).

in preparation for that, i got the auto-generated code to derive
from something called "ISACaller" - would you like to write that
class?

it should look up the instruction in the dictionary (ISACaller.instrs)
which you can see has been constructed (and appended at the bottom
of the derived class)

it contains the list of "variables" that need to be initialised
to zero (technically, uninitialised), the list of variables that
need to be "read" from the regfile (with a "yield") and, once
the function returns, the list of variables to be written to the
regfile.

to call a given function, you'll need those like this:

spec = self.instrs['cmpi']

actual_function, read_regs, uninit_regs, write_regs = spec

then merge read+unint (actually, move create_args() into soc.decoder.isa.something and import it)

so: function_arg_names = create_args(read_regs+uninit_regs)

create function_args based on that list

and then call it:

result = actual_function(*function_args)

match the result with write_regs using zip:

for (fname, result) in zip(result, write_regs):
     store result in GPR[fname]

can i leave that with you?
Comment 14 Michael Nolan 2020-04-03 23:55:10 BST
(In reply to Luke Kenneth Casson Leighton from comment #13)

> can i leave that with you?

yeah
Comment 15 Luke Kenneth Casson Leighton 2020-04-04 04:04:39 BST
(In reply to Michael Nolan from comment #14)
> (In reply to Luke Kenneth Casson Leighton from comment #13)
> 
> > can i leave that with you?
> 
> yeah

star.  technically once that base class is in place  it should be possible to go straight to dropping into internal_op.py

if you want to test out the new base class feel free to take one function from one autogenerated file, as a test/dev case and hack something together that works with interncal_op.

you will find however that much of decode2 is ignored (not useful).  the opcode *name* (not InternalOp), and you will need the Form name.

cry_in flags, add1, neg, etc. all ignored because the pseudocode on a case by case already does that. however carry_out on the other hand we need something for that.


i am still doing stuff (have to add % operator) i will keep ploughing through to get all of them compiled.

please do us a favour and apart from test cases don't add the compiled .py files to the repo :) yes they are .py however they are not the source code: the pseudocode is!
Comment 16 Luke Kenneth Casson Leighton 2020-04-04 17:45:56 BST
i'm trying _not_ to do ISACaller because i really need to move on
to the hardware-parser version

however, i'm laying the groundwork:

* as you saw in the bug #272 there's a problem with EXTS, which alerted
  me to adding the syntax (RA|0) - this meant adding a function
  "getz" to the prototype class GPR, and also inventing a new syntax
  "_REG" where "_REG" holds the actual bits of the opcode (RA, RB),
  and "REG" holds the *value from the regfile*.

* accessing GPR as GPR(5) is a function: i've converted it to GPR[5]
  then that alerted me to the fact that GPR and MEM do not exist in
  the opcode-function namespace.

  sooo i dragged out that namespace-injection-by-decorator trick,
  and have added a helper function.  this *may* be incorrect
  (as in, self.namespace has to be inside the decorator, not as
  an argument *to* the decorator).

i am going to keep chewing through the files.  moving onto fixedstore.mdwn
Comment 17 Michael Nolan 2020-04-04 20:27:13 BST
(In reply to Luke Kenneth Casson Leighton from comment #15) 
> star.  technically once that base class is in place  it should be possible
> to go straight to dropping into internal_op.py

I don't see an internal_op.py anywhere, is this instead for internalop_sim.py?
Comment 18 Luke Kenneth Casson Leighton 2020-04-04 20:38:42 BST
urrrr okaaay...

i just added parser support for this:

   MEM(EA, 4) <- GPR(r)[32:63]

which requires two nested depths of brackets:

   GPR (r) level 1
   GPR (r) [32:63] level 2

in the process, you can see i added a class Mem and also added
to the class GPR

https://git.libre-riscv.org/?p=soc.git;a=commitdiff;h=2ecec80b841136e052b6e7d39b76abaca77bb612

i don't know if you're familiar with __call__, michael: it allows
you to (literally) call an object as if it was a function.

so with gsc.gpr being an object of type GPR, rather than mess with
the parser any further i did two things:

1. assign a GPR instance into the dict d for the exec(), by key 'GPR'

2. add an __call__ to class GPR.

therefore, when the code sees 'GPR' - GPR(r) for example - it calls
GPR.__call__(5)

the return result from _that_ is then passed through a second level of
processing (a slice), and that gets assigned into Mem

an instance of Mem i *also* assigned into the dict d for the exec(),
by key 'MEM'.

this - again - through the Mem.__call__(addr, bytesize) trick - allows
*getting* memory.

however for setting i didn't want to overload dict because the key
is not just a single index (address), it's actually an address-bytesize
tuple:

   MEM(EA, 2) <- RA

which, hmm, whilst that *could* be turned into this:

   MEM[(EA,2)] <- RA

which would, if __setitem__ was overloaded, could allow for recognition
of the tuple, decoding it back into (addr, bytesize).

yuk, is my honest opinion :)

so instead, i "recognise" the pattern "MEM(addr, sz) <- x" back in the
parser "assignment" phase, and *REPLACE* it with this:

     memassign(addr, sz, x)

which you'll see in... hmmm i'm moving class Mem into decoder/isa/caller.py
ok?
Comment 19 Luke Kenneth Casson Leighton 2020-04-04 20:39:01 BST
(In reply to Michael Nolan from comment #17)

> I don't see an internal_op.py anywhere, is this instead for
> internalop_sim.py?

yes, sorry.
Comment 20 Michael Nolan 2020-04-04 21:28:36 BST
(In reply to Luke Kenneth Casson Leighton from comment #16)

> * as you saw in the bug #272 there's a problem with EXTS, which alerted
>   me to adding the syntax (RA|0)

If we're sure that power_decoder2 is correct, then all of the EXTS(...) and EXTZ(...) cases can be replaced with pdecode2.e.imm_data, because the decoder already handles sign/zero extending immediates.
Comment 21 Luke Kenneth Casson Leighton 2020-04-04 21:37:05 BST
(In reply to Michael Nolan from comment #20)
> (In reply to Luke Kenneth Casson Leighton from comment #16)
> 
> > * as you saw in the bug #272 there's a problem with EXTS, which alerted
> >   me to adding the syntax (RA|0)
> 
> If we're sure that power_decoder2 is correct, then all of the EXTS(...) and
> EXTZ(...) cases can be replaced with pdecode2.e.imm_data, because the
> decoder already handles sign/zero extending immediates.

mmm... yyeeees... except... that means modifying the pseudo-code which
comes directly from the spec.

this has two implications:

1) it's not a (strict) verification of the spec
2) as you point out, we'd be using something that we're not sure is correct

the more "direct", the better.

i'm not happy at the deviations from the spec that have had to be made!
the rather terrible solution is to have EXTZ and EXTS set the SelectableInt
bitlength to (say) 1,000,000 then use that to detect that when you get this:

   x + EXTS(d)

it means that in the addition, the rhs takes the bit-length of x.

okok we can't do 1,000,000 bits because in EXTS that would be a 1,000,000 bit
signed integer.  maybe... 256 bits or something.

i will modify SelectableInt so it is:

    def __add__(self, rhs):
          if rhs.bits == 256:
               rhs = SelectableInt(rhs.value, self.bits)

that will "do" as a workaround.
Comment 22 Luke Kenneth Casson Leighton 2020-04-04 21:39:27 BST
nice job btw

commit 237d4f455ae825941b0542819c02fdc87585808a
Author: Michael Nolan <mtnolan2640@gmail.com>
Date:   Sat Apr 4 16:23:01 2020 -0400

    Sorta working add instruction


oh.  i got into a horrible mess on the submodule, actually lost a
commit.  

if you get a rebase error just due to the submodule libreriscv (the wiki)
do let me know if you manage to fix it!

https://stackoverflow.com/questions/826715/how-do-i-manage-conflicts-with-git-submodules
Comment 23 Luke Kenneth Casson Leighton 2020-04-04 21:46:28 BST
fixedload done
fixedstore done
fixedlogical done
Comment 24 Luke Kenneth Casson Leighton 2020-04-04 22:36:56 BST
ah, michael, before you get too far into writing unit tests, you'll
find that the opcode reg spec information has been decoded from
the mdwn files, in pagereader.py opfields "regs".

feel free, back in pywriter.py, to push that out as an extra parameter
into the instrs[opcodename] tuple, then actually *use* it in the
unit tests (!)

oh also i think the form (X-Form) may be necessary to drop into that
tuple as well.  maybe.  probably.
Comment 25 Luke Kenneth Casson Leighton 2020-04-04 23:25:15 BST
oh, and one other thing... :)

the names of registers that are "recognised" are waay back in
decoder/pseudo/parser.py currently hard-coded

these need to be "lifted" from the current Form (Form-X)

however they will also need to be augmented by (p30):

* LR
* CTR
* CR (which can also be specified as CR0-CR7)
* XER
* FPSC

and TAR (p32)

oh, and instruction addresses:

* NIA (next)
* CIA (current)

and possibly some of the SPRs.
Comment 26 Jacob Lifshay 2020-04-05 05:28:47 BST
(In reply to Luke Kenneth Casson Leighton from comment #21)
> i'm not happy at the deviations from the spec that have had to be made!
> the rather terrible solution is to have EXTZ and EXTS set the SelectableInt
> bitlength to (say) 1,000,000 then use that to detect that when you get this:
> 
>    x + EXTS(d)
> 
> it means that in the addition, the rhs takes the bit-length of x.
> 
> okok we can't do 1,000,000 bits because in EXTS that would be a 1,000,000 bit
> signed integer.  maybe... 256 bits or something.
> 
> i will modify SelectableInt so it is:
> 
>     def __add__(self, rhs):
>           if rhs.bits == 256:
>                rhs = SelectableInt(rhs.value, self.bits)
> 
> that will "do" as a workaround.

See bug 278, comment 1 for a better workaround that doesn't require giant bit counts.
Comment 27 Michael Nolan 2020-04-05 19:23:19 BST
Hmm. I figured I'd be able to create a class with multiple types of instruction (so I can use loads and arithmetic instructions) like so:

class ISA(fixedarith, fixedload, fixedstore, ...):

But it seems that the different superclasses interfere with each other when they initialize their instrs dict by doing "instrs = {}"

I tried removing that from each generated class, and adding it to the ISACaller class, but now the generated classes can't find the dict from the superclass. It looks like I can get it to sort of work by rearranging the generated class like so: 

class fixedarith(ISACaller):
    @inject()
    def op_add(self, RA, RB):
        RT = RA + RB
        return (RT,)
    def __init__(self, dec, regs):
        super().__init__(dec, regs)
        
        self.instrs['add'] = (self.op_add, OrderedSet(['RA', 'RB']),
                OrderedSet(), OrderedSet(['RT']))

Is there a better way to do this?
Comment 28 Luke Kenneth Casson Leighton 2020-04-05 19:55:09 BST
(In reply to Michael Nolan from comment #27)
> Hmm. I figured I'd be able to create a class with multiple types of
> instruction (so I can use loads and arithmetic instructions) like so:
> 
> class ISA(fixedarith, fixedload, fixedstore, ...):
> 
> But it seems that the different superclasses interfere with each other when
> they initialize their instrs dict by doing "instrs = {}"

funny, i just wrote about this:
http://bugs.libre-riscv.org/show_bug.cgi?id=272#c19

> Is there a better way to do this?

i don't know about "better", one possibility is to use uniquely-named
dictionaries

commit 9276f73c76a342af9555bb4dd255f348c297f3ab (HEAD -> master, origin/master)
Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Sun Apr 5 19:54:30 2020 +0100

    drop instr info into uniquely-named dict
Comment 29 Luke Kenneth Casson Leighton 2020-04-05 20:05:15 BST
(In reply to Luke Kenneth Casson Leighton from comment #15)

> you will find however that much of decode2 is ignored (not useful).  the
> opcode *name* (not InternalOp), and you will need the Form name.

i added Form onto the end (and removed ISACaller as a base class)

class sprset:

    sprset_instrs = {}
    sprset_instrs['mtspr'] = (op_mtspr, OrderedSet(['RS']),
                OrderedSet(), OrderedSet(), 'XFX')
Comment 30 Luke Kenneth Casson Leighton 2020-04-05 20:52:21 BST
commit 6f640248c2bebfd5fd522cae2727bdbb0b9bf9a3
Author: Michael Nolan <mtnolan2640@gmail.com>
Date:   Sun Apr 5 15:41:23 2020 -0400

    Autogenerate all.py


haha i was wondering when you might do that :)


i just added Form name parsing, spotting when they're used in the
pseudo-code, and dropping them into a dictionary that's passed
back along with the other regs.

as the fields are supposed to be read-only (you don't modify the
actual instruction fields) the logic is much simpler.

it *should* be possible to "yield" the fields, from the Form
(sd.sigform[op.form]) and drop the values into the dictionary
for "injection".
Comment 31 Luke Kenneth Casson Leighton 2020-04-05 21:03:58 BST
i put a comment in where that would need to go:

lkcl@fizzy:~/src/libreriscv/soc/src/soc$ git diff d4c6
diff --git a/src/soc/decoder/isa/caller.py b/src/soc/decoder/isa/caller.py
index 5b5e3c1..5b6e158 100644
--- a/src/soc/decoder/isa/caller.py
+++ b/src/soc/decoder/isa/caller.py
@@ -123,17 +123,22 @@ class ISACaller:
     def memassign(self, ea, sz, val):
         self.mem.memassign(ea, sz, val)
 
-    def prep_namespace(self):
+    def prep_namespace(self, formname, op_fields):
+        # TODO: get field names from form in decoder*1* (not decoder2)
+        # decoder2 is hand-created, and decoder1.sigform is auto-generated
+        # from spec
+        # then "yield" fields only from op_fields rather than hard-coded
+        # list, here.
         for name in ['SI', 'UI', 'D', 'BD']:
             signal = getattr(self.decoder, name)
             val = yield signal
             self.namespace[name] = SelectableInt(val, bits=signal.width)
 
     def call(self, name):
Comment 32 Michael Nolan 2020-04-06 00:27:25 BST
I'm trying to get addpcis working (so I can test NIA and CIA without messing with branches), and the instruction builds the displacement from "concat(d0, d1, d2)". We don't have separate d0, d1, and d2 fields, but we do have a D field and a d0_d1_d2 field. Should this sort of thing (I expect we'll run into this again with shifts) be corrected in the ISA spec, or by adding the relevant fields in in fields.text?
Comment 33 Luke Kenneth Casson Leighton 2020-04-06 03:28:21 BST
(In reply to Michael Nolan from comment #32)
> I'm trying to get addpcis working (so I can test NIA and CIA without messing
> with branches), and the instruction builds the displacement from "concat(d0,
> d1, d2)". We don't have separate d0, d1, and d2 fields, but we do have a D
> field and a d0_d1_d2 field. Should this sort of thing (I expect we'll run
> into this again with shifts) be corrected in the ISA spec, or by adding the
> relevant fields in in fields.text?

they're already there, what i did however was to take d0,d1,d2 and cat them with underscores into a compound named field.

   d0,d1,d2 (16:25,11:15,31)
        Immediate fields that are concatenated to specify a
        16-bit signed two's complement integer which is
        sign-extended to 64 bits.
        Formats: DX
 
what should actually happen is that 3 fields should be created.

might need to keep the compound name as well.

hmmm line 225

https://git.libre-riscv.org/?p=soc.git;a=blob;f=src/soc/decoder/power_fields.py;h=26b59d46863fcd313e6d952b8f5fab22129afef0;hb=1117d7c3cd52cc73a757f3c6f62c6602551d7ad3#l225

i apologise, i write these types of line-based parsers on a one-off basis quite regularly, without expectation of modifying them, afterwards.

i think that one would be easier to process back in decode_instructions.

line 24

https://git.libre-riscv.org/?p=soc.git;a=blob;f=src/soc/decoder/power_fields.py;h=26b59d46863fcd313e6d952b8f5fab22129afef0;hb=1117d7c3cd52cc73a757f3c6f62c6602551d7ad3#l24

here it is just collecting d0,d1,d2 *without* splitting by commas

but in between the brackets *is* split

err nowait, line 213 is where the brackets are silently discarded (spec[1:-1])

then split by comma to recognise 16, 23:24, 31

then split by colon to recognise slices from indices

so hmmm..

suggest taking f from line 210 and splitting it by comma.  if no commas skip this next bit.

then take s from line 213

use zip on those two lists.

then *recreate* the lines as if they were actually in fields.text with "%s: [%s]"


 d0 : [16]
 d1 ....
 d2 ....

append those together and *re-call* decodeinstrfields, and do res.update() on the return dict.

only do that if f from line 210 has a comma in it.

sorry it is 3:30 am if that is totally confusing i can tackle it tomorrow morning
Comment 34 Luke Kenneth Casson Leighton 2020-04-06 03:35:14 BST
sorry should be

fs = f.split(",")
ss = spec[1:-1].split(",")
individualfields = []
for f0, s0 in zip(fs, ss):
  txt = "%s (%s)" 
  indfls.append(txt)
if len(fs) > 1:
  res.update(decodeinstrflds(indflds))

something like that, around line 211
Comment 35 Luke Kenneth Casson Leighton 2020-04-06 03:39:55 BST
(In reply to Michael Nolan from comment #32)

> field and a d0_d1_d2 field. Should this sort of thing (I expect we'll run
> into this again with shifts) be corrected in the ISA spec,

fields.txt was taken directly from the PDF. so modifying it is problematic.  it should be a direct correspondance with the PDF.

when we get the 3.0 spec source code this will become a lot easier.

and also those files will either go or they will themselves be aurogenerated directly from that spec source file.
Comment 36 Jacob Lifshay 2020-04-06 04:07:27 BST
(In reply to Luke Kenneth Casson Leighton from comment #35)
> (In reply to Michael Nolan from comment #32)
> 
> > field and a d0_d1_d2 field. Should this sort of thing (I expect we'll run
> > into this again with shifts) be corrected in the ISA spec,
> 
> fields.txt was taken directly from the PDF. so modifying it is problematic. 
> it should be a direct correspondance with the PDF.

maintain a set of patches for fields.txt?
Comment 37 Luke Kenneth Casson Leighton 2020-04-06 11:29:20 BST
(In reply to Jacob Lifshay from comment #36)

> maintain a set of patches for fields.txt?

i really don't think that's a route that should be explored.
if the spec is "wrong" then we should be raising it with the
OpenPOWER Foundation as a bug.

deviating from the spec means that we have to explain to people
why that has been done.

it's unnecessary, as well.  the field names were designed to be
unique (*sigh* except they're not - SPR opcode field clashes with 
SPR the global dictionary of SPRs....)
Comment 38 Luke Kenneth Casson Leighton 2020-04-06 12:18:15 BST
done, michael.  d0_d1_d2 is "non-standard".  probably should be dropped,
on reflection.


commit 3a40cddd16f77259c9f15edaa5aecb12f1a1324d (HEAD -> master, origin/master)
Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Mon Apr 6 12:15:28 2020 +0100

    add individual field-detection where field spec is "d0,d1,d2"



Form DX
    field d0 BitRange([(0, 16), (1, 17), (2, 18), (3, 19), (4, 20), (5, 21), (6, 22), (7, 23), (8, 24), (9, 25)])
    field d1 BitRange([(0, 11), (1, 12), (2, 13), (3, 14), (4, 15)])
    field d2 BitRange([(0, 31)])
    field d0_d1_d2 BitRange([(0, 16), (1, 17), (2, 18), (3, 19), (4, 20), (5, 21), (6, 22), (7, 23), (8, 24), (9, 25), (10, 11), (11, 12), (12, 13), (13, 14), (14, 15), (15, 31)])
    field RT BitRange([(0, 6), (1, 7), (2, 8), (3, 9), (4, 10)])
    field XO BitRange([(0, 26), (1, 27), (2, 28), (3, 29), (4, 30)])
Comment 39 Luke Kenneth Casson Leighton 2020-04-07 01:16:26 BST
https://git.libre-riscv.org/?p=soc.git;a=commitdiff;h=5d496a6b50ec3b9e00586e777651c60a1921688d

michael that should just be for fieldname in op_fields otherwise a lot more fields get added than is necessary.

op_fields includes the names of everything spotted during parsing... that is *not* a GPR.

no need to exclude RS RA RT either because i already excluded them back in the parser.
Comment 40 Luke Kenneth Casson Leighton 2020-04-07 16:55:09 BST
(In reply to Luke Kenneth Casson Leighton from comment #39)
> https://git.libre-riscv.org/?p=soc.git;a=commitdiff;
> h=5d496a6b50ec3b9e00586e777651c60a1921688d
> 
> michael that should just be for fieldname in op_fields otherwise a lot more
> fields get added than is necessary.
> 
> op_fields includes the names of everything spotted during parsing... that is
> *not* a GPR.
> 
> no need to exclude RS RA RT either because i already excluded them back in
> the parser.

commit 8e261f6ee07e2ccf271f8838daa372d3e2dcb2ed (HEAD -> master, origin/master)
Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Tue Apr 7 16:54:50 2020 +0100

    op_fields is passed over (and excludes register names)
Comment 41 Luke Kenneth Casson Leighton 2020-04-07 17:57:19 BST
michael i've just made a half-broken effort to add CR to the simulator.
to get it to work, i added CR (and other special regs) to the list of
parameters that will be passed in and also returned.  whether these
should be added to namespace is debatable (probably not), kept separate
instead.

i also created something called FieldSelectableInt which can "refer"
to a subset (of SelectableInt) just like in the Power Decoder.

the idea being to allow CR0 to be "accessed" (arithmetically) including
assignments, and have that actually go *directly* into CR.
Comment 42 Luke Kenneth Casson Leighton 2020-04-27 21:29:51 BST
declaring this one completed as the parser itself is done.
additions / enhancements to be added under a new bugreport.