1177 – create a way to do "do not store in regfile unless modified" in ISACaller

Bug 1177 - create a way to do "do not store in regfile unless modified" in ISACaller

Summary: create a way to do "do not store in regfile unless modified" in ISACaller

Status:	RESOLVED FIXED

Alias:	None

Product:	Libre-SOC's first SoC
Classification:	Unclassified
Component:	Source Code (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	Highest enhancement
Assignee:	Jacob Lifshay

URL:

Depends on:	1205 1207
Blocks:	1125 1136 1035
	Show dependency tree / graph

Reported:	2023-10-01 11:28 BST by Luke Kenneth Casson Leighton
Modified:	2023-11-14 23:19 GMT (History)
CC List:	2 users (show)

See Also:	1087 1197
NLnet milestone:	---
total budget (EUR) for completion of task and all subtasks:	0
budget (EUR) for this task, excluding subtasks' budget:	0
parent task for budget allocation:
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Luke Kenneth Casson Leighton 2023-10-01 11:28:50 BST

the change to the pseudocode language parser was not discussed and not authorized.

https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=870f7f232

it needs immediate reversion.

the way to handle this particular case is to *always* initialize
the variable (RT).

    if vex_flag = 0 then
        RT <- result
        FPSCR.FPRF <- undefined(0b00000)
        FPSCR.FR <- inc_flag
        FPSCR.FI <- xx_flag
    else
        RT <- undefined(0)
        FPSCR.FR <- 0
        FPSCR.FI <- 0

if there is *genuinely* an intention to not make a modification
to RT based on whether vex_flag is set or unset then that is a
different matter, but to create this type of unauthorized change
*without discussion* is just not acceptable.

it has taken months to find this, it was purely by chance.

the change should have been *raised and discussed*.

you cannot continue to blithely go ahead with making these kinds of changes
jacob on a repeated systematic long-term basis without there being consequences

Comment 1 Luke Kenneth Casson Leighton 2023-10-02 00:48:09 BST

ok i took a look at power isa 3.0 using vex_flags and yes it looks
like vex_flags is a "gate" to disable writing to RT.

it is therefore reasonable to "special-case" vex_flags just like
the use of "overflow" in ISACaller to stop RT from being written,
rather than poison the language by adding extraneous external
"guidance" in the markdown.

the language is supposed to be self-describing and self-contained.

the other way to do this is:

    GPR(RT) <- val

but that gets quite involved and is a bit more work, especially when
doing vectorisation and element-width overrides, so it is better to
add a hack to detect vex_flags as the "write-gate" just like overflow
is the write-gate to XER.OV/SO.

please please for goodness sake start describing your intentions when
thinking up these types of changes as there are implications of the
hacks that you come up with which are just not thought through, and
although they appear to "get the job done" in fact cause a lot of damage.

for future work please use branches only so that there is an opportunity
for a full review.

Comment 2 Luke Kenneth Casson Leighton 2023-10-02 00:52:07 BST

https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/isa/caller.py;h=ce5a80a;hb=HEAD#l2197

2197         # get output named "overflow" and "CR0"
2198         overflow = outs.get('overflow')
2199         cr0 = outs.get('CR0')
2200         cr1 = outs.get('CR1')
2201 
2202         # soc test_pipe_caller tests don't have oe
2203         has_oe = hasattr(self.dec2.e.do, "oe")
2204         # yeah just no. not in parallel processing
2205         if has_oe and not self.is_svp64_mode:
2206             # detect if overflow was in return result

this dreadful-hack in ISACaller which spots if there is an output
variable called "overflow" keeps the pseudocode clean, and i have
absolutely no problem with it.  but it *will* need documenting
that that is what is going on.

Comment 3 Jacob Lifshay 2023-10-03 05:10:12 BST

(In reply to Luke Kenneth Casson Leighton from comment #1)
> ok i took a look at power isa 3.0 using vex_flags and yes it looks
> like vex_flags is a "gate" to disable writing to RT.

not writing [F]RT is used for all FP traps so the trap handler can see the original input values rather than e.g. fadd f1, f1, f1 overwriting f1 with junk making it impossible to read the original value.

this isn't actually needed for cffpr since RT can't overwrite any inputs since they're in a different register file, but you asked that we retain that pseudocode anyway.
https://bugs.libre-soc.org/show_bug.cgi?id=1087#c30
> 
> it is therefore reasonable to "special-case" vex_flags just like
> the use of "overflow" in ISACaller to stop RT from being written,

imo a better approach is to have the simulator read all destination registers (marking the reads as ignored for the in-order pipeline simulator) before executing the pseudo-code, so *any* conditional overwrites will work without needing any vex_flags special casing (special casing should be avoided whenever possible). this also matches what a reasonable reader would expect the pseudocode to mean, where any unwritten outputs keep their original values.

so, like:
if (RA) = 0 then
    RT <- (RB)

would approximately translate to:
def op_theinsn(self, RA, RB, RT):
    if eq(RA, 0):
        RT = copy_it_icr_the_name(RB)
    return (RT,)
    
> rather than poison the language by adding extraneous external
> "guidance" in the markdown.

the way i saw it is that the pseudocode still described completely what's happening, the html comment is not part of the pseudocode and is merely there to inform our simulator that its overly-simplistic analysis of what registers need to be passed into the translated pseudocode need RT added too.

> 
> the language is supposed to be self-describing and self-contained.

imo it was even with the html comment.

> for future work please use branches only so that there is an opportunity
> for a full review.

imo our project is to the size where *everyone* should be working in separate branches and then merging to master only after both someone else okays the changes *and* CI passes on their branch.

Comment 4 Luke Kenneth Casson Leighton 2023-10-03 09:39:09 BST

(In reply to Jacob Lifshay from comment #3)
> imo a better approach is to have the simulator read all destination
> registers

nom nom nom.. thinking it over... this is a great idea / principle. i don't
know why it wasn't brought up before now :)

how to implement without causing chaos on some of the "overwrite" instructions
like ternlogi...

where, there, if done naively (always-write-back) the fact that the
instruction may choose any one of *3* registers to write-back, the
simulator could write *all* of them, corrupting results.

so... a solution to that: mirror what is in TestIssuer, with the
data/ok flag.  if you *really* want the register written, you set
the "ok" flag.

> (marking the reads as ignored for the in-order pipeline simulator)
> before executing the pseudo-code, so *any* conditional overwrites will work
> without needing any vex_flags special casing (special casing should be
> avoided whenever possible).

yes i never liked it (overflow-hack) but it worked pending a better solution.
this is a better solution.

> this also matches what a reasonable reader would
> expect the pseudocode to mean, where any unwritten outputs keep their
> original values.
> 
> so, like:
> if (RA) = 0 then
>     RT <- (RB)
> 
> would approximately translate to:
> def op_theinsn(self, RA, RB, RT):
>     if eq(RA, 0):
>         RT = copy_it_icr_the_name(RB)
>     return (RT,)

better:

     if eq(RA, 0):
         RT = copy_it_icr_the_name(RB)
         RT.ok = True
     return (RT,)

which then back in ISACaller it is detected "if outputreg.ok then store"

> the way i saw it is that the pseudocode still described completely what's
> happening, the html comment is not part of the pseudocode and is merely
> there to inform our simulator that its overly-simplistic analysis of what
> registers need to be passed into the translated pseudocode need RT added too.

unfortunately it becomes part of the entire process, the storage of information,
about how to do what needs to be done in the compiler / simulator.

for example, now the c-based compiler *also* needs to know about that HTML
parsing!

which is... bad.

> imo our project is to the size where *everyone* should be working in
> separate branches and then merging to master only after both someone else
> okays the changes *and* CI passes on their branch.

well, it's more that *if* everyone had the same level of working-knowledge
that i do (because i have been doing or working closely with literally
everything, hundreds of thousands of lines of code, now) *then* it would
not be necessary.

as the only person with the full working knowledge, it is "safe" for me to not
need CI, and to work in the master branches, but most definitely definitely
*not* safe for anyone else.

that was a mistake made by me to assume that everyone else would be able to
keep up - they can't, we know that now.

actually... this needs to go on the list. and also bug #1126. let's transfer
there.

in the meantime, back on topic for this bug: do let me know what you think
about the "ok" flag idea.

Comment 5 Jacob Lifshay 2023-10-04 07:18:00 BST

(In reply to Luke Kenneth Casson Leighton from comment #4)
> (In reply to Jacob Lifshay from comment #3)
> > imo a better approach is to have the simulator read all destination
> > registers
> 
> nom nom nom.. thinking it over... this is a great idea / principle. i don't
> know why it wasn't brought up before now :)

cuz no one thought of it before? :)

> 
> how to implement without causing chaos on some of the "overwrite"
> instructions
> like ternlogi...
> 
> where, there, if done naively (always-write-back) the fact that the
> instruction may choose any one of *3* registers to write-back, the
> simulator could write *all* of them, corrupting results.

well, for ternlogi specifically, since it always writes RT and the simulator knows RT is an output (due to the csvs), it isn't actually a problem. just do:
read gprs into RT, RA, RB variables
run pseudocode
write RT variable to gpr.

> 
> so... a solution to that: mirror what is in TestIssuer, with the
> data/ok flag.  if you *really* want the register written, you set
> the "ok" flag.

I think that's a good idea, with some caveats:

> better:
> 
>      if eq(RA, 0):
>          RT = copy_it_icr_the_name(RB)
>          RT.ok = True

SelectableInt should not be where we stick the .ok field, either we should have:

RT be a class with .data and .ok fields and all pseudocode read/writes of RT access .data
or have some kind of separate output for the `ok`s such as a dict or have each output fill two slots, like so:
* option #1:
    # we need to return RT and RS:
    return (RT, RT_ok, RS, RS_ok)
* option #2:
    return ((RT, RT_ok), (RS, RS_ok))
* option #3:
    def op_theinsn(self, RT, RA, RB, oks):
        if eq(RA, 0):
            RT = copy_it_icr_the_name(RB)
            oks['RT'] = True
        return (RT,)


though, tbh we should use a different name than `ok`, how about `written` or `outs`?

Comment 6 Jacob Lifshay 2023-11-01 03:24:08 GMT

(In reply to Jacob Lifshay from comment #5)
> * option #2:
>     return ((RT, RT_ok), (RS, RS_ok))

I'm picking this option, luke, since you never replied to comment #5 and since it seems the simplest. I'm working on this now since you just recently pushed a commit which breaks a lot of the fp-int conversion tests

Comment 7 Jacob Lifshay 2023-11-01 07:20:13 GMT

I have a WIP version (doesn't yet work):

I'm implementing it as a post-parsing transform over the Python AST, since I tried doing it while parsing and that quickly became very confusing due to all the parse nodes' types changing, so I decided that was a bad idea, and the AST transformer works much better. I have it generate an ...__ok variable for every name mentioned in a statement, since that's easier than filtering out just assignments (and SetFX), all the extra variables just get ignored.

So far, it seems to generate correct .py files, I just haven't finished implementing the corresponding ISACaller changes.

https://git.libre-soc.org/?p=openpower-isa.git;a=shortlog;h=1e42b9e14a5bb31552905bca3f46427beeedb55d

commit 1e42b9e14a5bb31552905bca3f46427beeedb55d
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Wed Nov 1 00:11:38 2023 -0700

    WIP adding ...__ok for all outputs

commit ef94f1b73ba3194b5f893229ff10d054b2b97393
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Tue Oct 31 22:50:40 2023 -0700

    misc AST correctness fixes

commit e5838b62aa181544d8013f444fa78abf92de3ecd
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Tue Oct 31 22:44:22 2023 -0700

    fix `fallthrough` handling
    
    previously it tried to call self.fallthrough() in mtspr's pseudo-code

Comment 8 Luke Kenneth Casson Leighton 2023-11-01 10:43:39 GMT

stopstopstop, i am not able to keep up.

it is very simple.

in GPR.set, set self.ok=True.

likewise in SPR.set.

if that does not work then do it in SelectableInt when value is
altered.

incredibly simple.  about... 5-10 lines of noninvasive code.

please try that first rather than making massive intrusive "clever"
2 stage compiler work.

Comment 9 Luke Kenneth Casson Leighton 2023-11-01 11:54:32 GMT

class fixedarith:
    @inject()
    def op_addi(self, RA):
        RT = GPR.getz(_RA) + self.EXTS(SI)
        return (RT,)

ok. it can be done by setting all "ok" flags in all inputs
to FALSE, followed by modifying SelectableInt to set
data.ok=True in its constructor and in all math operations.

https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/selectable_int.py;hb=HEAD

it can go into merge and eq.

FieldSelectableInt is not necessary as we never return FSI from
any operations. but worth keeping an eye on.

Comment 10 Jacob Lifshay 2023-11-01 21:51:50 GMT

(In reply to Luke Kenneth Casson Leighton from comment #9)
> ok. it can be done by setting all "ok" flags in all inputs
> to FALSE, followed by modifying SelectableInt to set
> data.ok=True in its constructor and in all math operations.

ok, yeah, that would work. I didn't initially pick that since I really think SelectableInt should just be a bit-string with nothing else...but it is the easiest place to put it.

I think it shouldn't be named `ok` tho, I think we need to pick a name that makes sense and when you first see it you don't need to look at the docs to figure out what it means (since people often don't). I think I'll use `touched` since what we really want to detect is if the variable hasn't been written or replaced. touched will be set by all operations that produce a SelectableInt, including copies.

> FieldSelectableInt is not necessary as we never return FSI from
> any operations. but worth keeping an eye on.

FieldSelectableInt is converted to SelectableInt by copy_assign_rhs anyway...we don't want:
a <- b[3:5]
a[1] <- 0  # modifies b too

Comment 11 Luke Kenneth Casson Leighton 2023-11-01 23:40:10 GMT

(In reply to Jacob Lifshay from comment #10)

> ok, yeah, that would work. I didn't initially pick that since I really think
> SelectableInt should just be a bit-string with nothing else...

abolutely not.

performance is low enough as it is.


> I think it shouldn't be named `ok` tho, I think we need to pick a name that
> makes sense and when you first see it you don't need to look at the docs to
> figure out what it means (since people often don't). I think I'll use
> `touched` since what we really want to detect is if the variable hasn't been
> written or replaced. touched will be set by all operations that produce a
> SelectableInt, including copies.
>0

no because it matches the HDL which already has precedet. and is short.
ensure that the comment links here and also says "this matches the HDL
it is a stupid name" or something more appropriate but explicitly clear.

comments are precisely and exactly to warn people about counterintuitive
things.


with some thought i realised it only needs setting to True in the
constructor.

then explicitly set ok=False on all "read" (input) parameters.

2174         # main input registers (RT, RA ...)
2175         inputs = []
2176         for name in input_names:
2177             regval = (yield from self.get_input(name, ew_src))
2178             log("regval name", name, regval)
       ->        regval.ok = False
2179             inputs.append(regval)



however when passing in RT (and other write/output parameters)
those must *also* have their ok flag set to False.

when the python-autogenerated-code replaces RT it will be replaced
with a SelectableInt that has an "ok=True"

Comment 12 Jacob Lifshay 2023-11-02 02:03:11 GMT

(In reply to Luke Kenneth Casson Leighton from comment #11)
> (In reply to Jacob Lifshay from comment #10)
> 
> > ok, yeah, that would work. I didn't initially pick that since I really think
> > SelectableInt should just be a bit-string with nothing else...
> 
> abolutely not.
> 
> performance is low enough as it is.

I don't see how that affects performance.

> no because it matches the HDL which already has precedet. and is short.
> ensure that the comment links here and also says "this matches the HDL
> it is a stupid name" or something more appropriate but explicitly clear.

I just added docs for all SelectableInt fields (except overflow, cuz I forgot):
    value: int
        the bits contained by `self`
    bits: int
        the number of bits contained by `self`.
    ok: bool
        a flag to detect if outputs have been written by pseudo-code

        instruction inputs have `ok` set to `False`, all changed or new
        SelectableInt instances set `ok` to `True`.

https://git.libre-soc.org/?p=openpower-isa.git;a=shortlog;h=afa6db74d7dca5ccb83e1186a0f49c85bf810921

rebased programmerjake/bug-1177 on master:

commit afa6db74d7dca5ccb83e1186a0f49c85bf810921
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Wed Nov 1 18:51:23 2023 -0700

    WIP fixing bug #1177

commit ffc3710214a7c4828f994fddffcba6f4383603ac
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Wed Nov 1 18:36:10 2023 -0700

    format code

commit 370b69a982df2dbfba3675e4a3a2f998555df1db
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Tue Oct 31 22:50:40 2023 -0700

    misc AST correctness fixes

I dropped the fallthrough commit, since that needs a better fix: bug #1197

Comment 13 Luke Kenneth Casson Leighton 2023-11-02 06:54:36 GMT

(In reply to Jacob Lifshay from comment #12)

> I don't see how that affects performance.

integers in python have always been a special high-performance
path in the python byte-code interpreter.

> I just added docs for all SelectableInt fields

ahh that's a great idea.

Comment 14 Luke Kenneth Casson Leighton 2023-11-02 07:08:28 GMT

         # XXX TODO: allow CR-vector to be written out even if ffirst fails
         if not ffirst_hit or vli:
             for name, output in outs.items():
-                yield from self.check_write(info, name, output, ca_en, ew_dst)
+                out_ok = outs_ok[name]
+                yield from self.check_write(
+                    info, name, output, ca_en, ew_dst, out_ok)


ok i went to some trouble to ensure that the number of arguments
was below 80 chars to avoid this type of indentation. it is
"not clean or clear" (visual noise)

there will only ever be one call to check_write.
can you please kindly put that back the way it was, and use:

        if not outs_ok[name]:
           log("skipping writing output with .ok=False", name, output)
            continue
        yield from self.check_write(info, name, output, ca_en, ew_dst)

it achieves the same effect with less noise and less lines. reduce the words in
the log call to get it to stay on one line.
(this code is so complex a Finite State Machine that it *has* to remain
"uncluttered", both visually and functionally, with REALLY clear comments).

Comment 15 Jacob Lifshay 2023-11-02 07:27:00 GMT

(In reply to Luke Kenneth Casson Leighton from comment #13)
> (In reply to Jacob Lifshay from comment #12)
> 
> > I don't see how that affects performance.
> 
> integers in python have always been a special high-performance
> path in the python byte-code interpreter.

i didn't say don't use an integer -- you can make a bitstring out of an integer, like SelectableInt does (except SelectableInt also has some extraneous fields: ok and overflow).

though, if we had a C extension for bitstrings, it'd probably be faster than SelectableInt is (though maybe harder to understand).

actually, now that I think of it, using __slots__ in SelectableInt would probably make it faster than SelectableInt is now. another thing that'd make it faster is not logging in all the SelectableInt methods.

though, improving SelectableInt probably won't change the overall simulator's speed by much, since last time I benchmarked it, nmigen's simulator for the decoder was taking up like 50% of the time.

Comment 16 Luke Kenneth Casson Leighton 2023-11-02 07:48:18 GMT

(In reply to Jacob Lifshay from comment #15)

> actually, now that I think of it, using __slots__ in SelectableInt would
> probably make it faster than SelectableInt is now. another thing that'd make
> it faster is not logging in all the SelectableInt methods.

good ideas. log both of these as "Future"

> though, improving SelectableInt probably won't change the overall
> simulator's speed by much, since last time I benchmarked it, nmigen's
> simulator for the decoder was taking up like 50% of the time.

that's what was *supposed* to be done under cavatools but unfortunately
dmitry didn't listen and rushed ahead.

Comment 17 Jacob Lifshay 2023-11-02 07:58:57 GMT

(In reply to Luke Kenneth Casson Leighton from comment #16)
> (In reply to Jacob Lifshay from comment #15)
> 
> > actually, now that I think of it, using __slots__ in SelectableInt would
> > probably make it faster than SelectableInt is now. another thing that'd make
> > it faster is not logging in all the SelectableInt methods.
> 
> good ideas. log both of these as "Future"

actually, i'll just add __slots__ when fixing this bug, since it's one line of code and super easy:
class SelectableInt:
    __slots__ = 'value', 'bits', 'overflow', 'ok'
    ...  # rest of class

deleting the log() calls is also pretty easy, i just haven't done it since i thought you'd complain about removing debugging code.

Comment 18 Luke Kenneth Casson Leighton 2023-11-02 10:52:32 GMT

(In reply to Jacob Lifshay from comment #17)

> actually, i'll just add __slots__ when fixing this bug, since it's one line
> of code and super easy:
> class SelectableInt:
>     __slots__ = 'value', 'bits', 'overflow', 'ok'
>     ...  # rest of class

pffh awesome.

> 
> deleting the log() calls is also pretty easy, i just haven't done it since i
> thought you'd complain about removing debugging code.

no it is really important, as when examining the behaviour, *not* writing
to the regfile is incredibly important to know, just as much as when it does.

remember when debugging i read the *entire* log, it's that complex a FSM,
knowing what gets called in what order is absolutely essential.

Comment 19 Jacob Lifshay 2023-11-06 04:58:07 GMT

i rebased on master and made further progress fixing all the test failures introduced by merging the shriya_add_descriptions branch:

https://git.libre-soc.org/?p=openpower-isa.git;a=shortlog;h=refs/heads/programmerjake/bug-1177

Comment 20 Jacob Lifshay 2023-11-07 05:54:47 GMT

Ok, I finally fixed all the issues caused by merging shriya_add_descriptions.

we're back to 6 test failures (same as before). The test failures are listed at the end of this comment.

I also fixed sc and the previously-working tests for it, since SRR1 is specified to have bits 33:36 set to 0, 42:43 set to LEV (always 0 for now), and 44:47 set to 0. SRR1[TRAP] is *not* set.

calling TRAP isn't really what sc's pseudo-code should be doing (v3.1B just writes NIA and MSR directly), but I left that mostly un-modified, just passed in None so it knows not to try to set SRR1[TRAP].

So, can I merge to master?

https://git.libre-soc.org/?p=openpower-isa.git;a=shortlog;h=aa32b6a5063f32306a8b7e34f4d7741ea58cb79d

commit aa32b6a5063f32306a8b7e34f4d7741ea58cb79d
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Mon Nov 6 20:54:52 2023 -0800

    misc fixes for fallout of copying insn inputs

commit d87097d170c563cb8a24856854ec8628b5fcb7e0
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Mon Nov 6 20:38:18 2023 -0800

    System Call Interrupts do *not* set SRR1[TRAP]
    
    See PowerISA v3.1B Book III 7.5.14

commit 5df2836830494706e1d1a39bdf2470b26b32154c
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Mon Nov 6 20:37:07 2023 -0800

    support TRAP being called without setting a trap_bit

commit d60bd9dca97512869b8a811086b7620904b6ffc5
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Mon Nov 6 20:54:05 2023 -0800

    only write outputs that have .ok == True

commit b347cc3e750a82b2a94154636f1f0e1347fc98d7
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Mon Nov 6 20:49:19 2023 -0800

    use create_full_args to generate insn arg list

commit 8710b3b496d1e38877ae6123e565c790359dc8a4
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Mon Nov 6 20:46:03 2023 -0800

    add SelectableInt.ok

commit e554bd3457831925ea4f307a258d8f3cf8083825
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Mon Nov 6 20:43:13 2023 -0800

    helper for one-source-of-truth for insn argument list for ISACaller and parser

commit 837d3f162bf071dc21fb4c7a7517a520b9cc2ae8
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Mon Nov 6 20:41:11 2023 -0800

    copy_assign_rhs must retain subclasses of SelectableInt

commit 392744106fe2e7c59def9e19d81bfb711d94251e
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Sun Nov 5 18:14:29 2023 -0800

    log load/stores to InstrInOuts

commit 2e76c0cbf9ef1620c32a1b7c3ccdc1369366a5ef
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Wed Nov 1 18:36:10 2023 -0700

    format code

commit 34bdb3eba18e55c8620893aae235559919fd951a
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Tue Oct 31 22:50:40 2023 -0700

    misc AST correctness fixes


Test Failures:
[case_2_rfid] SUBFAIL src/openpower/decoder/isa/test_caller_trap.py::TrapTest::test
[7:sv.cmp/zz/ff=gt/m=r3] SUBFAIL src/openpower/sv/trans/test_pysvp64dis.py::SVSTATETestCase::test_20_cmp
[0:sv.rldimi] SUBFAIL src/openpower/sv/trans/test_pysvp64dis.py::SVSTATETestCase::test_36_extras_rldimi
[0:rldimi] SUBFAIL src/openpower/sv/trans/test_pysvp64dis.py::SVSTATETestCase::test_37_extras_rldimi
[0:sv.rldimi.] SUBFAIL src/openpower/sv/trans/test_pysvp64dis.py::SVSTATETestCase::test_36_extras_rldimi_
[case_sc] SUBFAIL src/openpower/decoder/isa/test_caller_syscall.py::TestSysCall::test
= 6 failed, 461 passed, 75 skipped, 19 xfailed, 822 warnings in 403.70s (0:06:43) =

Comment 21 Luke Kenneth Casson Leighton 2023-11-07 06:58:25 GMT

(In reply to Jacob Lifshay from comment #20)
> Ok, I finally fixed all the issues caused by merging shriya_add_descriptions.
> 
> we're back to 6 test failures (same as before). The test failures are listed
> at the end of this comment.

fantastic.

> I also fixed sc and the previously-working tests for it, since SRR1 is
> specified to have bits 33:36 set to 0, 42:43 set to LEV (always 0 for now),
> and 44:47 set to 0. SRR1[TRAP] is *not* set.

please revert aaaalll of them. we are compatible with MICROWATT.
i keep having to tell you this.

the microwatt team had to entirely disregard hypervisor mode and
make some extremely complex investigations and subsequent modifications
to the linux kernel.

we CANNOT afford to go "off reservation" here, it will be LITERALLY months
if not over a year's worth of extra work.

> calling TRAP isn't really what sc's pseudo-code should be doing (v3.1B just
> writes NIA and MSR directly), but I left that mostly un-modified, just
> passed in None so it knows not to try to set SRR1[TRAP].

put it back please.

> So, can I merge to master?

not until those unauthorized sc changes are fully reverted.

Paul Mackerras and the rest of the IBM Research team spent months getting
microwatt and the linux-kernel-5.7 into shape.

if you make changes like this we end up NOT BEING ABLE TO RUN LINUX.

raise a *completely separate* bugreport please so that a FULL investigation
can be made including running the appropriate TestIssuer unit tests *and*
ensuring microwatt interoperability.

keep the changes you made around (in *another* branch) referenced under the
new bugreport, so that "git gc" does not lose them.  name to be prefixed
with the bugreport number (also please do stop doing "personalname/description",
use "bugnumber_description" instead, as a project standard)

Comment 22 Jacob Lifshay 2023-11-07 08:56:13 GMT

(edit: please continue conversation on sc on bug #1207)

(In reply to Luke Kenneth Casson Leighton from comment #21)
> (In reply to Jacob Lifshay from comment #20)
> > I also fixed sc and the previously-working tests for it, since SRR1 is
> > specified to have bits 33:36 set to 0, 42:43 set to LEV (always 0 for now),
> > and 44:47 set to 0. SRR1[TRAP] is *not* set.
> 
> please revert aaaalll of them. we are compatible with MICROWATT.
> i keep having to tell you this.

well, my changes make us *compatible* with microwatt again:
on sc-test branch (based off latest master):
https://git.libre-soc.org/?p=microwatt.git;a=commitdiff;h=44eab364aee1a7cb6198cb148d68a065715a434a

make -C sc_test
ln -sf sc_test/hello_world.bin main_ram.bin
make DOCKER=1
./core_tb > /dev/null

gives the following output on my computer:
before sc:
SRR0 = 0x0000000012345678 SRR1 = 0x0000000000005678 MSR = 0x90000000783F2903
sc
after sc:
SRR0 = 0x000000000000112C SRR1 = 0x8000000000002903 MSR = 0x80000000783F0001
before sc:
SRR0 = 0x0000000012345678 SRR1 = 0x0000000000005678 MSR = 0x9000000000000003
sc
after sc:
SRR0 = 0x000000000000112C SRR1 = 0x8000000000000003 MSR = 0x8000000000000001
<snip>

as you can see, microwatt sc doesn't set SRR1[TRAP] (which would be 0x20000), and it also clears all the SRR1 bits that sc is supposed to
(basically SRR1 &= ~0x783f0000)

Comment 23 Luke Kenneth Casson Leighton 2023-11-07 14:55:31 GMT

(In reply to Jacob Lifshay from comment #22)
> (edit: please continue conversation on sc on bug #1207)
> 
> (In reply to Luke Kenneth Casson Leighton from comment #21)
> > (In reply to Jacob Lifshay from comment #20)
> > > I also fixed sc and the previously-working tests for it, since SRR1 is
> > > specified to have bits 33:36 set to 0, 42:43 set to LEV (always 0 for now),
> > > and 44:47 set to 0. SRR1[TRAP] is *not* set.
> > 
> > please revert aaaalll of them. we are compatible with MICROWATT.
> > i keep having to tell you this.
> 
> well, my changes make us *compatible* with microwatt again:

okaay,interesting. can you check that they don't set the LEV bit,
in OP_SC? implementing Hypervisor is completely off-limits.

> on sc-test branch (based off latest master):
> https://git.libre-soc.org/?p=microwatt.git;a=commitdiff;
> h=44eab364aee1a7cb6198cb148d68a065715a434a

oh! good idea. niice teeest :)

> make -C sc_test
> ln -sf sc_test/hello_world.bin main_ram.bin
> make DOCKER=1
> ./core_tb > /dev/null
> 
> gives the following output on my computer:
> before sc:
> SRR0 = 0x0000000012345678 SRR1 = 0x0000000000005678 MSR = 0x90000000783F2903
> sc
> after sc:
> SRR0 = 0x000000000000112C SRR1 = 0x8000000000002903 MSR = 0x80000000783F0001
> before sc:
> SRR0 = 0x0000000012345678 SRR1 = 0x0000000000005678 MSR = 0x9000000000000003
> sc
> after sc:
> SRR0 = 0x000000000000112C SRR1 = 0x8000000000000003 MSR = 0x8000000000000001
> <snip>
> 
> as you can see, microwatt sc doesn't set SRR1[TRAP] (which would be
> 0x20000), and it also clears all the SRR1 bits that sc is supposed to
> (basically SRR1 &= ~0x783f0000)

ok let's track the source... execute1.vhdl

1055             when OP_SC =>
1056                 -- check bit 1 of the instruction is 1 so we know this is sc;
1057                 -- 0 would mean scv, so generate an illegal instruction interrupt
1058                 if e_in.insn(1) = '1' then
1059                     v.trap := '1';
1060                     v.advance_nia := '1';
1061                     v.e.intr_vec := 16#C00#;
1062                     if e_in.valid = '1' then
1063                         report "sc";
1064                     end if;
1065                 else
1066                     illegal := '1';
1067                 end if;

so trap=1, intrvec=0xc00...

ahh but for OP_TRAP

1093             when OP_CMP =>
1094             when OP_TRAP =>
1095                 -- trap instructions (tw, twi, td, tdi)
1096                 v.e.intr_vec := 16#700#;
1097                 -- set bit 46 to say trap occurred
1098                 v.e.srr1(47 - 46) := '1';

YYEP,nice catch jacob,sorry for freaking out. yes use a "None" option but
make sure it is documented. i must have missed that when doing the original
HDL/test

Comment 24 Jacob Lifshay 2023-11-09 02:52:45 GMT

(In reply to Luke Kenneth Casson Leighton from comment #21)
> (In reply to Jacob Lifshay from comment #20)
> > So, can I merge to master?
> 
> not until those unauthorized sc changes are fully reverted.

ok, since the sc issues have been fully resolved and we're keeping the changes I made (bug #1207), can I merge now?

Comment 25 Luke Kenneth Casson Leighton 2023-11-12 21:14:21 GMT

From git.libre-soc.org:openpower-isa
   aa32b6a5..7062e727  programmerjake/bug-1177 -> origin/programmerjake/bug-1177

jacob i asked you not to use that style. you are not the "personal private owner"
as we do not run that style of project.

please *listen*

i said *use bugnumber_description* not
personalnameindicatingrwstrictiveownership/bugnumber.

Comment 26 Jacob Lifshay 2023-11-12 21:26:31 GMT

(In reply to Luke Kenneth Casson Leighton from comment #25)
> From git.libre-soc.org:openpower-isa
>    aa32b6a5..7062e727  programmerjake/bug-1177 ->
> origin/programmerjake/bug-1177
> 
> jacob i asked you not to use that style. you are not the "personal private
> owner"
> as we do not run that style of project.

yes. I will do that for all future branches, I didn't rename existing branches because that means all the links and references to that branch no longer work.

Comment 27 Luke Kenneth Casson Leighton 2023-11-13 02:09:35 GMT

(In reply to Jacob Lifshay from comment #26)

> yes. I will do that for all future branches, I didn't rename existing
> branches because that means all the links and references to that branch no
> longer work.

then you should have communicated that (unilateral, unauthorized)
decision to me in order to avoid placing me into distress.

Comment 28 Jacob Lifshay 2023-11-13 02:14:23 GMT

(In reply to Luke Kenneth Casson Leighton from comment #27)
> then you should have communicated

sorry, I forgot.

Comment 29 Luke Kenneth Casson Leighton 2023-11-13 10:23:14 GMT

(In reply to Jacob Lifshay from comment #28)

> sorry, I forgot.

not a problem jacob.

Comment 30 Jacob Lifshay 2023-11-13 10:29:52 GMT

(In reply to Jacob Lifshay from comment #24)
> can I merge now?

I'll merge on tue unless i get told to merge now or to wait.

Comment 31 Jacob Lifshay 2023-11-14 23:19:46 GMT

(In reply to Jacob Lifshay from comment #30)
> (In reply to Jacob Lifshay from comment #24)
> > can I merge now?
> 
> I'll merge on tue unless i get told to merge now or to wait.

rebased and merged. tests pass on my computer:
https://git.libre-soc.org/?p=openpower-isa.git;a=shortlog;h=22970a2dd5e9dadd01256410e0c1d2b6854f3931