980 – Implement C-based Power ISA pseudocode compiler

Bug 980 - Implement C-based Power ISA pseudocode compiler

Summary: Implement C-based Power ISA pseudocode compiler

Status:	RESOLVED FIXED

Alias:	None

Product:	Libre-SOC's first SoC
Classification:	Unclassified
Component:	Source Code (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	--- enhancement
Assignee:	Dmitry Selyutin

URL:

Depends on:	1094
Blocks:
	Show dependency tree / graph

Reported:	2022-12-08 18:26 GMT by Dmitry Selyutin
Modified:	2024-02-04 22:46 GMT (History)
CC List:	4 users (show)

See Also:	979 939 1082 1212 1162 1250 1251 1252
NLnet milestone:	NLnet.2021-08-071.cavatools
total budget (EUR) for completion of task and all subtasks:	5500
budget (EUR) for this task, excluding subtasks' budget:	5500
parent task for budget allocation:	939
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:	ghostmansd={amount=5500,submitted=2024-01-18,paid=2024-01-29}

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Dmitry Selyutin 2022-12-08 18:26:17 GMT

Besides the actual decoding to understand which instruction we're going to simulate, we also need a way to specify the actual logic behind the instruction to be executed. We already have human-readable specification in Markdown format, and successfully used it for our reference Python-based simulator. We need to follow the same approach for cavatools, and generate the code for all PowerISA instructions.

Comment 1 Dmitry Selyutin 2022-12-08 18:27:58 GMT

An important remark: even though we're able to execute the instruction only after it's been decoded, the decoder-related task (#979) is not a show stopper here. These tasks can be completed in parallel.

Comment 2 Luke Kenneth Casson Leighton 2023-05-29 21:43:12 BST

IRC discussion
https://libre-soc.org/irclog/%23libre-soc.2023-05-29.log.html#t2023-05-29T20:10:07

Comment 3 Jacob Lifshay 2023-05-29 22:04:55 BST

i recommend using memcpy instead of *(uint16_t)addr, since it avoids C language UB and generally gets compiled to a single load/store, not a function call:
https://gcc.godbolt.org/z/sj9o1novn

#include <stdint.h>
#include <string.h>

uint16_t load(uint8_t *mem, uint64_t EA) {
    uint16_t retval;
    memcpy(&retval, &mem[EA], 2);
    return retval;
}

load:                                   # @load
        lhzx 3, 3, 4
        blr
        .long   0
        .quad   0

Comment 4 Andrey Miroshnikov 2023-09-22 14:41:42 BST

Dmitry, is this task doable?
Again apologies, I don't remember if we agreed this task is actually possible within the time constraints. My chart on the discussion page proposed to cancel this task.

Comment 5 Luke Kenneth Casson Leighton 2023-11-26 17:54:42 GMT

(In reply to Andrey Miroshnikov from comment #4)
> Dmitry, is this task doable?

yes.

> Again apologies, I don't remember if we agreed this task is actually
> possible within the time constraints. 

it is.

> My chart on the discussion page
> proposed to cancel this task.

no.

Comment 6 Dmitry Selyutin 2023-11-26 19:01:25 GMT

Hi guys! Andrey, I'm sorry, I've missed your message.

(In reply to Andrey Miroshnikov from comment #4)
(In reply to Luke Kenneth Casson Leighton from comment #5)
> (In reply to Andrey Miroshnikov from comment #4)
> > Dmitry, is this task doable?
> 
> yes.
> 
> > Again apologies, I don't remember if we agreed this task is actually
> > possible within the time constraints. 
> 
> it is.
> 
> > My chart on the discussion page
> > proposed to cancel this task.
> 
> no.

Luke, I agree with you it's both doable and can be kept, but we have to establish the criteria, otherwise we'll put it under risk. We'd better be damn careful! Here's what I suggest:

1. Stick to developing the key concepts for code generation. Likely only vanilla PPC at this stage.
2. Generating something to be reused as yet another library; this allows us to combine it with libopid. I'm thinking of generating a table of per-instruction callbacks along with their bodies and some context around, like this:

/*
 * Intended to be called from the decoder.
 * Prefix is NOP for vanilla PPC.
 * Just a quick approximation, something more complex is needed.
 */
static void
opsim_add(struct opsim_ctx *ctx,
        uint64_t prefix, uint64_t suffix) {
    /* can be optimized later, don't care for now */
    uint64_t RT = opsim_reg_fetch(ctx, opsim_add_RT(suffix));
    uint64_t RA = opsim_reg_fetch(ctx, opsim_add_RA(suffix));
    uint64_t RB = opsim_reg_fetch(ctx, opsim_add_RB(suffix));

    RT = (RA + RB);

    opsim_reg_store(ctx, opsim_add_RT(suffix), RT);
}

or even:

static void
opsim_add(struct opsim_ctx *ctx,
        uint64_t prefix, uint64_t suffix) {
    opsim_reg_store(ctx, opsim_add_RT(suffix),
        (opsim_reg_fetch(ctx, opsim_add_RA(suffix)) +
         opsim_reg_fetch(ctx, opsim_add_RB(suffix)))
    );
}

3. Nothing to be touched yet in the existing codegen, we'll just add C generation mode into the current generator, due to time constraints. Yes there's a room for improvement, some things might be done differently. But for now -- no changes unless lack of them stops us. We'll be under a huge risk otherwise.

Comment 7 Luke Kenneth Casson Leighton 2023-11-26 19:22:55 GMT

(In reply to Dmitry Selyutin from comment #6)

> Luke, I agree with you it's both doable and can be kept, but we have to
> establish the criteria, otherwise we'll put it under risk.

yes. i recommend a compatible drop-in replacement for PowerDecoder.
(power_decoder.py) which is easier because the existing unit tests
can be used (like, all of them: TestIssuer as well as ISACaller).

this means a cffi interface to return PowerOp instances.

do not attempt to rewrite PowerOp, it is too fundamental.
a replacement class that does the exact same job, and is used
*only* by the new c-based decoder: not a problem.

Comment 8 Luke Kenneth Casson Leighton 2023-11-26 19:27:06 GMT

(In reply to Luke Kenneth Casson Leighton from comment #7)

> yes. i recommend a compatible drop-in replacement for PowerDecoder.
> (power_decoder.py) which is easier because the existing unit tests
> can be used (like, all of them: TestIssuer as well as ISACaller).

oh hang on sorry, pseudocode compiler not power decoder.


(In reply to Dmitry Selyutin from comment #6)

> 3. Nothing to be touched yet in the existing codegen, we'll just add C
> generation mode into the current generator, due to time constraints.

yes. cffi interface as well, to allow unit tests comparing against
existing python-based simulator, ultimately i want it to *replace* the
python-based compiler.

skip functions such as EXTS and others (Power ISA Section 1.3) for now.

> Yes
> there's a room for improvement, some things might be done differently. But
> for now -- no changes unless lack of them stops us. We'll be under a huge
> risk otherwise.

v. true

Comment 9 Luke Kenneth Casson Leighton 2023-11-26 19:29:00 GMT

(In reply to Dmitry Selyutin from comment #6)

>     opsim_reg_store(ctx, opsim_add_RT(suffix), RT);

opsim_reg_store must set an "ok" bit in a struct to indicate that the
register has been modified.

long story.

Comment 10 Dmitry Selyutin 2023-12-03 11:38:55 GMT

OK, first thing I did was replacing print() incantations with log(). I want to at least be able to observe MY debug instead of everything else. You should observe no differences unless you explicitly incantate SILENCELOG=true.

https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=659f58e0144c0ddb59439501ed9c50ba993f92fe

Comment 11 Luke Kenneth Casson Leighton 2023-12-03 17:56:06 GMT

(In reply to Dmitry Selyutin from comment #10)
> OK, first thing I did was replacing print() incantations with log().

goood. i did some yesterday, already.

>  I want
> to at least be able to observe MY debug instead of everything else. You
> should observe no differences unless you explicitly incantate
> SILENCELOG=true.

fantastic. i have no problem with that at all.
(as i am dealing all the time with *individual damn bits*
i need to literlly see everything and i do mean everything.
as long as that's not interfered with i'm good. i say good
but what i mean is, "my mind is utterly twisted and melted by
the insane level of detail *which i memorise and look out for")


> https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;
> h=659f58e0144c0ddb59439501ed9c50ba993f92fe

Comment 12 Dmitry Selyutin 2023-12-03 18:43:45 GMT

I've spent a LOT of hours debugging and checking the opportunities. Here are some conclusions I came to so far.

1. I might appear to be a devil's advocate, but I actually think that python-ply is a good choice. Regardless of our implementation pros and contras, for me it already advocates python-ply as a sufficiently good instrument, and I'm going to stick with it.
2. What is bad from C code generation perspective is that we produce Python AST. It's not a suitable candidate for C code generation. What's worse, we generate multiple nodes for a single entity, and it gets too complicated even in Python, not even speaking of C.
3. It gets especially hairy when we produce Python calls and modify the code after it has been obtained with AST. That is, we get the AST result and patch it in place.
4. We cannot avoid modifying the nodes produced, but there are several issues. First, all our nodes are Python AST. Second, some nodes are built in place, e.g. concat and selectassign and many others (basically all which lead to ast.Call). Third, whenever I'll update this code, this would potentially bring a disaster.

All in all, I don't believe that the current code is a good basis for C code generator. It works for Python, and I'd keep it so that the code WORKS.

Here's what I suggest instead.
1. Keep the python-ply, but recreate the overall code so that it produces CUSTOM nodes instead. That is, where we have <u, there will be a `class LTUnsigned(Node)` construction, similarly for all other stuff. Basically the real language-agnostic AST. Language-agnostic is an absolutely critical item!
2. Once we have AST collected and working, we can have walkers and visitors for it.
3. This AST can be used for C code generation. Ideally, had we not time constraints, we'd build ANOTHER AST atop of that: PseudocodeAST => CAst. But for now likely we'll have to follow the simpler approach.
4. If we have time and opportunities, we'd just eventually follow the same approach for Python code. That is, just convert PseudocodeAST to ast.Module.

I don't think it's doable in terms of the current approach mostly due to the fact that it produces a pretty complicated Python AST. I could try changing the current code, but I'm absolutely sure my changes will make all the code around completely nuts.

I wish I had returned with a more encouraging news, but these are conclusions I've been able to reach so far. Basically my suggestion is to use python-ply again and output a completely different nodes. So the code eventually will look familiar but still build something completely different.

I'd like to know your opinion and will meanwhile move on with checking other projects using ply to better understand how it can be used. Before we proceed, two important items: 1) python-ply is to stay, we don't have time for brand new frameworks, plus I'm mostly satisfied with it; 2) no there's no time (and there's a strong lack of desire) to teach the whole Python ast to produce Python code.

Comment 13 Jacob Lifshay 2023-12-03 21:54:28 GMT

(In reply to Dmitry Selyutin from comment #12)
> I'd like to know your opinion and will meanwhile move on with checking other
> projects using ply to better understand how it can be used. Before we
> proceed, two important items: 1) python-ply is to stay, we don't have time
> for brand new frameworks,

using python ply is fine with me, but please modify the parser rather than introduce kludges like pre-parsing steps to insert colons (unless you're just reusing existing parsing code unmodified).

also, we need to change operator precedence to match the PowerISA specification:
bug #1082 (it isn't a lot of work, just change the operator list order and run all the tests.)

Comment 14 Jacob Lifshay 2023-12-03 21:55:26 GMT

(In reply to Jacob Lifshay from comment #13)
> using python ply is fine with me, but please modify the parser rather than
> introduce kludges like pre-parsing steps to insert colons (unless you're
> just reusing existing parsing code unmodified).

or copying.

Comment 15 Luke Kenneth Casson Leighton 2023-12-03 23:59:33 GMT

(In reply to Dmitry Selyutin from comment #12)

> 3. This AST can be used for C code generation. Ideally, had we not time
> constraints, we'd build ANOTHER AST atop of that: PseudocodeAST => CAst. But
> for now likely we'll have to follow the simpler approach.

i'm good with that.  if this was a "full" language (scopes exceptions
globl/local variabes classes gotos) i would say NO hell NO.

(btw please drop functions. they were not added and should not have beem
added by jacob. please drop "def" from the AST and modify pyfnwriter
and appendix pseudocode to not support functions.  this means
separating each "function" into an explicit separate pseudocode file)


> 4. If we have time and opportunities, we'd just eventually follow the same
> approach for Python code. That is, just convert PseudocodeAST to ast.Module.
> 
> 
> I don't think it's doable in terms of the current approach mostly due to the
> fact that it produces a pretty complicated Python AST.

yes. if you have seen the notes i started from dabeaz's GardenSnake.py
as *literally* the first commit, then fixed the lexer problem (with
an additional lexer phase that inline-inserted the required TOKENs):

https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/pseudo/lexer.py;h=c42f8330;hb=HEAD#l62

and then spent about FOUR MONTHS over the next 2 years performing incremental
changes and fixes that got it to do the job.

the only reason i could start from GardenSnake.py was because i spotted that
fundamentally the underlying syntax (not the grammar) of python and
Section 1.3 POWER Book I ISA are identical.

*over time* the use of python-astor, which i used because some dickhead
in the python community ripped out ast.py of python2.7 and replaced it
with fucking c code *with no introspection or source* "bcuz faster",
has become a serious impediment.

i can barely edit parser.py without also creating a prerequisite
code-fragment in power_pseuo.py and using astor.dump()

https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/power_pseudo.py;hb=HEAD

> I could try changing
> the current code, but I'm absolutely sure my changes will make all the code
> around completely nuts.

yyep. it's reached the "mature complex but stable" phase that all software
developed incrementally will get to.


> I wish I had returned with a more encouraging news, but these are
> conclusions I've been able to reach so far. Basically my suggestion is to
> use python-ply again and output a completely different nodes. So the code
> eventually will look familiar but still build something completely different.

seriously and honestly, i was expecting that anyway. an astor-to-c converter
(basically a replication of cython) was a "hope" rather than a requirement.

> 
> I'd like to know your opinion and will meanwhile move on with checking other
> projects using ply to better understand how it can be used. Before we
> proceed, two important items: 1) python-ply is to stay, we don't have time
> for brand new frameworks,

no we do not.

> plus I'm mostly satisfied with it;

it's pretty damn good. it is sad that dabeaz abandoned it.

> 2) no there's
> no time (and there's a strong lack of desire) to teach the whole Python ast
> to produce Python code.

it *used* to do that.  go right back to the original very first commit.

(context: GardenSnake.py was - is - an interactive python interpreter
written *in* python using python-ply.  it worked perfectly for *very
early* versions of python and i am going back 20+ years here but at
some point the syntax changed and dabeaz never got round to fixing
GardenSnake.py... that was my very first task)

Comment 16 Jacob Lifshay 2023-12-04 00:24:18 GMT

(In reply to Luke Kenneth Casson Leighton from comment #15)
> (btw please drop functions. they were not added and should not have beem
> added by jacob. please drop "def" from the AST and modify pyfnwriter
> and appendix pseudocode to not support functions.  this means
> separating each "function" into an explicit separate pseudocode file)

PowerISA very much has functions (though they don't use the def keyword):
Book I Appendix A

Round Integer( sign, frac0:64, gbit, rbit, xbit, round_mode ):
    inc <- 0
    if round_mode = 0b00 then do /* Round to Nearest */
        if sign || frac64 || gbit || rbit || xbit = 0bU11UU then inc <- 1
        if sign || frac64 || gbit || rbit || xbit = 0bU011U then inc <- 1
        if sign || frac64 || gbit || rbit || xbit = 0bU01U1 then inc <- 1
    end
    if round_mode = 0b10 then do /* Round toward +Infinity */
        if sign || frac64 || gbit || rbit || xbit = 0b0U1UU then inc <- 1
        if sign || frac64 || gbit || rbit || xbit = 0b0UU1U then inc <- 1
        if sign || frac64 || gbit || rbit || xbit = 0b0UUU1 then inc <- 1
    end
    if round_mode = 0b11 then do /* Round toward -Infinity */
        if sign || frac64 || gbit || rbit || xbit = 0b1U1UU then inc <- 1
        if sign || frac64 || gbit || rbit || xbit = 0b1UU1U then inc <- 1
        if sign || frac64 || gbit || rbit || xbit = 0b1UUU1 then inc <- 1
    end
    frac0:64 <- frac0:64 + inc
    FPSCRFR <- inc
    FPSCRFI <- gbit | rbit | xbit
    return

Comment 17 Jacob Lifshay 2023-12-04 00:27:11 GMT

(In reply to Jacob Lifshay from comment #16)
> (In reply to Luke Kenneth Casson Leighton from comment #15)
> > (btw please drop functions. they were not added and should not have beem
> > added by jacob. please drop "def" from the AST and modify pyfnwriter
> > and appendix pseudocode to not support functions.  this means
> > separating each "function" into an explicit separate pseudocode file)
> 
> PowerISA very much has functions (though they don't use the def keyword):

see also bfp_ROUND_TO_BFP32 which in the spec PDF is basically a function: it has a function signature line, has calls to other functions, has return statements, etc.

Comment 18 Luke Kenneth Casson Leighton 2023-12-04 00:35:05 GMT

(In reply to Jacob Lifshay from comment #16)
separating each "function" into an explicit separate pseudocode file)
> 
> PowerISA very much has functions (though they don't use the def keyword):
> Book I Appendix A
> 
> Round Integer( sign, frac0:64, gbit, rbit, xbit, round_mode ):

i expect and anticipate the pseudocode *file* to be called round_integer.mdwn

Comment 19 Jacob Lifshay 2023-12-04 00:41:36 GMT

(In reply to Luke Kenneth Casson Leighton from comment #18)
> (In reply to Jacob Lifshay from comment #16)
> separating each "function" into an explicit separate pseudocode file)
> > 
> > PowerISA very much has functions (though they don't use the def keyword):
> > Book I Appendix A
> > 
> > Round Integer( sign, frac0:64, gbit, rbit, xbit, round_mode ):
> 
> i expect and anticipate the pseudocode *file* to be called round_integer.mdwn

so you're suggesting *deleting* the function signatures from the PowerISA specification? I don't think they'll like that. please don't do that.

i'm fine if you want one function per file (but it'll be annoying), but we *need* the function signatures to stay. they will *still be functions*.

Comment 20 Jacob Lifshay 2023-12-04 00:50:48 GMT

(In reply to Jacob Lifshay from comment #19)
> so you're suggesting *deleting* the function signatures from the PowerISA
> specification? I don't think they'll like that. please don't do that.
> 
> i'm fine if you want one function per file (but it'll be annoying), but we
> *need* the function signatures to stay. they will *still be functions*.

function signatures are necessary so we know what function arguments we have and what to name them.

icr where and couldn't find it from some cursory searching, but iirc we discussed pseudocode functions before and concluded we need them.

Comment 21 Jacob Lifshay 2023-12-04 01:00:10 GMT

(edit: ignoring is fine too)

(In reply to Jacob Lifshay from comment #20)
> (In reply to Jacob Lifshay from comment #19)
> > i'm fine if you want one function per file (but it'll be annoying), but we
> > *need* the function signatures to stay. they will *still be functions*.

dmitry, so it's clear, I'm fine if you decide it's too much work to implement pseudocode functions for now and just raise an error or ignore the parsed functions, but please don't delete them.

Comment 22 Luke Kenneth Casson Leighton 2023-12-04 08:30:16 GMT

(In reply to Jacob Lifshay from comment #20)

> icr where and couldn't find it from some cursory searching, but iirc we
> discussed pseudocode functions before and concluded we need them.

ok good enough. so many details.

Comment 23 Luke Kenneth Casson Leighton 2023-12-04 08:37:07 GMT

(In reply to Jacob Lifshay from comment #19)

> so you're suggesting *deleting* the function signatures from the PowerISA
> specification? 

no. don't be absurd.

if they are in section 1.3 book I i need to see the wording.

i don't ever recall seeing them. if not there that needs raising
with OPF ISA WG.

just re-read v3.0C, sections 1.3.1 1.3.2 1.3.3 and 1.3.4 there is nothing
i can see, no definition of "function" or parameters or the keyword "return"

Comment 24 Jacob Lifshay 2023-12-04 08:46:03 GMT

(In reply to Luke Kenneth Casson Leighton from comment #23)
> (In reply to Jacob Lifshay from comment #19)
> 
> > so you're suggesting *deleting* the function signatures from the PowerISA
> > specification? 
> 
> no. don't be absurd.
> 
> if they are in section 1.3 book I i need to see the wording.

afaict they're not there. functions are just implicitly defined.

they are explicitly mentioned though (v3.1B):
section title: 7.6.2.2 VSX Instruction RTL Function Calls
and then it proceeds to define a bunch of functions, some in pseudo-code, some partially english description, many calling other defined functions.

so, they have functions that they define and use lots of places, just they never define what a function is or how to interpret one -- they probably assume the reader has heard of function calls and returns before (which imo is a good assumption, but being explicit is almost always better).

> i don't ever recall seeing them. if not there that needs raising
> with OPF ISA WG.

you want to do that?

Comment 25 Luke Kenneth Casson Leighton 2023-12-04 11:28:40 GMT

(In reply to Jacob Lifshay from comment #24)

> afaict they're not there. functions are just implicitly defined.

which is a direct violation of their rules on the spec.

> assume the reader has heard of function calls and returns before (which imo
> is a good assumption, but being explicit is almost always better).

it's actually required: the ISA TWG *requires* that under
no circumstances can there be "assumptions"

> > i don't ever recall seeing them. if not there that needs raising
> > with OPF ISA WG.
> 
> you want to do that?

not really. i've had enough of dealing with them.
when they agree to properly cooperate then yes.

Comment 26 Dmitry Selyutin 2023-12-04 17:58:57 GMT

I've started playing with this, and found some complexities in mdis module. I had to drop parent/path in walker otherwise this becomes too complex. Basically the issue is that yielding tuples inside tuple hook (unsurprisingly) leads to an infinite recursion. Since we don't use this functionality for now, I decided it'd be easier to drop it and return to this subject later.

https://git.libre-soc.org/?p=mdis.git;a=commitdiff;h=835549b3d024ea03f6363f6131cce3268c9aa52a
https://git.libre-soc.org/?p=mdis.git;a=commitdiff;h=84f9c40cff102be5de4a4063931764d74d7abbae

Comment 27 Dmitry Selyutin 2023-12-06 17:15:23 GMT

I'm currently writing the basic classes which represent pseudocode nodes, and also started implementing the first simple visitor which produces the pseudocode given the root node. Ignore the contents, they are a complete nonsense.

tree = Module([
    If(
        test=BinaryOp(
            left=Symbol("RA"),
            op=LtU(),
            right=Symbol("RB"),
        ),
        body=Block([
            Assign(
                lvalue=Symbol("RT"),
                rvalue=BinaryOp(
                    left=Symbol("RA"),
                    op=Add(),
                    right=Symbol("RB"),
                ),
            ),
            AssignIEA(
                lvalue=Symbol("NIA"),
                rvalue=Call([
                    Symbol("RA"),
                    Call([Symbol("RB")]),
                ]),
            ),
        ]),
        orelse=Block([
            AssignIEA(
                lvalue=Symbol("NIA"),
                rvalue=Symbol("CIA"),
            ),
        ]),
    ),
])
for (level, line) in util.pcode(tree):
    stmt = ((" " * (level * 4)) + line)
    print(stmt)

Comment 28 Dmitry Selyutin 2023-12-06 17:15:44 GMT

The result is:

    if RA <u RB then
        RT <- RA + RB
        NIA <-iea CALL(RA, CALL(RB))
    else
        NIA <-iea CIA

Comment 29 Dmitry Selyutin 2023-12-06 17:20:29 GMT

I need the pseudocode printing for both the debugging and ensuring the overall approach are fine. The code goes like this (example handlers for particular node types):


    @Hook(If)
    def If(self, node):
        yield node
        stmt = " ".join(["if", str(self[node.test]), "then"])
        self[node].emit(stmt=stmt)
        for (level, stmt) in self[node.body]:
            self[node].emit(stmt=stmt, level=level)
        if node.orelse:
            self[node].emit("else")
            stmt = str(self[node.orelse])
            self[node].emit(stmt=stmt)

    @Hook(Symbol)
    def Symbol(self, node):
        yield node
        self[node].emit(stmt=str(node))


As you see, each node gets some code object associated with it. The code object has an emit method, which feeds yet another statement considering the current indentation level for it.

Comment 30 Dmitry Selyutin 2023-12-06 17:21:39 GMT

Obviously C will need more context to be collected and stored upon visiting, but you get the idea.

Comment 31 Luke Kenneth Casson Leighton 2023-12-06 17:43:23 GMT

(In reply to Dmitry Selyutin from comment #28)
> The result is:
> 
>     if RA <u RB then
>         RT <- RA + RB
>         NIA <-iea CALL(RA, CALL(RB))
>     else
>         NIA <-iea CIA

nice! already! that will come in handy for debug purposes during
AST-Node-morphing/manipulation (why astor has a dump function)

(In reply to Dmitry Selyutin from comment #29)

>     @Hook(If)
>     def If(self, node):
>         yield node
>         stmt = " ".join(["if", str(self[node.test]), "then"])

...
...

quite ridiculously clear and simple, isn't it?

> As you see, each node gets some code object associated with it. The code
> object has an emit method, which feeds yet another statement considering the
> current indentation level for it.

very cool.

Comment 32 Dmitry Selyutin 2023-12-08 17:45:21 GMT

Supported more nodes, I'm not saying all nodes are already implemented, but many are.

tree = Module(
    description=String("Subtract From"),
    form=Form.XO,
    mnemonics=Mnemonics([
        Mnemonic(
            name=String("subf"),
            dynamic_operands=DynamicOperands([
                DynamicOperand("RT"),
                DynamicOperand("RA"),
                DynamicOperand("RB"),
            ]),
            static_operands=StaticOperands([
                StaticOperand(key=String("OE"), value=Int(0)),
                StaticOperand(key=String("Rc"), value=Int(0)),
            ]),
        ),
        Mnemonic(
            name=String("subf."),
            dynamic_operands=DynamicOperands([
                DynamicOperand("RT"),
                DynamicOperand("RA"),
                DynamicOperand("RB"),
            ]),
            static_operands=StaticOperands([
                StaticOperand(key=String("OE"), value=Int(0)),
                StaticOperand(key=String("Rc"), value=Int(1)),
            ]),
        ),
        Mnemonic(
            name=String("subfo"),
            dynamic_operands=DynamicOperands([
                DynamicOperand("RT"),
                DynamicOperand("RA"),
                DynamicOperand("RB"),
            ]),
            static_operands=StaticOperands([
                StaticOperand(key=String("OE"), value=Int(1)),
                StaticOperand(key=String("Rc"), value=Int(0)),
            ]),
        ),
        Mnemonic(
            name=String("subfo."),
            dynamic_operands=DynamicOperands([
                DynamicOperand("RT"),
                DynamicOperand("RA"),
                DynamicOperand("RB"),
            ]),
            static_operands=StaticOperands([
                StaticOperand(key=String("OE"), value=Int(1)),
                StaticOperand(key=String("Rc"), value=Int(1)),
            ]),
        ),
    ]),
    alterations=Alterations([
        Alteration(
            registers=Registers([
                String("CR0"),
            ]),
            condition=Condition(key=String("Rc"), value=Int(1)),
        ),
        Alteration(
            registers=Registers([
                String("SO"),
                String("OV"),
                String("OV32"),
            ]),
            condition=Condition(key=String("OE"), value=Int(1)),
        ),
    ]),
    code=Code([
        Assign(
            lvalue=String("RT"),
            rvalue=BinaryOp(
                left=BinaryOp(
                    left=UnaryOp(
                        op=Not(),
                        expr=String("RA"),
                    ),
                    op=Add(),
                    right=String("RB"),
                ),
                op=Add(),
                right=Int(1),
            ),
        ),
    ]),
)
for (level, line) in util.pcode(tree):
    stmt = ((" " * (level * 4)) + line)
    print(stmt)

Comment 33 Dmitry Selyutin 2023-12-08 17:45:43 GMT

# Subtract From

XO-form

* subf RT,RA,RB (OE=0 Rc=0)
* subf. RT,RA,RB (OE=0 Rc=1)
* subfo RT,RA,RB (OE=1 Rc=0)
* subfo. RT,RA,RB (OE=1 Rc=1)

Pseudo-code:

    RT <- ((¬(RA) + RB) + 1)

Special Registers Altered:

    CR0 (if Rc=1)
    SO OV OV32 (if OE=1)

Comment 34 Luke Kenneth Casson Leighton 2023-12-09 01:11:50 GMT

(In reply to Dmitry Selyutin from comment #33)
> # Subtract From
> 
> XO-form
> 
> * subf RT,RA,RB (OE=0 Rc=0)
> * subf. RT,RA,RB (OE=0 Rc=1)
> * subfo RT,RA,RB (OE=1 Rc=0)
> * subfo. RT,RA,RB (OE=1 Rc=1)

ah - the focus and the scope is solely and exclusively the
pseudocode.

there is no money payable for work on amything other than
the pseudocode, plus it is too risky to start adding to or
changing the scope in an unplanned manner when there is
so little time (cavatools and cryptoprimitives are on a
HARD deadline of March 1st 2024 - only 9 weeks)

therefore don't whatever you do increase the scope *in any way*,
but if you have ideas raise them as "Future" and we can discuss
the possibility of adding them to one of the two upcoming grants
when appropriate if appropriate ok?

really *really* don't increase the scope here, it is just too scary
to contemplate missing the committment made to NLnet because of
some unplanned addition to the scope, no matter how good it is.

Comment 35 Luke Kenneth Casson Leighton 2023-12-09 02:29:58 GMT

...pause... sorry :)

Comment 36 Dmitry Selyutin 2023-12-09 07:21:07 GMT

OK Luke, that's fair, I totally agree with you. I won't pursue this target in scope of this task, and will abandon these (quite incomplete) changes now.

Still I think this would be the right choice, and eventually will do it. This largely relates to binutils-related works as well, we need possible forms there. I'll raise the relevant task later, OK?

Comment 37 Luke Kenneth Casson Leighton 2023-12-09 08:50:32 GMT

(In reply to Dmitry Selyutin from comment #36)
> OK Luke, that's fair, I totally agree with you. I won't pursue this target
> in scope of this task,

they have quite a lot of additional implications for changes that make
me nervous if not properly evaluated (under time-pressure at that)

> and will abandon these (quite incomplete) changes now.

suggest to keep the git commit around (no force-branch-push which loses it.
or.. ah! could explicitly tag it?)

> Still I think this would be the right choice, and eventually will do it.

i like the very deliberatte brutal simplicity of the current parser. however
an actual AST (esp. in BNF form) is ultimately better but hm the lexer/parser
split in python-ply leaves me drumming my fingers at the additional
complexity.

> This largely relates to binutils-related works as well, we need possible
> forms there. I'll raise the relevant task later, OK?

no problem. parent: bug #1212

Comment 38 Jacob Lifshay 2023-12-10 09:43:58 GMT

(In reply to Luke Kenneth Casson Leighton from comment #37)
> > Still I think this would be the right choice, and eventually will do it.
> 
> i like the very deliberatte brutal simplicity of the current parser. however
> an actual AST (esp. in BNF form) is ultimately better but hm the lexer/parser
> split in python-ply leaves me drumming my fingers at the additional
> complexity.

I think we should aim more for an intermediate representation that aims to capture the *semantics* (since those are what we're trying to translate) more than trying to exactly match the *syntax*, so, no, I don't think an AST is appropriate, we should instead make a tree structure that may be similar to an AST but instead prioritizes captures semantic differences over being in BNF form or other syntax-focused requirements.

An example of what I mean:

the difference between FPSCR[VXSNAN] (a named field access, not indexing, usually written FPSCR.VXSNAN) and v[i + 3] (bit indexing) -- if we were focused on syntax those would both just be the indexing operator with subexpressions of FPSCR and VXSNAN, or v and i + 3 respectively. Focusing on semantics means those would instead be different node kinds, since they are completely different kinds of operations. FPSCR[VXSNAN] probably would be the same kind of node as FPSCR.VXSNAN with a flag if necessary to tell the two syntactical forms apart.

Comment 39 Jacob Lifshay 2023-12-10 09:48:57 GMT

(In reply to Jacob Lifshay from comment #38)
> the difference between FPSCR[VXSNAN] (a named field access, not indexing,
> usually written FPSCR.VXSNAN)

dmitry, in case that was confusing note that we decided to use the a[b] syntax whenever the PDF uses (html-ified since bugzilla doesn't support formatting) a<sub>b<sub/> (where b is a subscript right after a).

Comment 40 Jacob Lifshay 2023-12-10 09:54:26 GMT

(In reply to Jacob Lifshay from comment #38)
> I think we should aim more for an intermediate representation that aims to
> capture the *semantics*

a lot of the benefit of an IR is that it makes simple and obvious all of the semantic differences that are not obvious from just parsing the input syntax.

Comment 41 Dmitry Selyutin 2023-12-21 20:43:49 GMT

After thinking a lot, I've decided to take the current lexer and parser. Despite that I had a good success on this road, I have to reserve some time for the generation itself. I suggest we raise the relevant tasks for ideas I have in mind once this task is completed; from the code it'll be obvious why these ideas and improvements are good to have.

Comment 42 Dmitry Selyutin 2023-12-21 20:55:17 GMT

That said, whilst I'm keeping the current parser, it still has to produce different nodes, so all previous remarks are still valid.

Comment 43 Dmitry Selyutin 2023-12-21 20:56:42 GMT

(In reply to Dmitry Selyutin from comment #42)
> keeping the current parser

...basically having its copy. No intents to modify what we have now.

Comment 44 Dmitry Selyutin 2024-01-04 08:28:59 GMT

I have to say that we'll have to reimplement bits of it eventually: I'd like to make the lexing/parsing routines be parts of the classes. I don't have time for this now, I'm basically hacking the copy of the existing code, but this is inevitable. The names of tokens and rules are not sufficiently apt, and I'd not say the code is really maintainable. But I'm reserving this for the future due to time constraints.

The next (long and boring) comments are the code we had, the tree we got and the code we obtained from the tree. There still might be issues, but mostly the output meets my expectations.

Comment 45 Dmitry Selyutin 2024-01-04 08:29:39 GMT

The original pseudocode (simplev.mdwn):

tree[0:63] <- (RB)
mode <- tree[62:63]
tree[62:63] <- 0
ra_used <- 0b0
in_bits[0:63] <- (RC|0)
if in_bits = 0 then
    in_bits[0:63] <- 1
orig_in_bits <- in_bits
tree_index <- 1
found <- 0b0
hit_end <- 0b0
do bit_length = 1 to 6
    in_bit <- in_bits[63]
    if in_bits = 1 then
        if ra_used | (_RA = 0) then
            hit_end <- 0b1
            leave
        ra_used <- 0b1
        in_bit <- (RA)[63]
        in_bits <- 0b1 || (RA)[0:62]
    else
        in_bits <- 0b0 || in_bits[0:62]
    tree_index <- tree_index * 2
    if in_bit = 1 then
        tree_index <- tree_index + 1
    if tree_index < 64 then
        if tree[63 - tree_index] then
            found <- 0b1
            leave
compressed_index <- 0
do i = 0 to 127
    possible <- 1
    j <- i
    do while j >= 4
        j <- j / 2
        if tree[63 - j] then
            possible <- 0
            leave
    if i = tree_index then
        leave
    else if i >= 64 then
        compressed_index <- compressed_index + possible
    else if tree[63 - i] = 1 then
        compressed_index <- compressed_index + possible
switch(mode)
    case(0):
        RT[0:63] <- tree_index
        if ¬found then
            in_bits <- orig_in_bits
            ra_used <- 0b0
    case(1):
        RT[0:63] <- tree_index
        if hit_end then
            in_bits <- orig_in_bits
            ra_used <- 0b0
    case(2):
        RT[0:63] <- compressed_index
        if ¬found then
            in_bits <- orig_in_bits
            ra_used <- 0b0
            RT[0:63] <- tree_index
    default:
        RT[0:63] <- compressed_index
        if hit_end then
            in_bits <- orig_in_bits
            ra_used <- 0b0
RS <- in_bits
CR0 <- ra_used || (tree_index >= 64) || found || hit_end

Comment 46 Dmitry Selyutin 2024-01-04 08:30:19 GMT

The tree (I'm too lazy to indent it):

Scope([Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=Symbol(tree)), rvalue=GPR(RB)), Assign(lvalue=Symbol(mode), rvalue=RangeSubscript(start=DecLiteral(62), end=DecLiteral(63), target=Symbol(tree))), Assign(lvalue=RangeSubscript(start=DecLiteral(62), end=DecLiteral(63), target=Symbol(tree)), rvalue=DecLiteral(0)), Assign(lvalue=Symbol(ra_used), rvalue=BinLiteral(0b0)), Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=Symbol(in_bits)), rvalue=GPR(RC0)), IfExpr(test=BinaryExpr(left=Symbol(in_bits), op=Eq(=), right=DecLiteral(0)), body=Scope([Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=Symbol(in_bits)), rvalue=DecLiteral(1))]), orelse=Scope([])), Assign(lvalue=Symbol(orig_in_bits), rvalue=Symbol(in_bits)), Assign(lvalue=Symbol(tree_index), rvalue=DecLiteral(1)), Assign(lvalue=Symbol(found), rvalue=BinLiteral(0b0)), Assign(lvalue=Symbol(hit_end), rvalue=BinLiteral(0b0)), ForExpr(target=Symbol(bit_length), start=DecLiteral(1), end=DecLiteral(6), body=Scope([Assign(lvalue=Symbol(in_bit), rvalue=Subscript(index=DecLiteral(63), target=Symbol(in_bits))), IfExpr(test=BinaryExpr(left=Symbol(in_bits), op=Eq(=), right=DecLiteral(1)), body=Scope([IfExpr(test=BinaryExpr(left=Symbol(ra_used), op=BitOr(|), right=BinaryExpr(left=Symbol(_RA), op=Eq(=), right=DecLiteral(0))), body=Scope([Assign(lvalue=Symbol(hit_end), rvalue=BinLiteral(0b1)), LeaveKeyword(leave)]), orelse=Scope([])), Assign(lvalue=Symbol(ra_used), rvalue=BinLiteral(0b1)), Assign(lvalue=Symbol(in_bit), rvalue=Subscript(index=DecLiteral(63), target=GPR(RA))), Assign(lvalue=Symbol(in_bits), rvalue=BinaryExpr(left=BinLiteral(0b1), op=BitConcat(||), right=RangeSubscript(start=DecLiteral(0), end=DecLiteral(62), target=GPR(RA))))]), orelse=Scope([Assign(lvalue=Symbol(in_bits), rvalue=BinaryExpr(left=BinLiteral(0b0), op=BitConcat(||), right=RangeSubscript(start=DecLiteral(0), end=DecLiteral(62), target=Symbol(in_bits))))])), Assign(lvalue=Symbol(tree_index), rvalue=BinaryExpr(left=Symbol(tree_index), op=Mul(*), right=DecLiteral(2))), IfExpr(test=BinaryExpr(left=Symbol(in_bit), op=Eq(=), right=DecLiteral(1)), body=Scope([Assign(lvalue=Symbol(tree_index), rvalue=BinaryExpr(left=Symbol(tree_index), op=Add(+), right=DecLiteral(1)))]), orelse=Scope([])), IfExpr(test=BinaryExpr(left=Symbol(tree_index), op=Lt(<), right=DecLiteral(64)), body=Scope([IfExpr(test=Subscript(index=BinaryExpr(left=DecLiteral(63), op=Sub(-), right=Symbol(tree_index)), target=Symbol(tree)), body=Scope([Assign(lvalue=Symbol(found), rvalue=BinLiteral(0b1)), LeaveKeyword(leave)]), orelse=Scope([]))]), orelse=Scope([]))])), Assign(lvalue=Symbol(compressed_index), rvalue=DecLiteral(0)), ForExpr(target=Symbol(i), start=DecLiteral(0), end=DecLiteral(127), body=Scope([Assign(lvalue=Symbol(possible), rvalue=DecLiteral(1)), Assign(lvalue=Symbol(j), rvalue=Symbol(i)), WhileExpr(test=BinaryExpr(left=Symbol(j), op=Ge(>=), right=DecLiteral(4)), body=Scope([Assign(lvalue=Symbol(j), rvalue=BinaryExpr(left=Symbol(j), op=Div(/), right=DecLiteral(2))), IfExpr(test=Subscript(index=BinaryExpr(left=DecLiteral(63), op=Sub(-), right=Symbol(j)), target=Symbol(tree)), body=Scope([Assign(lvalue=Symbol(possible), rvalue=DecLiteral(0)), LeaveKeyword(leave)]), orelse=Scope([]))]), orelse=Scope([])), IfExpr(test=BinaryExpr(left=Symbol(i), op=Eq(=), right=Symbol(tree_index)), body=Scope([LeaveKeyword(leave)]), orelse=Scope([IfExpr(test=BinaryExpr(left=Symbol(i), op=Ge(>=), right=DecLiteral(64)), body=Scope([Assign(lvalue=Symbol(compressed_index), rvalue=BinaryExpr(left=Symbol(compressed_index), op=Add(+), right=Symbol(possible)))]), orelse=Scope([IfExpr(test=BinaryExpr(left=Subscript(index=BinaryExpr(left=DecLiteral(63), op=Sub(-), right=Symbol(i)), target=Symbol(tree)), op=Eq(=), right=DecLiteral(1)), body=Scope([Assign(lvalue=Symbol(compressed_index), rvalue=BinaryExpr(left=Symbol(compressed_index), op=Add(+), right=Symbol(possible)))]), orelse=Scope([]))]))]))])), SwitchExpr(subject=Symbol(mode), cases=Cases([Case(labels=Labels([Label(0)]), body=Scope([Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=GPR(RT)), rvalue=Symbol(tree_index)), IfExpr(test=UnaryExpr(op=Not(¬), value=Symbol(found)), body=Scope([Assign(lvalue=Symbol(in_bits), rvalue=Symbol(orig_in_bits)), Assign(lvalue=Symbol(ra_used), rvalue=BinLiteral(0b0))]), orelse=Scope([]))])), Case(labels=Labels([Label(1)]), body=Scope([Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=GPR(RT)), rvalue=Symbol(tree_index)), IfExpr(test=Symbol(hit_end), body=Scope([Assign(lvalue=Symbol(in_bits), rvalue=Symbol(orig_in_bits)), Assign(lvalue=Symbol(ra_used), rvalue=BinLiteral(0b0))]), orelse=Scope([]))])), Case(labels=Labels([Label(2)]), body=Scope([Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=GPR(RT)), rvalue=Symbol(compressed_index)), IfExpr(test=UnaryExpr(op=Not(¬), value=Symbol(found)), body=Scope([Assign(lvalue=Symbol(in_bits), rvalue=Symbol(orig_in_bits)), Assign(lvalue=Symbol(ra_used), rvalue=BinLiteral(0b0)), Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=GPR(RT)), rvalue=Symbol(tree_index))]), orelse=Scope([]))])), Case(labels=Labels([DefaultLabel(default)]), body=Scope([Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=GPR(RT)), rvalue=Symbol(compressed_index)), IfExpr(test=Symbol(hit_end), body=Scope([Assign(lvalue=Symbol(in_bits), rvalue=Symbol(orig_in_bits)), Assign(lvalue=Symbol(ra_used), rvalue=BinLiteral(0b0))]), orelse=Scope([]))]))])), Assign(lvalue=GPR(RS), rvalue=Symbol(in_bits)), Assign(lvalue=Symbol(CR0), rvalue=BinaryExpr(left=Symbol(ra_used), op=BitConcat(||), right=BinaryExpr(left=BinaryExpr(left=Symbol(tree_index), op=Ge(>=), right=DecLiteral(64)), op=BitConcat(||), right=BinaryExpr(left=Symbol(found), op=BitConcat(||), right=Symbol(hit_end)))))])

Comment 47 Dmitry Selyutin 2024-01-04 08:30:56 GMT

The code we got from the tree via visiting the nodes:

    tree[0:63] <- (RB)
    mode <- tree[62:63]
    tree[62:63] <- 0
    ra_used <- 0b0
    in_bits[0:63] <- (RC|0)
    if (in_bits = 0) then
        in_bits[0:63] <- 1
    orig_in_bits <- in_bits
    tree_index <- 1
    found <- 0b0
    hit_end <- 0b0
    for bit_length = 1 to 6
        in_bit <- in_bits[63]
        if (in_bits = 1) then
            if (ra_used | (_RA = 0)) then
                hit_end <- 0b1
                leave
            ra_used <- 0b1
            in_bit <- (RA)[63]
            in_bits <- (0b1 || (RA)[0:62])
        else
            in_bits <- (0b0 || in_bits[0:62])
        tree_index <- (tree_index × 2)
        if (in_bit = 1) then
            tree_index <- (tree_index + 1)
        if (tree_index < 64) then
            if tree[(63 - tree_index)] then
                found <- 0b1
                leave
    compressed_index <- 0
    for i = 0 to 127
        possible <- 1
        j <- i
        do while (j >= 4)
            j <- (j / 2)
            if tree[(63 - j)] then
                possible <- 0
                leave
        if (i = tree_index) then
            leave
        else
            if (i >= 64) then
                compressed_index <- (compressed_index + possible)
            else
                if (tree[(63 - i)] = 1) then
                    compressed_index <- (compressed_index + possible)
    switch(mode)
        case (0):
            (RT)[0:63] <- tree_index
            if ¬(found) then
                in_bits <- orig_in_bits
                ra_used <- 0b0
        case (1):
            (RT)[0:63] <- tree_index
            if hit_end then
                in_bits <- orig_in_bits
                ra_used <- 0b0
        case (2):
            (RT)[0:63] <- compressed_index
            if ¬(found) then
                in_bits <- orig_in_bits
                ra_used <- 0b0
                (RT)[0:63] <- tree_index
        default:
            (RT)[0:63] <- compressed_index
            if hit_end then
                in_bits <- orig_in_bits
                ra_used <- 0b0
    RS <- in_bits
    CR0 <- (ra_used || ((tree_index >= 64) || (found || hit_end)))

Comment 48 Dmitry Selyutin 2024-01-04 08:35:09 GMT

I had to cheat a bit to force output of RA instead of (RA) for constructs like assignments and subscripts (the default is (RA), but subscripts and assignments are different).

I dropped everything related to "read_regs", "uninit_regs" and other stateful bits from the copy of the parser. That's job of the concrete visitor, not the parser.

The pseudocode visitor hooks look like this:

    @Hook(pc_ast.IfExpr)
    def IfExpr(self, node):
        yield node
        stmt = " ".join([
            "if",
            str(self[node.test]),
            "then",
        ])
        self[node].emit(stmt=stmt)
        for (level, stmt) in self[node.body]:
            self[node].emit(stmt=stmt, level=level)
        if node.orelse:
            self[node].emit("else")
            for (level, stmt) in self[node.orelse]:
                self[node].emit(stmt=stmt, level=level)

    @Hook(pc_ast.WhileExpr)
    def WhileExpr(self, node):
        yield node
        stmt = " ".join([
            "do",
            "while",
            str(self[node.test]),
        ])
        self[node].emit(stmt=stmt)
        for (level, stmt) in self[node.body]:
            self[node].emit(stmt=stmt, level=level)
        if node.orelse:
            self[node].emit("else")
            for (level, stmt) in self[node.orelse]:
                self[node].emit(stmt=stmt, level=level)

Obviously there's a lot of other hooks, not shown here for brevity.

Comment 49 Dmitry Selyutin 2024-01-04 08:38:32 GMT

That's how ForExpr and WhileExpr look like:

class ForExpr(Dataclass):
    target: Node
    start: Node
    end: Node
    body: Scope


class WhileExpr(Dataclass):
    test: Node
    body: Scope
    orelse: Scope

Comment 50 Dmitry Selyutin 2024-01-04 08:43:53 GMT

Whenever somebody attempts to instantiate Dataclass, all its fields are __validated__ for isinstance. Same applies to typed sequences:


class Label(Literal):
    pass

class DefaultLabel(Label):
    def __new__(cls):
        return super().__new__(cls, "default")

class Labels(Sequence, typeid=Label):
    pass

class Case(Dataclass):
    labels: Labels
    body: Scope

class Cases(Sequence, typeid=Case):
    pass


If some weirdo ever tries to do Case(labels=(1, 2), body="ABC") or pass illegal nodes not matching the description, an exception is raised. This helps to diagnose what went wrong when attempt to establish the result of some rule. Example:


    def p_cases(self, p):
        """
        cases   : switch_list switch_default
                | switch_default
        """
        if len(p) == 3:
            p[0] = pc_ast.Cases(p[1] + (p[2],))
        else:
            p[0] = pc_ast.Cases([p[1]])


This cuts off the cases when something in switchlist is NOT an instance of Case. The Cases class takes care of it, checking anything it had, thus ensuring the previous rules expanded correctly.

Comment 51 Dmitry Selyutin 2024-01-05 15:15:03 GMT

OK, it seems we're able to parse all code EXCEPT functions we had so far. I used the following quick and dirty script:


def dedent(line):
    if line.startswith("    "):
        return line[4:].rstrip()
    return line.rstrip()

def ignore(line):
    line = line.strip()
    if line.startswith("<!--"):
        return True
    return False

def parse(parser, origin):
    origin = tuple(itertools.filterfalse(ignore, origin))
    tree = parser.parse(code="\n".join(origin))
    stream = io.StringIO()
    for (level, line) in pc_util.pseudocode(tree):
        print(f"{' ' * 4 * level}{line}", file=stream)
    stream.seek(0)
    target = tuple(itertools.filterfalse(ignore, stream))
    return (origin, target)

lexer = pc_lexer.IndentLexer(debug=False)
parser = pc_parser.Parser(lexer=lexer)
pattern = re.compile(r"Pseudo-code:(.*?)(?:Special Registers Altered|Description):", re.DOTALL)
for path in glob.glob("openpower/isa/*.mdwn"):
    with open(path, "r", encoding="UTF-8") as stream:
        data = stream.read()
    for code in pattern.findall(data):
        (stage0, stage1) = parse(parser, map(dedent, code.split("\n")))
        (stage2, stage3) = parse(parser, map(dedent, stage1))
        stage1 = tuple(map(dedent, itertools.filterfalse(ignore, stage1)))
        stage3 = tuple(map(dedent, itertools.filterfalse(ignore, stage3)))
        assert stage1 == stage2 and stage2 == stage3


As I said, I don't have time to deal with all the creepy stuff around. The definition of the "creepy stuff" includes "fallthrough" keywords, adding pesudo tokens like "colons", custom "indent/dedent" filter (which should've been simpler by order of magnitude), inept rule names, etc. etc.

This already wasted way more efforts and time than it deserved, and that's just only to make the real AST work. :-)

Comment 52 Dmitry Selyutin 2024-01-05 15:26:37 GMT

As discussed in the previous comments, I'll skip the functions (def, args, varargs, etc.). This mostly means that, after a cleanup and commit into some branch, we can now proceed with checking whether the code is correct.

As I'm extremely lazy with the manual checks, I have the following (crazy?) idea on how to test that the parsing is correct.
1. The new code can produce a valid AST tree after parsing.
2. The new code can convert an AST tree back into the pseudocode.
3. The old code can convert a pseudocode into a Python code.
4. If we do pycode(pseudocode(parse(mdwn))) and things still work, it means our AST is correct.

So the next stage would be to do exactly this. I'll clean the code a bit, commit it into a branch and modify the OLD parser so that it fetches an AST with a NEW parser then converts this AST into a pseudocode. The new pseudocode will be used in place of an old one, substituted in runtime. Then I'll check for discrepancies in the generation and see what went wrong. I think checking python generated code would be OK, then CI.

Comment 53 Luke Kenneth Casson Leighton 2024-01-07 02:44:23 GMT

(In reply to Dmitry Selyutin from comment #48)
> I had to cheat a bit to force output of RA instead of (RA) for constructs
> like assignments and subscripts (the default is (RA), but subscripts and
> assignments are different).

ok RA refers to the bits in the opcode field, but annoyingly
when assigned (RA <- xyz) it is actually GPR[RA] that is getting
assigned to.

then also there is (RA|0) which is Shorthand for
"if RA=0 then 0 otherwise GPR[RA]"

sigh :)

Comment 54 Dmitry Selyutin 2024-01-07 06:46:12 GMT

(In reply to Luke Kenneth Casson Leighton from comment #53)
> ok RA refers to the bits in the opcode field

...unless it's surrounded by parentheses. Damn this syntax is so annoying!

Comment 55 Luke Kenneth Casson Leighton 2024-01-07 16:50:40 GMT

(In reply to Dmitry Selyutin from comment #54)

> ...unless it's surrounded by parentheses. Damn this syntax is so annoying!

it was never intended to be machine-readable.

i was literalky the first person to ever consider it.
in 30+ years of Power ISA history.

Comment 56 Dmitry Selyutin 2024-01-07 23:10:49 GMT

(In reply to Dmitry Selyutin from comment #52)
> As I'm extremely lazy with the manual checks, I have the following (crazy?)
> idea on how to test that the parsing is correct.
> 1. The new code can produce a valid AST tree after parsing.
> 2. The new code can convert an AST tree back into the pseudocode.
> 3. The old code can convert a pseudocode into a Python code.
> 4. If we do pycode(pseudocode(parse(mdwn))) and things still work, it means
> our AST is correct.

Apart of bug #1247, the only difference we have so far after conversion is this:

     @inject()
     def op_setbc(self, CR, RT):
-        RT = copy_assign_rhs(1 if eq(CR[BI + 32], 1) else 0)
+        if eq(CR[BI + 32], 1):
+            RT = copy_assign_rhs(1)
+        else:
+            RT = copy_assign_rhs(0)
         return (CR, RT,)

There are several cases of this behavior (3 in total). This can be "fixed" via introducing a custom tree => pseudocode conversion. I don't consider this to be a real issue. Moreover, I think the updated version is cleaner and maps better to a language-agnostic code generation.

Any objections?

Comment 57 Jacob Lifshay 2024-01-07 23:34:59 GMT

(In reply to Dmitry Selyutin from comment #56)
> Apart of bug #1247, the only difference we have so far after conversion is
> this:
> 
>      @inject()
>      def op_setbc(self, CR, RT):
> -        RT = copy_assign_rhs(1 if eq(CR[BI + 32], 1) else 0)
> +        if eq(CR[BI + 32], 1):
> +            RT = copy_assign_rhs(1)
> +        else:
> +            RT = copy_assign_rhs(0)
>          return (CR, RT,)

looks fine to me, though this makes me suspect that you might be mishandling the ? : operator...

where is the code you've been committing? I don't see anything recently updated on git.libre-soc.org, maybe it's in a repo that luke forgot to set public?

Comment 58 Luke Kenneth Casson Leighton 2024-01-08 00:45:52 GMT

(In reply to Jacob Lifshay from comment #57)

> updated on git.libre-soc.org, maybe it's in a repo that luke forgot to set
> public?

there are no private repositories except for gitolite admin and
there never will be.

Comment 59 Jacob Lifshay 2024-01-08 01:00:40 GMT

(In reply to Luke Kenneth Casson Leighton from comment #58)
> (In reply to Jacob Lifshay from comment #57)
> 
> > updated on git.libre-soc.org, maybe it's in a repo that luke forgot to set
> > public?
> 
> there are no private repositories except for gitolite admin and
> there never will be.

ok, i was mentioning that because iirc you've forgotten before. so, in that case, it looks like ghostmansd didn't git push, or is changing committed dates, or gitweb is misbehaving...

Comment 60 Dmitry Selyutin 2024-01-08 06:36:00 GMT

I haven't pushed the code yet, I'm operating on a local copy. I'll push once I ensure things work.

Comment 61 Jacob Lifshay 2024-01-08 06:52:49 GMT

(In reply to Dmitry Selyutin from comment #60)
> I haven't pushed the code yet, I'm operating on a local copy. I'll push once
> I ensure things work.

our standard operating procedure is generally to push whatever WIP code we have when we are done for the day, or whenever we make commits. if it doesn't work, push to a separate branch or at least ensure all other tests still work (though I prefer for people to avoid pushing lots of new broken tests to branches others are using since it clogs up CI which is configured to only report the first 10 broken tests).

Comment 62 Jacob Lifshay 2024-01-08 06:57:47 GMT

(In reply to Jacob Lifshay from comment #61)
> (In reply to Dmitry Selyutin from comment #60)
> > I haven't pushed the code yet, I'm operating on a local copy. I'll push once
> > I ensure things work.
> 
> our standard operating procedure is generally to push whatever WIP code

pushing WIP code makes it much easier for others to spot bugs and/or understand how it works, this can also avoid where someone spends a lot of effort on a flawed design and the flaw only becomes known once they complete the design and push their code.

Comment 63 Dmitry Selyutin 2024-01-08 07:01:19 GMT

(In reply to Jacob Lifshay from comment #61)
> (In reply to Dmitry Selyutin from comment #60)
> > I haven't pushed the code yet, I'm operating on a local copy. I'll push once
> > I ensure things work.
> 
> our standard operating procedure is generally to push whatever WIP code we
> have when we are done for the day

I'm fine with this, do so if it doesn't break anything around.

Comment 64 Jacob Lifshay 2024-01-08 07:08:54 GMT

(In reply to Dmitry Selyutin from comment #63)
> I'm fine with this, do so if it doesn't break anything around.

Thanks! this also makes it much easier for someone to take over if something happens and the original person can't work on it (e.g. house burns down or computer stolen)

Comment 65 Dmitry Selyutin 2024-01-08 08:40:07 GMT

Pushed two commits (plus a change in walker).

https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=3b3b2189ed804ed311cdff3ba60c642bae663721
https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=6f6c363534b482b69e66a89ba0c53353aca5acff
https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=b1d7d93e54dd5698628eafd8012e243593122671

The last commit hacks pywriter so it does the following: 1) converts the code to a tree; 2) converts the tree into the pseudocode. This is done as a temporary measure, I want to ensure that the code is converted successfully. I already compared artefacts from src/openpower/decoder/isa/generated and found no changes except for ForExpr and bug #1247. Once I see that stuff works not only for a local run, but also for CI, I can drop that last commit (though I'd say it doesn't waste too much time to do this conversion so we might keep it).

Comment 66 Dmitry Selyutin 2024-01-08 08:42:52 GMT

As for commits and pushes. I will commit and push as _I_ find it convenient. Including any numbers of interactive rebases, fixups, even force pushes to branches, even branch removals. Feel free to impose any further restrictions on your own modus operandi, though.

Comment 67 Dmitry Selyutin 2024-01-08 11:02:13 GMT

CI doesn't appear to be active (the last test was a week ago). Plus I don't see oppc branch there. Jacob, could you check it, please?

Comment 68 Jacob Lifshay 2024-01-08 11:05:54 GMT

(In reply to Dmitry Selyutin from comment #67)
> CI doesn't appear to be active (the last test was a week ago). Plus I don't
> see oppc branch there. Jacob, could you check it, please?

it should be active, however luke required i only mirror the git repos once every 24hr...so I'm going to have to manually git push again (luke, this is a problem that needs fixing)

Comment 69 Dmitry Selyutin 2024-01-08 12:49:41 GMT

(In reply to Jacob Lifshay from comment #68)
> luke required i only mirror the git repos once
> every 24hr...

Luke, any rationale on that?

Comment 70 Jacob Lifshay 2024-01-08 23:08:03 GMT

(In reply to Dmitry Selyutin from comment #69)
> (In reply to Jacob Lifshay from comment #68)
> > luke required i only mirror the git repos once
> > every 24hr...
> 
> Luke, any rationale on that?

for what luke said originally:
https://bugs.libre-soc.org/show_bug.cgi?id=1162

imo it is reasonable to run git mirror every 15min, I calculated it would cost substantially less than 1 GBP per year based on the data transferred for one no-op mirror of all the repos.

Comment 71 Dmitry Selyutin 2024-01-09 19:50:21 GMT

(In reply to Jacob Lifshay from comment #70)
> imo it is reasonable to run git mirror every 15min, I calculated it would
> cost substantially less than 1 GBP per year based on the data transferred
> for one no-op mirror of all the repos.

If so, can we subtract 1GBP from any tasks we have around and reserve it for these purposes?

Comment 72 Dmitry Selyutin 2024-01-09 19:54:18 GMT

Many updates, too many to enumerate, briefly:
1) simplified the ternary condition operator pseudocode;
2) simplified special words parsing (GPR, FPR, CR, reserved, etc.);
3) minor hierarchy refactoring for the future C code generation.

Comment 73 Dmitry Selyutin 2024-01-09 19:57:00 GMT

With a local code I have now, we're already able to convert this pseudocode...

src <- [0]*64
src[64-XLEN:63] <- (RS)
result <- [0]*64
do i = 0 to 1
    n <- i * 32
    result[n+0:n+7] <- 0
    result[n+8:n+19] <- DPD_TO_BCD(src[n+12:n+21])
    result[n+20:n+31] <- DPD_TO_BCD(src[n+22:n+31])
RA <- result[64-XLEN:63]


...into declarations:

void
oppc_cdtbcd(void) {
    uint64_t src;
    uint64_t result;
    uint64_t i;
    uint64_t n;
}


I guess all such functions will have some context argument. Notice that "special" words and registers are not in declarations (because they shouldn't be).

Comment 74 Luke Kenneth Casson Leighton 2024-01-09 20:56:09 GMT

(In reply to Dmitry Selyutin from comment #66)
> As for commits and pushes. I will commit and push as _I_ find it convenient.

Dmitry, the project's requirements as per the full Transparency and
Audit Conditions are to push immediately.  please ensure that you
use the server as if it were your local machine.

the absolute maximum amount of time that work should be local-only
is about 18 hours and that is seriously on the edge. normally it
should be well under 2 minutes between commit and push.

Jacob explained it well. Hwever he did not mention that we have *actually*
had people waste significant time by not publishing
work, nor that NLnet got Audited by the EU a few months
ago by an extremely technically-knowledgeable Auditor.

we absolutely have to be watertight "above board" here.

bottom line: no private local-only code. sorry.

Comment 75 Luke Kenneth Casson Leighton 2024-01-09 21:10:06 GMT

(In reply to Dmitry Selyutin from comment #72)
> Many updates, too many to enumerate, briefly:
> 1) simplified the ternary condition operator pseudocode;
> 2) simplified special words parsing (GPR, FPR, CR, reserved, etc.);
> 3) minor hierarchy refactoring for the future C code generation.

fantastic. any joy on a python-cffi importer (if nothing else,
to do unit tests) or is that just too much for the $?

(In reply to Dmitry Selyutin from comment #73)

> With a local code I have now, 

(please do push it immediately to fulfil the Project Coding Standards,
to avoid Audit and Transparency "red flags", thank you for understanding
and patience here Dmitry)

> we're already able to convert this
> pseudocode...
> 
> src <- [0]*64
> src[64-XLEN:63] <- (RS)
> result <- [0]*64
> do i = 0 to 1
>     n <- i * 32
>     result[n+0:n+7] <- 0
>     result[n+8:n+19] <- DPD_TO_BCD(src[n+12:n+21])
>     result[n+20:n+31] <- DPD_TO_BCD(src[n+22:n+31])
> RA <- result[64-XLEN:63]

fantastic. calling functions, dealing with loops, local variables - nice.


> 
> ...into declarations:
> 
> void
> oppc_cdtbcd(void) {
>     uint64_t src;
>     uint64_t result;
>     uint64_t i;
>     uint64_t n;
> }

one thing: all return results need a boolean "ok" flag which is
set on write.

    RA <- result[64-XLEN:63]

outputs:

    RA->value = result...  ;
    RA->ok    = true;

or..

    RA = result.... ;
    ok_flags->RA = true;

or

    #define RA_result_index 0
    #define CR0_result_index 1
    ok_flags |= 1<<RA_result_index;

(separate ok bitfield or something)


this actually reflects the hardware design, so is not surprising,
but turns out to be absolutely essential in the FP routines for
flags, and a few other obscure areas.

basically *some* pseudocode *optionally* writes to registers,
using *runtime* arbitrary decisions, not static deterministic
ones based on known information going into the routine.

Comment 76 Dmitry Selyutin 2024-01-09 21:15:06 GMT

I'm extremely limited in time. I'm spending it wisely. I commit when I consider it's the right to do so, w/o half-baked code or strange constraints like "commit every 18 hours". I do it in my free time and it's up to me how often I commit and in how many commits.

Luke, w/o further ado, I'm passing this to David and James.

Comment 77 Dmitry Selyutin 2024-01-09 21:16:45 GMT

And, obviously, to Andrey. Andrey, please solve it. It's 1.5 month till the deadline. This situation must be solved fast, the time is precious.

Comment 78 Dmitry Selyutin 2024-01-09 21:24:04 GMT

At this point, I'm stopping any further comments here until the situation is resolved, to avoid any climax or unconstructive outcome. I seriously don't have time to put any spokes out of the wheel. The mere fact that we started this discussion shows a severe underestimation of the task and the works already done and amount of works to be done. Luke, I'm sorry, but I have no other choice.

Comment 79 Jacob Lifshay 2024-01-10 06:00:14 GMT

while reading through the code, i noticed you wrote:
import openpower.oppc.pc_util as pc_util

if you like you can write that as:
from openpower.oppc import pc_util

Comment 80 Dmitry Selyutin 2024-01-10 17:51:45 GMT

According to the recent discussion we had at mailing list, I consider this case closed and can move on. :-) Thanks chaps for handling this!

Comment 81 Jacob Lifshay 2024-01-10 17:58:57 GMT

(In reply to Dmitry Selyutin from comment #80)
> According to the recent discussion we had at mailing list, I consider this
> case closed and can move on. :-) Thanks chaps for handling this!

for future reference, that discussion is at:
https://lists.libre-soc.org/pipermail/libre-soc-dev/2024-January/005912.html

Comment 82 Dmitry Selyutin 2024-01-12 20:27:21 GMT

Many recent commits on oppc branch (too many to list them separately) bring support for more expressions and statements. With these, the following pseudocode...


src <- [0]*64
src[64-XLEN:63] <- (RS)
result <- [0]*64
do i = 0 to 1
    n <- i * 32
    result[n+0:n+7] <- 0
    result[n+8:n+19] <- DPD_TO_BCD(src[n+12:n+21])
    result[n+20:n+31] <- DPD_TO_BCD(src[n+22:n+31])
RA <- result[64-XLEN:63]
RA <- (C ? A : B)
if a < b then RT <- (RA)
else          RT <- (RB)


...get converted to this C code:


void
oppc_cdtbcd(void) {
    uint64_t src;
    uint64_t result;
    uint64_t i;
    uint64_t n;
    uint64_t C;
    uint64_t A;
    uint64_t B;
    uint64_t a;
    uint64_t b;
    src = oppc_repeat(UINT64_C(0x0), UINT64_C(0x40));
    oppc_range_subscript_assign(&src, (UINT64_C(0x40) - ctx->XLEN), UINT64_C(0x3f), ctx->gpr[OPPC_GPR_RS]);
    result = oppc_repeat(UINT64_C(0x0), UINT64_C(0x40));
    for (i = UINT64_C(0x0); i != (UINT64_C(0x1) + 1); ++i) {
        n = (i * UINT64_C(0x20));
        oppc_range_subscript_assign(&result, (n + UINT64_C(0x0)), (n + UINT64_C(0x7)), UINT64_C(0x0));
        oppc_range_subscript_assign(&result, (n + UINT64_C(0x8)), (n + UINT64_C(0x13)), DPD_TO_BCD(oppc_range_subscript(src, (n + UINT64_C(0xc)), (n + UINT64_C(0x15)))));
        oppc_range_subscript_assign(&result, (n + UINT64_C(0x14)), (n + UINT64_C(0x1f)), DPD_TO_BCD(oppc_range_subscript(src, (n + UINT64_C(0x16)), (n + UINT64_C(0x1f)))));
    }
    ctx->gpr[OPPC_GPR_RA] = oppc_range_subscript(result, (UINT64_C(0x40) - ctx->XLEN), UINT64_C(0x3f));
    ctx->gpr[OPPC_GPR_RA] = C ? A : B;
    if (a < b) {
        ctx->gpr[OPPC_GPR_RT] = ctx->gpr[OPPC_GPR_RA];
    } else {
        ctx->gpr[OPPC_GPR_RT] = ctx->gpr[OPPC_GPR_RB];
    }
}


I haven't yet touched the function signature to make it return bool or expand what ctx is, I'd like to deal with the nodes first. But I think this should be sufficient to get the overall idea.

Comment 83 Jacob Lifshay 2024-01-12 23:42:38 GMT

(In reply to Dmitry Selyutin from comment #82)

code looks nice!

>     uint64_t result;

eventually we'll need to support things with >64 bits, but this should be mostly good for now.

>     for (i = UINT64_C(0x0); i != (UINT64_C(0x1) + 1); ++i) {

I would write that i <= UINT64_C(0x1) because that way you don't need the + 1 and because there may be for loops that need to loop zero times e.g. they start at 5 and end at 3, tho I don't remember any particular examples.

Comment 84 Dmitry Selyutin 2024-01-13 05:31:46 GMT

(In reply to Jacob Lifshay from comment #83)
> (In reply to Dmitry Selyutin from comment #82)
> 
> code looks nice!

Thanks! It's funny though that the process is mostly the same as for the pseudocode, it's just that the representation is different. Now I know for sure that the idea with the custom tree format was correct and that I did the right thing when I decided to check with the pseudocode first.

> >     uint64_t result;
> 
> eventually we'll need to support things with >64 bits, but this should be
> mostly good for now.

Hm. Perhaps all the operations should be calls then. Plus after you wrote I recalled that we probably should keep a number of bits an integer has as well... So probably these all should deal with some kind of structure.

> >     for (i = UINT64_C(0x0); i != (UINT64_C(0x1) + 1); ++i) {
> 
> I would write that i <= UINT64_C(0x1) because that way you don't need the +
> 1 and because there may be for loops that need to loop zero times e.g. they
> start at 5 and end at 3, tho I don't remember any particular examples.

Nice catch, haven't thought of that. I just tend to use not equal automatically, but here, when we don't fully control code, we should be extra careful.

Comment 85 Luke Kenneth Casson Leighton 2024-01-13 09:35:32 GMT

(In reply to Dmitry Selyutin from comment #84)

> 
> Hm. Perhaps all the operations should be calls then.

this is abolutely essential, as a c++ template/wrapper allows
for operator-overloading to provide SV semantics.

ths was the technique i used almost 5 years ago to do SV
full implementation in spike in something mad like 6-8 weeks flat.

Comment 86 Dmitry Selyutin 2024-01-13 12:54:59 GMT

A huge update. Many operations are now supported. From now on, every integer should be represented by a structure which will look like this:

struct oppc_int {
    uint64_t bits;
    uint64_t value[];
};


The functions would then deal with the poiners to such structures:

struct oppc_int *
oppc_add(struct oppc_int *rv, struct oppc_int const *lhs, struct oppc_int const *rhs);

struct oppc_int *
oppc_assign(struct oppc_int *dst, struct oppc_int const *src);

bool
oppc_lt(struct oppc_int const *lhs, struct oppc_int const *rhs);


This is needed mostly for stuff like concatenation to work on binary and hexadecimal literals, plus to make call chains simpler. Decimal and registers have bits equal to XLEN (should be part of the context).


Plus, from now on, everything is a call. I obviously had to cheat a bit and use compound literals. The code is too big to post here, you can check what's generated from __main__ upon `python3 -m openpower.oppc`. I'll post just minor excerpts below for the record.

To clarify about function calls and integer type. I have no intent to provide a completely __working__ C code in scope of this task. This would require to implement an analogue of SelectableInt, invent context with all the registers and states, etc. etc. It will never fit the time constraints, and certainly already doesn't fit the budget.

The idea at this stage is:
1. Emit the code which is valid C code.
2. Check whether it's buildable with some stubs instead of functions and context.

This is it. This is just a task "take pseudocode and emit C code", not "make a fully-blown decoder and simulator". Any objections?

Comment 87 Dmitry Selyutin 2024-01-13 12:59:19 GMT

a[0:XLEN/2] <- [0]*(XLEN/2)
do i = 0 to 1
    a[i] <- i

...gets converted to this (again, ignore lack of ctx and "leave" section which updates its state):


void
oppc_cocojamboo(void) {
    struct oppc_int a;
    struct oppc_int i;
    /* a[0:(XLEN / 2)] <- ([0] * (XLEN / 2)) */
    oppc_range_subscript_assign(
        &a,
        /* 0 */
        &(struct oppc_int){
            .count = ctx->XLEN,
            .array = {0},
        },
        /* (XLEN / 2) */
        oppc_div(
            /* transient */
            &(struct oppc_int){
                .count = ctx->XLEN,
                .array = {},
            },
            /* XLEN */
            &(struct oppc_int){
                .count = ctx->XLEN,
                .array = {ctx->XLEN},
            },
            /* 2 */
            &(struct oppc_int){
                .count = ctx->XLEN,
                .array = {2},
            },
        ),
        /* ([0] * (XLEN / 2)) */
        oppc_repeat(
            /* transient */
            &(struct oppc_int){
                .count = ctx->XLEN,
                .array = {},
            },
            /* 0 */
            &(struct oppc_int){
                .count = ctx->XLEN,
                .array = {0},
            },
            /* (XLEN / 2) */
            oppc_div(
                /* transient */
                &(struct oppc_int){
                    .count = ctx->XLEN,
                    .array = {},
                },
                /* XLEN */
                &(struct oppc_int){
                    .count = ctx->XLEN,
                    .array = {ctx->XLEN},
                },
                /* 2 */
                &(struct oppc_int){
                    .count = ctx->XLEN,
                    .array = {2},
                }
            )
        )
    );
    /* for i = 0 to 1 */
    for (
            /* i <- 0 */
            oppc_assign(
                &i,
                /* 0 */
                &(struct oppc_int){
                    .count = ctx->XLEN,
                    .array = {0},
                }
            );
            /* (i <= 1) */
            oppc_le(
                &i,
                /* 1 */
                &(struct oppc_int){
                    .count = ctx->XLEN,
                    .array = {1},
                }
            );
            /* i <- (i + 1) */
            oppc_assign(
                &i,
                /* (i + 1) */
                oppc_add(
                    /* transient */
                    &(struct oppc_int){
                        .count = ctx->XLEN,
                        .array = {},
                    },
                    &i,
                    /* 1 */
                    &(struct oppc_int){
                        .count = ctx->XLEN,
                        .array = {1},
                    }
                )
            )
    /* for i = 0 to 1 */
    ) {
        /* a[i] <- i */
        oppc_subscript_assign(
            &a,
            &i,
            &i
        );
    }
}

Comment 88 Dmitry Selyutin 2024-01-13 13:04:55 GMT

A minor remark on for: I had to basically create different nodes for this to work, since ForExpr obtained from parser looks like this:

class ForExpr(Dataclass):
    subject: Node
    start: Node
    end: Node
    body: Scope

We, however, create two AssignExpr and one BinaryExpr instead of this. An alternative would be to output C-like ForExpr from the parser, but this would be creepy: I'd like to keep the produced AST as close to the origins as possible.

Comment 89 Dmitry Selyutin 2024-01-13 13:08:57 GMT

Also, yes, I know that this `oppc_int` is a pile of crap. :-) If it's an MSB0 bit vector, this should've been initialized differently. As I mentioned, there's no room for bringing a full-blown selectable int equivalent into C, especially if it's bigger than 64 bits.

Comment 90 Luke Kenneth Casson Leighton 2024-01-13 20:11:12 GMT

(In reply to Dmitry Selyutin from comment #86)
> A huge update. Many operations are now supported. From now on, every integer
> should be represented by a structure which will look like this:
> 
> struct oppc_int {
>     uint64_t bits;
>     uint64_t value[];
> };

and an "ok" flag. this is absolutely essential.

struct oppc_int {
    uint64_t bits;
    uint64_t value[];
    bool ok;   // set when value is modified
 };


> To clarify about function calls and integer type. I have no intent to
> provide a completely __working__ C code in scope of this task. This would
> require to implement an analogue of SelectableInt, invent context with all
> the registers and states, etc. etc. It will never fit the time constraints,
> and certainly already doesn't fit the budget.

yyep, not a surprise at all.  it took me 6-8 weeks to do SV for spike.

> The idea at this stage is:
> 1. Emit the code which is valid C code.
> 2. Check whether it's buildable with some stubs instead of functions and
> context.
> 
> This is it. This is just a task "take pseudocode and emit C code", not "make
> a fully-blown decoder and simulator". Any objections?

none here as long as the data structure includes that "ok" flag
and is initially clear and is set by any oppc_* operation.

basically if *ALL* the functions oppc_add oppc_assign oppc_le
set the ok flag on their return result then the desired behaviour
is achieved with NO NEED for the compiler to even know that the
ok flag exists.

do NOT attempt to get the compiler to emit code that sets the ok flag.

let the oppc_{arithmetic*} functions handle it.

Comment 91 Luke Kenneth Casson Leighton 2024-01-13 20:17:39 GMT

(In reply to Dmitry Selyutin from comment #87)

>     /* a[0:(XLEN / 2)] <- ([0] * (XLEN / 2)) */
>     oppc_range_subscript_assign(
>         &a,
>         /* 0 */
>         &(struct oppc_int){
>             .count = ctx->XLEN,
>             .array = {0},
>         },

blech! :)

it's actually understandable / readable.

implementation:

oppc_range_subscript_assign(oppc_int *a,  ...) {
    a->ok = true;
}

and then the caller can check the result->ok flag and
if true write to the regfile.

do you have time to write one function (oppc_add) or more as a demo /
unit test?

Comment 92 Jacob Lifshay 2024-01-14 02:09:23 GMT

(In reply to Dmitry Selyutin from comment #87)
>                 oppc_add(
>                     /* transient */
>                     &(struct oppc_int){
>                         .count = ctx->XLEN,
>                         .array = {},
>                     },

this needs one element in array -- because array is not fixed-length, you can't just use {}, since that'll give a zero-length array.

Comment 93 Jacob Lifshay 2024-01-14 02:22:57 GMT

(In reply to Luke Kenneth Casson Leighton from comment #90)
> do NOT attempt to get the compiler to emit code that sets the ok flag.
> 
> let the oppc_{arithmetic*} functions handle it.

all assignments need to set ok too, in the simulator copy_assign_rhs does that by calling SelectableInt's constructor:
https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/helpers.py;h=918a409eaf8e57762d7c67f692e6bd16d75dccba;hb=97f603213814159f6da349735f7a2a07d72c42cf#l78

so, for now, you likely only need to ensure anything that writes to an oppc_int is a function call, where ok = true can be put.

later, not as part of this bug, we will need to support more than just oppc_int, e.g. the bfp_* code uses a rational number as the mantissa (SelectableMSB0Fraction, which likely needs more than 64 bits in both numerator and denominator), and a BFPState struct.

Comment 94 Luke Kenneth Casson Leighton 2024-01-14 06:34:57 GMT

(In reply to Jacob Lifshay from comment #93)

> all assignments need to set ok too,

yes.

> in the simulator copy_assign_rhs does
> that by calling SelectableInt's constructor:
> https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/
> decoder/helpers.py;h=918a409eaf8e57762d7c67f692e6bd16d75dccba;
> hb=97f603213814159f6da349735f7a2a07d72c42cf#l78

thanks for finding that.

> so, for now, you likely only need to ensure anything that writes to an
> oppc_int is a function call, where ok = true can be put.

basically.
 
> later, not as part of this bug, we will need to support more than just
> oppc_int, e.g. the bfp_* code uses a rational number as the mantissa
> (SelectableMSB0Fraction, which likely needs more than 64 bits in both
> numerator and denominator), and a BFPState struct.

typecast and operator overloads provide globals and other state.

c++ wrapper #includes *the c code*,
(#include "autogenerated_compiler_output.c" - this is a trick from spike)
and the end result is total redirection even of ctx.

dmitry note from the above that global variables in the c code
(such as XLEN) are *perfectly fine* because when "wrapped" in
a "using namespace" c++ context on the exact same autogenerated
c code gets redirected.

trying that with ctx->XLEN is much harder to do in a c++ template/wrapper
whereas just XLEN (in the autogenerated output) can be
"#defined to ctx->THE_REAL_XLEN" for plain c, and made a
member of the #including wrapper/template class in c++

Comment 95 Dmitry Selyutin 2024-01-14 08:11:06 GMT

(In reply to Luke Kenneth Casson Leighton from comment #94)
> (In reply to Jacob Lifshay from comment #93)
> 
> > all assignments need to set ok too,
> 
> yes.
> 
> > in the simulator copy_assign_rhs does
> > that by calling SelectableInt's constructor:
> > https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/
> > decoder/helpers.py;h=918a409eaf8e57762d7c67f692e6bd16d75dccba;
> > hb=97f603213814159f6da349735f7a2a07d72c42cf#l78
> 
> thanks for finding that.
> 
> > so, for now, you likely only need to ensure anything that writes to an
> > oppc_int is a function call, where ok = true can be put.
> 
> basically.

Folks, could you, please, elaborate, why it's done on a generic integer class? I've been thinking rather of this approach:

struct oppc_reg {
    struct oppc_int value;
    uint64_t flags;
};

...where assignments to registers (and similar stuff) have the flags updated. What's the point of having ok or flags in each and every integer? We have full control over what entities we work on, and can detect whether pseudocode writes into GPR or FPR or whatever.


> > later, not as part of this bug, we will need to support more than just
> > oppc_int, e.g. the bfp_* code uses a rational number as the mantissa
> > (SelectableMSB0Fraction, which likely needs more than 64 bits in both
> > numerator and denominator), and a BFPState struct.
> 
> typecast and operator overloads provide globals and other state.
> 
> c++ wrapper #includes *the c code*,
> (#include "autogenerated_compiler_output.c" - this is a trick from spike)
> and the end result is total redirection even of ctx.
> 
> dmitry note from the above that global variables in the c code
> (such as XLEN) are *perfectly fine* because when "wrapped" in
> a "using namespace" c++ context on the exact same autogenerated
> c code gets redirected.
> 
> trying that with ctx->XLEN is much harder to do in a c++ template/wrapper
> whereas just XLEN (in the autogenerated output) can be
> "#defined to ctx->THE_REAL_XLEN" for plain c, and made a
> member of the #including wrapper/template class in c++

Even omitting issues related to global context in general, this is a detail of a particular implementation, the low-level architecture should be kept out of these details. I don't want the implementation to be tied to a particular language; at some point we might opt to reuse this generated code not only from C++. We might have wrappers which fetch/store parts of the state though in order to keep the context opaque.

Comment 96 Dmitry Selyutin 2024-01-14 08:19:08 GMT

Luke, could you please drop this?

warning, file not .mdwn, skipping lrsc.txt

Please don't store non-mdwn files there or use glob instead so that only mdwn files are captured.

Comment 97 Dmitry Selyutin 2024-01-14 08:49:58 GMT

(In reply to Dmitry Selyutin from comment #95)
> Folks, could you, please, elaborate, why it's done on a generic integer
> class? I've been thinking rather of this approach:
> 
> struct oppc_reg {
>     struct oppc_int value;
>     uint64_t flags;
> };
> 
> ...where assignments to registers (and similar stuff) have the flags
> updated.

Like this:

if (a = b) then
    RA <- 0b0
else
    X <- 0b0

void
oppc_cocojamboo(void) {
    struct oppc_int a;
    struct oppc_int b;
    struct oppc_int X;
    if (
        /* (a = b) */
        oppc_eq(
            &a,
            &b
        )
    ) {
        /* RA <- 0b0 */
        oppc_assign(
            /* (RA) */
            &ctx->gpr[OPPC_GPR_RA]->value,
            /* 0b0 */
            &(struct oppc_int){
                .count = 1,
                .array = {0x0},
            }
        );
        &ctx->gpr[OPPC_GPR_RA]->flags |= OPPC_GPR_UPDATED;
    } else {
        /* X <- 0b0 */
        oppc_assign(
            &X,
            /* 0b0 */
            &(struct oppc_int){
                .count = 1,
                .array = {0x0},
            }
        );
    }
}

Comment 98 Jacob Lifshay 2024-01-14 08:52:15 GMT

(In reply to Dmitry Selyutin from comment #95)
> What's the point of having ok or flags in each and every integer?

mostly because that's how we did it in python...yeah I can see your point why we shouldn't do it that way.

how about:

#include <stdbool.h> // so we can use bool regardless of C or C++
#include <stdint.h>

typedef enum oppc_type_tag {
    oppc_tt_sel_int,
    // add more stuff later
} oppc_type_tag;

typedef struct oppc_sel_int {
    uint32_t bits;
    union {
        uint64_t small; // for bits <= 64
        // add another field later for > 64 bits
    };
} oppc_sel_int;

typedef struct oppc_value {
    oppc_type_tag type : 4;
    bool ok : 1;
    : 0; // terminate bitfield
    union {
        oppc_sel_int si;
        // add more stuff later
    };
} oppc_value;

static inline oppc_value oppc_si_small(uint64_t value, uint32_t bits) {
    assert(bits <= 64);
    if(bits < 64)
        // note ULL is guaranteed to be >= 64-bits so this works.
        value &= (1ULL << bits) - 1ULL;
    return (oppc_value){
        .type = oppc_tt_sel_int,
        .ok = true,
        .si = {.bits = bits, .small = value},
    };
}

static inline oppc_value oppc_add(oppc_value a, oppc_value b) {
    assert(a.type == oppc_tt_si);
    assert(b.type == oppc_tt_si);
    assert(a.si.bits == b.si.bits);
    assert(a.si.bits <= 64);
    return oppc_si_small(a.si.small + b.si.small, a.si.bits);
}

Comment 99 Jacob Lifshay 2024-01-14 08:54:37 GMT

(In reply to Jacob Lifshay from comment #98)
> typedef enum oppc_type_tag {
>     oppc_tt_sel_int,
>     // add more stuff later

all of the "add more stuff later" notes are for things out of scope of this bug (unless necessary)

Comment 100 Luke Kenneth Casson Leighton 2024-01-14 11:38:20 GMT

(In reply to Dmitry Selyutin from comment #95)

> ...where assignments to registers (and similar stuff) have the flags
> updated. What's the point of having ok

i explained already, jacob can you explain it, it matches the hardware.
but briefly so that time is not wasted.

> Even omitting issues related to global context in general, this is a detail
> of a particular implementation, the low-level architecture should be kept
> out of these details. I don't want the implementation to be tied to a
> particular language;

that's unrealistic (i haven't time to explain, so sorry)
and very much out of scope

> at some point we might opt to reuse this generated code
> not only from C++. 

well beyond scope and not appropriate for discussion right now
due to time pressure.

> We might have wrappers which fetch/store parts of the
> state though in order to keep the context opaque.

indeed. that's exactly how i managed to turn a scalar RISC-V simulator,
spike, into a FULL Simple-V Vectorised simulator, in only 6-8 weeks.
but the wrappers were done by #including a template around each
scalar instruction, and all of the (equivalent of) oppc_* were
class/template members of that class.

no ctx pointer.

no XLEN member variable in ctx.

all done by c++ overloads.

please, really, don't add extra programming languages to this code,
even as a planned future consideration.

Comment 101 Jacob Lifshay 2024-01-14 11:56:06 GMT

(In reply to Luke Kenneth Casson Leighton from comment #100)
> (In reply to Dmitry Selyutin from comment #95)
> 
> > ...where assignments to registers (and similar stuff) have the flags
> > updated. What's the point of having ok
> 
> i explained already, jacob can you explain it, it matches the hardware.
> but briefly so that time is not wasted.

ok is inspired by the pipeline/data classes for soc.git:

simplified:
class Foo:
    data: Signal  # n-bit value, or some other data type like Record
    ok: Signal  # 1-bit value, true when `data` is valid, false when no data is supplied

since we didn't want to have to rewrite large portions of the simulator, we decided to add an `ok` field directly to SelectableInt, which tells the simulator which outputs were actually written (when ok=True) and which should retain their old values (when ok=False, indicating that output wasn't written). knowing when outputs aren't written is necessary for a lot of FP operations that intentionally don't write to FRT if they'll trap, so the trap handler can see the original input values if FRT is the same register as an input.

since the C code generator doesn't have to worry about problems caused by rewriting a bunch of the output code, we can just use ok without having to mash it into SelectableInt, like in comment #98

Comment 102 Luke Kenneth Casson Leighton 2024-01-14 13:15:23 GMT

(In reply to Jacob Lifshay from comment #101)

> since the C code generator doesn't have to worry about problems caused by
> rewriting a bunch of the output code, we can just use ok without having to
> mash it into SelectableInt, like in comment #98

if that works well and is dead easy to do i don't have a problem
with it. time is of the essence

Comment 103 Dmitry Selyutin 2024-01-14 13:28:32 GMT

Multiple updates all over again. The most important:
1. Generalized transient "allocation" (i.e. construction on-the-fly). Now the caller doesn't have to think about the layout of oppc_int and can simply rely that the function calls do the right job.
2. Generalized C calls: both the calls from pseudocode and our own calls now use the same code.
3. Fixed ternary expressions.
4. Dropped ctx argument. Now, it's expected that the users define OPPC_XLEN/OPPC_GPR/OPPC_FPR to behave as they want.
5. Eliminated redundant pseudocode.

Comment 104 Dmitry Selyutin 2024-01-14 13:29:19 GMT

src[64-XLEN:63] <- (RS)
RA <- (C ? A : B)

{
    struct oppc_int src;
    struct oppc_int C;
    struct oppc_int A;
    struct oppc_int B;
    /* src[(64 - XLEN):63] <- (RS) */
    oppc_range_subscript_assign(
        /* src */
        &src,
        /* (64 - XLEN) */
        oppc_sub(
            oppc_transient(&(struct oppc_int){}, UINT64_C(0), (uint8_t)OPPC_XLEN),
            /* 64 */
            oppc_transient(&(struct oppc_int){}, UINT64_C(64), (uint8_t)OPPC_XLEN),
            /* XLEN */
            oppc_transient(&(struct oppc_int){}, OPPC_XLEN, (uint8_t)OPPC_XLEN)
        ),
        /* 63 */
        oppc_transient(&(struct oppc_int){}, UINT64_C(63), (uint8_t)OPPC_XLEN),
        /* (RS) */
        &OPPC_GPR[OPPC_GPR_RS]
    );
    /* RA <- C ? A : B */
    oppc_assign(
        /* (RA) */
        &OPPC_GPR[OPPC_GPR_RA],
        (
            /* C */
            &C
            ?
                /* A */
                &A
            :
                /* B */
                &B
        )
    );
}

Comment 105 Dmitry Selyutin 2024-01-14 13:36:17 GMT

As we now rely on function calls to update anything and it might no longer be just a plain integer, I've took a liberty to rename oppc_int into oppc_value.

https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=33babebc7156a8b547782d3fa39f5c11fdb237b2

Comment 106 Dmitry Selyutin 2024-01-14 14:12:55 GMT

There's an error for if expressions, we don't want to check an address and should emit a call instead.

Comment 107 Dmitry Selyutin 2024-01-14 19:55:19 GMT

OK, with the recent commits, despite some glitches, we're able to produce _some_ C code for any pseudocode. By "any pseudocode" I assume the following:

for insn in db:
    tree = parser.parse(code="\n".join(insn.pcode))
    for (level, line) in pc_code.code(insn=insn, root=tree):
        print(f"{' ' * 4 * level}{line}")

Comment 108 Dmitry Selyutin 2024-01-14 20:00:00 GMT

...and here goes fields initialization (with the rest of the code cut):

void
oppc_tdi(struct oppc_value const *insn) {
    struct oppc_value TO;

    oppc_assign(
        &TO,
        oppc_transient(&(struct oppc_value){}, UINT64_C(0), 5)
    );
    oppc_subscript_assign(
        &TO,
        oppc_transient(&(struct oppc_value){}, 0, (uint8_t)OPPC_XLEN),
        oppc_subscript(
            insn,
            oppc_transient(&(struct oppc_value){}, 6, (uint8_t)OPPC_XLEN)
        )
    );
    oppc_subscript_assign(
        &TO,
        oppc_transient(&(struct oppc_value){}, 1, (uint8_t)OPPC_XLEN),
        oppc_subscript(
            insn,
            oppc_transient(&(struct oppc_value){}, 7, (uint8_t)OPPC_XLEN)
        )
    );
    oppc_subscript_assign(
        &TO,
        oppc_transient(&(struct oppc_value){}, 2, (uint8_t)OPPC_XLEN),
        oppc_subscript(
            insn,
            oppc_transient(&(struct oppc_value){}, 8, (uint8_t)OPPC_XLEN)
        )
    );
    oppc_subscript_assign(
        &TO,
        oppc_transient(&(struct oppc_value){}, 3, (uint8_t)OPPC_XLEN),
        oppc_subscript(
            insn,
            oppc_transient(&(struct oppc_value){}, 9, (uint8_t)OPPC_XLEN)
        )
    );
    oppc_subscript_assign(
        &TO,
        oppc_transient(&(struct oppc_value){}, 4, (uint8_t)OPPC_XLEN),
        oppc_subscript(
            insn,
            oppc_transient(&(struct oppc_value){}, 10, (uint8_t)OPPC_XLEN)
        )
    );
}

Comment 109 Dmitry Selyutin 2024-01-14 20:18:11 GMT

...now with (GPR|0) support!

x <- (RA|0)

{
    struct oppc_value x;
    struct oppc_value RA;

    oppc_assign(
        &RA,
        oppc_transient(&(struct oppc_value){}, UINT64_C(0), 5)
    );
    oppc_subscript_assign(
        &RA,
        oppc_transient(&(struct oppc_value){}, 0, (uint8_t)OPPC_XLEN),
        oppc_subscript(
            insn,
            oppc_transient(&(struct oppc_value){}, 11, (uint8_t)OPPC_XLEN)
        )
    );
    oppc_subscript_assign(
        &RA,
        oppc_transient(&(struct oppc_value){}, 1, (uint8_t)OPPC_XLEN),
        oppc_subscript(
            insn,
            oppc_transient(&(struct oppc_value){}, 12, (uint8_t)OPPC_XLEN)
        )
    );
    oppc_subscript_assign(
        &RA,
        oppc_transient(&(struct oppc_value){}, 2, (uint8_t)OPPC_XLEN),
        oppc_subscript(
            insn,
            oppc_transient(&(struct oppc_value){}, 13, (uint8_t)OPPC_XLEN)
        )
    );
    oppc_subscript_assign(
        &RA,
        oppc_transient(&(struct oppc_value){}, 3, (uint8_t)OPPC_XLEN),
        oppc_subscript(
            insn,
            oppc_transient(&(struct oppc_value){}, 14, (uint8_t)OPPC_XLEN)
        )
    );
    oppc_subscript_assign(
        &RA,
        oppc_transient(&(struct oppc_value){}, 4, (uint8_t)OPPC_XLEN),
        oppc_subscript(
            insn,
            oppc_transient(&(struct oppc_value){}, 15, (uint8_t)OPPC_XLEN)
        )
    );


    /* x <- (RA|0) */
    oppc_assign(
        /* x */
        &x,
        (
            oppc_bool(
                &RA
            )
            ?
                &OPPC_GPR[OPPC_GPR_RA]
            :
                oppc_transient(&(struct oppc_value){}, UINT64_C(0), (uint8_t)OPPC_XLEN)
        )
    );
}

Comment 110 Luke Kenneth Casson Leighton 2024-01-14 20:22:32 GMT

(In reply to Dmitry Selyutin from comment #109)
> ...now with (GPR|0) support!

oooOooo :)
https://www.youtube.com/watch?v=bVURsxuwy_E

Comment 111 Jacob Lifshay 2024-01-14 23:18:42 GMT

(In reply to Jacob Lifshay from comment #98)
for adding strings, since there are only string constants, we can add them like so:

typedef enum oppc_type_tag {
    oppc_tt_sel_int,
    oppc_tt_str,
    // add more stuff later
} oppc_type_tag;

typedef enum oppc_str {
    // for every string constant, add an entry here
    oppc_str_abc, // "abc"
    oppc_str_foo, // "foo"
    oppc_str_bar, // "bar"
} oppc_str;

// for debugging and other places you need the actual text
const char * const oppc_strings[] = {
    // for every string constant, add an entry here
    [(int)oppc_str_abc] = "abc",
    [(int)oppc_str_foo] = "foo",
    [(int)oppc_str_bar] = "bar",
};

typedef struct oppc_value {
    oppc_type_tag type : 4;
    bool ok : 1;
    : 0; // terminate bitfield
    union {
        oppc_sel_int si;
        oppc_str str;
        // add more stuff later
    };
} oppc_value;

in python, you'd generate this by having a hard-coded list of all string constants (since you're not always parsing everything and you need to emit consistent type definitions or that's a ODR violation -- aka. UB):

all_strings = (
    "abc",
    "foo",
    "bar",
)

def make_oppc_str_enumerant(v):
    name = ["oppc_str_"]
    # escape string into unambiguous C identifier
    for ch in v:
        if ch.isascii() and ch.isalnum():
            name.append(ch)
        elif ch == "_":
            # 2 underlines to not confuse with escaped char
            name.append("__")
        else:
            name.append("_%x_" % (ord(ch),))
    return "".join(name)

def translate_str_lit(v):
    assert f in all_strings, "%r is not in all_strings" % (v,)
    return "(oppc_value){.type = oppc_tt_str, .ok = true, .str = %s}" % (
        make_oppc_str_enumerant(v),)

def c_str(v):
    v = v.replace("\\", "\\\\")
    v = v.replace("\n", "\\n")
    v = v.replace("\t", "\\t")
    v = v.replace("\"", "\\"")
    return '"' + v + '"'

def create_oppc_str_definition():
    yield "typedef enum oppc_str {"
    for i in all_strings:
        yield "    %s, // %s" % (make_oppc_str_enumerant(i), c_str(i))
    yield "} oppc_str;"
    yield ""
    yield "// for debugging and other places you need the actual text"
    yield "const char * const oppc_strings[] = {"
    for i in all_strings:
        yield "    [(int)%s] = %s," % (make_oppc_str_enumerant(i), c_str(i))
    yield "};"

Comment 112 Dmitry Selyutin 2024-01-15 11:47:49 GMT

I wrote my position on the ONLY string we have so far in 1250. I won't introduce a tagged union only due to stubbornness. In fact, I don't see a need for a tagged union at all, both the type and ok field can be just encoded into flags of oppc_value.

Comment 113 Dmitry Selyutin 2024-01-15 11:52:36 GMT

Until then, I moving on assuming this function has a string argument, indeed.

Comment 114 Jacob Lifshay 2024-01-15 12:25:05 GMT

(In reply to Dmitry Selyutin from comment #112)
>  both the type and ok field can be just
> encoded into flags of oppc_value.

that's true, but not why i proposed a union, i proposed a union because we later need to add more types (rational number for bfp_* code, several different structs for XER, SVSTATE, FPSCR, and BFP numbers).

I proposed ok and type being bitfields because that makes the code clearer while producing the same assembly.

Comment 115 Dmitry Selyutin 2024-01-15 16:59:02 GMT

Function calls will need more work: in our realities, things like EXTS should resolve to this:

struct oppc_value *
OPPC_FUNCTION_EXTS(struct oppc_value *ret, struct oppc_value const *value);

Meanwhile EXTS has only 1 argument, and we cannot update the input.

Comment 116 Dmitry Selyutin 2024-01-15 17:02:19 GMT

88582 lines of C code from the C generator for all the code we had in insndb. Not bad.

Comment 117 Luke Kenneth Casson Leighton 2024-01-15 20:28:59 GMT

(In reply to Dmitry Selyutin from comment #112)
> I wrote my position on the ONLY string we have so far in 1250. I won't
> introduce a tagged union

indeed. jacob has been told maaany many times to stop introducing
types and unnecessary complexity (apologies talking about you in
3rd person there jacob).

> In fact, I don't see a
> need for a tagged union at all, both the type and ok field can be just
> encoded into flags of oppc_value.

yyyep. or not even flags, just an integer number in oppc_value.

the constants can also be #defined and made global/static,
which makes good readability if they are used in the compiler output
(rather than putting the explicit value itself)

Comment 118 Luke Kenneth Casson Leighton 2024-01-15 20:31:29 GMT

(In reply to Jacob Lifshay from comment #114)

> that's true, but not why i proposed a union, i proposed a union because we
> later need to add more types (rational number for bfp_* code, several
> different structs for XER, SVSTATE, FPSCR, and BFP numbers).

no.

and we are not repeat not for the umpteenth time increasing the scope.
please for goodness sake stop doing this, i have been literally asking
you for years jacob.

Comment 119 Dmitry Selyutin 2024-01-16 19:31:01 GMT

28 commits since the last update. This was a tough journey!! The code can now be compiled successfully. This is a valid C code. The most difficult commit was this:

https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=90c93ec3cf18b96a9e70464f2b549c9bace1bbe4

Believe it or not, this was done manually. A long and extremely boring series of "run Python and produce C code, compile it with C, see which function is broken due to lack of prototype, visit Python code and see its arguments, add the corresponding C prototype but don't forget to adopt the calling convention, our result must generate a transient and return a pointer to it". I cannot stress how annoying and boring it was, probably one of the most annoying tasks I ever did, but I decided that writing yet another parser would consume even more time.

This can be checked via the following incantation:

gcc -Os -Isrc/liboppc -Wall -Werror -Wextra test.c -c -o test.o

As you see I took extra measures to ensure we pass some stricter checks.

I'm planning to add src/liboppc/oppc.c autogeneration so that it's compiled as oppc.o, and be done with the task. The actual implementation of 238 calls (this includes the implementation of SelectableInt-like semantics, filling all these defines, all functions we have in helpers and isafn, switching for FPSCR attributes and so on). Yes, this will be a damn lot of work; even all the prototypes and definities took 1K+ lines of manually written code. Some functions can be generated as well, and I think this is the path we should follow, eventually discarding as many manually written helpers as we can.

So, I'm planning to add src/liboppc/oppc.c autogeneration as part of the build and be done with this task. I might probably agree to some smallish additions which hopefully won't be as annoying as that header above. I already feel this is well underestimated, considering the efforts, though. I'd like to listen to your thoughts and ideas.

Comment 120 Dmitry Selyutin 2024-01-16 19:32:09 GMT

Ah yes, I forgot the first part of the incantation. Without it, magic ain't gonna happen.

python3 -m openpower.oppc > test.c
gcc -Os -Isrc/liboppc -Wall -Werror -Wextra test.c -c -o test.o

Comment 121 Dmitry Selyutin 2024-01-16 20:34:45 GMT

Ah yes, -Os is not needed, I just checked how much bytes the object file has (2.2 MB with -Os, 4.6 IIRC with default, 2.4 IIRC with -O3). Just curiosity. And yes, almost 148K lines for the code itself (obviously including the comments from the pseudocode, but I consider these essential, otherwise the code is barely readable).

Comment 122 Luke Kenneth Casson Leighton 2024-01-16 22:06:20 GMT

(In reply to Dmitry Selyutin from comment #119)
> 28 commits since the last update. This was a tough journey!! The code can
> now be compiled successfully.

hooraay! that's brilliant

>  This is a valid C code. The most difficult
> commit was this:
> 
> https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;
> h=90c93ec3cf18b96a9e70464f2b549c9bace1bbe4
> 
> Believe it or not, this was done manually.

daaamn... :)


> pointer to it". I cannot stress how annoying and boring it was, 

sometimes i find it good to switch off if you know what i mean :)

> probably one
> of the most annoying tasks I ever did, but I decided that writing yet
> another parser would consume even more time.

ahh... yes :)
 
> This can be checked via the following incantation:
> 
> gcc -Os -Isrc/liboppc -Wall -Werror -Wextra test.c -c -o test.o
> 
> As you see I took extra measures to ensure we pass some stricter checks.
> 
> I'm planning to add src/liboppc/oppc.c autogeneration 

ah not to the repo, please, if that's the plan? (maybr misunderstding)
 no matter if it takes a long time,
the project's policy is never to add autogenerated output.
the one piece of work that violated that rule took up 20% of
available drive space and it is a permanent serious problem,
the repo is 50 times bigger than all others.

> so that it's compiled
> as oppc.o, and be done with the task. The actual implementation of 238 calls
> (this includes the implementation of SelectableInt-like semantics, filling
> all these defines, all functions we have in helpers and isafn, switching for
> FPSCR attributes and so on). Yes, this will be a damn lot of work; even all
> the prototypes and definities took 1K+ lines of manually written code. 

yep. welcome to 5 years development.

>Some
> functions can be generated as well, and I think this is the path we should
> follow, eventually discarding as many manually written helpers as we can.

yes. hm if consulted beforehand i would have said "don't do it" but
you needed the protos to test the output, ho hum it is what it is.
doing introspection of the modules would, retrospectively, perhaps
have been a better option? plus it would remove ongoing de-sync /
maintenance headaches when python functions change (or new ones added)
they're done now, and do the job.
 
> So, I'm planning to add src/liboppc/oppc.c autogeneration as part of the
> build

build script and makefiles yes but to repo no? am i understanding right?

> and be done with this task. 

i am veeery happy for that to happen, you've done a hell of a lot.

> I might probably agree to some smallish
> additions which hopefully won't be as annoying as that header above. I
> already feel this is well underestimated, considering the efforts, though.

no, enough is enough, further work requires a bigger budget

> I'd like to listen to your thoughts and ideas.

if you can outline what it would take to do a python-cffi wrapper
plus auto-introspection/generation of the c protos to ensure that
the python helpers.py is the "canonical" source, plus a hefty budget
with a bit extra for some unexpected things (which we know always
haoppens sigh) and chuck it onto 1211 or 1212 as appropriate, that'd
be fantastic.

correction: if you are *interested* to do that... :)

Comment 123 Dmitry Selyutin 2024-01-17 15:34:41 GMT

(In reply to Luke Kenneth Casson Leighton from comment #122)
> (In reply to Dmitry Selyutin from comment #119)
> 
> build script and makefiles yes but to repo no? am i understanding right?

I've pushed a commit which adds a Makefile with simple commands to check it:

https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=ab9c2cf0f935d6d97e319fbd4d177d7f8679a58f

I'll rebase libopid against the recent master in scope of bug #1254, put these changes atop, and then add these sweet pair to build once I ensure it works in CI. But this for 1254, my works here are done, it's ready and can be checked.

> i am veeery happy for that to happen, you've done a hell of a lot.

Thanks! It was a tough one. :-)

> if you can outline what it would take to do a python-cffi wrapper
> plus auto-introspection/generation of the c protos to ensure that
> the python helpers.py is the "canonical" source, plus a hefty budget
> with a bit extra for some unexpected things (which we know always
> haoppens sigh) and chuck it onto 1211 or 1212 as appropriate, that'd
> be fantastic.
> 
> correction: if you are *interested* to do that... :)

Certainly, but I'll take some well-deserved rest first. :-)

Comment 124 Dmitry Selyutin 2024-01-18 14:34:29 GMT

Luke, based on your comment #122, I've submitted an RFP and marked this as RESOLVED FIXED, please let me know if I understood you incorrectly.

Comment 125 Dmitry Selyutin 2024-02-04 14:00:37 GMT

Oddly, I haven't received a single damn notification from the bank that the payment has arrived. Perhaps something changed again, sigh. But I confirm the payment's arrived.

Comment 126 Luke Kenneth Casson Leighton 2024-02-04 18:11:44 GMT

(In reply to Dmitry Selyutin from comment #125)

> payment has arrived. Perhaps something changed again, sigh. But I confirm
> the payment's arrived.

total included the other two you asked to be merged?

Comment 127 Dmitry Selyutin 2024-02-04 18:17:10 GMT

A bigger amount than solely for this task, so I'd assume yes. The bank I'm using now has a huge fee, they consider keeping EUR is unsafe since recent, plus EU sender bank took their own fee. Welcome to post-sanctions world. :-)

Comment 128 Luke Kenneth Casson Leighton 2024-02-04 22:46:43 GMT

(In reply to Dmitry Selyutin from comment #127)
> A bigger amount than solely for this task, so I'd assume yes. The bank I'm
> using now has a huge fee, they consider keeping EUR is unsafe since recent,
> plus EU sender bank took their own fee. Welcome to post-sanctions world. :-)

worth asking if it can be transferred in a different
currency?