Besides the actual decoding to understand which instruction we're going to simulate, we also need a way to specify the actual logic behind the instruction to be executed. We already have human-readable specification in Markdown format, and successfully used it for our reference Python-based simulator. We need to follow the same approach for cavatools, and generate the code for all PowerISA instructions.
An important remark: even though we're able to execute the instruction only after it's been decoded, the decoder-related task (#979) is not a show stopper here. These tasks can be completed in parallel.
IRC discussion https://libre-soc.org/irclog/%23libre-soc.2023-05-29.log.html#t2023-05-29T20:10:07
i recommend using memcpy instead of *(uint16_t)addr, since it avoids C language UB and generally gets compiled to a single load/store, not a function call: https://gcc.godbolt.org/z/sj9o1novn #include <stdint.h> #include <string.h> uint16_t load(uint8_t *mem, uint64_t EA) { uint16_t retval; memcpy(&retval, &mem[EA], 2); return retval; } load: # @load lhzx 3, 3, 4 blr .long 0 .quad 0
Dmitry, is this task doable? Again apologies, I don't remember if we agreed this task is actually possible within the time constraints. My chart on the discussion page proposed to cancel this task.
(In reply to Andrey Miroshnikov from comment #4) > Dmitry, is this task doable? yes. > Again apologies, I don't remember if we agreed this task is actually > possible within the time constraints. it is. > My chart on the discussion page > proposed to cancel this task. no.
Hi guys! Andrey, I'm sorry, I've missed your message. (In reply to Andrey Miroshnikov from comment #4) (In reply to Luke Kenneth Casson Leighton from comment #5) > (In reply to Andrey Miroshnikov from comment #4) > > Dmitry, is this task doable? > > yes. > > > Again apologies, I don't remember if we agreed this task is actually > > possible within the time constraints. > > it is. > > > My chart on the discussion page > > proposed to cancel this task. > > no. Luke, I agree with you it's both doable and can be kept, but we have to establish the criteria, otherwise we'll put it under risk. We'd better be damn careful! Here's what I suggest: 1. Stick to developing the key concepts for code generation. Likely only vanilla PPC at this stage. 2. Generating something to be reused as yet another library; this allows us to combine it with libopid. I'm thinking of generating a table of per-instruction callbacks along with their bodies and some context around, like this: /* * Intended to be called from the decoder. * Prefix is NOP for vanilla PPC. * Just a quick approximation, something more complex is needed. */ static void opsim_add(struct opsim_ctx *ctx, uint64_t prefix, uint64_t suffix) { /* can be optimized later, don't care for now */ uint64_t RT = opsim_reg_fetch(ctx, opsim_add_RT(suffix)); uint64_t RA = opsim_reg_fetch(ctx, opsim_add_RA(suffix)); uint64_t RB = opsim_reg_fetch(ctx, opsim_add_RB(suffix)); RT = (RA + RB); opsim_reg_store(ctx, opsim_add_RT(suffix), RT); } or even: static void opsim_add(struct opsim_ctx *ctx, uint64_t prefix, uint64_t suffix) { opsim_reg_store(ctx, opsim_add_RT(suffix), (opsim_reg_fetch(ctx, opsim_add_RA(suffix)) + opsim_reg_fetch(ctx, opsim_add_RB(suffix))) ); } 3. Nothing to be touched yet in the existing codegen, we'll just add C generation mode into the current generator, due to time constraints. Yes there's a room for improvement, some things might be done differently. But for now -- no changes unless lack of them stops us. We'll be under a huge risk otherwise.
(In reply to Dmitry Selyutin from comment #6) > Luke, I agree with you it's both doable and can be kept, but we have to > establish the criteria, otherwise we'll put it under risk. yes. i recommend a compatible drop-in replacement for PowerDecoder. (power_decoder.py) which is easier because the existing unit tests can be used (like, all of them: TestIssuer as well as ISACaller). this means a cffi interface to return PowerOp instances. do not attempt to rewrite PowerOp, it is too fundamental. a replacement class that does the exact same job, and is used *only* by the new c-based decoder: not a problem.
(In reply to Luke Kenneth Casson Leighton from comment #7) > yes. i recommend a compatible drop-in replacement for PowerDecoder. > (power_decoder.py) which is easier because the existing unit tests > can be used (like, all of them: TestIssuer as well as ISACaller). oh hang on sorry, pseudocode compiler not power decoder. (In reply to Dmitry Selyutin from comment #6) > 3. Nothing to be touched yet in the existing codegen, we'll just add C > generation mode into the current generator, due to time constraints. yes. cffi interface as well, to allow unit tests comparing against existing python-based simulator, ultimately i want it to *replace* the python-based compiler. skip functions such as EXTS and others (Power ISA Section 1.3) for now. > Yes > there's a room for improvement, some things might be done differently. But > for now -- no changes unless lack of them stops us. We'll be under a huge > risk otherwise. v. true
(In reply to Dmitry Selyutin from comment #6) > opsim_reg_store(ctx, opsim_add_RT(suffix), RT); opsim_reg_store must set an "ok" bit in a struct to indicate that the register has been modified. long story.
OK, first thing I did was replacing print() incantations with log(). I want to at least be able to observe MY debug instead of everything else. You should observe no differences unless you explicitly incantate SILENCELOG=true. https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=659f58e0144c0ddb59439501ed9c50ba993f92fe
(In reply to Dmitry Selyutin from comment #10) > OK, first thing I did was replacing print() incantations with log(). goood. i did some yesterday, already. > I want > to at least be able to observe MY debug instead of everything else. You > should observe no differences unless you explicitly incantate > SILENCELOG=true. fantastic. i have no problem with that at all. (as i am dealing all the time with *individual damn bits* i need to literlly see everything and i do mean everything. as long as that's not interfered with i'm good. i say good but what i mean is, "my mind is utterly twisted and melted by the insane level of detail *which i memorise and look out for") > https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff; > h=659f58e0144c0ddb59439501ed9c50ba993f92fe
I've spent a LOT of hours debugging and checking the opportunities. Here are some conclusions I came to so far. 1. I might appear to be a devil's advocate, but I actually think that python-ply is a good choice. Regardless of our implementation pros and contras, for me it already advocates python-ply as a sufficiently good instrument, and I'm going to stick with it. 2. What is bad from C code generation perspective is that we produce Python AST. It's not a suitable candidate for C code generation. What's worse, we generate multiple nodes for a single entity, and it gets too complicated even in Python, not even speaking of C. 3. It gets especially hairy when we produce Python calls and modify the code after it has been obtained with AST. That is, we get the AST result and patch it in place. 4. We cannot avoid modifying the nodes produced, but there are several issues. First, all our nodes are Python AST. Second, some nodes are built in place, e.g. concat and selectassign and many others (basically all which lead to ast.Call). Third, whenever I'll update this code, this would potentially bring a disaster. All in all, I don't believe that the current code is a good basis for C code generator. It works for Python, and I'd keep it so that the code WORKS. Here's what I suggest instead. 1. Keep the python-ply, but recreate the overall code so that it produces CUSTOM nodes instead. That is, where we have <u, there will be a `class LTUnsigned(Node)` construction, similarly for all other stuff. Basically the real language-agnostic AST. Language-agnostic is an absolutely critical item! 2. Once we have AST collected and working, we can have walkers and visitors for it. 3. This AST can be used for C code generation. Ideally, had we not time constraints, we'd build ANOTHER AST atop of that: PseudocodeAST => CAst. But for now likely we'll have to follow the simpler approach. 4. If we have time and opportunities, we'd just eventually follow the same approach for Python code. That is, just convert PseudocodeAST to ast.Module. I don't think it's doable in terms of the current approach mostly due to the fact that it produces a pretty complicated Python AST. I could try changing the current code, but I'm absolutely sure my changes will make all the code around completely nuts. I wish I had returned with a more encouraging news, but these are conclusions I've been able to reach so far. Basically my suggestion is to use python-ply again and output a completely different nodes. So the code eventually will look familiar but still build something completely different. I'd like to know your opinion and will meanwhile move on with checking other projects using ply to better understand how it can be used. Before we proceed, two important items: 1) python-ply is to stay, we don't have time for brand new frameworks, plus I'm mostly satisfied with it; 2) no there's no time (and there's a strong lack of desire) to teach the whole Python ast to produce Python code.
(In reply to Dmitry Selyutin from comment #12) > I'd like to know your opinion and will meanwhile move on with checking other > projects using ply to better understand how it can be used. Before we > proceed, two important items: 1) python-ply is to stay, we don't have time > for brand new frameworks, using python ply is fine with me, but please modify the parser rather than introduce kludges like pre-parsing steps to insert colons (unless you're just reusing existing parsing code unmodified). also, we need to change operator precedence to match the PowerISA specification: bug #1082 (it isn't a lot of work, just change the operator list order and run all the tests.)
(In reply to Jacob Lifshay from comment #13) > using python ply is fine with me, but please modify the parser rather than > introduce kludges like pre-parsing steps to insert colons (unless you're > just reusing existing parsing code unmodified). or copying.
(In reply to Dmitry Selyutin from comment #12) > 3. This AST can be used for C code generation. Ideally, had we not time > constraints, we'd build ANOTHER AST atop of that: PseudocodeAST => CAst. But > for now likely we'll have to follow the simpler approach. i'm good with that. if this was a "full" language (scopes exceptions globl/local variabes classes gotos) i would say NO hell NO. (btw please drop functions. they were not added and should not have beem added by jacob. please drop "def" from the AST and modify pyfnwriter and appendix pseudocode to not support functions. this means separating each "function" into an explicit separate pseudocode file) > 4. If we have time and opportunities, we'd just eventually follow the same > approach for Python code. That is, just convert PseudocodeAST to ast.Module. > > > I don't think it's doable in terms of the current approach mostly due to the > fact that it produces a pretty complicated Python AST. yes. if you have seen the notes i started from dabeaz's GardenSnake.py as *literally* the first commit, then fixed the lexer problem (with an additional lexer phase that inline-inserted the required TOKENs): https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/pseudo/lexer.py;h=c42f8330;hb=HEAD#l62 and then spent about FOUR MONTHS over the next 2 years performing incremental changes and fixes that got it to do the job. the only reason i could start from GardenSnake.py was because i spotted that fundamentally the underlying syntax (not the grammar) of python and Section 1.3 POWER Book I ISA are identical. *over time* the use of python-astor, which i used because some dickhead in the python community ripped out ast.py of python2.7 and replaced it with fucking c code *with no introspection or source* "bcuz faster", has become a serious impediment. i can barely edit parser.py without also creating a prerequisite code-fragment in power_pseuo.py and using astor.dump() https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/power_pseudo.py;hb=HEAD > I could try changing > the current code, but I'm absolutely sure my changes will make all the code > around completely nuts. yyep. it's reached the "mature complex but stable" phase that all software developed incrementally will get to. > I wish I had returned with a more encouraging news, but these are > conclusions I've been able to reach so far. Basically my suggestion is to > use python-ply again and output a completely different nodes. So the code > eventually will look familiar but still build something completely different. seriously and honestly, i was expecting that anyway. an astor-to-c converter (basically a replication of cython) was a "hope" rather than a requirement. > > I'd like to know your opinion and will meanwhile move on with checking other > projects using ply to better understand how it can be used. Before we > proceed, two important items: 1) python-ply is to stay, we don't have time > for brand new frameworks, no we do not. > plus I'm mostly satisfied with it; it's pretty damn good. it is sad that dabeaz abandoned it. > 2) no there's > no time (and there's a strong lack of desire) to teach the whole Python ast > to produce Python code. it *used* to do that. go right back to the original very first commit. (context: GardenSnake.py was - is - an interactive python interpreter written *in* python using python-ply. it worked perfectly for *very early* versions of python and i am going back 20+ years here but at some point the syntax changed and dabeaz never got round to fixing GardenSnake.py... that was my very first task)
(In reply to Luke Kenneth Casson Leighton from comment #15) > (btw please drop functions. they were not added and should not have beem > added by jacob. please drop "def" from the AST and modify pyfnwriter > and appendix pseudocode to not support functions. this means > separating each "function" into an explicit separate pseudocode file) PowerISA very much has functions (though they don't use the def keyword): Book I Appendix A Round Integer( sign, frac0:64, gbit, rbit, xbit, round_mode ): inc <- 0 if round_mode = 0b00 then do /* Round to Nearest */ if sign || frac64 || gbit || rbit || xbit = 0bU11UU then inc <- 1 if sign || frac64 || gbit || rbit || xbit = 0bU011U then inc <- 1 if sign || frac64 || gbit || rbit || xbit = 0bU01U1 then inc <- 1 end if round_mode = 0b10 then do /* Round toward +Infinity */ if sign || frac64 || gbit || rbit || xbit = 0b0U1UU then inc <- 1 if sign || frac64 || gbit || rbit || xbit = 0b0UU1U then inc <- 1 if sign || frac64 || gbit || rbit || xbit = 0b0UUU1 then inc <- 1 end if round_mode = 0b11 then do /* Round toward -Infinity */ if sign || frac64 || gbit || rbit || xbit = 0b1U1UU then inc <- 1 if sign || frac64 || gbit || rbit || xbit = 0b1UU1U then inc <- 1 if sign || frac64 || gbit || rbit || xbit = 0b1UUU1 then inc <- 1 end frac0:64 <- frac0:64 + inc FPSCRFR <- inc FPSCRFI <- gbit | rbit | xbit return
(In reply to Jacob Lifshay from comment #16) > (In reply to Luke Kenneth Casson Leighton from comment #15) > > (btw please drop functions. they were not added and should not have beem > > added by jacob. please drop "def" from the AST and modify pyfnwriter > > and appendix pseudocode to not support functions. this means > > separating each "function" into an explicit separate pseudocode file) > > PowerISA very much has functions (though they don't use the def keyword): see also bfp_ROUND_TO_BFP32 which in the spec PDF is basically a function: it has a function signature line, has calls to other functions, has return statements, etc.
(In reply to Jacob Lifshay from comment #16) separating each "function" into an explicit separate pseudocode file) > > PowerISA very much has functions (though they don't use the def keyword): > Book I Appendix A > > Round Integer( sign, frac0:64, gbit, rbit, xbit, round_mode ): i expect and anticipate the pseudocode *file* to be called round_integer.mdwn
(In reply to Luke Kenneth Casson Leighton from comment #18) > (In reply to Jacob Lifshay from comment #16) > separating each "function" into an explicit separate pseudocode file) > > > > PowerISA very much has functions (though they don't use the def keyword): > > Book I Appendix A > > > > Round Integer( sign, frac0:64, gbit, rbit, xbit, round_mode ): > > i expect and anticipate the pseudocode *file* to be called round_integer.mdwn so you're suggesting *deleting* the function signatures from the PowerISA specification? I don't think they'll like that. please don't do that. i'm fine if you want one function per file (but it'll be annoying), but we *need* the function signatures to stay. they will *still be functions*.
(In reply to Jacob Lifshay from comment #19) > so you're suggesting *deleting* the function signatures from the PowerISA > specification? I don't think they'll like that. please don't do that. > > i'm fine if you want one function per file (but it'll be annoying), but we > *need* the function signatures to stay. they will *still be functions*. function signatures are necessary so we know what function arguments we have and what to name them. icr where and couldn't find it from some cursory searching, but iirc we discussed pseudocode functions before and concluded we need them.
(edit: ignoring is fine too) (In reply to Jacob Lifshay from comment #20) > (In reply to Jacob Lifshay from comment #19) > > i'm fine if you want one function per file (but it'll be annoying), but we > > *need* the function signatures to stay. they will *still be functions*. dmitry, so it's clear, I'm fine if you decide it's too much work to implement pseudocode functions for now and just raise an error or ignore the parsed functions, but please don't delete them.
(In reply to Jacob Lifshay from comment #20) > icr where and couldn't find it from some cursory searching, but iirc we > discussed pseudocode functions before and concluded we need them. ok good enough. so many details.
(In reply to Jacob Lifshay from comment #19) > so you're suggesting *deleting* the function signatures from the PowerISA > specification? no. don't be absurd. if they are in section 1.3 book I i need to see the wording. i don't ever recall seeing them. if not there that needs raising with OPF ISA WG. just re-read v3.0C, sections 1.3.1 1.3.2 1.3.3 and 1.3.4 there is nothing i can see, no definition of "function" or parameters or the keyword "return"
(In reply to Luke Kenneth Casson Leighton from comment #23) > (In reply to Jacob Lifshay from comment #19) > > > so you're suggesting *deleting* the function signatures from the PowerISA > > specification? > > no. don't be absurd. > > if they are in section 1.3 book I i need to see the wording. afaict they're not there. functions are just implicitly defined. they are explicitly mentioned though (v3.1B): section title: 7.6.2.2 VSX Instruction RTL Function Calls and then it proceeds to define a bunch of functions, some in pseudo-code, some partially english description, many calling other defined functions. so, they have functions that they define and use lots of places, just they never define what a function is or how to interpret one -- they probably assume the reader has heard of function calls and returns before (which imo is a good assumption, but being explicit is almost always better). > i don't ever recall seeing them. if not there that needs raising > with OPF ISA WG. you want to do that?
(In reply to Jacob Lifshay from comment #24) > afaict they're not there. functions are just implicitly defined. which is a direct violation of their rules on the spec. > assume the reader has heard of function calls and returns before (which imo > is a good assumption, but being explicit is almost always better). it's actually required: the ISA TWG *requires* that under no circumstances can there be "assumptions" > > i don't ever recall seeing them. if not there that needs raising > > with OPF ISA WG. > > you want to do that? not really. i've had enough of dealing with them. when they agree to properly cooperate then yes.
I've started playing with this, and found some complexities in mdis module. I had to drop parent/path in walker otherwise this becomes too complex. Basically the issue is that yielding tuples inside tuple hook (unsurprisingly) leads to an infinite recursion. Since we don't use this functionality for now, I decided it'd be easier to drop it and return to this subject later. https://git.libre-soc.org/?p=mdis.git;a=commitdiff;h=835549b3d024ea03f6363f6131cce3268c9aa52a https://git.libre-soc.org/?p=mdis.git;a=commitdiff;h=84f9c40cff102be5de4a4063931764d74d7abbae
I'm currently writing the basic classes which represent pseudocode nodes, and also started implementing the first simple visitor which produces the pseudocode given the root node. Ignore the contents, they are a complete nonsense. tree = Module([ If( test=BinaryOp( left=Symbol("RA"), op=LtU(), right=Symbol("RB"), ), body=Block([ Assign( lvalue=Symbol("RT"), rvalue=BinaryOp( left=Symbol("RA"), op=Add(), right=Symbol("RB"), ), ), AssignIEA( lvalue=Symbol("NIA"), rvalue=Call([ Symbol("RA"), Call([Symbol("RB")]), ]), ), ]), orelse=Block([ AssignIEA( lvalue=Symbol("NIA"), rvalue=Symbol("CIA"), ), ]), ), ]) for (level, line) in util.pcode(tree): stmt = ((" " * (level * 4)) + line) print(stmt)
The result is: if RA <u RB then RT <- RA + RB NIA <-iea CALL(RA, CALL(RB)) else NIA <-iea CIA
I need the pseudocode printing for both the debugging and ensuring the overall approach are fine. The code goes like this (example handlers for particular node types): @Hook(If) def If(self, node): yield node stmt = " ".join(["if", str(self[node.test]), "then"]) self[node].emit(stmt=stmt) for (level, stmt) in self[node.body]: self[node].emit(stmt=stmt, level=level) if node.orelse: self[node].emit("else") stmt = str(self[node.orelse]) self[node].emit(stmt=stmt) @Hook(Symbol) def Symbol(self, node): yield node self[node].emit(stmt=str(node)) As you see, each node gets some code object associated with it. The code object has an emit method, which feeds yet another statement considering the current indentation level for it.
Obviously C will need more context to be collected and stored upon visiting, but you get the idea.
(In reply to Dmitry Selyutin from comment #28) > The result is: > > if RA <u RB then > RT <- RA + RB > NIA <-iea CALL(RA, CALL(RB)) > else > NIA <-iea CIA nice! already! that will come in handy for debug purposes during AST-Node-morphing/manipulation (why astor has a dump function) (In reply to Dmitry Selyutin from comment #29) > @Hook(If) > def If(self, node): > yield node > stmt = " ".join(["if", str(self[node.test]), "then"]) ... ... quite ridiculously clear and simple, isn't it? > As you see, each node gets some code object associated with it. The code > object has an emit method, which feeds yet another statement considering the > current indentation level for it. very cool.
Supported more nodes, I'm not saying all nodes are already implemented, but many are. tree = Module( description=String("Subtract From"), form=Form.XO, mnemonics=Mnemonics([ Mnemonic( name=String("subf"), dynamic_operands=DynamicOperands([ DynamicOperand("RT"), DynamicOperand("RA"), DynamicOperand("RB"), ]), static_operands=StaticOperands([ StaticOperand(key=String("OE"), value=Int(0)), StaticOperand(key=String("Rc"), value=Int(0)), ]), ), Mnemonic( name=String("subf."), dynamic_operands=DynamicOperands([ DynamicOperand("RT"), DynamicOperand("RA"), DynamicOperand("RB"), ]), static_operands=StaticOperands([ StaticOperand(key=String("OE"), value=Int(0)), StaticOperand(key=String("Rc"), value=Int(1)), ]), ), Mnemonic( name=String("subfo"), dynamic_operands=DynamicOperands([ DynamicOperand("RT"), DynamicOperand("RA"), DynamicOperand("RB"), ]), static_operands=StaticOperands([ StaticOperand(key=String("OE"), value=Int(1)), StaticOperand(key=String("Rc"), value=Int(0)), ]), ), Mnemonic( name=String("subfo."), dynamic_operands=DynamicOperands([ DynamicOperand("RT"), DynamicOperand("RA"), DynamicOperand("RB"), ]), static_operands=StaticOperands([ StaticOperand(key=String("OE"), value=Int(1)), StaticOperand(key=String("Rc"), value=Int(1)), ]), ), ]), alterations=Alterations([ Alteration( registers=Registers([ String("CR0"), ]), condition=Condition(key=String("Rc"), value=Int(1)), ), Alteration( registers=Registers([ String("SO"), String("OV"), String("OV32"), ]), condition=Condition(key=String("OE"), value=Int(1)), ), ]), code=Code([ Assign( lvalue=String("RT"), rvalue=BinaryOp( left=BinaryOp( left=UnaryOp( op=Not(), expr=String("RA"), ), op=Add(), right=String("RB"), ), op=Add(), right=Int(1), ), ), ]), ) for (level, line) in util.pcode(tree): stmt = ((" " * (level * 4)) + line) print(stmt)
# Subtract From XO-form * subf RT,RA,RB (OE=0 Rc=0) * subf. RT,RA,RB (OE=0 Rc=1) * subfo RT,RA,RB (OE=1 Rc=0) * subfo. RT,RA,RB (OE=1 Rc=1) Pseudo-code: RT <- ((¬(RA) + RB) + 1) Special Registers Altered: CR0 (if Rc=1) SO OV OV32 (if OE=1)