Besides the actual decoding to understand which instruction we're going to simulate, we also need a way to specify the actual logic behind the instruction to be executed. We already have human-readable specification in Markdown format, and successfully used it for our reference Python-based simulator. We need to follow the same approach for cavatools, and generate the code for all PowerISA instructions.
An important remark: even though we're able to execute the instruction only after it's been decoded, the decoder-related task (#979) is not a show stopper here. These tasks can be completed in parallel.
IRC discussion https://libre-soc.org/irclog/%23libre-soc.2023-05-29.log.html#t2023-05-29T20:10:07
i recommend using memcpy instead of *(uint16_t)addr, since it avoids C language UB and generally gets compiled to a single load/store, not a function call: https://gcc.godbolt.org/z/sj9o1novn #include <stdint.h> #include <string.h> uint16_t load(uint8_t *mem, uint64_t EA) { uint16_t retval; memcpy(&retval, &mem[EA], 2); return retval; } load: # @load lhzx 3, 3, 4 blr .long 0 .quad 0
Dmitry, is this task doable? Again apologies, I don't remember if we agreed this task is actually possible within the time constraints. My chart on the discussion page proposed to cancel this task.
(In reply to Andrey Miroshnikov from comment #4) > Dmitry, is this task doable? yes. > Again apologies, I don't remember if we agreed this task is actually > possible within the time constraints. it is. > My chart on the discussion page > proposed to cancel this task. no.
Hi guys! Andrey, I'm sorry, I've missed your message. (In reply to Andrey Miroshnikov from comment #4) (In reply to Luke Kenneth Casson Leighton from comment #5) > (In reply to Andrey Miroshnikov from comment #4) > > Dmitry, is this task doable? > > yes. > > > Again apologies, I don't remember if we agreed this task is actually > > possible within the time constraints. > > it is. > > > My chart on the discussion page > > proposed to cancel this task. > > no. Luke, I agree with you it's both doable and can be kept, but we have to establish the criteria, otherwise we'll put it under risk. We'd better be damn careful! Here's what I suggest: 1. Stick to developing the key concepts for code generation. Likely only vanilla PPC at this stage. 2. Generating something to be reused as yet another library; this allows us to combine it with libopid. I'm thinking of generating a table of per-instruction callbacks along with their bodies and some context around, like this: /* * Intended to be called from the decoder. * Prefix is NOP for vanilla PPC. * Just a quick approximation, something more complex is needed. */ static void opsim_add(struct opsim_ctx *ctx, uint64_t prefix, uint64_t suffix) { /* can be optimized later, don't care for now */ uint64_t RT = opsim_reg_fetch(ctx, opsim_add_RT(suffix)); uint64_t RA = opsim_reg_fetch(ctx, opsim_add_RA(suffix)); uint64_t RB = opsim_reg_fetch(ctx, opsim_add_RB(suffix)); RT = (RA + RB); opsim_reg_store(ctx, opsim_add_RT(suffix), RT); } or even: static void opsim_add(struct opsim_ctx *ctx, uint64_t prefix, uint64_t suffix) { opsim_reg_store(ctx, opsim_add_RT(suffix), (opsim_reg_fetch(ctx, opsim_add_RA(suffix)) + opsim_reg_fetch(ctx, opsim_add_RB(suffix))) ); } 3. Nothing to be touched yet in the existing codegen, we'll just add C generation mode into the current generator, due to time constraints. Yes there's a room for improvement, some things might be done differently. But for now -- no changes unless lack of them stops us. We'll be under a huge risk otherwise.
(In reply to Dmitry Selyutin from comment #6) > Luke, I agree with you it's both doable and can be kept, but we have to > establish the criteria, otherwise we'll put it under risk. yes. i recommend a compatible drop-in replacement for PowerDecoder. (power_decoder.py) which is easier because the existing unit tests can be used (like, all of them: TestIssuer as well as ISACaller). this means a cffi interface to return PowerOp instances. do not attempt to rewrite PowerOp, it is too fundamental. a replacement class that does the exact same job, and is used *only* by the new c-based decoder: not a problem.
(In reply to Luke Kenneth Casson Leighton from comment #7) > yes. i recommend a compatible drop-in replacement for PowerDecoder. > (power_decoder.py) which is easier because the existing unit tests > can be used (like, all of them: TestIssuer as well as ISACaller). oh hang on sorry, pseudocode compiler not power decoder. (In reply to Dmitry Selyutin from comment #6) > 3. Nothing to be touched yet in the existing codegen, we'll just add C > generation mode into the current generator, due to time constraints. yes. cffi interface as well, to allow unit tests comparing against existing python-based simulator, ultimately i want it to *replace* the python-based compiler. skip functions such as EXTS and others (Power ISA Section 1.3) for now. > Yes > there's a room for improvement, some things might be done differently. But > for now -- no changes unless lack of them stops us. We'll be under a huge > risk otherwise. v. true
(In reply to Dmitry Selyutin from comment #6) > opsim_reg_store(ctx, opsim_add_RT(suffix), RT); opsim_reg_store must set an "ok" bit in a struct to indicate that the register has been modified. long story.
OK, first thing I did was replacing print() incantations with log(). I want to at least be able to observe MY debug instead of everything else. You should observe no differences unless you explicitly incantate SILENCELOG=true. https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=659f58e0144c0ddb59439501ed9c50ba993f92fe
(In reply to Dmitry Selyutin from comment #10) > OK, first thing I did was replacing print() incantations with log(). goood. i did some yesterday, already. > I want > to at least be able to observe MY debug instead of everything else. You > should observe no differences unless you explicitly incantate > SILENCELOG=true. fantastic. i have no problem with that at all. (as i am dealing all the time with *individual damn bits* i need to literlly see everything and i do mean everything. as long as that's not interfered with i'm good. i say good but what i mean is, "my mind is utterly twisted and melted by the insane level of detail *which i memorise and look out for") > https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff; > h=659f58e0144c0ddb59439501ed9c50ba993f92fe
I've spent a LOT of hours debugging and checking the opportunities. Here are some conclusions I came to so far. 1. I might appear to be a devil's advocate, but I actually think that python-ply is a good choice. Regardless of our implementation pros and contras, for me it already advocates python-ply as a sufficiently good instrument, and I'm going to stick with it. 2. What is bad from C code generation perspective is that we produce Python AST. It's not a suitable candidate for C code generation. What's worse, we generate multiple nodes for a single entity, and it gets too complicated even in Python, not even speaking of C. 3. It gets especially hairy when we produce Python calls and modify the code after it has been obtained with AST. That is, we get the AST result and patch it in place. 4. We cannot avoid modifying the nodes produced, but there are several issues. First, all our nodes are Python AST. Second, some nodes are built in place, e.g. concat and selectassign and many others (basically all which lead to ast.Call). Third, whenever I'll update this code, this would potentially bring a disaster. All in all, I don't believe that the current code is a good basis for C code generator. It works for Python, and I'd keep it so that the code WORKS. Here's what I suggest instead. 1. Keep the python-ply, but recreate the overall code so that it produces CUSTOM nodes instead. That is, where we have <u, there will be a `class LTUnsigned(Node)` construction, similarly for all other stuff. Basically the real language-agnostic AST. Language-agnostic is an absolutely critical item! 2. Once we have AST collected and working, we can have walkers and visitors for it. 3. This AST can be used for C code generation. Ideally, had we not time constraints, we'd build ANOTHER AST atop of that: PseudocodeAST => CAst. But for now likely we'll have to follow the simpler approach. 4. If we have time and opportunities, we'd just eventually follow the same approach for Python code. That is, just convert PseudocodeAST to ast.Module. I don't think it's doable in terms of the current approach mostly due to the fact that it produces a pretty complicated Python AST. I could try changing the current code, but I'm absolutely sure my changes will make all the code around completely nuts. I wish I had returned with a more encouraging news, but these are conclusions I've been able to reach so far. Basically my suggestion is to use python-ply again and output a completely different nodes. So the code eventually will look familiar but still build something completely different. I'd like to know your opinion and will meanwhile move on with checking other projects using ply to better understand how it can be used. Before we proceed, two important items: 1) python-ply is to stay, we don't have time for brand new frameworks, plus I'm mostly satisfied with it; 2) no there's no time (and there's a strong lack of desire) to teach the whole Python ast to produce Python code.
(In reply to Dmitry Selyutin from comment #12) > I'd like to know your opinion and will meanwhile move on with checking other > projects using ply to better understand how it can be used. Before we > proceed, two important items: 1) python-ply is to stay, we don't have time > for brand new frameworks, using python ply is fine with me, but please modify the parser rather than introduce kludges like pre-parsing steps to insert colons (unless you're just reusing existing parsing code unmodified). also, we need to change operator precedence to match the PowerISA specification: bug #1082 (it isn't a lot of work, just change the operator list order and run all the tests.)
(In reply to Jacob Lifshay from comment #13) > using python ply is fine with me, but please modify the parser rather than > introduce kludges like pre-parsing steps to insert colons (unless you're > just reusing existing parsing code unmodified). or copying.
(In reply to Dmitry Selyutin from comment #12) > 3. This AST can be used for C code generation. Ideally, had we not time > constraints, we'd build ANOTHER AST atop of that: PseudocodeAST => CAst. But > for now likely we'll have to follow the simpler approach. i'm good with that. if this was a "full" language (scopes exceptions globl/local variabes classes gotos) i would say NO hell NO. (btw please drop functions. they were not added and should not have beem added by jacob. please drop "def" from the AST and modify pyfnwriter and appendix pseudocode to not support functions. this means separating each "function" into an explicit separate pseudocode file) > 4. If we have time and opportunities, we'd just eventually follow the same > approach for Python code. That is, just convert PseudocodeAST to ast.Module. > > > I don't think it's doable in terms of the current approach mostly due to the > fact that it produces a pretty complicated Python AST. yes. if you have seen the notes i started from dabeaz's GardenSnake.py as *literally* the first commit, then fixed the lexer problem (with an additional lexer phase that inline-inserted the required TOKENs): https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/pseudo/lexer.py;h=c42f8330;hb=HEAD#l62 and then spent about FOUR MONTHS over the next 2 years performing incremental changes and fixes that got it to do the job. the only reason i could start from GardenSnake.py was because i spotted that fundamentally the underlying syntax (not the grammar) of python and Section 1.3 POWER Book I ISA are identical. *over time* the use of python-astor, which i used because some dickhead in the python community ripped out ast.py of python2.7 and replaced it with fucking c code *with no introspection or source* "bcuz faster", has become a serious impediment. i can barely edit parser.py without also creating a prerequisite code-fragment in power_pseuo.py and using astor.dump() https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/power_pseudo.py;hb=HEAD > I could try changing > the current code, but I'm absolutely sure my changes will make all the code > around completely nuts. yyep. it's reached the "mature complex but stable" phase that all software developed incrementally will get to. > I wish I had returned with a more encouraging news, but these are > conclusions I've been able to reach so far. Basically my suggestion is to > use python-ply again and output a completely different nodes. So the code > eventually will look familiar but still build something completely different. seriously and honestly, i was expecting that anyway. an astor-to-c converter (basically a replication of cython) was a "hope" rather than a requirement. > > I'd like to know your opinion and will meanwhile move on with checking other > projects using ply to better understand how it can be used. Before we > proceed, two important items: 1) python-ply is to stay, we don't have time > for brand new frameworks, no we do not. > plus I'm mostly satisfied with it; it's pretty damn good. it is sad that dabeaz abandoned it. > 2) no there's > no time (and there's a strong lack of desire) to teach the whole Python ast > to produce Python code. it *used* to do that. go right back to the original very first commit. (context: GardenSnake.py was - is - an interactive python interpreter written *in* python using python-ply. it worked perfectly for *very early* versions of python and i am going back 20+ years here but at some point the syntax changed and dabeaz never got round to fixing GardenSnake.py... that was my very first task)
(In reply to Luke Kenneth Casson Leighton from comment #15) > (btw please drop functions. they were not added and should not have beem > added by jacob. please drop "def" from the AST and modify pyfnwriter > and appendix pseudocode to not support functions. this means > separating each "function" into an explicit separate pseudocode file) PowerISA very much has functions (though they don't use the def keyword): Book I Appendix A Round Integer( sign, frac0:64, gbit, rbit, xbit, round_mode ): inc <- 0 if round_mode = 0b00 then do /* Round to Nearest */ if sign || frac64 || gbit || rbit || xbit = 0bU11UU then inc <- 1 if sign || frac64 || gbit || rbit || xbit = 0bU011U then inc <- 1 if sign || frac64 || gbit || rbit || xbit = 0bU01U1 then inc <- 1 end if round_mode = 0b10 then do /* Round toward +Infinity */ if sign || frac64 || gbit || rbit || xbit = 0b0U1UU then inc <- 1 if sign || frac64 || gbit || rbit || xbit = 0b0UU1U then inc <- 1 if sign || frac64 || gbit || rbit || xbit = 0b0UUU1 then inc <- 1 end if round_mode = 0b11 then do /* Round toward -Infinity */ if sign || frac64 || gbit || rbit || xbit = 0b1U1UU then inc <- 1 if sign || frac64 || gbit || rbit || xbit = 0b1UU1U then inc <- 1 if sign || frac64 || gbit || rbit || xbit = 0b1UUU1 then inc <- 1 end frac0:64 <- frac0:64 + inc FPSCRFR <- inc FPSCRFI <- gbit | rbit | xbit return
(In reply to Jacob Lifshay from comment #16) > (In reply to Luke Kenneth Casson Leighton from comment #15) > > (btw please drop functions. they were not added and should not have beem > > added by jacob. please drop "def" from the AST and modify pyfnwriter > > and appendix pseudocode to not support functions. this means > > separating each "function" into an explicit separate pseudocode file) > > PowerISA very much has functions (though they don't use the def keyword): see also bfp_ROUND_TO_BFP32 which in the spec PDF is basically a function: it has a function signature line, has calls to other functions, has return statements, etc.
(In reply to Jacob Lifshay from comment #16) separating each "function" into an explicit separate pseudocode file) > > PowerISA very much has functions (though they don't use the def keyword): > Book I Appendix A > > Round Integer( sign, frac0:64, gbit, rbit, xbit, round_mode ): i expect and anticipate the pseudocode *file* to be called round_integer.mdwn
(In reply to Luke Kenneth Casson Leighton from comment #18) > (In reply to Jacob Lifshay from comment #16) > separating each "function" into an explicit separate pseudocode file) > > > > PowerISA very much has functions (though they don't use the def keyword): > > Book I Appendix A > > > > Round Integer( sign, frac0:64, gbit, rbit, xbit, round_mode ): > > i expect and anticipate the pseudocode *file* to be called round_integer.mdwn so you're suggesting *deleting* the function signatures from the PowerISA specification? I don't think they'll like that. please don't do that. i'm fine if you want one function per file (but it'll be annoying), but we *need* the function signatures to stay. they will *still be functions*.
(In reply to Jacob Lifshay from comment #19) > so you're suggesting *deleting* the function signatures from the PowerISA > specification? I don't think they'll like that. please don't do that. > > i'm fine if you want one function per file (but it'll be annoying), but we > *need* the function signatures to stay. they will *still be functions*. function signatures are necessary so we know what function arguments we have and what to name them. icr where and couldn't find it from some cursory searching, but iirc we discussed pseudocode functions before and concluded we need them.
(edit: ignoring is fine too) (In reply to Jacob Lifshay from comment #20) > (In reply to Jacob Lifshay from comment #19) > > i'm fine if you want one function per file (but it'll be annoying), but we > > *need* the function signatures to stay. they will *still be functions*. dmitry, so it's clear, I'm fine if you decide it's too much work to implement pseudocode functions for now and just raise an error or ignore the parsed functions, but please don't delete them.
(In reply to Jacob Lifshay from comment #20) > icr where and couldn't find it from some cursory searching, but iirc we > discussed pseudocode functions before and concluded we need them. ok good enough. so many details.
(In reply to Jacob Lifshay from comment #19) > so you're suggesting *deleting* the function signatures from the PowerISA > specification? no. don't be absurd. if they are in section 1.3 book I i need to see the wording. i don't ever recall seeing them. if not there that needs raising with OPF ISA WG. just re-read v3.0C, sections 1.3.1 1.3.2 1.3.3 and 1.3.4 there is nothing i can see, no definition of "function" or parameters or the keyword "return"
(In reply to Luke Kenneth Casson Leighton from comment #23) > (In reply to Jacob Lifshay from comment #19) > > > so you're suggesting *deleting* the function signatures from the PowerISA > > specification? > > no. don't be absurd. > > if they are in section 1.3 book I i need to see the wording. afaict they're not there. functions are just implicitly defined. they are explicitly mentioned though (v3.1B): section title: 7.6.2.2 VSX Instruction RTL Function Calls and then it proceeds to define a bunch of functions, some in pseudo-code, some partially english description, many calling other defined functions. so, they have functions that they define and use lots of places, just they never define what a function is or how to interpret one -- they probably assume the reader has heard of function calls and returns before (which imo is a good assumption, but being explicit is almost always better). > i don't ever recall seeing them. if not there that needs raising > with OPF ISA WG. you want to do that?
(In reply to Jacob Lifshay from comment #24) > afaict they're not there. functions are just implicitly defined. which is a direct violation of their rules on the spec. > assume the reader has heard of function calls and returns before (which imo > is a good assumption, but being explicit is almost always better). it's actually required: the ISA TWG *requires* that under no circumstances can there be "assumptions" > > i don't ever recall seeing them. if not there that needs raising > > with OPF ISA WG. > > you want to do that? not really. i've had enough of dealing with them. when they agree to properly cooperate then yes.
I've started playing with this, and found some complexities in mdis module. I had to drop parent/path in walker otherwise this becomes too complex. Basically the issue is that yielding tuples inside tuple hook (unsurprisingly) leads to an infinite recursion. Since we don't use this functionality for now, I decided it'd be easier to drop it and return to this subject later. https://git.libre-soc.org/?p=mdis.git;a=commitdiff;h=835549b3d024ea03f6363f6131cce3268c9aa52a https://git.libre-soc.org/?p=mdis.git;a=commitdiff;h=84f9c40cff102be5de4a4063931764d74d7abbae
I'm currently writing the basic classes which represent pseudocode nodes, and also started implementing the first simple visitor which produces the pseudocode given the root node. Ignore the contents, they are a complete nonsense. tree = Module([ If( test=BinaryOp( left=Symbol("RA"), op=LtU(), right=Symbol("RB"), ), body=Block([ Assign( lvalue=Symbol("RT"), rvalue=BinaryOp( left=Symbol("RA"), op=Add(), right=Symbol("RB"), ), ), AssignIEA( lvalue=Symbol("NIA"), rvalue=Call([ Symbol("RA"), Call([Symbol("RB")]), ]), ), ]), orelse=Block([ AssignIEA( lvalue=Symbol("NIA"), rvalue=Symbol("CIA"), ), ]), ), ]) for (level, line) in util.pcode(tree): stmt = ((" " * (level * 4)) + line) print(stmt)
The result is: if RA <u RB then RT <- RA + RB NIA <-iea CALL(RA, CALL(RB)) else NIA <-iea CIA
I need the pseudocode printing for both the debugging and ensuring the overall approach are fine. The code goes like this (example handlers for particular node types): @Hook(If) def If(self, node): yield node stmt = " ".join(["if", str(self[node.test]), "then"]) self[node].emit(stmt=stmt) for (level, stmt) in self[node.body]: self[node].emit(stmt=stmt, level=level) if node.orelse: self[node].emit("else") stmt = str(self[node.orelse]) self[node].emit(stmt=stmt) @Hook(Symbol) def Symbol(self, node): yield node self[node].emit(stmt=str(node)) As you see, each node gets some code object associated with it. The code object has an emit method, which feeds yet another statement considering the current indentation level for it.
Obviously C will need more context to be collected and stored upon visiting, but you get the idea.
(In reply to Dmitry Selyutin from comment #28) > The result is: > > if RA <u RB then > RT <- RA + RB > NIA <-iea CALL(RA, CALL(RB)) > else > NIA <-iea CIA nice! already! that will come in handy for debug purposes during AST-Node-morphing/manipulation (why astor has a dump function) (In reply to Dmitry Selyutin from comment #29) > @Hook(If) > def If(self, node): > yield node > stmt = " ".join(["if", str(self[node.test]), "then"]) ... ... quite ridiculously clear and simple, isn't it? > As you see, each node gets some code object associated with it. The code > object has an emit method, which feeds yet another statement considering the > current indentation level for it. very cool.
Supported more nodes, I'm not saying all nodes are already implemented, but many are. tree = Module( description=String("Subtract From"), form=Form.XO, mnemonics=Mnemonics([ Mnemonic( name=String("subf"), dynamic_operands=DynamicOperands([ DynamicOperand("RT"), DynamicOperand("RA"), DynamicOperand("RB"), ]), static_operands=StaticOperands([ StaticOperand(key=String("OE"), value=Int(0)), StaticOperand(key=String("Rc"), value=Int(0)), ]), ), Mnemonic( name=String("subf."), dynamic_operands=DynamicOperands([ DynamicOperand("RT"), DynamicOperand("RA"), DynamicOperand("RB"), ]), static_operands=StaticOperands([ StaticOperand(key=String("OE"), value=Int(0)), StaticOperand(key=String("Rc"), value=Int(1)), ]), ), Mnemonic( name=String("subfo"), dynamic_operands=DynamicOperands([ DynamicOperand("RT"), DynamicOperand("RA"), DynamicOperand("RB"), ]), static_operands=StaticOperands([ StaticOperand(key=String("OE"), value=Int(1)), StaticOperand(key=String("Rc"), value=Int(0)), ]), ), Mnemonic( name=String("subfo."), dynamic_operands=DynamicOperands([ DynamicOperand("RT"), DynamicOperand("RA"), DynamicOperand("RB"), ]), static_operands=StaticOperands([ StaticOperand(key=String("OE"), value=Int(1)), StaticOperand(key=String("Rc"), value=Int(1)), ]), ), ]), alterations=Alterations([ Alteration( registers=Registers([ String("CR0"), ]), condition=Condition(key=String("Rc"), value=Int(1)), ), Alteration( registers=Registers([ String("SO"), String("OV"), String("OV32"), ]), condition=Condition(key=String("OE"), value=Int(1)), ), ]), code=Code([ Assign( lvalue=String("RT"), rvalue=BinaryOp( left=BinaryOp( left=UnaryOp( op=Not(), expr=String("RA"), ), op=Add(), right=String("RB"), ), op=Add(), right=Int(1), ), ), ]), ) for (level, line) in util.pcode(tree): stmt = ((" " * (level * 4)) + line) print(stmt)
# Subtract From XO-form * subf RT,RA,RB (OE=0 Rc=0) * subf. RT,RA,RB (OE=0 Rc=1) * subfo RT,RA,RB (OE=1 Rc=0) * subfo. RT,RA,RB (OE=1 Rc=1) Pseudo-code: RT <- ((¬(RA) + RB) + 1) Special Registers Altered: CR0 (if Rc=1) SO OV OV32 (if OE=1)
(In reply to Dmitry Selyutin from comment #33) > # Subtract From > > XO-form > > * subf RT,RA,RB (OE=0 Rc=0) > * subf. RT,RA,RB (OE=0 Rc=1) > * subfo RT,RA,RB (OE=1 Rc=0) > * subfo. RT,RA,RB (OE=1 Rc=1) ah - the focus and the scope is solely and exclusively the pseudocode. there is no money payable for work on amything other than the pseudocode, plus it is too risky to start adding to or changing the scope in an unplanned manner when there is so little time (cavatools and cryptoprimitives are on a HARD deadline of March 1st 2024 - only 9 weeks) therefore don't whatever you do increase the scope *in any way*, but if you have ideas raise them as "Future" and we can discuss the possibility of adding them to one of the two upcoming grants when appropriate if appropriate ok? really *really* don't increase the scope here, it is just too scary to contemplate missing the committment made to NLnet because of some unplanned addition to the scope, no matter how good it is.
...pause... sorry :)
OK Luke, that's fair, I totally agree with you. I won't pursue this target in scope of this task, and will abandon these (quite incomplete) changes now. Still I think this would be the right choice, and eventually will do it. This largely relates to binutils-related works as well, we need possible forms there. I'll raise the relevant task later, OK?
(In reply to Dmitry Selyutin from comment #36) > OK Luke, that's fair, I totally agree with you. I won't pursue this target > in scope of this task, they have quite a lot of additional implications for changes that make me nervous if not properly evaluated (under time-pressure at that) > and will abandon these (quite incomplete) changes now. suggest to keep the git commit around (no force-branch-push which loses it. or.. ah! could explicitly tag it?) > Still I think this would be the right choice, and eventually will do it. i like the very deliberatte brutal simplicity of the current parser. however an actual AST (esp. in BNF form) is ultimately better but hm the lexer/parser split in python-ply leaves me drumming my fingers at the additional complexity. > This largely relates to binutils-related works as well, we need possible > forms there. I'll raise the relevant task later, OK? no problem. parent: bug #1212
(In reply to Luke Kenneth Casson Leighton from comment #37) > > Still I think this would be the right choice, and eventually will do it. > > i like the very deliberatte brutal simplicity of the current parser. however > an actual AST (esp. in BNF form) is ultimately better but hm the lexer/parser > split in python-ply leaves me drumming my fingers at the additional > complexity. I think we should aim more for an intermediate representation that aims to capture the *semantics* (since those are what we're trying to translate) more than trying to exactly match the *syntax*, so, no, I don't think an AST is appropriate, we should instead make a tree structure that may be similar to an AST but instead prioritizes captures semantic differences over being in BNF form or other syntax-focused requirements. An example of what I mean: the difference between FPSCR[VXSNAN] (a named field access, not indexing, usually written FPSCR.VXSNAN) and v[i + 3] (bit indexing) -- if we were focused on syntax those would both just be the indexing operator with subexpressions of FPSCR and VXSNAN, or v and i + 3 respectively. Focusing on semantics means those would instead be different node kinds, since they are completely different kinds of operations. FPSCR[VXSNAN] probably would be the same kind of node as FPSCR.VXSNAN with a flag if necessary to tell the two syntactical forms apart.
(In reply to Jacob Lifshay from comment #38) > the difference between FPSCR[VXSNAN] (a named field access, not indexing, > usually written FPSCR.VXSNAN) dmitry, in case that was confusing note that we decided to use the a[b] syntax whenever the PDF uses (html-ified since bugzilla doesn't support formatting) a<sub>b<sub/> (where b is a subscript right after a).
(In reply to Jacob Lifshay from comment #38) > I think we should aim more for an intermediate representation that aims to > capture the *semantics* a lot of the benefit of an IR is that it makes simple and obvious all of the semantic differences that are not obvious from just parsing the input syntax.
After thinking a lot, I've decided to take the current lexer and parser. Despite that I had a good success on this road, I have to reserve some time for the generation itself. I suggest we raise the relevant tasks for ideas I have in mind once this task is completed; from the code it'll be obvious why these ideas and improvements are good to have.
That said, whilst I'm keeping the current parser, it still has to produce different nodes, so all previous remarks are still valid.
(In reply to Dmitry Selyutin from comment #42) > keeping the current parser ...basically having its copy. No intents to modify what we have now.
I have to say that we'll have to reimplement bits of it eventually: I'd like to make the lexing/parsing routines be parts of the classes. I don't have time for this now, I'm basically hacking the copy of the existing code, but this is inevitable. The names of tokens and rules are not sufficiently apt, and I'd not say the code is really maintainable. But I'm reserving this for the future due to time constraints. The next (long and boring) comments are the code we had, the tree we got and the code we obtained from the tree. There still might be issues, but mostly the output meets my expectations.
The original pseudocode (simplev.mdwn): tree[0:63] <- (RB) mode <- tree[62:63] tree[62:63] <- 0 ra_used <- 0b0 in_bits[0:63] <- (RC|0) if in_bits = 0 then in_bits[0:63] <- 1 orig_in_bits <- in_bits tree_index <- 1 found <- 0b0 hit_end <- 0b0 do bit_length = 1 to 6 in_bit <- in_bits[63] if in_bits = 1 then if ra_used | (_RA = 0) then hit_end <- 0b1 leave ra_used <- 0b1 in_bit <- (RA)[63] in_bits <- 0b1 || (RA)[0:62] else in_bits <- 0b0 || in_bits[0:62] tree_index <- tree_index * 2 if in_bit = 1 then tree_index <- tree_index + 1 if tree_index < 64 then if tree[63 - tree_index] then found <- 0b1 leave compressed_index <- 0 do i = 0 to 127 possible <- 1 j <- i do while j >= 4 j <- j / 2 if tree[63 - j] then possible <- 0 leave if i = tree_index then leave else if i >= 64 then compressed_index <- compressed_index + possible else if tree[63 - i] = 1 then compressed_index <- compressed_index + possible switch(mode) case(0): RT[0:63] <- tree_index if ¬found then in_bits <- orig_in_bits ra_used <- 0b0 case(1): RT[0:63] <- tree_index if hit_end then in_bits <- orig_in_bits ra_used <- 0b0 case(2): RT[0:63] <- compressed_index if ¬found then in_bits <- orig_in_bits ra_used <- 0b0 RT[0:63] <- tree_index default: RT[0:63] <- compressed_index if hit_end then in_bits <- orig_in_bits ra_used <- 0b0 RS <- in_bits CR0 <- ra_used || (tree_index >= 64) || found || hit_end
The tree (I'm too lazy to indent it): Scope([Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=Symbol(tree)), rvalue=GPR(RB)), Assign(lvalue=Symbol(mode), rvalue=RangeSubscript(start=DecLiteral(62), end=DecLiteral(63), target=Symbol(tree))), Assign(lvalue=RangeSubscript(start=DecLiteral(62), end=DecLiteral(63), target=Symbol(tree)), rvalue=DecLiteral(0)), Assign(lvalue=Symbol(ra_used), rvalue=BinLiteral(0b0)), Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=Symbol(in_bits)), rvalue=GPR(RC0)), IfExpr(test=BinaryExpr(left=Symbol(in_bits), op=Eq(=), right=DecLiteral(0)), body=Scope([Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=Symbol(in_bits)), rvalue=DecLiteral(1))]), orelse=Scope([])), Assign(lvalue=Symbol(orig_in_bits), rvalue=Symbol(in_bits)), Assign(lvalue=Symbol(tree_index), rvalue=DecLiteral(1)), Assign(lvalue=Symbol(found), rvalue=BinLiteral(0b0)), Assign(lvalue=Symbol(hit_end), rvalue=BinLiteral(0b0)), ForExpr(target=Symbol(bit_length), start=DecLiteral(1), end=DecLiteral(6), body=Scope([Assign(lvalue=Symbol(in_bit), rvalue=Subscript(index=DecLiteral(63), target=Symbol(in_bits))), IfExpr(test=BinaryExpr(left=Symbol(in_bits), op=Eq(=), right=DecLiteral(1)), body=Scope([IfExpr(test=BinaryExpr(left=Symbol(ra_used), op=BitOr(|), right=BinaryExpr(left=Symbol(_RA), op=Eq(=), right=DecLiteral(0))), body=Scope([Assign(lvalue=Symbol(hit_end), rvalue=BinLiteral(0b1)), LeaveKeyword(leave)]), orelse=Scope([])), Assign(lvalue=Symbol(ra_used), rvalue=BinLiteral(0b1)), Assign(lvalue=Symbol(in_bit), rvalue=Subscript(index=DecLiteral(63), target=GPR(RA))), Assign(lvalue=Symbol(in_bits), rvalue=BinaryExpr(left=BinLiteral(0b1), op=BitConcat(||), right=RangeSubscript(start=DecLiteral(0), end=DecLiteral(62), target=GPR(RA))))]), orelse=Scope([Assign(lvalue=Symbol(in_bits), rvalue=BinaryExpr(left=BinLiteral(0b0), op=BitConcat(||), right=RangeSubscript(start=DecLiteral(0), end=DecLiteral(62), target=Symbol(in_bits))))])), Assign(lvalue=Symbol(tree_index), rvalue=BinaryExpr(left=Symbol(tree_index), op=Mul(*), right=DecLiteral(2))), IfExpr(test=BinaryExpr(left=Symbol(in_bit), op=Eq(=), right=DecLiteral(1)), body=Scope([Assign(lvalue=Symbol(tree_index), rvalue=BinaryExpr(left=Symbol(tree_index), op=Add(+), right=DecLiteral(1)))]), orelse=Scope([])), IfExpr(test=BinaryExpr(left=Symbol(tree_index), op=Lt(<), right=DecLiteral(64)), body=Scope([IfExpr(test=Subscript(index=BinaryExpr(left=DecLiteral(63), op=Sub(-), right=Symbol(tree_index)), target=Symbol(tree)), body=Scope([Assign(lvalue=Symbol(found), rvalue=BinLiteral(0b1)), LeaveKeyword(leave)]), orelse=Scope([]))]), orelse=Scope([]))])), Assign(lvalue=Symbol(compressed_index), rvalue=DecLiteral(0)), ForExpr(target=Symbol(i), start=DecLiteral(0), end=DecLiteral(127), body=Scope([Assign(lvalue=Symbol(possible), rvalue=DecLiteral(1)), Assign(lvalue=Symbol(j), rvalue=Symbol(i)), WhileExpr(test=BinaryExpr(left=Symbol(j), op=Ge(>=), right=DecLiteral(4)), body=Scope([Assign(lvalue=Symbol(j), rvalue=BinaryExpr(left=Symbol(j), op=Div(/), right=DecLiteral(2))), IfExpr(test=Subscript(index=BinaryExpr(left=DecLiteral(63), op=Sub(-), right=Symbol(j)), target=Symbol(tree)), body=Scope([Assign(lvalue=Symbol(possible), rvalue=DecLiteral(0)), LeaveKeyword(leave)]), orelse=Scope([]))]), orelse=Scope([])), IfExpr(test=BinaryExpr(left=Symbol(i), op=Eq(=), right=Symbol(tree_index)), body=Scope([LeaveKeyword(leave)]), orelse=Scope([IfExpr(test=BinaryExpr(left=Symbol(i), op=Ge(>=), right=DecLiteral(64)), body=Scope([Assign(lvalue=Symbol(compressed_index), rvalue=BinaryExpr(left=Symbol(compressed_index), op=Add(+), right=Symbol(possible)))]), orelse=Scope([IfExpr(test=BinaryExpr(left=Subscript(index=BinaryExpr(left=DecLiteral(63), op=Sub(-), right=Symbol(i)), target=Symbol(tree)), op=Eq(=), right=DecLiteral(1)), body=Scope([Assign(lvalue=Symbol(compressed_index), rvalue=BinaryExpr(left=Symbol(compressed_index), op=Add(+), right=Symbol(possible)))]), orelse=Scope([]))]))]))])), SwitchExpr(subject=Symbol(mode), cases=Cases([Case(labels=Labels([Label(0)]), body=Scope([Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=GPR(RT)), rvalue=Symbol(tree_index)), IfExpr(test=UnaryExpr(op=Not(¬), value=Symbol(found)), body=Scope([Assign(lvalue=Symbol(in_bits), rvalue=Symbol(orig_in_bits)), Assign(lvalue=Symbol(ra_used), rvalue=BinLiteral(0b0))]), orelse=Scope([]))])), Case(labels=Labels([Label(1)]), body=Scope([Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=GPR(RT)), rvalue=Symbol(tree_index)), IfExpr(test=Symbol(hit_end), body=Scope([Assign(lvalue=Symbol(in_bits), rvalue=Symbol(orig_in_bits)), Assign(lvalue=Symbol(ra_used), rvalue=BinLiteral(0b0))]), orelse=Scope([]))])), Case(labels=Labels([Label(2)]), body=Scope([Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=GPR(RT)), rvalue=Symbol(compressed_index)), IfExpr(test=UnaryExpr(op=Not(¬), value=Symbol(found)), body=Scope([Assign(lvalue=Symbol(in_bits), rvalue=Symbol(orig_in_bits)), Assign(lvalue=Symbol(ra_used), rvalue=BinLiteral(0b0)), Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=GPR(RT)), rvalue=Symbol(tree_index))]), orelse=Scope([]))])), Case(labels=Labels([DefaultLabel(default)]), body=Scope([Assign(lvalue=RangeSubscript(start=DecLiteral(0), end=DecLiteral(63), target=GPR(RT)), rvalue=Symbol(compressed_index)), IfExpr(test=Symbol(hit_end), body=Scope([Assign(lvalue=Symbol(in_bits), rvalue=Symbol(orig_in_bits)), Assign(lvalue=Symbol(ra_used), rvalue=BinLiteral(0b0))]), orelse=Scope([]))]))])), Assign(lvalue=GPR(RS), rvalue=Symbol(in_bits)), Assign(lvalue=Symbol(CR0), rvalue=BinaryExpr(left=Symbol(ra_used), op=BitConcat(||), right=BinaryExpr(left=BinaryExpr(left=Symbol(tree_index), op=Ge(>=), right=DecLiteral(64)), op=BitConcat(||), right=BinaryExpr(left=Symbol(found), op=BitConcat(||), right=Symbol(hit_end)))))])
The code we got from the tree via visiting the nodes: tree[0:63] <- (RB) mode <- tree[62:63] tree[62:63] <- 0 ra_used <- 0b0 in_bits[0:63] <- (RC|0) if (in_bits = 0) then in_bits[0:63] <- 1 orig_in_bits <- in_bits tree_index <- 1 found <- 0b0 hit_end <- 0b0 for bit_length = 1 to 6 in_bit <- in_bits[63] if (in_bits = 1) then if (ra_used | (_RA = 0)) then hit_end <- 0b1 leave ra_used <- 0b1 in_bit <- (RA)[63] in_bits <- (0b1 || (RA)[0:62]) else in_bits <- (0b0 || in_bits[0:62]) tree_index <- (tree_index × 2) if (in_bit = 1) then tree_index <- (tree_index + 1) if (tree_index < 64) then if tree[(63 - tree_index)] then found <- 0b1 leave compressed_index <- 0 for i = 0 to 127 possible <- 1 j <- i do while (j >= 4) j <- (j / 2) if tree[(63 - j)] then possible <- 0 leave if (i = tree_index) then leave else if (i >= 64) then compressed_index <- (compressed_index + possible) else if (tree[(63 - i)] = 1) then compressed_index <- (compressed_index + possible) switch(mode) case (0): (RT)[0:63] <- tree_index if ¬(found) then in_bits <- orig_in_bits ra_used <- 0b0 case (1): (RT)[0:63] <- tree_index if hit_end then in_bits <- orig_in_bits ra_used <- 0b0 case (2): (RT)[0:63] <- compressed_index if ¬(found) then in_bits <- orig_in_bits ra_used <- 0b0 (RT)[0:63] <- tree_index default: (RT)[0:63] <- compressed_index if hit_end then in_bits <- orig_in_bits ra_used <- 0b0 RS <- in_bits CR0 <- (ra_used || ((tree_index >= 64) || (found || hit_end)))
I had to cheat a bit to force output of RA instead of (RA) for constructs like assignments and subscripts (the default is (RA), but subscripts and assignments are different). I dropped everything related to "read_regs", "uninit_regs" and other stateful bits from the copy of the parser. That's job of the concrete visitor, not the parser. The pseudocode visitor hooks look like this: @Hook(pc_ast.IfExpr) def IfExpr(self, node): yield node stmt = " ".join([ "if", str(self[node.test]), "then", ]) self[node].emit(stmt=stmt) for (level, stmt) in self[node.body]: self[node].emit(stmt=stmt, level=level) if node.orelse: self[node].emit("else") for (level, stmt) in self[node.orelse]: self[node].emit(stmt=stmt, level=level) @Hook(pc_ast.WhileExpr) def WhileExpr(self, node): yield node stmt = " ".join([ "do", "while", str(self[node.test]), ]) self[node].emit(stmt=stmt) for (level, stmt) in self[node.body]: self[node].emit(stmt=stmt, level=level) if node.orelse: self[node].emit("else") for (level, stmt) in self[node.orelse]: self[node].emit(stmt=stmt, level=level) Obviously there's a lot of other hooks, not shown here for brevity.
That's how ForExpr and WhileExpr look like: class ForExpr(Dataclass): target: Node start: Node end: Node body: Scope class WhileExpr(Dataclass): test: Node body: Scope orelse: Scope
Whenever somebody attempts to instantiate Dataclass, all its fields are __validated__ for isinstance. Same applies to typed sequences: class Label(Literal): pass class DefaultLabel(Label): def __new__(cls): return super().__new__(cls, "default") class Labels(Sequence, typeid=Label): pass class Case(Dataclass): labels: Labels body: Scope class Cases(Sequence, typeid=Case): pass If some weirdo ever tries to do Case(labels=(1, 2), body="ABC") or pass illegal nodes not matching the description, an exception is raised. This helps to diagnose what went wrong when attempt to establish the result of some rule. Example: def p_cases(self, p): """ cases : switch_list switch_default | switch_default """ if len(p) == 3: p[0] = pc_ast.Cases(p[1] + (p[2],)) else: p[0] = pc_ast.Cases([p[1]]) This cuts off the cases when something in switchlist is NOT an instance of Case. The Cases class takes care of it, checking anything it had, thus ensuring the previous rules expanded correctly.
OK, it seems we're able to parse all code EXCEPT functions we had so far. I used the following quick and dirty script: def dedent(line): if line.startswith(" "): return line[4:].rstrip() return line.rstrip() def ignore(line): line = line.strip() if line.startswith("<!--"): return True return False def parse(parser, origin): origin = tuple(itertools.filterfalse(ignore, origin)) tree = parser.parse(code="\n".join(origin)) stream = io.StringIO() for (level, line) in pc_util.pseudocode(tree): print(f"{' ' * 4 * level}{line}", file=stream) stream.seek(0) target = tuple(itertools.filterfalse(ignore, stream)) return (origin, target) lexer = pc_lexer.IndentLexer(debug=False) parser = pc_parser.Parser(lexer=lexer) pattern = re.compile(r"Pseudo-code:(.*?)(?:Special Registers Altered|Description):", re.DOTALL) for path in glob.glob("openpower/isa/*.mdwn"): with open(path, "r", encoding="UTF-8") as stream: data = stream.read() for code in pattern.findall(data): (stage0, stage1) = parse(parser, map(dedent, code.split("\n"))) (stage2, stage3) = parse(parser, map(dedent, stage1)) stage1 = tuple(map(dedent, itertools.filterfalse(ignore, stage1))) stage3 = tuple(map(dedent, itertools.filterfalse(ignore, stage3))) assert stage1 == stage2 and stage2 == stage3 As I said, I don't have time to deal with all the creepy stuff around. The definition of the "creepy stuff" includes "fallthrough" keywords, adding pesudo tokens like "colons", custom "indent/dedent" filter (which should've been simpler by order of magnitude), inept rule names, etc. etc. This already wasted way more efforts and time than it deserved, and that's just only to make the real AST work. :-)
As discussed in the previous comments, I'll skip the functions (def, args, varargs, etc.). This mostly means that, after a cleanup and commit into some branch, we can now proceed with checking whether the code is correct. As I'm extremely lazy with the manual checks, I have the following (crazy?) idea on how to test that the parsing is correct. 1. The new code can produce a valid AST tree after parsing. 2. The new code can convert an AST tree back into the pseudocode. 3. The old code can convert a pseudocode into a Python code. 4. If we do pycode(pseudocode(parse(mdwn))) and things still work, it means our AST is correct. So the next stage would be to do exactly this. I'll clean the code a bit, commit it into a branch and modify the OLD parser so that it fetches an AST with a NEW parser then converts this AST into a pseudocode. The new pseudocode will be used in place of an old one, substituted in runtime. Then I'll check for discrepancies in the generation and see what went wrong. I think checking python generated code would be OK, then CI.
(In reply to Dmitry Selyutin from comment #48) > I had to cheat a bit to force output of RA instead of (RA) for constructs > like assignments and subscripts (the default is (RA), but subscripts and > assignments are different). ok RA refers to the bits in the opcode field, but annoyingly when assigned (RA <- xyz) it is actually GPR[RA] that is getting assigned to. then also there is (RA|0) which is Shorthand for "if RA=0 then 0 otherwise GPR[RA]" sigh :)
(In reply to Luke Kenneth Casson Leighton from comment #53) > ok RA refers to the bits in the opcode field ...unless it's surrounded by parentheses. Damn this syntax is so annoying!
(In reply to Dmitry Selyutin from comment #54) > ...unless it's surrounded by parentheses. Damn this syntax is so annoying! it was never intended to be machine-readable. i was literalky the first person to ever consider it. in 30+ years of Power ISA history.
(In reply to Dmitry Selyutin from comment #52) > As I'm extremely lazy with the manual checks, I have the following (crazy?) > idea on how to test that the parsing is correct. > 1. The new code can produce a valid AST tree after parsing. > 2. The new code can convert an AST tree back into the pseudocode. > 3. The old code can convert a pseudocode into a Python code. > 4. If we do pycode(pseudocode(parse(mdwn))) and things still work, it means > our AST is correct. Apart of bug #1247, the only difference we have so far after conversion is this: @inject() def op_setbc(self, CR, RT): - RT = copy_assign_rhs(1 if eq(CR[BI + 32], 1) else 0) + if eq(CR[BI + 32], 1): + RT = copy_assign_rhs(1) + else: + RT = copy_assign_rhs(0) return (CR, RT,) There are several cases of this behavior (3 in total). This can be "fixed" via introducing a custom tree => pseudocode conversion. I don't consider this to be a real issue. Moreover, I think the updated version is cleaner and maps better to a language-agnostic code generation. Any objections?
(In reply to Dmitry Selyutin from comment #56) > Apart of bug #1247, the only difference we have so far after conversion is > this: > > @inject() > def op_setbc(self, CR, RT): > - RT = copy_assign_rhs(1 if eq(CR[BI + 32], 1) else 0) > + if eq(CR[BI + 32], 1): > + RT = copy_assign_rhs(1) > + else: > + RT = copy_assign_rhs(0) > return (CR, RT,) looks fine to me, though this makes me suspect that you might be mishandling the ? : operator... where is the code you've been committing? I don't see anything recently updated on git.libre-soc.org, maybe it's in a repo that luke forgot to set public?
(In reply to Jacob Lifshay from comment #57) > updated on git.libre-soc.org, maybe it's in a repo that luke forgot to set > public? there are no private repositories except for gitolite admin and there never will be.
(In reply to Luke Kenneth Casson Leighton from comment #58) > (In reply to Jacob Lifshay from comment #57) > > > updated on git.libre-soc.org, maybe it's in a repo that luke forgot to set > > public? > > there are no private repositories except for gitolite admin and > there never will be. ok, i was mentioning that because iirc you've forgotten before. so, in that case, it looks like ghostmansd didn't git push, or is changing committed dates, or gitweb is misbehaving...
I haven't pushed the code yet, I'm operating on a local copy. I'll push once I ensure things work.
(In reply to Dmitry Selyutin from comment #60) > I haven't pushed the code yet, I'm operating on a local copy. I'll push once > I ensure things work. our standard operating procedure is generally to push whatever WIP code we have when we are done for the day, or whenever we make commits. if it doesn't work, push to a separate branch or at least ensure all other tests still work (though I prefer for people to avoid pushing lots of new broken tests to branches others are using since it clogs up CI which is configured to only report the first 10 broken tests).
(In reply to Jacob Lifshay from comment #61) > (In reply to Dmitry Selyutin from comment #60) > > I haven't pushed the code yet, I'm operating on a local copy. I'll push once > > I ensure things work. > > our standard operating procedure is generally to push whatever WIP code pushing WIP code makes it much easier for others to spot bugs and/or understand how it works, this can also avoid where someone spends a lot of effort on a flawed design and the flaw only becomes known once they complete the design and push their code.
(In reply to Jacob Lifshay from comment #61) > (In reply to Dmitry Selyutin from comment #60) > > I haven't pushed the code yet, I'm operating on a local copy. I'll push once > > I ensure things work. > > our standard operating procedure is generally to push whatever WIP code we > have when we are done for the day I'm fine with this, do so if it doesn't break anything around.
(In reply to Dmitry Selyutin from comment #63) > I'm fine with this, do so if it doesn't break anything around. Thanks! this also makes it much easier for someone to take over if something happens and the original person can't work on it (e.g. house burns down or computer stolen)
Pushed two commits (plus a change in walker). https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=3b3b2189ed804ed311cdff3ba60c642bae663721 https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=6f6c363534b482b69e66a89ba0c53353aca5acff https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=b1d7d93e54dd5698628eafd8012e243593122671 The last commit hacks pywriter so it does the following: 1) converts the code to a tree; 2) converts the tree into the pseudocode. This is done as a temporary measure, I want to ensure that the code is converted successfully. I already compared artefacts from src/openpower/decoder/isa/generated and found no changes except for ForExpr and bug #1247. Once I see that stuff works not only for a local run, but also for CI, I can drop that last commit (though I'd say it doesn't waste too much time to do this conversion so we might keep it).
As for commits and pushes. I will commit and push as _I_ find it convenient. Including any numbers of interactive rebases, fixups, even force pushes to branches, even branch removals. Feel free to impose any further restrictions on your own modus operandi, though.
CI doesn't appear to be active (the last test was a week ago). Plus I don't see oppc branch there. Jacob, could you check it, please?
(In reply to Dmitry Selyutin from comment #67) > CI doesn't appear to be active (the last test was a week ago). Plus I don't > see oppc branch there. Jacob, could you check it, please? it should be active, however luke required i only mirror the git repos once every 24hr...so I'm going to have to manually git push again (luke, this is a problem that needs fixing)
(In reply to Jacob Lifshay from comment #68) > luke required i only mirror the git repos once > every 24hr... Luke, any rationale on that?
(In reply to Dmitry Selyutin from comment #69) > (In reply to Jacob Lifshay from comment #68) > > luke required i only mirror the git repos once > > every 24hr... > > Luke, any rationale on that? for what luke said originally: https://bugs.libre-soc.org/show_bug.cgi?id=1162 imo it is reasonable to run git mirror every 15min, I calculated it would cost substantially less than 1 GBP per year based on the data transferred for one no-op mirror of all the repos.
(In reply to Jacob Lifshay from comment #70) > imo it is reasonable to run git mirror every 15min, I calculated it would > cost substantially less than 1 GBP per year based on the data transferred > for one no-op mirror of all the repos. If so, can we subtract 1GBP from any tasks we have around and reserve it for these purposes?
Many updates, too many to enumerate, briefly: 1) simplified the ternary condition operator pseudocode; 2) simplified special words parsing (GPR, FPR, CR, reserved, etc.); 3) minor hierarchy refactoring for the future C code generation.
With a local code I have now, we're already able to convert this pseudocode... src <- [0]*64 src[64-XLEN:63] <- (RS) result <- [0]*64 do i = 0 to 1 n <- i * 32 result[n+0:n+7] <- 0 result[n+8:n+19] <- DPD_TO_BCD(src[n+12:n+21]) result[n+20:n+31] <- DPD_TO_BCD(src[n+22:n+31]) RA <- result[64-XLEN:63] ...into declarations: void oppc_cdtbcd(void) { uint64_t src; uint64_t result; uint64_t i; uint64_t n; } I guess all such functions will have some context argument. Notice that "special" words and registers are not in declarations (because they shouldn't be).
(In reply to Dmitry Selyutin from comment #66) > As for commits and pushes. I will commit and push as _I_ find it convenient. Dmitry, the project's requirements as per the full Transparency and Audit Conditions are to push immediately. please ensure that you use the server as if it were your local machine. the absolute maximum amount of time that work should be local-only is about 18 hours and that is seriously on the edge. normally it should be well under 2 minutes between commit and push. Jacob explained it well. Hwever he did not mention that we have *actually* had people waste significant time by not publishing work, nor that NLnet got Audited by the EU a few months ago by an extremely technically-knowledgeable Auditor. we absolutely have to be watertight "above board" here. bottom line: no private local-only code. sorry.
(In reply to Dmitry Selyutin from comment #72) > Many updates, too many to enumerate, briefly: > 1) simplified the ternary condition operator pseudocode; > 2) simplified special words parsing (GPR, FPR, CR, reserved, etc.); > 3) minor hierarchy refactoring for the future C code generation. fantastic. any joy on a python-cffi importer (if nothing else, to do unit tests) or is that just too much for the $? (In reply to Dmitry Selyutin from comment #73) > With a local code I have now, (please do push it immediately to fulfil the Project Coding Standards, to avoid Audit and Transparency "red flags", thank you for understanding and patience here Dmitry) > we're already able to convert this > pseudocode... > > src <- [0]*64 > src[64-XLEN:63] <- (RS) > result <- [0]*64 > do i = 0 to 1 > n <- i * 32 > result[n+0:n+7] <- 0 > result[n+8:n+19] <- DPD_TO_BCD(src[n+12:n+21]) > result[n+20:n+31] <- DPD_TO_BCD(src[n+22:n+31]) > RA <- result[64-XLEN:63] fantastic. calling functions, dealing with loops, local variables - nice. > > ...into declarations: > > void > oppc_cdtbcd(void) { > uint64_t src; > uint64_t result; > uint64_t i; > uint64_t n; > } one thing: all return results need a boolean "ok" flag which is set on write. RA <- result[64-XLEN:63] outputs: RA->value = result... ; RA->ok = true; or.. RA = result.... ; ok_flags->RA = true; or #define RA_result_index 0 #define CR0_result_index 1 ok_flags |= 1<<RA_result_index; (separate ok bitfield or something) this actually reflects the hardware design, so is not surprising, but turns out to be absolutely essential in the FP routines for flags, and a few other obscure areas. basically *some* pseudocode *optionally* writes to registers, using *runtime* arbitrary decisions, not static deterministic ones based on known information going into the routine.
I'm extremely limited in time. I'm spending it wisely. I commit when I consider it's the right to do so, w/o half-baked code or strange constraints like "commit every 18 hours". I do it in my free time and it's up to me how often I commit and in how many commits. Luke, w/o further ado, I'm passing this to David and James.
And, obviously, to Andrey. Andrey, please solve it. It's 1.5 month till the deadline. This situation must be solved fast, the time is precious.
At this point, I'm stopping any further comments here until the situation is resolved, to avoid any climax or unconstructive outcome. I seriously don't have time to put any spokes out of the wheel. The mere fact that we started this discussion shows a severe underestimation of the task and the works already done and amount of works to be done. Luke, I'm sorry, but I have no other choice.
while reading through the code, i noticed you wrote: import openpower.oppc.pc_util as pc_util if you like you can write that as: from openpower.oppc import pc_util
According to the recent discussion we had at mailing list, I consider this case closed and can move on. :-) Thanks chaps for handling this!
(In reply to Dmitry Selyutin from comment #80) > According to the recent discussion we had at mailing list, I consider this > case closed and can move on. :-) Thanks chaps for handling this! for future reference, that discussion is at: https://lists.libre-soc.org/pipermail/libre-soc-dev/2024-January/005912.html
Many recent commits on oppc branch (too many to list them separately) bring support for more expressions and statements. With these, the following pseudocode... src <- [0]*64 src[64-XLEN:63] <- (RS) result <- [0]*64 do i = 0 to 1 n <- i * 32 result[n+0:n+7] <- 0 result[n+8:n+19] <- DPD_TO_BCD(src[n+12:n+21]) result[n+20:n+31] <- DPD_TO_BCD(src[n+22:n+31]) RA <- result[64-XLEN:63] RA <- (C ? A : B) if a < b then RT <- (RA) else RT <- (RB) ...get converted to this C code: void oppc_cdtbcd(void) { uint64_t src; uint64_t result; uint64_t i; uint64_t n; uint64_t C; uint64_t A; uint64_t B; uint64_t a; uint64_t b; src = oppc_repeat(UINT64_C(0x0), UINT64_C(0x40)); oppc_range_subscript_assign(&src, (UINT64_C(0x40) - ctx->XLEN), UINT64_C(0x3f), ctx->gpr[OPPC_GPR_RS]); result = oppc_repeat(UINT64_C(0x0), UINT64_C(0x40)); for (i = UINT64_C(0x0); i != (UINT64_C(0x1) + 1); ++i) { n = (i * UINT64_C(0x20)); oppc_range_subscript_assign(&result, (n + UINT64_C(0x0)), (n + UINT64_C(0x7)), UINT64_C(0x0)); oppc_range_subscript_assign(&result, (n + UINT64_C(0x8)), (n + UINT64_C(0x13)), DPD_TO_BCD(oppc_range_subscript(src, (n + UINT64_C(0xc)), (n + UINT64_C(0x15))))); oppc_range_subscript_assign(&result, (n + UINT64_C(0x14)), (n + UINT64_C(0x1f)), DPD_TO_BCD(oppc_range_subscript(src, (n + UINT64_C(0x16)), (n + UINT64_C(0x1f))))); } ctx->gpr[OPPC_GPR_RA] = oppc_range_subscript(result, (UINT64_C(0x40) - ctx->XLEN), UINT64_C(0x3f)); ctx->gpr[OPPC_GPR_RA] = C ? A : B; if (a < b) { ctx->gpr[OPPC_GPR_RT] = ctx->gpr[OPPC_GPR_RA]; } else { ctx->gpr[OPPC_GPR_RT] = ctx->gpr[OPPC_GPR_RB]; } } I haven't yet touched the function signature to make it return bool or expand what ctx is, I'd like to deal with the nodes first. But I think this should be sufficient to get the overall idea.
(In reply to Dmitry Selyutin from comment #82) code looks nice! > uint64_t result; eventually we'll need to support things with >64 bits, but this should be mostly good for now. > for (i = UINT64_C(0x0); i != (UINT64_C(0x1) + 1); ++i) { I would write that i <= UINT64_C(0x1) because that way you don't need the + 1 and because there may be for loops that need to loop zero times e.g. they start at 5 and end at 3, tho I don't remember any particular examples.
(In reply to Jacob Lifshay from comment #83) > (In reply to Dmitry Selyutin from comment #82) > > code looks nice! Thanks! It's funny though that the process is mostly the same as for the pseudocode, it's just that the representation is different. Now I know for sure that the idea with the custom tree format was correct and that I did the right thing when I decided to check with the pseudocode first. > > uint64_t result; > > eventually we'll need to support things with >64 bits, but this should be > mostly good for now. Hm. Perhaps all the operations should be calls then. Plus after you wrote I recalled that we probably should keep a number of bits an integer has as well... So probably these all should deal with some kind of structure. > > for (i = UINT64_C(0x0); i != (UINT64_C(0x1) + 1); ++i) { > > I would write that i <= UINT64_C(0x1) because that way you don't need the + > 1 and because there may be for loops that need to loop zero times e.g. they > start at 5 and end at 3, tho I don't remember any particular examples. Nice catch, haven't thought of that. I just tend to use not equal automatically, but here, when we don't fully control code, we should be extra careful.
(In reply to Dmitry Selyutin from comment #84) > > Hm. Perhaps all the operations should be calls then. this is abolutely essential, as a c++ template/wrapper allows for operator-overloading to provide SV semantics. ths was the technique i used almost 5 years ago to do SV full implementation in spike in something mad like 6-8 weeks flat.
A huge update. Many operations are now supported. From now on, every integer should be represented by a structure which will look like this: struct oppc_int { uint64_t bits; uint64_t value[]; }; The functions would then deal with the poiners to such structures: struct oppc_int * oppc_add(struct oppc_int *rv, struct oppc_int const *lhs, struct oppc_int const *rhs); struct oppc_int * oppc_assign(struct oppc_int *dst, struct oppc_int const *src); bool oppc_lt(struct oppc_int const *lhs, struct oppc_int const *rhs); This is needed mostly for stuff like concatenation to work on binary and hexadecimal literals, plus to make call chains simpler. Decimal and registers have bits equal to XLEN (should be part of the context). Plus, from now on, everything is a call. I obviously had to cheat a bit and use compound literals. The code is too big to post here, you can check what's generated from __main__ upon `python3 -m openpower.oppc`. I'll post just minor excerpts below for the record. To clarify about function calls and integer type. I have no intent to provide a completely __working__ C code in scope of this task. This would require to implement an analogue of SelectableInt, invent context with all the registers and states, etc. etc. It will never fit the time constraints, and certainly already doesn't fit the budget. The idea at this stage is: 1. Emit the code which is valid C code. 2. Check whether it's buildable with some stubs instead of functions and context. This is it. This is just a task "take pseudocode and emit C code", not "make a fully-blown decoder and simulator". Any objections?
a[0:XLEN/2] <- [0]*(XLEN/2) do i = 0 to 1 a[i] <- i ...gets converted to this (again, ignore lack of ctx and "leave" section which updates its state): void oppc_cocojamboo(void) { struct oppc_int a; struct oppc_int i; /* a[0:(XLEN / 2)] <- ([0] * (XLEN / 2)) */ oppc_range_subscript_assign( &a, /* 0 */ &(struct oppc_int){ .count = ctx->XLEN, .array = {0}, }, /* (XLEN / 2) */ oppc_div( /* transient */ &(struct oppc_int){ .count = ctx->XLEN, .array = {}, }, /* XLEN */ &(struct oppc_int){ .count = ctx->XLEN, .array = {ctx->XLEN}, }, /* 2 */ &(struct oppc_int){ .count = ctx->XLEN, .array = {2}, }, ), /* ([0] * (XLEN / 2)) */ oppc_repeat( /* transient */ &(struct oppc_int){ .count = ctx->XLEN, .array = {}, }, /* 0 */ &(struct oppc_int){ .count = ctx->XLEN, .array = {0}, }, /* (XLEN / 2) */ oppc_div( /* transient */ &(struct oppc_int){ .count = ctx->XLEN, .array = {}, }, /* XLEN */ &(struct oppc_int){ .count = ctx->XLEN, .array = {ctx->XLEN}, }, /* 2 */ &(struct oppc_int){ .count = ctx->XLEN, .array = {2}, } ) ) ); /* for i = 0 to 1 */ for ( /* i <- 0 */ oppc_assign( &i, /* 0 */ &(struct oppc_int){ .count = ctx->XLEN, .array = {0}, } ); /* (i <= 1) */ oppc_le( &i, /* 1 */ &(struct oppc_int){ .count = ctx->XLEN, .array = {1}, } ); /* i <- (i + 1) */ oppc_assign( &i, /* (i + 1) */ oppc_add( /* transient */ &(struct oppc_int){ .count = ctx->XLEN, .array = {}, }, &i, /* 1 */ &(struct oppc_int){ .count = ctx->XLEN, .array = {1}, } ) ) /* for i = 0 to 1 */ ) { /* a[i] <- i */ oppc_subscript_assign( &a, &i, &i ); } }
A minor remark on for: I had to basically create different nodes for this to work, since ForExpr obtained from parser looks like this: class ForExpr(Dataclass): subject: Node start: Node end: Node body: Scope We, however, create two AssignExpr and one BinaryExpr instead of this. An alternative would be to output C-like ForExpr from the parser, but this would be creepy: I'd like to keep the produced AST as close to the origins as possible.
Also, yes, I know that this `oppc_int` is a pile of crap. :-) If it's an MSB0 bit vector, this should've been initialized differently. As I mentioned, there's no room for bringing a full-blown selectable int equivalent into C, especially if it's bigger than 64 bits.
(In reply to Dmitry Selyutin from comment #86) > A huge update. Many operations are now supported. From now on, every integer > should be represented by a structure which will look like this: > > struct oppc_int { > uint64_t bits; > uint64_t value[]; > }; and an "ok" flag. this is absolutely essential. struct oppc_int { uint64_t bits; uint64_t value[]; bool ok; // set when value is modified }; > To clarify about function calls and integer type. I have no intent to > provide a completely __working__ C code in scope of this task. This would > require to implement an analogue of SelectableInt, invent context with all > the registers and states, etc. etc. It will never fit the time constraints, > and certainly already doesn't fit the budget. yyep, not a surprise at all. it took me 6-8 weeks to do SV for spike. > The idea at this stage is: > 1. Emit the code which is valid C code. > 2. Check whether it's buildable with some stubs instead of functions and > context. > > This is it. This is just a task "take pseudocode and emit C code", not "make > a fully-blown decoder and simulator". Any objections? none here as long as the data structure includes that "ok" flag and is initially clear and is set by any oppc_* operation. basically if *ALL* the functions oppc_add oppc_assign oppc_le set the ok flag on their return result then the desired behaviour is achieved with NO NEED for the compiler to even know that the ok flag exists. do NOT attempt to get the compiler to emit code that sets the ok flag. let the oppc_{arithmetic*} functions handle it.
(In reply to Dmitry Selyutin from comment #87) > /* a[0:(XLEN / 2)] <- ([0] * (XLEN / 2)) */ > oppc_range_subscript_assign( > &a, > /* 0 */ > &(struct oppc_int){ > .count = ctx->XLEN, > .array = {0}, > }, blech! :) it's actually understandable / readable. implementation: oppc_range_subscript_assign(oppc_int *a, ...) { a->ok = true; } and then the caller can check the result->ok flag and if true write to the regfile. do you have time to write one function (oppc_add) or more as a demo / unit test?
(In reply to Dmitry Selyutin from comment #87) > oppc_add( > /* transient */ > &(struct oppc_int){ > .count = ctx->XLEN, > .array = {}, > }, this needs one element in array -- because array is not fixed-length, you can't just use {}, since that'll give a zero-length array.
(In reply to Luke Kenneth Casson Leighton from comment #90) > do NOT attempt to get the compiler to emit code that sets the ok flag. > > let the oppc_{arithmetic*} functions handle it. all assignments need to set ok too, in the simulator copy_assign_rhs does that by calling SelectableInt's constructor: https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/helpers.py;h=918a409eaf8e57762d7c67f692e6bd16d75dccba;hb=97f603213814159f6da349735f7a2a07d72c42cf#l78 so, for now, you likely only need to ensure anything that writes to an oppc_int is a function call, where ok = true can be put. later, not as part of this bug, we will need to support more than just oppc_int, e.g. the bfp_* code uses a rational number as the mantissa (SelectableMSB0Fraction, which likely needs more than 64 bits in both numerator and denominator), and a BFPState struct.
(In reply to Jacob Lifshay from comment #93) > all assignments need to set ok too, yes. > in the simulator copy_assign_rhs does > that by calling SelectableInt's constructor: > https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/ > decoder/helpers.py;h=918a409eaf8e57762d7c67f692e6bd16d75dccba; > hb=97f603213814159f6da349735f7a2a07d72c42cf#l78 thanks for finding that. > so, for now, you likely only need to ensure anything that writes to an > oppc_int is a function call, where ok = true can be put. basically. > later, not as part of this bug, we will need to support more than just > oppc_int, e.g. the bfp_* code uses a rational number as the mantissa > (SelectableMSB0Fraction, which likely needs more than 64 bits in both > numerator and denominator), and a BFPState struct. typecast and operator overloads provide globals and other state. c++ wrapper #includes *the c code*, (#include "autogenerated_compiler_output.c" - this is a trick from spike) and the end result is total redirection even of ctx. dmitry note from the above that global variables in the c code (such as XLEN) are *perfectly fine* because when "wrapped" in a "using namespace" c++ context on the exact same autogenerated c code gets redirected. trying that with ctx->XLEN is much harder to do in a c++ template/wrapper whereas just XLEN (in the autogenerated output) can be "#defined to ctx->THE_REAL_XLEN" for plain c, and made a member of the #including wrapper/template class in c++
(In reply to Luke Kenneth Casson Leighton from comment #94) > (In reply to Jacob Lifshay from comment #93) > > > all assignments need to set ok too, > > yes. > > > in the simulator copy_assign_rhs does > > that by calling SelectableInt's constructor: > > https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/ > > decoder/helpers.py;h=918a409eaf8e57762d7c67f692e6bd16d75dccba; > > hb=97f603213814159f6da349735f7a2a07d72c42cf#l78 > > thanks for finding that. > > > so, for now, you likely only need to ensure anything that writes to an > > oppc_int is a function call, where ok = true can be put. > > basically. Folks, could you, please, elaborate, why it's done on a generic integer class? I've been thinking rather of this approach: struct oppc_reg { struct oppc_int value; uint64_t flags; }; ...where assignments to registers (and similar stuff) have the flags updated. What's the point of having ok or flags in each and every integer? We have full control over what entities we work on, and can detect whether pseudocode writes into GPR or FPR or whatever. > > later, not as part of this bug, we will need to support more than just > > oppc_int, e.g. the bfp_* code uses a rational number as the mantissa > > (SelectableMSB0Fraction, which likely needs more than 64 bits in both > > numerator and denominator), and a BFPState struct. > > typecast and operator overloads provide globals and other state. > > c++ wrapper #includes *the c code*, > (#include "autogenerated_compiler_output.c" - this is a trick from spike) > and the end result is total redirection even of ctx. > > dmitry note from the above that global variables in the c code > (such as XLEN) are *perfectly fine* because when "wrapped" in > a "using namespace" c++ context on the exact same autogenerated > c code gets redirected. > > trying that with ctx->XLEN is much harder to do in a c++ template/wrapper > whereas just XLEN (in the autogenerated output) can be > "#defined to ctx->THE_REAL_XLEN" for plain c, and made a > member of the #including wrapper/template class in c++ Even omitting issues related to global context in general, this is a detail of a particular implementation, the low-level architecture should be kept out of these details. I don't want the implementation to be tied to a particular language; at some point we might opt to reuse this generated code not only from C++. We might have wrappers which fetch/store parts of the state though in order to keep the context opaque.
Luke, could you please drop this? warning, file not .mdwn, skipping lrsc.txt Please don't store non-mdwn files there or use glob instead so that only mdwn files are captured.
(In reply to Dmitry Selyutin from comment #95) > Folks, could you, please, elaborate, why it's done on a generic integer > class? I've been thinking rather of this approach: > > struct oppc_reg { > struct oppc_int value; > uint64_t flags; > }; > > ...where assignments to registers (and similar stuff) have the flags > updated. Like this: if (a = b) then RA <- 0b0 else X <- 0b0 void oppc_cocojamboo(void) { struct oppc_int a; struct oppc_int b; struct oppc_int X; if ( /* (a = b) */ oppc_eq( &a, &b ) ) { /* RA <- 0b0 */ oppc_assign( /* (RA) */ &ctx->gpr[OPPC_GPR_RA]->value, /* 0b0 */ &(struct oppc_int){ .count = 1, .array = {0x0}, } ); &ctx->gpr[OPPC_GPR_RA]->flags |= OPPC_GPR_UPDATED; } else { /* X <- 0b0 */ oppc_assign( &X, /* 0b0 */ &(struct oppc_int){ .count = 1, .array = {0x0}, } ); } }
(In reply to Dmitry Selyutin from comment #95) > What's the point of having ok or flags in each and every integer? mostly because that's how we did it in python...yeah I can see your point why we shouldn't do it that way. how about: #include <stdbool.h> // so we can use bool regardless of C or C++ #include <stdint.h> typedef enum oppc_type_tag { oppc_tt_sel_int, // add more stuff later } oppc_type_tag; typedef struct oppc_sel_int { uint32_t bits; union { uint64_t small; // for bits <= 64 // add another field later for > 64 bits }; } oppc_sel_int; typedef struct oppc_value { oppc_type_tag type : 4; bool ok : 1; : 0; // terminate bitfield union { oppc_sel_int si; // add more stuff later }; } oppc_value; static inline oppc_value oppc_si_small(uint64_t value, uint32_t bits) { assert(bits <= 64); if(bits < 64) // note ULL is guaranteed to be >= 64-bits so this works. value &= (1ULL << bits) - 1ULL; return (oppc_value){ .type = oppc_tt_sel_int, .ok = true, .si = {.bits = bits, .small = value}, }; } static inline oppc_value oppc_add(oppc_value a, oppc_value b) { assert(a.type == oppc_tt_si); assert(b.type == oppc_tt_si); assert(a.si.bits == b.si.bits); assert(a.si.bits <= 64); return oppc_si_small(a.si.small + b.si.small, a.si.bits); }
(In reply to Jacob Lifshay from comment #98) > typedef enum oppc_type_tag { > oppc_tt_sel_int, > // add more stuff later all of the "add more stuff later" notes are for things out of scope of this bug (unless necessary)
(In reply to Dmitry Selyutin from comment #95) > ...where assignments to registers (and similar stuff) have the flags > updated. What's the point of having ok i explained already, jacob can you explain it, it matches the hardware. but briefly so that time is not wasted. > Even omitting issues related to global context in general, this is a detail > of a particular implementation, the low-level architecture should be kept > out of these details. I don't want the implementation to be tied to a > particular language; that's unrealistic (i haven't time to explain, so sorry) and very much out of scope > at some point we might opt to reuse this generated code > not only from C++. well beyond scope and not appropriate for discussion right now due to time pressure. > We might have wrappers which fetch/store parts of the > state though in order to keep the context opaque. indeed. that's exactly how i managed to turn a scalar RISC-V simulator, spike, into a FULL Simple-V Vectorised simulator, in only 6-8 weeks. but the wrappers were done by #including a template around each scalar instruction, and all of the (equivalent of) oppc_* were class/template members of that class. no ctx pointer. no XLEN member variable in ctx. all done by c++ overloads. please, really, don't add extra programming languages to this code, even as a planned future consideration.
(In reply to Luke Kenneth Casson Leighton from comment #100) > (In reply to Dmitry Selyutin from comment #95) > > > ...where assignments to registers (and similar stuff) have the flags > > updated. What's the point of having ok > > i explained already, jacob can you explain it, it matches the hardware. > but briefly so that time is not wasted. ok is inspired by the pipeline/data classes for soc.git: simplified: class Foo: data: Signal # n-bit value, or some other data type like Record ok: Signal # 1-bit value, true when `data` is valid, false when no data is supplied since we didn't want to have to rewrite large portions of the simulator, we decided to add an `ok` field directly to SelectableInt, which tells the simulator which outputs were actually written (when ok=True) and which should retain their old values (when ok=False, indicating that output wasn't written). knowing when outputs aren't written is necessary for a lot of FP operations that intentionally don't write to FRT if they'll trap, so the trap handler can see the original input values if FRT is the same register as an input. since the C code generator doesn't have to worry about problems caused by rewriting a bunch of the output code, we can just use ok without having to mash it into SelectableInt, like in comment #98
(In reply to Jacob Lifshay from comment #101) > since the C code generator doesn't have to worry about problems caused by > rewriting a bunch of the output code, we can just use ok without having to > mash it into SelectableInt, like in comment #98 if that works well and is dead easy to do i don't have a problem with it. time is of the essence
Multiple updates all over again. The most important: 1. Generalized transient "allocation" (i.e. construction on-the-fly). Now the caller doesn't have to think about the layout of oppc_int and can simply rely that the function calls do the right job. 2. Generalized C calls: both the calls from pseudocode and our own calls now use the same code. 3. Fixed ternary expressions. 4. Dropped ctx argument. Now, it's expected that the users define OPPC_XLEN/OPPC_GPR/OPPC_FPR to behave as they want. 5. Eliminated redundant pseudocode.
src[64-XLEN:63] <- (RS) RA <- (C ? A : B) { struct oppc_int src; struct oppc_int C; struct oppc_int A; struct oppc_int B; /* src[(64 - XLEN):63] <- (RS) */ oppc_range_subscript_assign( /* src */ &src, /* (64 - XLEN) */ oppc_sub( oppc_transient(&(struct oppc_int){}, UINT64_C(0), (uint8_t)OPPC_XLEN), /* 64 */ oppc_transient(&(struct oppc_int){}, UINT64_C(64), (uint8_t)OPPC_XLEN), /* XLEN */ oppc_transient(&(struct oppc_int){}, OPPC_XLEN, (uint8_t)OPPC_XLEN) ), /* 63 */ oppc_transient(&(struct oppc_int){}, UINT64_C(63), (uint8_t)OPPC_XLEN), /* (RS) */ &OPPC_GPR[OPPC_GPR_RS] ); /* RA <- C ? A : B */ oppc_assign( /* (RA) */ &OPPC_GPR[OPPC_GPR_RA], ( /* C */ &C ? /* A */ &A : /* B */ &B ) ); }
As we now rely on function calls to update anything and it might no longer be just a plain integer, I've took a liberty to rename oppc_int into oppc_value. https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=33babebc7156a8b547782d3fa39f5c11fdb237b2
There's an error for if expressions, we don't want to check an address and should emit a call instead.
OK, with the recent commits, despite some glitches, we're able to produce _some_ C code for any pseudocode. By "any pseudocode" I assume the following: for insn in db: tree = parser.parse(code="\n".join(insn.pcode)) for (level, line) in pc_code.code(insn=insn, root=tree): print(f"{' ' * 4 * level}{line}")
...and here goes fields initialization (with the rest of the code cut): void oppc_tdi(struct oppc_value const *insn) { struct oppc_value TO; oppc_assign( &TO, oppc_transient(&(struct oppc_value){}, UINT64_C(0), 5) ); oppc_subscript_assign( &TO, oppc_transient(&(struct oppc_value){}, 0, (uint8_t)OPPC_XLEN), oppc_subscript( insn, oppc_transient(&(struct oppc_value){}, 6, (uint8_t)OPPC_XLEN) ) ); oppc_subscript_assign( &TO, oppc_transient(&(struct oppc_value){}, 1, (uint8_t)OPPC_XLEN), oppc_subscript( insn, oppc_transient(&(struct oppc_value){}, 7, (uint8_t)OPPC_XLEN) ) ); oppc_subscript_assign( &TO, oppc_transient(&(struct oppc_value){}, 2, (uint8_t)OPPC_XLEN), oppc_subscript( insn, oppc_transient(&(struct oppc_value){}, 8, (uint8_t)OPPC_XLEN) ) ); oppc_subscript_assign( &TO, oppc_transient(&(struct oppc_value){}, 3, (uint8_t)OPPC_XLEN), oppc_subscript( insn, oppc_transient(&(struct oppc_value){}, 9, (uint8_t)OPPC_XLEN) ) ); oppc_subscript_assign( &TO, oppc_transient(&(struct oppc_value){}, 4, (uint8_t)OPPC_XLEN), oppc_subscript( insn, oppc_transient(&(struct oppc_value){}, 10, (uint8_t)OPPC_XLEN) ) ); }
...now with (GPR|0) support! x <- (RA|0) { struct oppc_value x; struct oppc_value RA; oppc_assign( &RA, oppc_transient(&(struct oppc_value){}, UINT64_C(0), 5) ); oppc_subscript_assign( &RA, oppc_transient(&(struct oppc_value){}, 0, (uint8_t)OPPC_XLEN), oppc_subscript( insn, oppc_transient(&(struct oppc_value){}, 11, (uint8_t)OPPC_XLEN) ) ); oppc_subscript_assign( &RA, oppc_transient(&(struct oppc_value){}, 1, (uint8_t)OPPC_XLEN), oppc_subscript( insn, oppc_transient(&(struct oppc_value){}, 12, (uint8_t)OPPC_XLEN) ) ); oppc_subscript_assign( &RA, oppc_transient(&(struct oppc_value){}, 2, (uint8_t)OPPC_XLEN), oppc_subscript( insn, oppc_transient(&(struct oppc_value){}, 13, (uint8_t)OPPC_XLEN) ) ); oppc_subscript_assign( &RA, oppc_transient(&(struct oppc_value){}, 3, (uint8_t)OPPC_XLEN), oppc_subscript( insn, oppc_transient(&(struct oppc_value){}, 14, (uint8_t)OPPC_XLEN) ) ); oppc_subscript_assign( &RA, oppc_transient(&(struct oppc_value){}, 4, (uint8_t)OPPC_XLEN), oppc_subscript( insn, oppc_transient(&(struct oppc_value){}, 15, (uint8_t)OPPC_XLEN) ) ); /* x <- (RA|0) */ oppc_assign( /* x */ &x, ( oppc_bool( &RA ) ? &OPPC_GPR[OPPC_GPR_RA] : oppc_transient(&(struct oppc_value){}, UINT64_C(0), (uint8_t)OPPC_XLEN) ) ); }
(In reply to Dmitry Selyutin from comment #109) > ...now with (GPR|0) support! oooOooo :) https://www.youtube.com/watch?v=bVURsxuwy_E
(In reply to Jacob Lifshay from comment #98) for adding strings, since there are only string constants, we can add them like so: typedef enum oppc_type_tag { oppc_tt_sel_int, oppc_tt_str, // add more stuff later } oppc_type_tag; typedef enum oppc_str { // for every string constant, add an entry here oppc_str_abc, // "abc" oppc_str_foo, // "foo" oppc_str_bar, // "bar" } oppc_str; // for debugging and other places you need the actual text const char * const oppc_strings[] = { // for every string constant, add an entry here [(int)oppc_str_abc] = "abc", [(int)oppc_str_foo] = "foo", [(int)oppc_str_bar] = "bar", }; typedef struct oppc_value { oppc_type_tag type : 4; bool ok : 1; : 0; // terminate bitfield union { oppc_sel_int si; oppc_str str; // add more stuff later }; } oppc_value; in python, you'd generate this by having a hard-coded list of all string constants (since you're not always parsing everything and you need to emit consistent type definitions or that's a ODR violation -- aka. UB): all_strings = ( "abc", "foo", "bar", ) def make_oppc_str_enumerant(v): name = ["oppc_str_"] # escape string into unambiguous C identifier for ch in v: if ch.isascii() and ch.isalnum(): name.append(ch) elif ch == "_": # 2 underlines to not confuse with escaped char name.append("__") else: name.append("_%x_" % (ord(ch),)) return "".join(name) def translate_str_lit(v): assert f in all_strings, "%r is not in all_strings" % (v,) return "(oppc_value){.type = oppc_tt_str, .ok = true, .str = %s}" % ( make_oppc_str_enumerant(v),) def c_str(v): v = v.replace("\\", "\\\\") v = v.replace("\n", "\\n") v = v.replace("\t", "\\t") v = v.replace("\"", "\\"") return '"' + v + '"' def create_oppc_str_definition(): yield "typedef enum oppc_str {" for i in all_strings: yield " %s, // %s" % (make_oppc_str_enumerant(i), c_str(i)) yield "} oppc_str;" yield "" yield "// for debugging and other places you need the actual text" yield "const char * const oppc_strings[] = {" for i in all_strings: yield " [(int)%s] = %s," % (make_oppc_str_enumerant(i), c_str(i)) yield "};"
I wrote my position on the ONLY string we have so far in 1250. I won't introduce a tagged union only due to stubbornness. In fact, I don't see a need for a tagged union at all, both the type and ok field can be just encoded into flags of oppc_value.
Until then, I moving on assuming this function has a string argument, indeed.
(In reply to Dmitry Selyutin from comment #112) > both the type and ok field can be just > encoded into flags of oppc_value. that's true, but not why i proposed a union, i proposed a union because we later need to add more types (rational number for bfp_* code, several different structs for XER, SVSTATE, FPSCR, and BFP numbers). I proposed ok and type being bitfields because that makes the code clearer while producing the same assembly.
Function calls will need more work: in our realities, things like EXTS should resolve to this: struct oppc_value * OPPC_FUNCTION_EXTS(struct oppc_value *ret, struct oppc_value const *value); Meanwhile EXTS has only 1 argument, and we cannot update the input.
88582 lines of C code from the C generator for all the code we had in insndb. Not bad.
(In reply to Dmitry Selyutin from comment #112) > I wrote my position on the ONLY string we have so far in 1250. I won't > introduce a tagged union indeed. jacob has been told maaany many times to stop introducing types and unnecessary complexity (apologies talking about you in 3rd person there jacob). > In fact, I don't see a > need for a tagged union at all, both the type and ok field can be just > encoded into flags of oppc_value. yyyep. or not even flags, just an integer number in oppc_value. the constants can also be #defined and made global/static, which makes good readability if they are used in the compiler output (rather than putting the explicit value itself)
(In reply to Jacob Lifshay from comment #114) > that's true, but not why i proposed a union, i proposed a union because we > later need to add more types (rational number for bfp_* code, several > different structs for XER, SVSTATE, FPSCR, and BFP numbers). no. and we are not repeat not for the umpteenth time increasing the scope. please for goodness sake stop doing this, i have been literally asking you for years jacob.
28 commits since the last update. This was a tough journey!! The code can now be compiled successfully. This is a valid C code. The most difficult commit was this: https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=90c93ec3cf18b96a9e70464f2b549c9bace1bbe4 Believe it or not, this was done manually. A long and extremely boring series of "run Python and produce C code, compile it with C, see which function is broken due to lack of prototype, visit Python code and see its arguments, add the corresponding C prototype but don't forget to adopt the calling convention, our result must generate a transient and return a pointer to it". I cannot stress how annoying and boring it was, probably one of the most annoying tasks I ever did, but I decided that writing yet another parser would consume even more time. This can be checked via the following incantation: gcc -Os -Isrc/liboppc -Wall -Werror -Wextra test.c -c -o test.o As you see I took extra measures to ensure we pass some stricter checks. I'm planning to add src/liboppc/oppc.c autogeneration so that it's compiled as oppc.o, and be done with the task. The actual implementation of 238 calls (this includes the implementation of SelectableInt-like semantics, filling all these defines, all functions we have in helpers and isafn, switching for FPSCR attributes and so on). Yes, this will be a damn lot of work; even all the prototypes and definities took 1K+ lines of manually written code. Some functions can be generated as well, and I think this is the path we should follow, eventually discarding as many manually written helpers as we can. So, I'm planning to add src/liboppc/oppc.c autogeneration as part of the build and be done with this task. I might probably agree to some smallish additions which hopefully won't be as annoying as that header above. I already feel this is well underestimated, considering the efforts, though. I'd like to listen to your thoughts and ideas.
Ah yes, I forgot the first part of the incantation. Without it, magic ain't gonna happen. python3 -m openpower.oppc > test.c gcc -Os -Isrc/liboppc -Wall -Werror -Wextra test.c -c -o test.o
Ah yes, -Os is not needed, I just checked how much bytes the object file has (2.2 MB with -Os, 4.6 IIRC with default, 2.4 IIRC with -O3). Just curiosity. And yes, almost 148K lines for the code itself (obviously including the comments from the pseudocode, but I consider these essential, otherwise the code is barely readable).
(In reply to Dmitry Selyutin from comment #119) > 28 commits since the last update. This was a tough journey!! The code can > now be compiled successfully. hooraay! that's brilliant > This is a valid C code. The most difficult > commit was this: > > https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff; > h=90c93ec3cf18b96a9e70464f2b549c9bace1bbe4 > > Believe it or not, this was done manually. daaamn... :) > pointer to it". I cannot stress how annoying and boring it was, sometimes i find it good to switch off if you know what i mean :) > probably one > of the most annoying tasks I ever did, but I decided that writing yet > another parser would consume even more time. ahh... yes :) > This can be checked via the following incantation: > > gcc -Os -Isrc/liboppc -Wall -Werror -Wextra test.c -c -o test.o > > As you see I took extra measures to ensure we pass some stricter checks. > > I'm planning to add src/liboppc/oppc.c autogeneration ah not to the repo, please, if that's the plan? (maybr misunderstding) no matter if it takes a long time, the project's policy is never to add autogenerated output. the one piece of work that violated that rule took up 20% of available drive space and it is a permanent serious problem, the repo is 50 times bigger than all others. > so that it's compiled > as oppc.o, and be done with the task. The actual implementation of 238 calls > (this includes the implementation of SelectableInt-like semantics, filling > all these defines, all functions we have in helpers and isafn, switching for > FPSCR attributes and so on). Yes, this will be a damn lot of work; even all > the prototypes and definities took 1K+ lines of manually written code. yep. welcome to 5 years development. >Some > functions can be generated as well, and I think this is the path we should > follow, eventually discarding as many manually written helpers as we can. yes. hm if consulted beforehand i would have said "don't do it" but you needed the protos to test the output, ho hum it is what it is. doing introspection of the modules would, retrospectively, perhaps have been a better option? plus it would remove ongoing de-sync / maintenance headaches when python functions change (or new ones added) they're done now, and do the job. > So, I'm planning to add src/liboppc/oppc.c autogeneration as part of the > build build script and makefiles yes but to repo no? am i understanding right? > and be done with this task. i am veeery happy for that to happen, you've done a hell of a lot. > I might probably agree to some smallish > additions which hopefully won't be as annoying as that header above. I > already feel this is well underestimated, considering the efforts, though. no, enough is enough, further work requires a bigger budget > I'd like to listen to your thoughts and ideas. if you can outline what it would take to do a python-cffi wrapper plus auto-introspection/generation of the c protos to ensure that the python helpers.py is the "canonical" source, plus a hefty budget with a bit extra for some unexpected things (which we know always haoppens sigh) and chuck it onto 1211 or 1212 as appropriate, that'd be fantastic. correction: if you are *interested* to do that... :)
(In reply to Luke Kenneth Casson Leighton from comment #122) > (In reply to Dmitry Selyutin from comment #119) > > build script and makefiles yes but to repo no? am i understanding right? I've pushed a commit which adds a Makefile with simple commands to check it: https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=ab9c2cf0f935d6d97e319fbd4d177d7f8679a58f I'll rebase libopid against the recent master in scope of bug #1254, put these changes atop, and then add these sweet pair to build once I ensure it works in CI. But this for 1254, my works here are done, it's ready and can be checked. > i am veeery happy for that to happen, you've done a hell of a lot. Thanks! It was a tough one. :-) > if you can outline what it would take to do a python-cffi wrapper > plus auto-introspection/generation of the c protos to ensure that > the python helpers.py is the "canonical" source, plus a hefty budget > with a bit extra for some unexpected things (which we know always > haoppens sigh) and chuck it onto 1211 or 1212 as appropriate, that'd > be fantastic. > > correction: if you are *interested* to do that... :) Certainly, but I'll take some well-deserved rest first. :-)
Luke, based on your comment #122, I've submitted an RFP and marked this as RESOLVED FIXED, please let me know if I understood you incorrectly.
Oddly, I haven't received a single damn notification from the bank that the payment has arrived. Perhaps something changed again, sigh. But I confirm the payment's arrived.
(In reply to Dmitry Selyutin from comment #125) > payment has arrived. Perhaps something changed again, sigh. But I confirm > the payment's arrived. total included the other two you asked to be merged?
A bigger amount than solely for this task, so I'd assume yes. The bank I'm using now has a huge fee, they consider keeping EUR is unsafe since recent, plus EU sender bank took their own fee. Welcome to post-sanctions world. :-)
(In reply to Dmitry Selyutin from comment #127) > A bigger amount than solely for this task, so I'd assume yes. The bank I'm > using now has a huge fee, they consider keeping EUR is unsafe since recent, > plus EU sender bank took their own fee. Welcome to post-sanctions world. :-) worth asking if it can be transferred in a different currency?