see table 63 on v3.1B page 670(696) for a full list of how fp trap/exception handling should be done (at least for vsx, tbd if that matches sffs ignoring VSR vs. FPR) * DONE: adjust fcvt*/fmv* * TODO: adjust fadd/fsub/etc. existing PowerISA instructions without spec. pseudocode * TODO: adjust fptrans/fft/dct/etc. proposed instructions that can cause fp traps. original comment: https://libre-soc.org/irclog/%23libre-soc.2023-05-21.log.html#t2023-05-21T10:29:37 things like fcvtfg don't need to prevent writing the output since the output is in a different regfile and thus can't possibly overlap with the input reg. things like fsinpi f3, f3 need to prevent writing the output reg when they trap so the trap can see what the input was.
basically we need to add/remove the pseudocode that explicitly sets up register state (FRT/RT and FPSCR, not SRR0 or other trap machinery) for taking a fp trap. see xscvdpsxds for an example -- we need to add/remove the `if vex_flag`
(In reply to Jacob Lifshay from comment #0) > https://libre-soc.org/irclog/%23libre-soc.2023-05-21.log.html#t2023-05-21T10: > 29:37 > > things like fcvtfg don't need to prevent writing the output since the output > is in a different regfile and thus can't possibly overlap with the input reg. this is a misunderstanding of how the simulator works. a register that requires writing *is* a local variable, *is* a return result from the function. def op_fcvtfg(self, RB, FPSCR): if eq(IT[0], 0): FRT = copy_assign_rhs(result) return (FRT, FPSCR,) > things like fsinpi f3, f3 need to prevent writing the output reg when they > trap so the trap can see what the input was. this is automatic and inherent. the write of the results *is* automatically inherently 100% without fail absolute without fail prevented. there is no other choice. declaring this bugreport invalid as there is no change required (and no change ever going to happen) and it was raised based on a misunderstanding.
(In reply to Jacob Lifshay from comment #1) > basically we need to add/remove the pseudocode that explicitly sets up > register state (FRT/RT and FPSCR, not SRR0 or other trap machinery) for > taking a fp trap. > see xscvdpsxds for an example -- we need to add/remove the `if vex_flag` i have absolutely no idea why on earth that would be necessary. can you please confirm that you understand that there is not and will never be a change to action what you have recommended? and that you understand that FRT is a *local* variable returned as a *local* result, where ISACaller has control *after* the pseudocode function is called as to whether the result is written to the regfile? what you should *not* have done is bypass the mechanism that was established 2 years ago for passing in registers and returning them if name in ('XLEN', 'FPSCR') or name in BFP_FLAG_NAMES: attr = ast.Name("self", ast.Load()) p[0] = ast.Attribute(attr, name, ast.Load(), lineno=p.lineno(1)) that is *NOT* the correct behaviour. look at the pseudocode def op_fcvttg_(self, FRB, ****>>>>> FPSCR <<<<<<*****): if eq(vex_flag, 0): RT = copy_assign_rhs(result) >>> NO! self. <<<< FPSCR.FPRF = copy_assign_rhs(self.undefined( SelectableInt(value=0x0, bits=5))) return (RT, *****>>>>> FPSCR <<<<<*****, overflow,) you should *NOT* have attempted to bypass FPSCR being passed in as a local variable, bypassed FPSCR as a return result. updating of FPSCR goes into *check_write()*
> change pseudocode to prevent output register write only when > causing a fp trap and output is in same regfile as input whatever you do, do *not* do this. i will change the bugreport to the correct (required) action.
commit 46690bb38bf2534ea200503d753a422dc866a000 Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net> Date: Sun May 21 12:01:45 2023 +0100 FPSCR should never have been added to "bypass" the incoming local parameter or its return result
done. + # write FPSCR + if name in ['FPSCR', ]: + log("write FPSCR 0x%x" % (output.value)) + self.FPSCR.eq(output) + return commit 20cf6adbcb6800f5d9e15c59a5f7e280ebef1352 (HEAD -> master) Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net> Date: Sun May 21 12:17:39 2023 +0100 explicitly update FPSCR from list of return results
(In reply to Luke Kenneth Casson Leighton from comment #4) > > change pseudocode to prevent output register write only when > > causing a fp trap and output is in same regfile as input > > whatever you do, do *not* do this. i will change the bugreport > to the correct (required) action. so the bugreport is invalid (but should not be closed as such until it has been understood why). to "effect" a "no change", the variable that requires "not to be changed" must: 1) be passed in as an INPUT parameter 2) not be modified by the pseudocode 3) be passed OUT as a return result. the "no change" is achieved by simply not making any modifications to the local variable (parameter) the return result **WILL** get written (caveats: VLi=0) but the contents will **HAPPEN** to be indentical... resulting in "the appearance of no change having taken place" CPU efficiency of the Simulator is *not* a high priority here.
ok you'll need to pass in FPSCR into all helper routines, or better we use @inject(). i think that is safer/better, a lot less hassle - let me try it out
(In reply to Luke Kenneth Casson Leighton from comment #8) > ok you'll need to pass in FPSCR into all helper routines, > or better we use @inject(). i think that is safer/better, > a lot less hassle - let me try it out done - sorted commit 7df4bbfd7b87dde3d1c2b82e1e7ffbb58e075f02 (HEAD -> master, origin/master) Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net> Date: Sun May 21 15:02:44 2023 +0100 hack-add @inject() into pyfnwriter, also take the opportunity to stop doing "import ISACallerFnHelper as something" by doing a hack-substitute on the class name
(In reply to Luke Kenneth Casson Leighton from comment #2) > (In reply to Jacob Lifshay from comment #0) > > https://libre-soc.org/irclog/%23libre-soc.2023-05-21.log.html#t2023-05-21T10: > > 29:37 > > > > things like fcvtfg don't need to prevent writing the output since the output > > is in a different regfile and thus can't possibly overlap with the input reg. > > this is a misunderstanding of how the simulator works. this bug is not motivated by how the simulator works, but by how the PowerISA spec has decided to setup state in preparation for fp traps. we need to match the PowerISA spec. the simulator must be adjusted to match pre-existing convention, not convention adjusted to match our arbitrary simulator internals. (In reply to Luke Kenneth Casson Leighton from comment #3) > (In reply to Jacob Lifshay from comment #1) > > see xscvdpsxds for an example -- we need to add/remove the `if vex_flag` > > i have absolutely no idea why on earth that would be necessary. because that's the convention PowerISA decided on and, unless you want to propose to the ISA WG that they rewrite the pseudocode for all their existing fp ops (both Appendix A (see SNaN Operand for examples) and all fp vmx/vsx ops and probably decimal fp too), we need to match their existing convention. please confirm you understand that we need to solve the issue of getting the right pseudocode for the spec to match existing conventions, *because those conventions will 99.9% likely not change to suit our existing simulator because of how massive the change would be*. (In reply to Luke Kenneth Casson Leighton from comment #4) > > change pseudocode to prevent output register write only when > > causing a fp trap and output is in same regfile as input > > whatever you do, do *not* do this. i will change the bugreport > to the correct (required) action. please *don't repurpose bugs* like this, open a new bug and close the old one instead. it is especially problematic when the original bug is still a bug that needs fixing (like in this case), but also just confusing for people who see an email for a new bug and click on the link but are lead to a completely different bug because the entry was usurped. please open a new bug for "change ISACaller and correct bug introduced in parser.py where it bypasses FPSCR as a local parameter and a return result" and change this bug's title back to "change pseudocode to prevent output register write only when causing a fp trap and output is in same regfile as input" or at least "add pseudocode to properly setup for fp traps according to PowerISA spec convention"
(In reply to Jacob Lifshay from comment #10) > this bug is not motivated by how the simulator works, but by how the > PowerISA spec has decided to setup state in preparation for fp traps. we > need to match the PowerISA spec. yes? and that's being done, yes? i see no difference between this (xscvdpsxds) if vex_flag=0 then do VSR[32xTX+T].dword[1] <- result VSR[32xTX+T].dword[2] <- 0x0000_0000_0000_0000 FPSCR.FPRF <- 0bUUUUU FPSCR.FR <- inc_flag FPSCR.FI <- xx_flag end else do FPSCR.FR <- 0b0 FPSCR.FI <- 0b0 end and this (fcvttg): vex_flag <- FPSCR.VE & vx_flag if vex_flag = 0 then RT <- result FPSCR.FPRF <- undefined(0b00000) FPSCR.FR <- inc_flag FPSCR.FI <- xx_flag else FPSCR.FR <- 0 FPSCR.FI <- 0 i see no difference between those in any way. > the simulator must be adjusted to match > pre-existing convention, yes? and i have not seen anything that needs to be changed *to* match? > not convention adjusted to match our arbitrary > simulator internals. agreed... so what precisely and exactly are you envisioning needs to "change"? you have not shown me where anything is that actually *needs to change* > please confirm you understand that we need to solve the issue of getting the > right pseudocode for the spec to match existing conventions, i see neither an error in the pseudocode nor anything that needs to change in the simulator to support a change that i cannot see is even needed. > please *don't repurpose bugs* like this, open a new bug and close the old > one instead. it is especially problematic when the original bug is still a > bug that needs fixing (like in this case), i see no bug. you need to tell me *precisely and exactly* what change is needed, and where. at present as i literally see no difference in the pseudocode i am completely unable to understand why there is any change in the simulator's behaviour needed. > change this bug's title back to "change pseudocode to prevent output > register write only when causing a fp trap and output is in same regfile as > input" or at least "add pseudocode to properly setup for fp traps according > to PowerISA spec convention" sorry, i'm not doing that until - unless - you can demonstrate that there is an actual change needed. i cannot see any functional difference between the two pieces of text, above. the only difference i see is the register target VSR instead of RT and that is immaterial.
(In reply to Luke Kenneth Casson Leighton from comment #11) > (In reply to Jacob Lifshay from comment #10) > > > this bug is not motivated by how the simulator works, but by how the > > PowerISA spec has decided to setup state in preparation for fp traps. we > > need to match the PowerISA spec. > > yes? and that's being done, yes? i see no difference between this > (xscvdpsxds) > <snip> > > and this (fcvttg): > <snip> > > i see no difference between those in any way. yes, because the places that we must change are all our other fp instructions (fptrans, fft ops, etc.). e.g. fatan2s needs to be changed to something like: https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=openpower/isa/fptrans.mdwn;h=e6799bb1a72767793330bfae70e5a17b6c6d6af5;hb=371d91b299c0e4bd7b23e660b9936ed40debb824#l14 result <- DOUBLE(bfp32_ATAN2(SINGLE((FRA)), SINGLE((FRB)))) if FPSCR.FEX = 0 then # computed as vex_flag in fcvt* FRT <- result else # don't write FRT # do other trap setup -- TBD ops like fmv don't need to change since they never cause fp traps. additionally, fcvt* might change to always write [F]RT (since that simplifies hardware and can't overwrite the input, unlike basically all other fp ops) but that change isn't technically necessary.
(In reply to Jacob Lifshay from comment #12) > e.g. fatan2s needs to be changed to something like: > https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=openpower/isa/ > fptrans.mdwn;h=e6799bb1a72767793330bfae70e5a17b6c6d6af5; > hb=371d91b299c0e4bd7b23e660b9936ed40debb824#l14 > > result <- DOUBLE(bfp32_ATAN2(SINGLE((FRA)), SINGLE((FRB)))) > if FPSCR.FEX = 0 then # computed as vex_flag in fcvt* actually, rather than testing FEX, we need to test if *this instruction* sets FEX (hence vex_flag), since FEX can be 1 on entry to the instruction (if MSR exceptions are disabled or in non-precise exception modes) > FRT <- result > else > # don't write FRT > # do other trap setup -- TBD > > ops like fmv don't need to change since they never cause fp traps. > > additionally, fcvt* might change to always write [F]RT (since that > simplifies hardware and can't overwrite the input, unlike basically all > other fp ops) but that change isn't technically necessary.
(In reply to Jacob Lifshay from comment #12) > yes, because the places that we must change are all our other fp > instructions (fptrans, fft ops, etc.). ah ok. so given that the places that *are* in the right pattern (the if FEX then FRT/FPSCR else FPSCR parts) require no change to the simulator, and that *works* right now (does the right thing and requires - required - no change), i still do not see why any change would be needed once all of those instructions are modified to follow the exact same pattern > e.g. fatan2s needs to be changed to something like: > > result <- DOUBLE(bfp32_ATAN2(SINGLE((FRA)), SINGLE((FRB)))) > if FPSCR.FEX = 0 then # computed as vex_flag in fcvt* > FRT <- result > else > # don't write FRT > # do other trap setup -- TBD where i assume by "do other trap setup" you mean "modify FPSCR flags just like in the other locations"?
(In reply to Luke Kenneth Casson Leighton from comment #14) > once all of those instructions > are modified to follow the exact same pattern this bug is for modifying all those instructions' pseudocode > > > e.g. fatan2s needs to be changed to something like: > > > > result <- DOUBLE(bfp32_ATAN2(SINGLE((FRA)), SINGLE((FRB)))) > > if FPSCR.FEX = 0 then # computed as vex_flag in fcvt* > > FRT <- result > > else > > # don't write FRT > > # do other trap setup -- TBD > > where i assume by "do other trap setup" you mean "modify FPSCR flags > just like in the other locations"? yes, basically.
(In reply to Jacob Lifshay from comment #15) > > where i assume by "do other trap setup" you mean "modify FPSCR flags > > just like in the other locations"? > > yes, basically. see table 63 on v3.1B page 670(696) for a full list of how fp trap/exception handling should be done
(In reply to Luke Kenneth Casson Leighton from comment #11) > > please *don't repurpose bugs* like this, open a new bug and close the old > > one instead. it is especially problematic when the original bug is still a > > bug that needs fixing (like in this case), > > i see no bug. even if you see no bug in the future, since I obviously think there is a bug and it's better to not mix mostly unrelated issues, please don't repurpose bugs like this. please just open a new one. > > > change this bug's title back to "change pseudocode to prevent output > > register write only when causing a fp trap and output is in same regfile as > > input" or at least "add pseudocode to properly setup for fp traps according > > to PowerISA spec convention" > > sorry, i'm not doing that until - unless - you can demonstrate that there > is an actual change needed. since that has been demonstrated in comment #12 and comment #15, please do make the corresponding title change
(In reply to Jacob Lifshay from comment #17) > since that has been demonstrated in comment #12 and comment #15, please do > make the corresponding title change willdo, tomorrow.
(In reply to Luke Kenneth Casson Leighton from comment #18) > (In reply to Jacob Lifshay from comment #17) > > > since that has been demonstrated in comment #12 and comment #15, please do > > make the corresponding title change > > willdo, tomorrow. it's been a few days, so I just made the change and opened a new bug (bug #1088) for "change ISACaller and correct bug introduced in parser.py where it bypasses FPSCR as a local parameter and a return result" since i have some comments on that.
other *fun* side-effects of fcvtfg not writing RT (if we decide to not accept my proposal) is that we either have to manually calculate CR0 in the pseudocode or not change CR0 (messes with data-dependent fail-first) or (probably the worst option) read RT and calculate CR0 from the read-in value. (CR1 doesn't have that problem since it's input always comes from FPSCR). I think the best option is to just have fcvtfg always write RT as I proposed, since suppressing writes (so the trap handler can see the input) is only necessary when targeting the same register file as inputs.
(In reply to Jacob Lifshay from comment #20) > other *fun* side-effects of fcvtfg whoops, I meant fcvttgo. > not writing RT (if we decide to not > accept my proposal) is that we either have to manually calculate CR0 in the > pseudocode or not change CR0 (messes with data-dependent fail-first) or > (probably the worst option) read RT and calculate CR0 from the read-in > value. (CR1 doesn't have that problem since it's input always comes from > FPSCR). > > I think the best option is to just have fcvtfg same here > always write RT as I > proposed, since suppressing writes (so the trap handler can see the input) > is only necessary when targeting the same register file as inputs.
(In reply to Jacob Lifshay from comment #20) jacob: i think you keep misunderstanding: RT *is* always written by the simulator. i am NEVER going to change the Simulator to stop the value returned by the pseudocode function from being written. *regardless* of what the contents are the contents of FRT are **ALWAYS** passed to the function ISACaller.do_outregs_nia(). whatever goes in to RT *is* written out back into FPR, period. this is NEVER going to change. it is therefore up to you to make a choice in the pseudocode whether to CHANGE the value of FRT or NOT change the value of FRT. but the value regardless of whether it is or is not changed by the pseudocode WILL ALWAYS BE WRITTEN BACK INTO THE REGISTER. now, what you *might* be assuming is that "there is a flag in the operator "<-" which if used on a return result it indicates to the Simulator to store the result in the register", this is false: you are likely thinking of *hardware* here (which is perfectly reasonable). comments in the unit test such as this are therefore misleading or at least ambiguous: + # FIXME: #1087 proposes to change pseudocode of fcvt* to + # always write output, this implements reading RT when output + # isn't written, which is terrible it's not terrible at all for the *Simulator*, it doesn't care. what you might be thinking of is *hardware* always writing output, and/or always reading input, which yes that would waste resources if it was unnecessary. when writing the HDL it would be one of the jobs of the implementation to set the "output-enable" bit, indicating that the register is to be written (or cancelled if not set). with the code-reuse from the unit tests it's probably a good idea to clarify that hardware should watch out for that. the most important bit there will be if it is extremely easy (based preferably on immediate-flags and nothing else, certainly not reading the contents of the register) to determine if RT needs to be read at all. also, i have to say, i don't fully understand what you mean "reading RT when output isn't written" - are you saying that the instruction has to become a READ-MODIFY-WRITE? or that in *hardware* it is necessary to read RT, perform some checks, and then decide "actually RT need not be written"? this makes no sense: there really should only be MODIFY-WRITE (no read) (there are only a few read-modify-write instructions, they're explicitly listed as such in the spec, hence why i ask)
(In reply to Luke Kenneth Casson Leighton from comment #22) > (In reply to Jacob Lifshay from comment #20) > > jacob: i think you keep misunderstanding: RT *is* always written by > the simulator. i understand that perfectly fine, my proposal has nothing to do with how the simulator works, it is to make the pseudocode always write to RT so hw doesn't have to mask off writes to CR0 (causes problems for ddff) or read RT from the regfile to compute CR0. > comments in the unit test such as this are therefore misleading > or at least ambiguous: > > + # FIXME: #1087 proposes to change pseudocode of fcvt* to > + # always write output, this implements reading RT when > output > + # isn't written, which is terrible this is terrible for hw, it means it neads to read another register just for the very rare case when it traps so it can properly calculate CR0 from the RT value it read. > > it's not terrible at all for the *Simulator*, it doesn't care. yup. > also, i have to say, i don't fully understand what you mean > "reading RT when output isn't written" - are you saying that > the instruction has to become a READ-MODIFY-WRITE? technically it'd be read-or-write in hardware > or that > in *hardware* it is necessary to read RT, perform some > checks, and then decide "actually RT need not be written"? > this makes no sense: there really should only be MODIFY-WRITE > (no read) read needed to calculate CR0 when trapping, since then RT is not written since we all agree that's terrible, I want to change the pseudocode to always write RT even if it traps, that way it'd never need to read RT or mask off writes (ignoring SVP64 predicates). since you had said to not make that change (before you understood what I was talking about) i'm trying to get confirmation that making fcvttg's pseudocode always write RT is fine with you.
wait... are you saying that, at the moment, a zero is written into the regfile when a TRAP is called? are you then saying "that is bad"? because if so don't believe it is: compilers have to expect that register contents to be (statically) allocated regardless of the exception. IBM declares the result register "UNDEFINED" under these circumstances
(In reply to Luke Kenneth Casson Leighton from comment #24) > wait... are you saying that, at the moment, a zero is written into > the regfile when a TRAP is called? no, at the moment, the GPR regfile is unmodified when a fp trap occurs. the value of RT must be read from the GPR regfile so Rc=1 can compute the correct value of CR0. > > are you then saying "that is bad"? yes, for hw. > because if so don't believe it is: > compilers have to expect that register contents to be (statically) > allocated regardless of the exception. > > IBM declares the result register "UNDEFINED" under these circumstances imho that's unacceptable here, RT must either be unmodified (not written; for consistency with fp ops even though the reason that's done for fp ops doesn't apply here) or have the correct output (following the defined conversion semantics).
current semantics of fcvttgo.: v = fptoint(FRB) if overflow and enabled: # do *not* modify RT CR0 = compute_rc(RT) trap() else: RT = v CR0 = compute_rc(RT) proposed semantics of fcvttgo.: v = fptoint(FRB) if overflow and enabled: RT = v # writing is fine since it can't overwrite FRB due to FPR vs. GPR CR0 = compute_rc(RT) trap() else: RT = v CR0 = compute_rc(RT)
(In reply to Jacob Lifshay from comment #23) > (In reply to Luke Kenneth Casson Leighton from comment #22) > > (In reply to Jacob Lifshay from comment #20) > > > > jacob: i think you keep misunderstanding: RT *is* always written by > > the simulator. > > i understand that perfectly fine, my proposal has nothing to do with how the > simulator works, it is to make the pseudocode always write to RT so hw > doesn't have to mask off writes to CR0 (causes problems for ddff) or read RT > from the regfile to compute CR0. no you cannot do that, it entirely breaks how CR-field co-results work. such a change "in some cases Rc=1 means test the previous value" would not go down well with the ISA WG. > this is terrible for hw, it means it neads to read another register just for > the very rare case when it traps so it can properly calculate CR0 from the > RT value it read. but it should not have read RT [in order to do that] as that is totally the wrong behaviour and a misunderstanding of what CRf co-results actually are. co-results are associated with the result being computed by the hardware (ok pseudocode). and in the Simulator, the *return* results are what are tested - even before they're put into the regfile. that's why handle_condition goes to the trouble of hunting inside the list of return results, not re-reading the contents of any registers. > read needed to calculate CR0 when trapping, since then RT is not written then perform the change to CR0 explicitly, and leave RT alone. (btw it's perfectly valid to have RT "undefined" and CR0 to be "defined"). parser.py is smart enough to spot that the pseudocode contains CR0, and tell ISACaller *not* attempt to perform the [usual] computation, because parser.py places CR0 into the list of return results. there is at least one place where this is done (hence why i had to add the feature). oh! yes - prefix_codes! openpower/isa/prefix_codes.mdwn: CR0 <- ra_used || (tree_index >= 64) || found || hit_end yes this would mean that in the *non*-exceptional case you have to duplicate the functionality of Rc=1. yes it's a pain.
(In reply to Jacob Lifshay from comment #26) > current semantics of fcvttgo.: > v = fptoint(FRB) > if overflow and enabled: > # do *not* modify RT > CR0 = compute_rc(RT) > trap() no, really don't do that. it should be this: v = fptoint(FRB) if overflow and enabled: CR0.SO = 1 CR.EQ/LT/GT = UNDEFINED trap() actually, when overflow-bit is set, ISACaller already spots that and already sets CR.SO. and i think it reasonable to simply leave the rest of the contents of CR0 completely alone (in the pseudocode) and have english-language text stating "if overflow all bits of CR0 except for CR0.SO are UNDEFINED"
(In reply to Luke Kenneth Casson Leighton from comment #27) > (In reply to Jacob Lifshay from comment #23) > > (In reply to Luke Kenneth Casson Leighton from comment #22) > > > (In reply to Jacob Lifshay from comment #20) > > > > > > jacob: i think you keep misunderstanding: RT *is* always written by > > > the simulator. > > > > i understand that perfectly fine, my proposal has nothing to do with how the > > simulator works, it is to make the pseudocode always write to RT so hw > > doesn't have to mask off writes to CR0 (causes problems for ddff) or read RT > > from the regfile to compute CR0. > > no you cannot do that, it entirely breaks how CR-field co-results work. which part can i not do? the part with the current semantics which i thoroughly dislike with reading RT? or the part with changing the pseudocode to always write RT which I think is the best option overall?
just leave it as this: v = fptoint(FRB) if overflow and enabled: trap() else: RT = v * overflow will be written into CR0 by ISACaller (post-pseudocode-execution) * CR0's remaining bits are undefined (to be mentioned in the english language spec section) * even if RT is analysed by ISACaller (post-pseudocode-execution) it is "Not Your Problem(tm)" - because RT is undefined. * software will need to be told "on exception the only CR0 bit you can trust is CR0.SO".
for comparison, fcmpo always writes to the destination since it is in a different register file and therefore can't be overwriting any inputs even when fcmpo causes an identical trap.
(In reply to Jacob Lifshay from comment #29) > which part can i not do? the part with the current semantics which i > thoroughly dislike with reading RT? or the part with changing the pseudocode > to always write RT which I think is the best option overall? neither. just mention in the english language of the spec page "on a FP exception, RT is UNDEFINED. Additionally, if Rc=1, CR0.SO is set to 1, all other bits of CR0 are UNDEFINED" and don't touch CR0 in the psuedocode at all.
(In reply to Luke Kenneth Casson Leighton from comment #30) > just leave it as this: > > v = fptoint(FRB) > if overflow and enabled: > trap() > else: > RT = v > > * overflow will be written into CR0 by ISACaller (post-pseudocode-execution) > * CR0's remaining bits are undefined (to be mentioned in the english language > spec section) > * even if RT is analysed by ISACaller (post-pseudocode-execution) it is > "Not Your Problem(tm)" - because RT is undefined. that's not how pseudocode works in the PowerISA spec, if it doesn't explicitly write `undefined` to RT, then RT is not undefined. in this case RT is unmodified. plus, this doesn't match fcmpo, the only other kind of inter-regfile fp op with traps i could find.
(In reply to Jacob Lifshay from comment #31) > for comparison, fcmpo always writes to the destination since it is in a > different register file and therefore can't be overwriting any inputs even > when fcmpo causes an identical trap. fcmpo does not provide a good precedent here to follow because it is not a "co-result" instruction (not an Rc=1) it is a 5-bit instruction targetting a specific CR-bit (not a CR-field bit) fcmpo: * reads two FPRs * writes one bit to CR these instructions are an *arithmetic* profile * read one FPR * write to one GPR *and a co-result CR0* in other words, the issue you think is an issue is actually invalid. this *is* doing the right thing: if vex_flag = 0 then RT <- result FPSCR.FPRF <- undefined(0b00000) FPSCR.FR <- inc_flag FPSCR.FI <- xx_flag else FPSCR.FR <- 0 FPSCR.FI <- 0 RT *must not* be written to [in the pseudocode] CR0 *must not* be written to [in the pseudocode] writing to overflow *is* correct there is no change required to the pseudocode.
(In reply to Jacob Lifshay from comment #33) > that's not how pseudocode works in the PowerISA spec, if it doesn't > explicitly write `undefined` to RT, then RT is not undefined. in this case > RT is unmodified. great. that's fine, good catch. CR0 should still be UNDEFINED in all its 3 bits except overflow, with words to that effect written into the english language part of the spec (not the pseudocode). > plus, this doesn't match fcmpo, the only other kind of inter-regfile fp op > with traps i could find. as i said that is not an Rc=1 co-result instruction and comparing with it is invalid. action required: * NONE (no change) on the pseudocode * English language text stating "on exception CR0 fields UNDEFINED except CR0.SO"
(In reply to Luke Kenneth Casson Leighton from comment #35) > action required: > > * NONE (no change) on the pseudocode > * English language text stating "on exception CR0 fields UNDEFINED except > CR0.SO" I phrased it differently since that's not exactly correct (since I was glossing over one case when discussing with you): > When `RT` is not written (`vex_flag = 1`), all CR0 bits > except SO are undefined. https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=72076851a4927c15c077db046d2ec0a234033fa0 commit 72076851a4927c15c077db046d2ec0a234033fa0 Author: Jacob Lifshay <programmerjake@gmail.com> Date: Thu May 25 22:09:31 2023 -0700 fcvttg CR0 fields (except SO) are undefined when RT is not written this can occur even when fp traps are disabled in MSR, so writing "on exceptions" is incorrect.
(In reply to Jacob Lifshay from comment #36) > > When `RT` is not written (`vex_flag = 1`), all CR0 bits > > except SO are undefined. > > https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff; > h=72076851a4927c15c077db046d2ec0a234033fa0 ok that's really clear. good catch that the wording also needed fixing. interesting that it's not made clear in the Power ISA spec itself, i will have to do a quick search for vex_flag > commit 72076851a4927c15c077db046d2ec0a234033fa0 > Author: Jacob Lifshay <programmerjake@gmail.com> > Date: Thu May 25 22:09:31 2023 -0700 > > fcvttg CR0 fields (except SO) are undefined when RT is not written ok yes i forgot. been too long. ok so implementation-wise vex_flag needs to be done like overflow in ISACaller. see bug #1177.