illustrating the point of why the shift-and-add instruction is beneficial is quite important, by including a table (38) LD/ST instructions. perhaps this will not be considered costly: it just depends. **prior work on ls004 is NOT part of this milestone. ONLY the addition of the LD/ST table and the associated rationale is part of this milestone* -- https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=openpower/sv/rfc/ls004.mdwn 128 **LD/ST-Shifted** 129 130 | 0-5 | 6-10 | 11-15 | 16-20 | 21-22 | 23-31 | Instruction | 131 |-------|------|-------|-------|-------|-------|----------------------| 132 | PO | RT | RA | RB | SH | XO | lbzsx RT,RA,RB,SH | 133 | PO | RT | RA | RB | SH | XO | lhzsx RT,RA,RB,SH | 134 | PO | RT | RA | RB | SH | XO | lhasx RT,RA,RB,SH | 135 | PO | RT | RA | RB | SH | XO | lwzsx RT,RA,RB,SH | 136 | PO | RT | RA | RB | SH | XO | lwasx RT,RA,RB,SH | 137 | PO | RT | RA | RB | SH | XO | ldsx RT,RA,RB,SH | 138 | PO | RT | RA | RB | SH | XO | lhbrsx RT,RA,RB,SH | 139 | PO | RT | RA | RB | SH | XO | lwbrsx RT,RA,RB,SH | 140 | PO | RT | RA | RB | SH | XO | ldbrsx RT,RA,RB,SH | 141 | PO | RS | RA | RB | SH | XO | stbsx RS,RA,RB,SH | 142 | PO | RS | RA | RB | SH | XO | sthsx RS,RA,RB,SH | 143 | PO | RS | RA | RB | SH | XO | stwsx RS,RA,RB,SH | 144 | PO | RS | RA | RB | SH | XO | stdsx RS,RA,RB,SH | 145 | PO | RS | RA | RB | SH | XO | sthbrsx RS,RA,RB,SH | 146 | PO | RS | RA | RB | SH | XO | stwbrsx RS,RA,RB,SH | 147 | PO | RS | RA | RB | SH | XO | stdbrsx RS,RA,RB,SH | 148 | PO | FRT | RA | RB | SH | XO | lfsxs FRT,RA,RB,SH | 149 | PO | FRT | RA | RB | SH | XO | lfdxs FRT,RA,RB,SH | 150 | PO | FRT | RA | RB | SH | XO | lfiwaxs FRT,RA,RB,SH | 151 | PO | FRT | RA | RB | SH | XO | lfiwzxs FRT,RA,RB,SH | 152 | PO | FRS | RA | RB | SH | XO | stfsxs FRS,RA,RB,SH | 153 | PO | FRS | RA | RB | SH | XO | stfdxs FRS,RA,RB,SH | 154 | PO | FRS | RA | RB | SH | XO | stfiwxs FRS,RA,RB,SH | 155 156 **LD/ST-Shifted-Update** 157 158 | 0-5 | 6-10 | 11-15 | 16-20 | 21-22 | 23-31 | Instruction | 159 |-------|------|-------|-------|-------|-------|----------------------| 160 | PO | RT | RA | RB | SH | XO | lbzusx RT,RA,RB,SH | 161 | PO | RT | RA | RB | SH | XO | lhzusx RT,RA,RB,SH | 162 | PO | RT | RA | RB | SH | XO | lhausx RT,RA,RB,SH | 163 | PO | RT | RA | RB | SH | XO | lwzusx RT,RA,RB,SH | 164 | PO | RT | RA | RB | SH | XO | lwausx RT,RA,RB,SH | 165 | PO | RT | RA | RB | SH | XO | ldusx RT,RA,RB,SH | 166 | PO | RS | RA | RB | SH | XO | stbusx RS,RA,RB,SH | 167 | PO | RS | RA | RB | SH | XO | sthusx RS,RA,RB,SH | 168 | PO | RS | RA | RB | SH | XO | stwusx RS,RA,RB,SH | 169 | PO | RS | RA | RB | SH | XO | stdusx RS,RA,RB,SH | 170 | PO | FRT | RA | RB | SH | XO | lfsuxs FRT,RA,RB,SH | 171 | PO | FRT | RA | RB | SH | XO | lfduxs FRT,RA,RB,SH | 172 | PO | FRS | RA | RB | SH | XO | stfsuxs FRS,RA,RB,SH | 173 | PO | FRS | RA | RB | SH | XO | stfduxs FRS,RA,RB,SH | 174 175 **Post-Increment-Update LD/ST-Shifted** 176 177 | 0-5 | 6-10 | 11-15 | 16-20 | 21-22 | 23-31 | Instruction | 178 |-------|------|-------|-------|-------|-------|----------------------| 179 | PO | RT | RA | RB | SH | XO | lbzuspx RT,RA,RB,SH | 180 | PO | RT | RA | RB | SH | XO | lhzuspx RT,RA,RB,SH | 181 | PO | RT | RA | RB | SH | XO | lhauspx RT,RA,RB,SH | 182 | PO | RT | RA | RB | SH | XO | lwzuspx RT,RA,RB,SH | 183 | PO | RT | RA | RB | SH | XO | lwauspx RT,RA,RB,SH | 184 | PO | RS | RA | RB | SH | XO | stbuspx RS,RA,RB,SH | 185 | PO | RS | RA | RB | SH | XO | sthuspx RS,RA,RB,SH | 186 | PO | RS | RA | RB | SH | XO | stwuspx RS,RA,RB,SH | 187 | PO | RS | RA | RB | SH | XO | stduspx RS,RA,RB,SH | 188 | PO | RT | RA | RB | SH | XO | lduspx RT,RA,RB,SH | 189 | PO | FRT | RA | RB | SH | XO | lfdupxs FRT,RA,RB,SH | 190 | PO | FRT | RA | RB | SH | XO | lfsupxs FRT,RA,RB,SH | 191 | PO | FRS | RA | RB | SH | XO | stfdupxs FRS,RA,RB,SH | 192 | PO | FRS | RA | RB | SH | XO | stfsupxs FRS,RA,RB,SH |
LD/ST table added to ls004, along with a brief explanation. needs expanding https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=9ca0101bb441ed8f81b1e2f0f86538942a73d0c0 also adding to v2 of ls012, see bug #1054
missing loads: signed 8bit stbus needs to be stbsx would be nice to have but icr if part of openpower: * both sign/zero extending byte reversed loads * byte reversed fp loads/stores
(In reply to Jacob Lifshay from comment #2) > missing loads: signed 8bit > stbus needs to be stbsx do make corrections directly but do NOT add yet more instructions. > would be nice to have but icr if part of openpower: > * both sign/zero extending byte reversed loads > * byte reversed fp loads/stores in combination with post-increment this becomes almost 80 instructions which is alarming. NO MORE NEW INSTRUCTIONS. please. byterev GPR was only added for RADIX MMU support which was punished severely in BE mode.
btw jacob the strategy for this one - shift-and-add - is to include *both* shift-and-add *and* the alternatives (the LD-ST-indexed-shifted group) and let the OPF ISA WG evaluate which is better. they might actually decide LD-ST-indexed-shifted is better, we just never know. given that both ARM and x86 have them, it's not actually as hard a sell as it might sound even though it's 37 (!) instructions (excluding ld-st-indexed-shifted-postincrement) https://azeria-labs.com/memory-instructions-load-and-store-part-4/
(In reply to Luke Kenneth Casson Leighton from comment #4) > they might actually decide LD-ST-indexed-shifted is better, we just > never know. well i'm hoping they decide they want *both* shift-add and LD-ST-indexed-shifted, x86 and arm have both: https://rust.godbolt.org/z/r8azxbW8f
(In reply to Luke Kenneth Casson Leighton from comment #3) > (In reply to Jacob Lifshay from comment #2) > > missing loads: signed 8bit turns out I thought PowerISA had lba*, it doesn't...
https://git.libre-soc.org/?p=libreriscv.git;a=shortlog;h=7881bad4de0c9b5abc3fe7db2ae65181f7369bd4 commit 7881bad4de0c9b5abc3fe7db2ae65181f7369bd4 Author: Jacob Lifshay <programmerjake@gmail.com> Date: Wed Apr 12 13:52:26 2023 -0700 rename typoed stbus -> stbsx commit 9a4dec12f0d7f5d47291925094f5dcb187fb965a Author: Jacob Lifshay <programmerjake@gmail.com> Date: Wed Apr 12 13:37:47 2023 -0700 shift-add is useful even with LD-ST-indexed-shifted
commit 6eeb6589a9efeb2b7b0e15972859c8445bb64302 (HEAD -> master) Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net> Date: Thu Apr 13 03:51:33 2023 +0100 add shaddsw to optable.csv, part of ls004 actually part of ls012 (v2) bug #1051
(In reply to Luke Kenneth Casson Leighton from comment #8) > add shaddsw to optable.csv, part of ls004 > > actually part of ls012 (v2) bug #1051 fixed spelling: commit 021c3ab3a10ed8683b9bc8912ea5d08795c851e8 Author: Jacob Lifshay <programmerjake@gmail.com> Date: Wed Apr 12 19:58:07 2023 -0700 fix shaddw spelling
ARM register offset load/store word and unsigned byte instructions. https://developer.arm.com/documentation/ddi0406/c/Application-Level-Architecture/Instruction-Details/Shifts-applied-to-a-register/Constant-shifts LDRSB R0, [R5, R1, LSL #1] ; Read byte value from an address equal to ; sum of R5 and two times R1, sign extended it https://developer.arm.com/documentation/dui0552/a/the-cortex-m3-instruction-set/memory-access-instructions/ldr-and-str--register-offset added first cut of loadstoreshift.mdwn page https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=5990036b5
https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=openpower/sv/rfc/ls004.mdwn
added first cut of fixedstoreshift.mdwn, TODO edit https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=106817db
commit 94a8b2c6b39947968ab421fbda1af4211d74846c (HEAD -> shriya_add_descriptions) Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net> Date: Fri Nov 17 15:30:33 2023 +0000 add RS/FRT/FRS to Z-23 Form for ls004 https://bugs.libre-soc.org/show_bug.cgi?id=1055
commit 0a0d3b53721b96e82d140308215a870ebae2afdc (HEAD -> shriya_add_descriptions) Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net> Date: Fri Nov 17 15:42:07 2023 +0000 add Z-23 to RT FRS FRT
commit 147835b50b87dc305ad844de872955878abcf903 (HEAD -> master, origin/master, origin/HEAD) Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net> Date: Fri Nov 17 15:45:33 2023 +0000 add fixed/fload load/store shift instructions to ls004
Added descriptions/ pseudo-code for pifploadshift, pifpstoreshift, fploadshift, fpstoreshit, pifixedloadshit and pifixedstoreshift files. Also added them to RFC ls004 as inline includes.
I expect all instructions' pseudocode to be of the form: shifted: EA <- (RA|0) + ((RB) << (SH + 1)) load/store at address EA postinc: EA <- (RA) load/store at address EA RA <- (RA) + (RB) preinc-shifted: EA <- (RA) + ((RB) << (SH + 1)) load/store at address EA RA <- EA postinc-shifted: EA <- (RA) load/store at address EA RA <- (RA) + ((RB) << (SH + 1)) many of the recent pseudocode updates don't match that, e.g.: https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=fc695579c71a83592cfd2b91d41a60544dbd34a9 which is: * ldupsx RT,RA,RB,SH Pseudo-code: EA <- (RA)<<(SH+1) RT <- MEM(EA, 8) RA <- (RA) + (RB) this doesn't look right to me, RB should be shifted, not RA, and the shift should be on the increment, not EA
(In reply to Jacob Lifshay from comment #17) > > Pseudo-code: > > EA <- (RA)<<(SH+1) > RT <- MEM(EA, 8) > RA <- (RA) + (RB) > > this doesn't look right to me, RB should be shifted, not RA, and the shift > should be on the increment, not EA yep well spotted jacob thank you for the review (budget adapted). shriya can you handle this?
> yep well spotted jacob thank you for the review (budget adapted). > shriya can you handle this? Yes I will!
> > this doesn't look right to me, RB should be shifted, not RA, and the shift > should be on the increment, not EA Jacob can you give me an example of what do you mean by shift should be on increment?
i have this: Z23-Form stbupsx RS,RA,RB,SH Pseudo-code: EA <- (RA) + (RB)<<(SH+1) ea <- (RA) MEM(ea, 1) <- (RS)[XLEN-8:XLEN-1] RA <- EA you had this: EA <- (RA)<<(SH+1) + (RB) X-Form lbzupsx RT,RA,RB,SH Pseudo-code: EA <- (RA)<<(SH+1) RT <- ([0] * (XLEN-8)) || MEM(EA, 1) RA <- (RA) + (RB) shoud be: X-Form lbzupsx RT,RA,RB,SH Pseudo-code: EA <- (RA) RT <- ([0] * (XLEN-8)) || MEM(EA, 1) RA <- (RA) + (RB)<<(SH+1) etc etc
(In reply to Luke Kenneth Casson Leighton from comment #21) > EA <- (RA) + (RB)<<(SH+1) note the shift should be parenthesized like so: EA <- (RA) + ((RB) << (SH+1)) so you don't accidentally have it trying to do: EA <- ((RA) + (RB)) << (SH+1) this is necessary because our parser has the wrong precedence for some operations (idk if that includes shifts) and it hasn't yet been fixed: bug #1082
commit 70792a5b202a28869accc23c73ea6e260363cfb2 (HEAD -> shriya_add_descriptions, origin/shriya_add_descriptions) Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net> Date: Mon Nov 20 15:03:10 2023 +0000 sum RA+RB<<sh not RA<<sh+RB https://bugs.libre-soc.org/show_bug.cgi?id=1055
(In reply to Jacob Lifshay from comment #22) > (In reply to Luke Kenneth Casson Leighton from comment #21) > > EA <- (RA) + (RB)<<(SH+1) > > note the shift should be parenthesized like so: > EA <- (RA) + ((RB) << (SH+1)) > so you don't accidentally have it trying to do: > EA <- ((RA) + (RB)) << (SH+1) I just checked, our parser currently has shifts as lower precedence than addition, so it does parse RA + RB << shift as (RA + RB) << shift, which is wrong here, so, shriya, luke, those shift expressions *need* to be parenthesized. https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/pseudo/parser.py;h=d0a2630acad98e5092fffd5ab771ef7460f1fefa;hb=f3f8aff4aefa36e0af4c9afe2d6e5913d2e78328#l184 shifts being lower precedence than addition seems to match PowerISA v3.1B from cursory examination: see the pseudocode of vcmpequb, they have: (all_true<<3) + (all_false<<1)
> > (In reply to Luke Kenneth Casson Leighton from comment #21) > > > EA <- (RA) + (RB)<<(SH+1) > > > > note the shift should be parenthesized like so: > > EA <- (RA) + ((RB) << (SH+1))) Okay Jacob! Thanks for noticing.
(In reply to Jacob Lifshay from comment #24) > I just checked, our parser currently has shifts as lower precedence than > addition, so it does parse RA + RB << shift as (RA + RB) << shift, which is > wrong here, so, shriya, luke, those shift expressions *need* to be > parenthesized. no: we fix the parser.
(In reply to Luke Kenneth Casson Leighton from comment #26) > (In reply to Jacob Lifshay from comment #24) > > > I just checked, our parser currently has shifts as lower precedence than > > addition, so it does parse RA + RB << shift as (RA + RB) << shift, which is > > wrong here, so, shriya, luke, those shift expressions *need* to be > > parenthesized. > > no: we fix the parser. no, the parser is correct in how it parses addition and shifts, it matches PowerISA v3.1B afaict. everywhere I could see in the specification PDF it parenthesizes shifts when adding (I went through every occurrence of << or >> in the PDF).
(In reply to Jacob Lifshay from comment #27) > no, the parser is correct in how it parses addition and shifts, it matches > PowerISA v3.1B afaict. everywhere I could see in the specification PDF it > parenthesizes shifts when adding (I went through every occurrence of << or > >> in the PDF). additionally, parenthesizing when operators whose relative precedence is less commonly remembered (such as shifts and adds) makes the specification clearer, so even if shift was higher precedence, I still think it would be wise to parenthesize the shift.
(In reply to Jacob Lifshay from comment #28) > additionally, parenthesizing when operators whose relative precedence is > less commonly remembered (such as shifts and adds) makes the specification > clearer, so even if shift was higher precedence, I still think it would be > wise to parenthesize the shift. no because prior authors do not. (edit: sorry, i misread. if other authors do it then we do it) follow the pattern of OTHER PEOPLE, not what YOU think should be "correct technical clarity". there is 35 years history in Power spec, we MUST respect that.
(In reply to Luke Kenneth Casson Leighton from comment #29) > no because prior authors do not. they do, iirc every single shift and add in the PDF is parenthesized. There's like 10-15 of them and I looked at all of them.
(In reply to Luke Kenneth Casson Leighton from comment #29) > (edit: sorry, i misread. if other authors do it then we do it) ok, i replied too early...sorry
(In reply to Jacob Lifshay from comment #30) > they do, iirc every single shift and add in the PDF is parenthesized. > There's like 10-15 of them and I looked at all of them. DONE
spotted an error: lfsupsx current pseudo-code: EA <- (RA) + ((RB)<<(SH+1)) FRT <- DOUBLE(MEM(RA, 4)) RA <- EA when reading from RA, it needs to be parenthesized, otherwise you get the register number instead of the contents of the register: EA <- (RA) + ((RB)<<(SH+1)) FRT <- DOUBLE(MEM((RA), 4)) <--- here RA <- EA lfdupsx also has that issue
(In reply to Jacob Lifshay from comment #33) > spotted an error: Corrected
shriya do you want to copy what andrey did for ls011 (bug #1048) and do exactly the same include raw=yes trick? then run "make ls004.pdf" in the openpower/sv/rfc directory
https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=3ca4bd25 https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=4561335e7 https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=bec140c5d missed out fploadshift, fpstoreshift. TODO fpstoreshift needs editing
(In reply to Luke Kenneth Casson Leighton from comment #37) > missed out fploadshift, fpstoreshift. TODO fpstoreshift needs editing done. submitted ls004 v2 at https://openpowerfoundation.org/isarfc/ radio button 2. further discussion (including Libre-SOC review) to take place at bug #1091