create/update spec and documentation page for shift-and-add instruction
andrey do you also want to have a go at creating ls004 for this one?
part of the justification for this instruction does come from twofish (as well as LD-ST-address-calculate-with-a-shift) #define ENCRYPT_RND( A,B,C,D, T0, T1, xkey, r ) \ T0 = g0(A,xkey); T1 = g1(B,xkey);\ C ^= T0+T1+xkey->K[8+2*(r)]; C = ROR32(C,1);\ D = ROL32(D,1); D ^= T0+2*T1+xkey->K[8+2*(r)+1] and there is additional shifting occuring in creation of a q-table: ae = i>>4; be = i&0xf; ao = ae ^ be; bo = ae ^ ROR4BY1(be) ^ ((ae<<3)&8); ae = t[0][ao]; be = t[1][bo]; ao = ae ^ be; bo = ae ^ ROR4BY1(be) ^ ((ae<<3)&8); ae = t[2][ao]; be = t[3][bo];
(In reply to Luke Kenneth Casson Leighton from comment #1) > andrey do you also want to have a go at creating ls004 for this one? Yes, will look into this. Been busy sorting out VAT returns this weekend.
Drafted the pseudo-code for shadd and shadduw, please give it a check: https://libre-soc.org/openpower/sv/bitmanip/#shift-add
What's left in the scope of this task? Should I update the code in the docs with the stuff we have in markdown?
(In reply to Dmitry Selyutin from comment #5) > What's left in the scope of this task? Should I update the code in the docs > with the stuff we have in markdown? make sure the description is accurate, pseudocode in URL accurate, any examples that need to go in ls004 RFC, links in wiki markdown are there, cross-referencing. a lot of small details.
(In reply to Luke Kenneth Casson Leighton from comment #1) > andrey do you also want to have a go at creating ls004 for this one? (In reply to Luke Kenneth Casson Leighton from comment #6) > any examples that need to go in ls004 RFC, links in wiki markdown > are there, cross-referencing. a lot of small details. Added initial draft for ls004, containing what I know: https://libre-soc.org/openpower/sv/rfc/ls004/
(In reply to Andrey Miroshnikov from comment #7) > (In reply to Luke Kenneth Casson Leighton from comment #1) > > andrey do you also want to have a go at creating ls004 for this one? > > (In reply to Luke Kenneth Casson Leighton from comment #6) > > any examples that need to go in ls004 RFC, links in wiki markdown > > are there, cross-referencing. a lot of small details. > > Added initial draft for ls004, containing what I know: > > https://libre-soc.org/openpower/sv/rfc/ls004/ the word descriptioj makes no sense at all `shift` is determined by the 2-bit bitfield `sm`+1. The minimum shift as 1, maximum 4. The result is shifted (RB) + (RA), and is stored in register RT. these are not "instructions", noone could follow them step by step. try reading them, and following them. do you know, looking at nothing else, what will happen? now try this: wben sm is zero, the contents of register RB are multiplied by two, added to the contents of register RA, and the result stored in RT. this is obvious instructions which can be followed, yes? shadduw is "unsigned word" btw
(In reply to Luke Kenneth Casson Leighton from comment #8) > the word descriptioj makes no sense at all Yes, true. > now try this: > > wben sm is zero, the contents of register RB are multiplied by two, > added to the contents of register RA, and the result stored in RT. Changed the descriptions: https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=2ee653d030e00ffd1e1a93644088e6f2fee59650 > shadduw is "unsigned word" btw This is not obvious to me (I'm probably not understanding). Not sure what the purpose of this instruction is. The pseudocode of shadd doesn't indicate that it is **signed**, therefore why have shadduw? Second line of shadduw pseudocode: n <- (RB)[XLEN/2:XLEN-1] # btw we agreed on not using XLEN in RFCs for now? With XLEN=64, this would be (RB)[32:63], or the lower half (MSB ordering?). Why ignore bits (RB)[0:31]?
(In reply to Andrey Miroshnikov from comment #9) > The pseudocode of shadd doesn't indicate that it is **signed**, 64-bit on both operands, it makes no difference whatsoever. > therefore why have shadduw? compare: ADD( RA, SIGNEXTEND( LOWERHALF(RB) ) with: ADD( RA, ZEROEXTEND( LOWERHALF(RB) ) then set RB=0x0000_0000_8000_0000 and compare the two. > Second line of shadduw pseudocode: > n <- (RB)[XLEN/2:XLEN-1] # btw we agreed on not using XLEN in RFCs for now? in RFCs markdown pseudocode yes in markdown pseudocode used by ISACaller *no*. > With XLEN=64, this would be (RB)[32:63], or the lower half (MSB ordering?). yep. > Why ignore bits (RB)[0:31]? because it is more compact when doing address offsets. takes up less space. RA base 64 bit, offsets into data structure need only be 32bit.
(In reply to Luke Kenneth Casson Leighton from comment #10) > 64-bit on both operands, it makes no difference whatsoever. Ah ok, although the operands are unsigned, the sign extension allows to do address offsetting. > > compare: > > ADD( RA, SIGNEXTEND( LOWERHALF(RB) ) > > with: > > ADD( RA, ZEROEXTEND( LOWERHALF(RB) ) > > then set RB=0x0000_0000_8000_0000 and compare the two. Thanks, that info was missing from bitmanip.mdwn. Where are these statements located? grep'ing for those I didn't find anything in openpower-isa repo. > in RFCs markdown pseudocode yes > in markdown pseudocode used by ISACaller *no*. Thanks for clearing that up. > because it is more compact when doing address offsets. takes up less > space. RA base 64 bit, offsets into data structure need only be 32bit. Does that mean shadduw pseudocode is missing a EXTS() statement? There is no explicit sign-extension in the pseudocode. Adjusted based on your explanation: https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=95a74ba3566333ac498bb1110781e563f86b4c06
Regarding the pseudocode: https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=0389a26f6d725335ee1e3d920dc9e87861bff1d6 Otherwise it'd overflow.
Andrey, it seems that you handle this task, right? I fixed the pseudocode on the openpower-isa side, and you update the docs in librerisv repository. It seems that I no longer have the write access to libreriscv repository; I vaguely recall I had it before, what happened?
(In reply to Andrey Miroshnikov from comment #11) > (In reply to Luke Kenneth Casson Leighton from comment #10) > > 64-bit on both operands, it makes no difference whatsoever. > Ah ok, although the operands are unsigned, the sign extension allows to do > address offsetting. > > > > > compare: > > > > ADD( RA, SIGNEXTEND( LOWERHALF(RB) ) > > > > with: > > > > ADD( RA, ZEROEXTEND( LOWERHALF(RB) ) > > > > then set RB=0x0000_0000_8000_0000 and compare the two. > > Thanks, that info was missing from bitmanip.mdwn. yes. you don't put massive explanations into pseudocode. it goes into notes (here) or a discussion page. > Where are these statements located? they are not. i wrote them for you, just now. > > because it is more compact when doing address offsets. takes up less > > space. RA base 64 bit, offsets into data structure need only be 32bit. > > Does that mean shadduw pseudocode is missing a EXTS() statement? no it does not. > There is no explicit sign-extension in the pseudocode. correct. if the desired behaviour was to perform a signed 32bit add then not only would EXTS be used, the instruction would be called "shaddsw" (or sigh shaddws) > Adjusted based on your explanation: > https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff; > h=95a74ba3566333ac498bb1110781e563f86b4c06 read again. i said *no* on changing the pseudocode markdown for ISACaller. why did you then change the pseudocode markdown used by ISACaller? please revert @@ -122,17 +124,21 @@ shadd r4, r1, r2, 3 Pseudocode: shift <- sm + 1 # Shift is between 1-4 - n <- (RB)[XLEN/2:XLEN-1] # Limit RB to upper word (32-bits) + n <- (RB)[32:63] # Only use lower 32-bits of RB sum[0:63] <- (n << shift) + (RA) # Shift n, add RA RT <- sum # Result stored in RT
(In reply to Dmitry Selyutin from comment #12) > Regarding the pseudocode: > > https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff; > h=0389a26f6d725335ee1e3d920dc9e87861bff1d6 > > Otherwise it'd overflow. jacob pointed out that using a switch/case is better for static compilation (to c) which we will be doing shortly. (In reply to Dmitry Selyutin from comment #13) > It seems that I no longer have the write access to libreriscv repository; I > vaguely recall I had it before, what happened? jacob made unauthorised modifications for about the 3rd time, not following ptoject procedures, and i had to perform a force-master push (again). jacob was directly responsible for communicating with everyone to ensure that they had updated, but obviously missed you. i only re-add people once i know that no overwrite damage can occur. please do check that you are at latest master commit.
(In reply to Luke Kenneth Casson Leighton from comment #14) > @@ -122,17 +124,21 @@ shadd r4, r1, r2, 3 > Pseudocode: > > shift <- sm + 1 # Shift is between > 1-4 > - n <- (RB)[XLEN/2:XLEN-1] # Limit RB to upper word > (32-bits) > + n <- (RB)[32:63] # Only use lower 32-bits of > RB > sum[0:63] <- (n << shift) + (RA) # Shift n, add RA > RT <- sum # Result stored in RT no, sorry, sorry, you're right: that's ls004.mdwn, not openpower-isa bitmanip.mdwn. distracted, hard to pay attention from an apartment.
(In reply to Luke Kenneth Casson Leighton from comment #15) > (In reply to Dmitry Selyutin from comment #12) > > Otherwise it'd overflow. > > jacob pointed out that using a switch/case is better for > static compilation (to c) which we will be doing shortly. Any static compilation, if it's so complicated that it handles switch, should likely be able to deal with constants, shouldn't it? This switch/case is so ugly... so ugly that the only reason to keep it would be the way our integers overflow. :-) Anyway, this works now, I'm just somewhat unsatisfied that the code is explicit about widths, since this seems to be an implementation detail. > i only re-add people once i know > that no overwrite damage can occur. please do check that you are at latest > master commit. I am at latest commit, but I don't already have to push anything, since Andrey handles this task, if I got it correctly. :-)
(In reply to Luke Kenneth Casson Leighton from comment #14) > no it does not. > > > There is no explicit sign-extension in the pseudocode. > > correct. if the desired behaviour was to perform a signed 32bit add > then not only would EXTS be used, the instruction would be called "shaddsw" > (or sigh shaddws) Then why did you give me an example with SIGNEXTEND and ZEROEXTEND, if they are not relevant? A lot of my confusion comes from the naming of these two instructions. Is there a good reason as to why shadduw uses "uw" (Unsigned Word)? Since both shadd and shadduw are unsigned, you're putting emphasis on the shadduw being unsigned. Of course shadduw also deals with a "word" so unsigned is probably necessary here. While shadd does not imply the operation deals with unsigned or signed integers. (In reply to Dmitry Selyutin from comment #17) > Andrey handles this task, if I got it correctly. :-) Doing ls004, though I did add the initial pseudocode as well. If yours is working, where can I find it? I guess you just stored "sm" first "shift <- sm" to ensure there wouldn't be overflow problems? (Our IRC chat on the overflow problem: https://libre-soc.org/irclog/latest.log.html#t2022-11-01T19:36:54)
(In reply to Dmitry Selyutin from comment #17) > Any static compilation, if it's so complicated that it handles switch, > should likely be able to deal with constants, shouldn't it? a pattern-match would be needed "if variable used then output a shift mask using a variable else use a constant", which would likely need a 2-pass compiler. > This switch/case > is so ugly... i know, but it is braindead simple to compile to c, just output the equivalent c switch/case
(In reply to Andrey Miroshnikov from comment #18) > (In reply to Luke Kenneth Casson Leighton from comment #14) > > no it does not. > > > > > There is no explicit sign-extension in the pseudocode. > > > > correct. if the desired behaviour was to perform a signed 32bit add > > then not only would EXTS be used, the instruction would be called "shaddsw" > > (or sigh shaddws) > > Then why did you give me an example with SIGNEXTEND and ZEROEXTEND, if they > are not relevant? because the question you asked was, "why is it not sign-extended". the answer to that involves creating two sets of pseudocode, one sign-extending one not-sign-extending, then putting in an example value and noticing that the sign-extending one gives the WRONG result. sign-extending: RB=0x0000_0000_8000_000 SIGNEXTEND(RB) --> 0xFFFF_FFFF_8000_0000 this is the WRONG thing to do. shadduw is SPECIFICALLY DESIGNED to NOT perform this signextension on RB. > A lot of my confusion comes from the naming of these two instructions. > > Is there a good reason as to why shadduw uses "uw" (Unsigned Word)? because it explicitly makes clear that RB is **NOT** to be sign-extended before the addition. > Since both shadd and shadduw are unsigned, you're putting emphasis on the > shadduw being unsigned. > Of course shadduw also deals with a "word" so unsigned is probably necessary > here. yes. > While shadd does not imply the operation deals with unsigned or signed > integers. yes. because when both RA and RB are 64-bit, there *is* no difference. sign-extension only matters when one of the two numbers is a DIFFERENT bitwidth.
(In reply to Luke Kenneth Casson Leighton from comment #15) > (In reply to Dmitry Selyutin from comment #13) > > > It seems that I no longer have the write access to libreriscv repository; I > > vaguely recall I had it before, what happened? > > jacob made unauthorised modifications for about the 3rd time, not following > ptoject procedures, and i had to perform a force-master push (again). jacob > was directly responsible for communicating with everyone to ensure that they > had updated, but obviously missed you. afaict i sent an email to the mailing list asking everyone to get back to me ensuring their local repo was in good shape (this was about 6mo ago iirc), since you never responded, I never asked luke to re-add write permissions. > i only re-add people once i know > that no overwrite damage can occur. please do check that you are at latest > master commit.
Assigned to Andrey. I only handled the pseudocode; not sure if this is related to documentation.
Hopefully this is the right place to submit feedback about ls004; if not then Luke could add a pointer to the right place to the OPF gitea issue. Comments from IBM architects regarding the proposal: The ls004 instructions use "sm" as the name of the shift-amount field. sm is not a split field, so the field name should be capitalized. (I'm not sure what "sm" stands for. "shift minus (1)"? If yes, SHM1 might be a better name. (ls004 doesn't include a definition of the field for Section 1.6.3.)) An additional comment is that the architecture uses "s" to stand for "Shift" in Shift instruction mnemonics, not "sh" as in ls004. (ls003 uses "s".)