a shift-and-add instruction is needed, useful for twofish/serpent(?) as well as single-instruction for LD-ST-address-calculation-with-a-shift * TODO: add shaddsw or replace shadduw with shaddsw
first version of pseudocode https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=f43f0bf226745b30b40b7af2890adf911d5fe8a8 actual: https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=openpower/isa/bitmanip.mdwn;h=89515ec977385f3ffd7f91b253f795c7d55d69c8;hb=249e8881b01ca8209f6a9d9934072b654bfc7901#l106
I'm too lazy to raise a new issue, so I'll support these in binutils in scope of this task.
Binutils support: https://git.libre-soc.org/?p=binutils-gdb.git;a=commitdiff;h=b8953aca7868c2e61a014e2f2280c0c896bd50f7;hp=ec8f4eba3c9d1139e4cf9cdb980fd5a622c33ac4
I'll note the pseudocode is wrong, it should shift left and add, instead it replaces the lower bits with zero then adds. https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=openpower/isa/bitmanip.mdwn;h=89515ec977385f3ffd7f91b253f795c7d55d69c8;hb=HEAD#l115 115 case (0): sum[0:XLEN-1] = (n[0:XLEN-1-1] || [0]*1) + (RA) 116 case (1): sum[0:XLEN-1] = (n[0:XLEN-2-1] || [0]*2) + (RA) 117 case (2): sum[0:XLEN-1] = (n[0:XLEN-3-1] || [0]*3) + (RA) 118 default: sum[0:XLEN-1] = (n[0:XLEN-4-1] || [0]*4) + (RA) that should be: case (0): sum[0:XLEN-1] = (n[1:XLEN-1] || [0]*1) + (RA) and likewise for all other cases.
(In reply to Jacob Lifshay from comment #4) > I'll note the pseudocode is wrong, it should shift left and add, instead it > replaces the lower bits with zero then adds. ah well spotted. > that should be: > case (0): sum[0:XLEN-1] = (n[1:XLEN-1] || [0]*1) + (RA) > and likewise for all other cases. i just realised, it's perfectly fine to do this: n <- (RB) m <- sm + 1 RT <- (n[m:XLEN-1] || [0]*m) + (RA) https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=263a8fca0f7413e62cf74a0e559e8965b0951e6c
i've reverted the use of dynamic shift, instead it is an explicit switch/case again. ghostmansd, next time if you encounter difficulties just take the easiest route.
Reverted back. Please don't revert the code unless really necessary. Peeking the easiest route after you already see the end of the difficult but correct route doesn't seem to be the best option. :-) Details at https://bugs.libre-soc.org/show_bug.cgi?id=967#c1.
In the meeting today I pointed out that we probably also want shaddsw since C likes to use int for array indexing. shaddsw: RT = RA + ((uint64_t)(int32_t)RB << (sh + 1))
(In reply to Jacob Lifshay from comment #8) > In the meeting today I pointed out that we probably also want shaddsw since > C likes to use int for array indexing. > > shaddsw: > RT = RA + ((uint64_t)(int32_t)RB << (sh + 1)) mmm.... if added as "L", a 5th operand (shaddw RT, RA, RB, sh, L) it reduces the number of new instructions added. if L = 1 then n <- EXTS(RB[32:63) else n <- [0]*32 || RB[32:63) pseudoaliases will need to be created as well. and yet another encoding Form. sigh.
(In reply to Jacob Lifshay from comment #8) > In the meeting today I pointed out that we probably also want shaddsw since > C likes to use int for array indexing. > > shaddsw: > RT = RA + ((uint64_t)(int32_t)RB << (sh + 1)) the number of instructions being of some concern, this can be done with extsw+shadd. is there any compelling reason for an explicit 32bit shift-signed-add? how much usage will it get in actual compiled code?
Most of the cases I saw in the wild didn't really need sign. That is, many C programmers just use int since they simply unaware or are not used to size_t.
That said, adding a new form can be done. It will need more time, but for sure is doable. I just need some rationale that this is really required. With new form, we can have 1 instruction.
Is there any agreement and conclusion about this? If necessary I can create a patch which supports a new form to support shaddsw/shadduw simultaneously; this will need some tweaks and changes, so this task will take more time. Otherwise this task is done.
(In reply to Dmitry Selyutin from comment #13) > Is there any agreement and conclusion about this? If necessary I can create > a patch which supports a new form to support shaddsw/shadduw simultaneously; no. no justification. extsw is sufficient. we have over 100 instructions to add already > this will need some tweaks and changes, so this task will take more time. > Otherwise this task is done. agreed.
Luke, this needs the budget, as well as #968 and #967.
(In reply to Luke Kenneth Casson Leighton from comment #14) > (In reply to Dmitry Selyutin from comment #13) > > Is there any agreement and conclusion about this? If necessary I can create > > a patch which supports a new form to support shaddsw/shadduw simultaneously; > > no. no justification. extsw is sufficient. we have over 100 > instructions to add already realistically if we're adding only one of shaddsw/shadduw, we should add shaddsw because a C compiler can trivially use it with array[int_index], whereas shadduw requires the compiler to first prove the index isn't ever negative, which is often difficult (all the easy cases are already covered by ld/st's immediate or by shadd). u32 indexes are very uncommon -- basically everything uses a constant, size_t/ptrdiff_t, or int.
(In reply to Jacob Lifshay from comment #16) > realistically if we're adding only one of shaddsw/shadduw, we should add > shaddsw because a C compiler can trivially use it with array[int_index], > whereas shadduw requires the compiler to first prove the index isn't ever > negative, which is often difficult (all the easy cases are already covered > by ld/st's immediate or by shadd). those easy cases are: 1. the compiler can prove the index is a constant, in which case just using ld/st immediate is sufficient. 2. the compiler can promote all uses of the index variable to a 64-bit value throughout the code thereby avoiding needing any sign extensions (a common optimization compilers already do), in which case it can just use shadd.
comments #16 and #17 haven't been addressed, so i think it's premature to mark this bug as fixed.
Please refer to comment #14. It's explicitly stated by PL it's done long ago. If you need to add something else, please raise the relevant task.
(In reply to Dmitry Selyutin from comment #19) > Please refer to comment #14. It's explicitly stated by PL it's done long > ago. If you need to add something else, please raise the relevant task. ok, reported https://bugs.libre-soc.org/show_bug.cgi?id=996