https://libre-soc.org/irclog/%23libre-soc.2023-04-18.log.html#t2023-04-18T10:09:54 (fixed spelling) > so, lkcl, what do you think of changing extsb/h/w to be: > RA <- EXTSXL(RS, XLEN/N) for N = 8/4/2 respectively > instead of: RA <- EXTSXL(RS, N) for N = 8/16/32 respectively? this allows sign extending from 1/2/4-bit values while retaining the existing ability to sign extend from 8/16/32-bit values, all that changes is which of extsb/h/w you need to use for each case. TODO: unit tests?
proposed: XLEN=8: extsb: 1-bit -> 8-bit sign extension extsh: 2-bit -> 8-bit sign extension extsw: 4-bit -> 8-bit sign extension XLEN=16: extsb: 2-bit -> 16-bit sign extension extsh: 4-bit -> 16-bit sign extension extsw: 8-bit -> 16-bit sign extension XLEN=32: extsb: 4-bit -> 32-bit sign extension extsh: 8-bit -> 32-bit sign extension extsw: 16-bit -> 32-bit sign extension XLEN=64: extsb: 8-bit -> 64-bit sign extension extsh: 16-bit -> 64-bit sign extension extsw: 32-bit -> 64-bit sign extension
note 1-bit to 8-bit sign extension is particularly useful for generating traditional simd masks (generally used in code not specifically designed for SVP64) where elements are either -1 or 0
(In reply to Jacob Lifshay from comment #2) > note 1-bit to 8-bit sign extension is particularly useful for generating > traditional simd masks (generally used in code not specifically designed for > SVP64) where elements are either -1 or 0 e.g. code to generate 32-bit traditional simd masks (possibly faster than shifting): sv.extsb/elwid=8/subvl=4 *r3, *r3 # sign extend lsb but of each byte sv.extsh/elwid=32 *r3, *r3 # sign extend least significant byte of each 32-bit word to full word
(In reply to Jacob Lifshay from comment #0) > https://libre-soc.org/irclog/%23libre-soc.2023-04-18.log.html#t2023-04- > this allows sign extending from 1/2/4-bit values while retaining the > existing ability to sign extend from 8/16/32-bit values, all that changes is > which of extsb/h/w you need to use for each case. let me think about it. seems ok on initial glance.
(In reply to Jacob Lifshay from comment #3) > (In reply to Jacob Lifshay from comment #2) > > note 1-bit to 8-bit sign extension is particularly useful for generating > > traditional simd masks (generally used in code not specifically designed for > > SVP64) where elements are either -1 or 0 grevlut covers this (as 1 of 1000sof others) > e.g. code to generate 32-bit traditional simd masks (possibly faster than > shifting): > > sv.extsb/elwid=8/subvl=4 *r3, *r3 # sign extend lsb but of each byte nice. > sv.extsh/elwid=32 *r3, *r3 # sign extend least significant byte of each > 32-bit word to full word like it.
lkcl: note that imho you waay overcomplicated the extsb pseudocode, i think it should just be: RT <- EXTSXL((RA), XLEN/8)
(In reply to Jacob Lifshay from comment #6) > lkcl: note that imho you waay overcomplicated the extsb pseudocode, i think > it should just be: RT <- EXTSXL((RA), XLEN/8) had to be illustrated clearly and explicityly. EXTSXL is not clear nor in Power ISA v3.1 spec.
(In reply to Jacob Lifshay from comment #6) > lkcl: note that imho you waay overcomplicated the extsb pseudocode, i think > it should just be: RT <- EXTSXL((RA), XLEN/8) if you don't want to use EXTSXL, the following should work: RT[0:XLEN-1] <- EXTS((RA)[XLEN*7/8:XLEN-1])
(In reply to Jacob Lifshay from comment #8) > if you don't want to use EXTSXL, the following should work: > RT[0:XLEN-1] <- EXTS((RA)[XLEN*7/8:XLEN-1]) you misunderstand. for the purpose of precisely exactly illustrating at the bit level precisely and exactly the pseudocode i have written achieves its purpose as intended for the demonstration of the concept of XLEN within this RFC. any function *at all* destroys that purpose and intention to EXPLICITLY show at the bitlevel what is going on. no change shall occur to this pseudocode.
https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=75d9fc62dc34efeba2b395564bd1579ed29a0c14 commit 75d9fc62dc34efeba2b395564bd1579ed29a0c14 Author: Jacob Lifshay <programmerjake@gmail.com> Date: Wed Apr 19 17:58:19 2023 -0700 change extsb/h/w to scale based on XLEN rather than extending from a fixed width