Bug 911 - svshape2 instruction (with offsets)
Summary: svshape2 instruction (with offsets)
Status: RESOLVED FIXED
Alias: None
Product: Libre-SOC's first SoC
Classification: Unclassified
Component: Source Code (show other bugs)
Version: unspecified
Hardware: PC Linux
: High enhancement
Assignee: Dmitry Selyutin
URL: https://libre-soc.org/openpower/sv/re...
Depends on:
Blocks:
 
Reported: 2022-08-24 20:11 BST by Luke Kenneth Casson Leighton
Modified: 2022-09-25 17:08 BST (History)
3 users (show)

See Also:
NLnet milestone: NLNet.2019.10.042.Vulkan
total budget (EUR) for completion of task and all subtasks: 3000
budget (EUR) for this task, excluding subtasks' budget: 3000
parent task for budget allocation: 254
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:
lkcl = { amount = 1500, submitted = 2022-09-15, paid = 2022-09-16 } ghostmansd = { amount = 1000, submitted = 2022-09-16, paid = 2022-09-23 } [jacob] amount = 500 submitted = 2022-09-16 paid = 2022-09-24


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Kenneth Casson Leighton 2022-08-24 20:11:09 BST
as we found in bug #794 a way to easily set offsets is needed.
that will be similar to svindex, but fitting into SVRM of svshape
https://libre-soc.org/openpower/sv/remap/discussion/

binutils spec simulator unit tests, all needed.

* instruction spec: DONE
* pseudocode        DONE
* CSV file          DONE
* fields.txt        DONE
* sv/trans/svp64.py DONE
* unit tests        DONE
* binutils          TODO
Comment 1 Jacob Lifshay 2022-08-24 20:28:15 BST
notes: offset will need to be signed. the instruction should have a mode where it doesn't change vl or mvl. imho having it be able to enable remap in the same instruction is more important than being able to set matrix modes, it saves an instruction in the common code sequence (the arithmetic operation and elwid will vary):

# not actual proposed svshape2 syntax, just to tell you what gets set
svshape2 offset=-1, remap=rb, persistant=0
# merged vslideup and alu op, reads 1 byte from end of r63
sv.add/elwid=8 *32, *64, *32

equivalent to:
for i in range(vl):
    regs[32 + i // 8].byte[i % 8] += regs[64 + (i - 1) // 8].byte[(i - 1) % 8]
Comment 2 Jacob Lifshay 2022-08-24 20:36:31 BST
maybe have a 5-bit field for which remap to enable (in0, in1, in2, out0, out1), and a 2-bit field for which svshape register to overwrite.
Comment 3 Luke Kenneth Casson Leighton 2022-08-24 20:44:00 BST
there is a gotcha with the instruction database, the entries needed
for using two of the SVRM 4-bit field encodings for a new instruction
means that *14* lines in the minor_22.csv file are needed:

0000---00011,OP_SVSHAPE
0001---00011,OP_SVSHAPE
...
...

which is exactly what was just "fixed" for rlwimi etc.

the proper fix therefore is to allow an instruction to match
against a *list* of integer patterns, not just one.
i will track down the lines in openpower-isa in the next post
Comment 4 Luke Kenneth Casson Leighton 2022-08-24 20:54:25 BST
(In reply to Jacob Lifshay from comment #1)
> notes: offset will need to be signed.

this makes me jittery as it is an extra bit.  at least 3 bits are needed
to be able to offset to a byte within a 64 bit register.  now it is 4 bits,
they are very precious in SVSTATE. i think there is room.

and it makes sense because not all ops are... swap-args. A-B != B-A.

> the instruction should have a mode
> where it doesn't change vl or mvl. 

nowhere near enough space to do otherwise.

> imho having it be able to enable remap in
> the same instruction is more important than being able to set matrix modes,
> it saves an instruction in the common code sequence (the arithmetic
> operation and elwid will vary):

i already came up with something that fits into the existing
svshape instruction, and is similar to svindex.  both in HDL and
encoding
 
(In reply to Jacob Lifshay from comment #2)
> maybe have a 5-bit field for which remap to enable (in0, in1, in2, out0,
> out1), and a 2-bit field for which svshape register to overwrite.

i already have an encoding and implementation for svindex, which can
be copied / adapted to make svshape2. see svindex spec.

the more similar the instructions are the (a) less work the (b) less HDL
the (c) higher the chance of acceptance.

adding new highly-strategic instructions like this at such a late stage
is pretty risky in so many ways i don't even want to begin listing them.
the more they are identical to other REMAP instructions the less risk.
Comment 5 Jacob Lifshay 2022-08-24 21:23:05 BST
(In reply to Luke Kenneth Casson Leighton from comment #4)
> (In reply to Jacob Lifshay from comment #1)
> > notes: offset will need to be signed.
> 
> this makes me jittery as it is an extra bit.  at least 3 bits are needed
> to be able to offset to a byte within a 64 bit register.  now it is 4 bits,
> they are very precious in SVSTATE. i think there is room.

assuming you meant SVSHAPE0-3, not SVSTATE.

imho it'd make more sense to just expand SVSHAPE0-3 to 64-bits...we'd want more than i4 for offset, since many instructions can only access every other or every 4th register iirc, so offset should be able to cover the rest of the bytes.

also, it just occurred to me we don't need signed offset after all, e.g. if you want offset -2 for bytes, just use offset -2 % 8 = 6 instead and use the previous register in the sv.* instruction.

rather than:
svshape2 offset=-1, remap=ra
sv.add/elwid=8 *32, *64, *32

do:
svshape offset=6, remap=ra
sv.add/elwid=8 *32, *63, *32

or:
svshape offset=14, remap=ra
sv.add/elwid=8 *32, *62, *32 # if you can't address every register...
Comment 6 Luke Kenneth Casson Leighton 2022-08-24 21:33:08 BST
(In reply to Jacob Lifshay from comment #5)
> (In reply to Luke Kenneth Casson Leighton from comment #4)
> > (In reply to Jacob Lifshay from comment #1)
> > > notes: offset will need to be signed.
> > 
> > this makes me jittery as it is an extra bit.  at least 3 bits are needed
> > to be able to offset to a byte within a 64 bit register.  now it is 4 bits,
> > they are very precious in SVSTATE. i think there is room.
> 
> assuming you meant SVSHAPE0-3, not SVSTATE.

... yes.

> imho it'd make more sense to just expand SVSHAPE0-3 to 64-bits...

no, because i kept them very specifically to 32 bit so as to reduce
context switch latency.  on the TODO list is to merge into 1/2 the number
of 64 bit SPRs.

> we'd want
> more than i4 for offset, since many instructions can only access every other
> or every 4th register iirc, so offset should be able to cover the rest of
> the bytes.

future version.
 
> also, it just occurred to me we don't need signed offset after all, e.g. if
> you want offset -2 for bytes, just use offset -2 % 8 = 6 instead and use the
> previous register in the sv.* instruction.

face-palm moment, of course. i still have 4 bits anyway because of VSX 128-bit
registers (quad.prevision, not the SIMD ones)

> rather than:
> svshape2 offset=-1, remap=ra
> sv.add/elwid=8 *32, *64, *32
> 
> do:
> svshape offset=6, remap=ra
> sv.add/elwid=8 *32, *63, *32
> 
> or:
> svshape offset=14, remap=ra
> sv.add/elwid=8 *32, *62, *32 # if you can't address every register...

neat. like it.  that fixes the EXTRA2 problem. like it.
Comment 7 Luke Kenneth Casson Leighton 2022-09-02 15:44:43 BST
added CSV files, fields.txt entries, and stub pseudocode

https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=29ea1c07e1632b99c94371cc6da9b181283e7995
Comment 8 Luke Kenneth Casson Leighton 2022-09-02 18:54:07 BST
add first draft svshape2 pseudocode

https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=a448d0bb54913f1f894fb1032a32d159b80804e9
Comment 9 Luke Kenneth Casson Leighton 2022-09-03 13:00:19 BST
added an "inv" option which only does one dimension.
other bits can be added later with an "sv.svshape2"

https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=77a4f7104968859385e0b8117ea74041cbe6e436
Comment 10 Luke Kenneth Casson Leighton 2022-09-03 13:15:43 BST
nope.  decided to revert this as the cost is too great.

https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=0d7d7b0059cf5835461b5b2e1d9fa2352f8f6d4d
Comment 11 Luke Kenneth Casson Leighton 2022-09-03 13:46:06 BST
unit test showing how offset works, and explaining how it could be used with
operations restricted to EXTRA2

https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=0ae4b42d6f7bcb1fa83f8e62e46578eff2141f10
Comment 12 Jacob Lifshay 2022-09-15 23:52:12 BST
Since lkcl submitted his RFP, I'm assuming that means it's completed.