Bug 924 - potential major opcode allocation for SVP64
Summary: potential major opcode allocation for SVP64
Status: CONFIRMED
Alias: None
Product: Libre-SOC's first SoC
Classification: Unclassified
Component: Specification (show other bugs)
Version: unspecified
Hardware: Other Linux
: High enhancement
Assignee: Luke Kenneth Casson Leighton
URL: https://libre-soc.org/openpower/sv/rf...
Depends on:
Blocks: 952
  Show dependency treegraph
 
Reported: 2022-09-09 12:39 BST by Luke Kenneth Casson Leighton
Modified: 2022-10-14 15:22 BST (History)
8 users (show)

See Also:
NLnet milestone: NLnet.2022-08-051.OPF
total budget (EUR) for completion of task and all subtasks: 0
budget (EUR) for this task, excluding subtasks' budget: 0
parent task for budget allocation:
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Kenneth Casson Leighton 2022-09-09 12:39:07 BST
there are discussions and reasons that cannot be disclosed,
there is an ISA WG meeting coming up tuesday 12th,
a need has come up to find ways to allocate SVP64 and the
80+ scalar opcodes without using 75% of 3 major 32 bit opcodes.

* 75% (50%?) needed for SVP64, SVP64-Single (and SVP64-Reserved?)
* 75% for grevluti, crternlogi, ternlogi
* 75% for xpermi, bmrevi, grevlut etc

this would require for example EXT005 EXT009 and 75% of EXT017

an alternative is needed.

submitted 12sep2022 - ticket number [[OPF][ISA] #984]
Comment 1 Jacob Lifshay 2022-09-09 16:06:07 BST
(In reply to Luke Kenneth Casson Leighton from comment #0)
> there are discussions and reasons that cannot be disclosed,
> there is an ISA WG meeting coming up tuesday 12th,
> a need has come up to find ways to allocate SVP64 and the
> 80+ scalar opcodes without using 75% of 3 major 32 bit opcodes.
> 
> * 75% (50%?) needed for SVP64, SVP64-Single (and SVP64-Reserved?)

Why are there 3 separate prefixes? I can understand reserving space for future extensions [1] but I've never heard of SVP64-Single before (it's not on the wiki), assuming it is for scalar ops, i'm inclined to say that it's unnecessary since we can already use svp64 with all EXTRA2/3 fields set to scalar. 

[1]: future extensions other than increasing reg file size -- note that the current svp64 prefix is 100% sufficient for future register file expansions, my proposed plan is that new bits on sv.setvl would set some state in SPRs that would change how the existing svp64 prefix's register fields are interpreted, allowing accessing more registers. this retains backward compatibility because programs have to opt-in to using the new reg field interpretations. this doesn't need any new encoding space for additional prefixes.)

> * 75% for grevluti, crternlogi, ternlogi
> * 75% for xpermi, bmrevi, grevlut etc

imho ternlogi doesn't need Rc, it just takes up too much space.

imho grevluti/grevlut just takes up too much space for the benefit it gives, imho we should just have grevi/grev and if you really want repeating load immediates just use:
sv.li/elwid=16/subvl=4 rt, 0x5555

also bmrevi is unnecessary because bitreversed fields are very uncommon (i've never seen code that uses them iirc) and there's a simple 2-instruction sequence that does the same thing:
grevi rt, ra, 0b111111  # bit reverse
rldicl rt, rt, ...  # extract field, must use 63-x for bit positions

after cutting out all of the above stuff i mentioned, it should reduce the space requirements by around 50%
Comment 2 djac 2022-09-09 16:47:11 BST
To all the active members of LibreSOC, let me explain what I can about what is going on.

LibreSOC has developed SVP64 based on pubic releases of the ISA.

Luke and I can now see and discuss under RED Semiconductor's membership the current state of affairs, and changes already made or under way to the ISA as yet unpublished.  We can also be told what resources are available to LibreSOC as an absolute statement.

We have to submit a workable means of integrating SVP64 into the POWER ISA that complies with the current state of affairs of the ISA.

Apologies but we are going to have to do this by a double blind process where Luke, Toshaan or I will be able to say "that cannot be done but I cannot tell you why.

We have had guidance and advice as to what would be accepted, most of it in the clear, and we need to convey this to the LibreSOC team as a way forward.

This was always going to happen at some point and todays the day.

Inevitably there is going to be some form of coming together with what we need and what we can have, and we all need to constructively work towards this.

I would ask everyone to bear with us in this process as the prize will substantially clear our way to getting the RFC for SVP64 accepted in its entirety, and remove many obstructions.
Comment 3 Luke Kenneth Casson Leighton 2022-09-09 17:19:53 BST
(In reply to Jacob Lifshay from comment #1)
 
> after cutting out all of the above stuff i mentioned, it should reduce the
> space requirements by around 50%

this is a tangent that still does not solve the problem.
as this is being closely examined can we,keep it on topic.
Comment 4 Luke Kenneth Casson Leighton 2022-09-09 18:39:05 BST
can we do a jitsi meeting on Sunday around say UTC 17:00?
i'd like to go over options, Toshaan to Chair the meeting
and keep OPF confidential details out of the discussion
Comment 5 Toshaan Bharvani 2022-09-11 16:00:14 BST
Moved to 18:00 UTC
Calendar invite send to all on this ticket
(If I forgot you, please comment here or on IRC)
Comment 6 Luke Kenneth Casson Leighton 2022-09-11 20:17:28 BST
\pagenumbering{fancy}
Comment 7 Luke Kenneth Casson Leighton 2022-09-15 14:01:03 BST
| width | assembler | prefix?      | suffix    | description   |
|-------|-----------|--------------|-----------|---------------|
| 64bit | fishmv    | 0x24000000   | 0x12345678| scalar EXT2nn |
| 64bit | ss.fishmv | 0x24!zero    | 0x12345678| scalar SVP64Single:EXT2nn |
| 64bit | sv.fishmv | 0x26nnnnnn   | 0x12345678| vector SVP64:EXT2nn |

second round of ideas needed, above shows how fishmv would be
allocated as scalar, svp64single and svp64vector, in the newly
proposed EXT2nn area

| width | assembler | prefix?      | suffix    | description   |
|-------|-----------|--------------|-----------|---------------|
| 32bit | fishmv    | none         | 0x12345678| scalar EXT0nn |
| 64bit | ss.fishmv | 0x26!zero    | 0x12345678| scalar SVP64Single:EXT0nn |
| 64bit | sv.fishmv | 0x27nnnnnn   | 0x12345678| vector SVP64:EXT0nn |

this one shows how it would be done if fishmv was in EXT063.

any of the 64bit areas if allocated to anything else it's Game Over
at the Decode Phase.

UNLESS..... this:

| width | assembler | prefix?      | suffix    | description   |
|-------|-----------|--------------|-----------|---------------|
| 64bit | fishmv    | 0x24000000   | PO2 345678| scalar EXT2nn |
| 64bit | ss.fishmv | 0x24!zero    | PO2 345678| scalar SVP64Single:EXT2nn |
| 64bit | sv.fishmv | 0x26nnnnnn   | PO2 345678| vector SVP64:EXT2nn |

where PO2 is:

| PO2      | Usage  |
|----------|--------|
| 0b11nnnn | SVP64  |
| 0b00nnnnn| other  |
| 0b10nnnnn| other  |
| 0b01nnnnn| other  |
Comment 8 Luke Kenneth Casson Leighton 2022-09-15 21:31:29 BST
damn, i messed up, need help reviewing this:

| 0-5 | 6 | 7 | 8-31  | 32:33 |  Description               |
|-----|---|---|-------|-------|---------------------------|
| PO9?| 0 | 0 | 0000  | 11    | RESERVED (other)          |
| PO9?| 0 | 0 | !zero | 11    | SVP64 (current and future) |
| PO9?| 0 | 1 | xxxx  | 11    | SVP64 (current and future) |
| PO9?| 0 | x | xxxx  | 01    | RESERVED (other)          |
| PO9?| 0 | x | xxxx  | 10    | RESERVED (other)          |
| PO9?| 0 | x | xxxx  | 00    | RESERVED (other)          |
| PO9?| 1 | x | xxxx  | xx    | SVP64 (current and future) |

it is a hybrid scheme that allows priority to Vectorising
"old" (existing, called "Defined Words") EXT000-063.  which
is when bit6=1, thid takes up 50% of EXT009.

the "new" setting (bit6=0) can be shared, and the idea there
is that by setting the top 2 bits to 0b11 of the Suffix NEW Primary
Opcode then it only needs another 25% of EXT009.
Comment 9 Jacob Lifshay 2022-09-15 23:25:13 BST
(In reply to Luke Kenneth Casson Leighton from comment #8)
> damn, i messed up, need help reviewing this:
> 
> | 0-5 | 6 | 7 | 8-31  | 32:33 |  Description               |
> |-----|---|---|-------|-------|---------------------------|
> | PO9?| 0 | 0 | 0000  | 11    | RESERVED (other)          |
> | PO9?| 0 | 0 | !zero | 11    | SVP64 (current and future) |
> | PO9?| 0 | 1 | xxxx  | 11    | SVP64 (current and future) |
> | PO9?| 0 | x | xxxx  | 01    | RESERVED (other)          |
> | PO9?| 0 | x | xxxx  | 10    | RESERVED (other)          |
> | PO9?| 0 | x | xxxx  | 00    | RESERVED (other)          |
> | PO9?| 1 | x | xxxx  | xx    | SVP64 (current and future) |

may i suggest kinda reversing the order by having bit 6 = 1 mean new and bit 6 = 0 mean old, also for new have bit 32:33 = 0 mean svp64, and bit 32:33 != 0 mean reserved?
Comment 10 Luke Kenneth Casson Leighton 2022-09-16 00:36:38 BST
(In reply to Jacob Lifshay from comment #9)

> may i suggest kinda reversing the order by having bit 6 = 1 mean new and bit
> 6 = 0 mean old, also for new have bit 32:33 = 0 mean svp64, and bit 32:33 !=
> 0 mean reserved?

apparently it's based on what is in EXT001 Public v3.1 1.6.3,
so no, we can't, it would be too confusing.
Comment 11 Jacob Lifshay 2022-09-16 00:40:11 BST
(In reply to Luke Kenneth Casson Leighton from comment #10)
> (In reply to Jacob Lifshay from comment #9)
> 
> > may i suggest kinda reversing the order by having bit 6 = 1 mean new and bit
> > 6 = 0 mean old, also for new have bit 32:33 = 0 mean svp64, and bit 32:33 !=
> > 0 mean reserved?
> 
> apparently it's based on what is in EXT001 Public v3.1 1.6.3,
> so no, we can't, it would be too confusing.

should still be able to do the bit 32:33 part afaict...
Comment 12 Luke Kenneth Casson Leighton 2022-09-16 01:22:36 BST
(In reply to Jacob Lifshay from comment #11)

> should still be able to do the bit 32:33 part afaict...

yes that's new.

basically to 24-bit-prefix EXT000-063 and also include
SVP64Single and identify it that's:

* 24-bit prefix
* 1-bit vector/scalar
* 1-bit saying "old" encoding
* 6-bit PO EXT009

grand total 32-bits

thus in effect we are asking for a whopping 50% of EXT009-64bit

the next one - "new" - we can say "we don't want all of it"

* 24-bit prefix
* 1-bit vector/scalar
* 1-bit saying "new" encoding
* 6-bit PO EXT009
then
* 2-bit in 32-34 to say "we want 25% of this"

as that as in the same "PO" position (call it PO2) that corresponds to
EXT248-EXT263

which then leaves 75% of 50% left for future ISA WG needs, a grand
total of:

* 64 minus (
  * 6-bit PO EXT009 (taken up)
  * 1-bit "new" encoding (bit6)
  * 2-bit "not SVP64-Reserved" (bits 32-34)
 )

=

a massive 55 bits available for future encoding space of non-SVP64 instructions.
it's actually slightly more than the allocated space of EXT001 (ok so they
have 6-bits taken up with a "specifier")
Comment 13 Jacob Lifshay 2022-09-16 17:32:32 BST
(In reply to Luke Kenneth Casson Leighton from comment #12)
> (In reply to Jacob Lifshay from comment #11)
> 
> > should still be able to do the bit 32:33 part afaict...
> 
> yes that's new.
> <snip>
> which then leaves 75% of 50% left for future ISA WG needs, a grand
> total of:
> 
> * 64 minus (
>   * 6-bit PO EXT009 (taken up)
>   * 1-bit "new" encoding (bit6)
>   * 2-bit "not SVP64-Reserved" (bits 32-34)

what i'm saying is bits 32:33 (34 is a typo) should be changed to where zero indicates new svp64-scalar, and nonzero indicates reserved. right now you picked 11 indicates new svp64-scalar:
https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=openpower/sv/rfc/ls001.mdwn;h=16bcb16307a163e0f5d2483d4a7df84eba926720;hb=7a316cb8c159ba507a6278dea2f0b29d2d4a0fe3#l668
Comment 14 Luke Kenneth Casson Leighton 2022-09-16 17:36:47 BST
(In reply to Jacob Lifshay from comment #13)

> what i'm saying is bits 32:33 (34 is a typo)

yes, good catch

> should be changed to where zero
> indicates new svp64-scalar, and nonzero indicates reserved. right now you
> picked 11 indicates new svp64-scalar:
> https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=openpower/sv/rfc/ls001.
> mdwn;h=16bcb16307a163e0f5d2483d4a7df84eba926720;
> hb=7a316cb8c159ba507a6278dea2f0b29d2d4a0fe3#l668

ah no. you've very much misunderstood.

bits 32-33 deliberately overlap with Primary Opcodes numbered 48 thru 63
(0b**11**0000 to 0b**11**1111).

bit 7 is responsible for differentiating between SVP64(Vector) and
SVP64(Single).
Comment 15 Luke Kenneth Casson Leighton 2022-09-16 17:45:31 BST
bit 6 and 7 you have to decipher from EXT001 Power ISA Public v3.1 1.6.3

* bit 6: old / new
* bit 7: memory / register

Paul and Brad suggested using a similar encoding for SVP64(*)

* bit 6: old / new
* bit 7: scalar / vector

the problem is, that takes up 100% of a Primary Opcode and we need to
reduce that allocation a **LOT**.

there is nothing we can do about EXT000-063 (bit6=1) that just has to be 50% of
the PO.

but we *can* limit the amount of space needed in EXT2nn (bit6=0) by forcing
the top two bits to be 0b11xxxx, which has the effect of forcing
numbering to be in the range EXT248-263.
Comment 16 Jacob Lifshay 2022-09-17 00:54:01 BST
(In reply to Luke Kenneth Casson Leighton from comment #14)
> (In reply to Jacob Lifshay from comment #13)
> 
> > what i'm saying is bits 32:33 (34 is a typo)
> 
> yes, good catch
> 
> > should be changed to where zero
> > indicates new svp64-scalar, and nonzero indicates reserved. right now you
> > picked 11 indicates new svp64-scalar:
> > https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=openpower/sv/rfc/ls001.
> > mdwn;h=16bcb16307a163e0f5d2483d4a7df84eba926720;
> > hb=7a316cb8c159ba507a6278dea2f0b29d2d4a0fe3#l668
> 
> ah no. you've very much misunderstood.
> 
> bits 32-33 deliberately overlap with Primary Opcodes numbered 48 thru 63
> (0b**11**0000 to 0b**11**1111).

afaict those are *new* encodings, so no, it's not 48-63, it's 248-263 (in your weird prefix * 100 + suffix scheme). it would be equally valid to pick po 200-215 (as i'm suggesting) as what you currently picked -- po 248-263.

I'm suggesting picking zeros rather than 11 because that's what is conventional for quite a few places in the power isa spec.
Comment 17 Luke Kenneth Casson Leighton 2022-09-17 09:40:08 BST
(In reply to Jacob Lifshay from comment #16)

> afaict those are *new* encodings, so no, it's not 48-63, it's 248-263 (in
> your weird prefix * 100 + suffix scheme).

[the weird prefix+suffix scheme.  best to think of it in advance as not
 being "personally owned" given that it's to be transferred to the OPF]

>  it would be equally valid to pick
> po 200-215 (as i'm suggesting) as what you currently picked -- po 248-263.

ok, right, yes, that's much clearer
 
> I'm suggesting picking zeros rather than 11 because that's what is
> conventional for quite a few places in the power isa spec.

honestly it doesn't matter what it is. i'm happy for the OPF to pick the
digits.

two digits represents a 25% reservation of the new space.  i did say that
we could potentially go to three (12.5% - only 8 new POs) but it starts
to get dicey at that point, especially given that:

* at least one of those new POs are likely to be for large bitmanip  
* 22 new POs (!!) are needed for a full suite of LD-ST-with-shift!
  (which has to be done *really* carefully - if at all - just
   sticking with add-with-shift might be better)