588 – add SVP64 to PowerDecoder2

Bug 588 - add SVP64 to PowerDecoder2

Summary: add SVP64 to PowerDecoder2

Status:	RESOLVED FIXED

Alias:	None

Product:	Libre-SOC's first SoC
Classification:	Unclassified
Component:	Source Code (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	--- enhancement
Assignee:	Luke Kenneth Casson Leighton

URL:	https://libre-soc.org/openpower/sv/im...

Depends on:	703
Blocks:	617 583
	Show dependency tree / graph

Reported:	2021-01-30 00:20 GMT by Luke Kenneth Casson Leighton
Modified:	2022-09-10 12:40 BST (History)
CC List:	2 users (show)

See Also:	583
NLnet milestone:	NLNet.2019.10.046.Standards
total budget (EUR) for completion of task and all subtasks:	1800
budget (EUR) for this task, excluding subtasks' budget:	1000
parent task for budget allocation:	241
child tasks for budget allocation:	703
The table of payments (in EUR) for this task; TOML format:	cesar = { amount = 300, submitted = 2022-06-16, paid = 2022-07-21 } lkcl = { amount = 700, submitted = 2022-06-25, paid = 2022-07-21 }

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Luke Kenneth Casson Leighton 2021-01-30 00:20:14 GMT

PowerDecoder2 needs to be able to understand SVP64, particularly register
numbers (isvec).  also the "modes" need sub-decoding, and predicate
selection etc.

* Reg EXTRA: done except out2
* CR EXTRA: done
* SPR EXTRA: TODO
* Predicate selection: TODO
* Element-width overrides: TODO
* Mode decoding incl. LDST: done, testing TODO

Comment 1 Luke Kenneth Casson Leighton 2021-01-30 00:22:54 GMT

commit 63aeeaa31a60065b03421d3a5497327078d0b0e8 (HEAD -> master)
Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Sat Jan 30 00:17:20 2021 +0000

    add first SVP64 7-bit register context decoder to PowerDecoder2

Comment 2 Luke Kenneth Casson Leighton 2021-01-30 00:38:33 GMT

commit 982a3a872f8969ab61e9f1c42194e1522be38de9 (HEAD -> master)
Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Sat Jan 30 00:36:22 2021 +0000

    add SVP64 EXTRA decoding to RB, RC and RT (out) in PowerDecode2
    DecodeOut2 will have to wait because it is more complex

Cesar i have the INT registers in the 3 input columns done, and one
output, but not the 2nd output yet (LDST-with-update), or the CRs.

Comment 3 Luke Kenneth Casson Leighton 2021-01-30 14:00:55 GMT

commit b90ce1976820244dbd710d2c612933db7d5eece9 (HEAD -> master, origin/master)
Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Sat Jan 30 13:55:55 2021 +0000

    add SVP64 CR EXTRA field-extension, from 3-bit to 7-bit (plus isvec)
    in PowerDecoder2

added CR incoming register extending, CR outgoing is next.  test_issuer.py
is still working fine.

Comment 4 Luke Kenneth Casson Leighton 2021-01-30 21:21:27 GMT

moved CR EXTRA into PowerDecoder2 so that tsatellite decoders do not have unnecessary copies of SVP64 decode modules.

Comment 5 Cesar Strauss 2021-02-03 10:37:19 GMT

The augmented decoder will stay stateless (purely combinatorial) right? So, it will need both the 32-bit prefix and the 32-bit suffix at the same time, correct?

Or, will it be split in two stages, so you first decode the prefix (if any), then you take the result and use it to post-process the result of the scalar decoder?

Comment 6 Luke Kenneth Casson Leighton 2021-02-03 12:31:38 GMT

(In reply to Cesar Strauss from comment #5)
> The augmented decoder will stay stateless (purely combinatorial) right? 

yes absolutelyn

> it will need both the 32-bit prefix and the 32-bit suffix at the same time,
> correct?

yes.  at the moment the only augmentation needed is EXTRA2/3 fields.

however later in the future certain combinations of vec2/3/4 will cause DIFFERENT sub-operations.

for example CROSSPRODUCT, CORDIC with compkex numbers, also and especially the mapreduce modes.
 
> Or, will it be split in two stages, so you first decode the prefix (if any),

yes

> then you take the result and use it to post-process the result of the scalar
> decoder?

exactly.  you can see i have started this process in ISACaller

https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/decoder/isa/caller.py;h=7730ce198d8d70a4db02a80ab54c0450d678b6b2;hb=9f19947c9887e61f66247ee1ce82ae60bedaf3c6#l611

i could have used PowerDecoder2 to do that task, by adding a CSV file (major1.csv) entry plus a NNN-Form plus some fields.

but, to be honest, when we get to multi-issue, PowerDecoder2 is total overkill, it is better to have a separate vastly simpler SVP64 prefix identifier system.

we discussed that a few months back on the Compressed  bug and jacob came up with a carry-propagation algorithm for multi-issue

Comment 7 Luke Kenneth Casson Leighton 2021-02-03 21:33:43 GMT

(In reply to Cesar Strauss from comment #5)

> Or, will it be split in two stages, so you first decode the prefix (if any),
> then you take the result and use it to post-process the result of the scalar
> decoder?

first thing: identify the prefix using this:
https://git.libre-soc.org/?p=soc.git;a=commitdiff;h=9cc04f05fff07d38c685614190007e107ee8b891

then if that is successfully identified as an svp64 instruction, pass in
the next 32 bits *and* the 24-bit ReMap into PowerDecoder2.

https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/decoder/power_decoder2.py;h=2f6c0bdec572db0ab605e83087ec7b72758e704c;hb=9cc04f05fff07d38c685614190007e107ee8b891#l793

now, in theory this could be done in 1 clock cycle, with some MUXes. but for the FSM it is perfectly fine to take more.

note however:

* the SVP64PowerDecoder2 is used in the *first* FSM (simply to identify
  "is this instruction 32 or 64 bit").

  - if it identifies an svp64 prefix it stores the 24-bit ReMap field
    in a latch, then reads *another* 32 bits

* PowerDecoder2 is used in the *second* FSM, receiving zero in the RM
  field if the *first* FSM identified a 32-bit operation.

first FSM reads from instruction fetch and identifies length.

second FSM does decode-and-execute *only*.


but, long before that is done, the split into two FSMs, and processing of 32-bit instructions *only*, must be carried out.  no involvement of svp64 at all.

Comment 8 Luke Kenneth Casson Leighton 2021-03-07 16:15:05 GMT

mode decoder here:

https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/decoder/power_svp64_rm.py;hb=HEAD

mostly recognises the differences between standard RM Mode, LDST-immediate and LDST-indexed.