936 – change the spec so RC1=1 fail-first instructions always write all outputs up to and including failing subvector

Bug 936 - change the spec so RC1=1 fail-first instructions always write all outputs up to and including failing subvector

Summary: change the spec so RC1=1 fail-first instructions always write all outputs up ...

Status:	IN_PROGRESS

Alias:	None

Product:	Libre-SOC's first SoC
Classification:	Unclassified
Component:	Specification (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	--- normal
Assignee:	Luke Kenneth Casson Leighton

URL:

Depends on:
Blocks:	952
	Show dependency tree / graph

Reported:	2022-09-24 02:37 BST by Jacob Lifshay
Modified:	2023-04-20 11:58 BST (History)
CC List:	1 user (show)

See Also:	933
NLnet milestone:	NLnet.2022-08-051.OPF
total budget (EUR) for completion of task and all subtasks:	0
budget (EUR) for this task, excluding subtasks' budget:	0
parent task for budget allocation:
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jacob Lifshay 2022-09-24 02:37:16 BST

* TODO: update wiki to describe always writing
  * TODO: partially undo:
> https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=4e1efb3f567faf947a765ab47ea51c0ab7ee74ce
  * TODO: look for other wiki changes that need to be made.
* WIP?: change openpower-isa.git to account for failing element writing.
* TODO: add unit tests for RC=1 writing failing subvector with subvl>1

https://libre-soc.org/irclog/latest.log.html#t2022-09-24T01:48:29

programmerjake (talking about using fail-first for pcdec.):
> uuh, RC1=1 can't be used, since the spec says the results are never stored,
> only the CR outputs...the whole point of pcdec. is the RT output, without
> it it's mostly useless
> either that or the spec is unclear
> > Note that when RC1=1 the result elements are never stored, only the CR Fields.
> https://libre-soc.org/openpower/sv/normal/#index5h1
> imho the spec should be changed to always write outputs for each element up
> to and including the first one that fails the data-dependent fail-first
> test, only elements after that one are not executed. VL being set to exclude
> the failing element should happen after.
> that way, an element is always fully-executed or not executed. not
> partially-only-writes-CR-executed
> all outputs, RT, CR, OV, etc.

Comment 1 Luke Kenneth Casson Leighton 2022-09-24 09:43:01 BST

well, the purpose of VLi mode is simply to save an annoying
"VL += 1" in hot-loops, on detection of truncation.  if that
can at least be detected then it's fine, no need to modify
the spec.

my feeling is that it *may* be worthwhile thinking of a new
RM Mode for this type of instruction, one which uses CR.lt
and CR.gt as indicators whether to increase srcstep and
dststep.

Comment 2 Luke Kenneth Casson Leighton 2022-09-24 10:43:33 BST

yep agreed, RC=1 should write its results for DD FF mode.

btw if something more complex is needed (FFirst inv/crbit with
VLi i.e. not just eq/ne) then crand, cror can be used, even if
the cror is an apparent nop (sv.cror *0,*0,*0) it gives the chance
to walk (and test) the CRfield bits.

https://libre-soc.org/openpower/sv/cr_ops/

still have to properly code up crops, that they work at all in
sv/trans/svp64.py is a total coincidence of the similarity between
crops and normal mode.

Comment 3 Jacob Lifshay 2022-10-07 20:49:34 BST

added todo list in top comment

Comment 4 Luke Kenneth Casson Leighton 2022-10-13 08:17:40 BST

(In reply to Jacob Lifshay from comment #0)
> * TODO: add unit tests for RC=1 writing failing subvector with subvl>1

it's not that simple.

the "stop" point will be not only in the middle of an outer
loop, but in the middle of an *inner* loop as well.

do you carry on to the end of the inner loop, long past the initial fail?

where is the state information "fail after the end of something that
happened up to 3 instructions ago?"

what if there is an interrupt in the middle?

how do you know which subvector element was the one that failed,
when all the SVSTATE src/dst steps including substeps have been
reset to zero?

DDFF subvectors are basically not practical.

Comment 5 Luke Kenneth Casson Leighton 2023-04-20 11:58:41 BST

(In reply to Luke Kenneth Casson Leighton from comment #4)

> DDFF subvectors are basically not practical.

the reason is that if you truncate SUBVL it is meaningless.
and also SUBVL is specified in the instruction, not in SVSTATE.