707 – PartitionedSignal limited Cat function needed

Bug 707 - PartitionedSignal limited Cat function needed

Summary: PartitionedSignal limited Cat function needed

Status:	RESOLVED FIXED

Alias:	None

Product:	Libre-SOC's first SoC
Classification:	Unclassified
Component:	Source Code (show other bugs)
Version:	unspecified
Hardware:	Other Linux

Importance:	--- enhancement
Assignee:	Luke Kenneth Casson Leighton

URL:	https://libre-soc.org/3d_gpu/architec...

Depends on:
Blocks:	132
	Show dependency tree / graph

Reported:	2021-09-23 20:50 BST by Luke Kenneth Casson Leighton
Modified:	2022-06-16 15:49 BST (History)
CC List:	1 user (show)

See Also:	458 115
NLnet milestone:	NLnet.2019.02.012
total budget (EUR) for completion of task and all subtasks:	250
budget (EUR) for this task, excluding subtasks' budget:	250
parent task for budget allocation:	132
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:	[lkcl] amount = 250 submitted = 2021-12-09 paid = 2021-12-09

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Luke Kenneth Casson Leighton 2021-09-23 20:50:31 BST

a SIMD-aware Cat function is needed which can cope with
concatenation of PartitionedSignals together yet creates
the right output regardless of partition bits at runtime

PartitionedSignal:

https://git.libre-soc.org/?p=ieee754fpu.git;a=blob;f=src/ieee754/part/partsig.py;hb=HEAD

Comment 1 Luke Kenneth Casson Leighton 2021-09-23 22:41:46 BST

i went over the cases (see URL at URL field on wiki)
and worked out that as long as the inputs arw all
PartitionedSignals that a SIMD Cat() is possible.

what is *not* possible is to mix in non-Partitioned
with Partitioned Signals, because without subdivisions
the lengths vary in non-proportional ways.

Comment 2 Luke Kenneth Casson Leighton 2021-09-24 01:12:16 BST

looking at the tables created in the URL wiki page, the algorithm appears to be:

m.Switch()
for pbits cases: 0b000 to 0b111
  output = []
  # set up some yielders which will retain where they each got to
  # then when called below in the inner nested loop they give
  # the relevant sequential chunk
  yielders = [Yielder(a), Yielder(b), ....]
  runlist = split pbits into runs of zeros
  for y in yielders: # for each signal a b c d ...
     for i in runlist: # for each partition
        for _ in range(i)+1: # for the length of each partition
            thing = yield from y # grab sequential chunks
            output.append(thing)
  with m.Case(pbits):
     comb += out.eq(Cat(*output)

where Yielder() is a function that yields one partition
at a time from the PartitionedSignal.

another way to do this is just to have a list of
indices which get incremented and explicitly select
the partition data explicitly.

Comment 3 Luke Kenneth Casson Leighton 2021-09-24 20:01:24 BST

(In reply to Luke Kenneth Casson Leighton from comment #2)

> where Yielder() is a function that yields one partition
> at a time from the PartitionedSignal.

drat. i may have gotten confused how to use yield
 
> another way to do this is just to have a list of
> indices which get incremented and explicitly select
> the partition data explicitly.

i went this route instead.