Bug 707 - PartitionedSignal limited Cat function needed
Summary: PartitionedSignal limited Cat function needed
Status: RESOLVED FIXED
Alias: None
Product: Libre-SOC's first SoC
Classification: Unclassified
Component: Source Code (show other bugs)
Version: unspecified
Hardware: Other Linux
: --- enhancement
Assignee: Luke Kenneth Casson Leighton
URL: https://libre-soc.org/3d_gpu/architec...
Depends on:
Blocks: 132
  Show dependency treegraph
 
Reported: 2021-09-23 20:50 BST by Luke Kenneth Casson Leighton
Modified: 2022-06-16 15:49 BST (History)
1 user (show)

See Also:
NLnet milestone: NLnet.2019.02.012
total budget (EUR) for completion of task and all subtasks: 250
budget (EUR) for this task, excluding subtasks' budget: 250
parent task for budget allocation: 132
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:
[lkcl] amount = 250 submitted = 2021-12-09 paid = 2021-12-09


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Kenneth Casson Leighton 2021-09-23 20:50:31 BST
a SIMD-aware Cat function is needed which can cope with
concatenation of PartitionedSignals together yet creates
the right output regardless of partition bits at runtime

PartitionedSignal:

https://git.libre-soc.org/?p=ieee754fpu.git;a=blob;f=src/ieee754/part/partsig.py;hb=HEAD
Comment 1 Luke Kenneth Casson Leighton 2021-09-23 22:41:46 BST
i went over the cases (see URL at URL field on wiki)
and worked out that as long as the inputs arw all
PartitionedSignals that a SIMD Cat() is possible.

what is *not* possible is to mix in non-Partitioned
with Partitioned Signals, because without subdivisions
the lengths vary in non-proportional ways.
Comment 2 Luke Kenneth Casson Leighton 2021-09-24 01:12:16 BST
looking at the tables created in the URL wiki page, the algorithm appears to be:

m.Switch()
for pbits cases: 0b000 to 0b111
  output = []
  # set up some yielders which will retain where they each got to
  # then when called below in the inner nested loop they give
  # the relevant sequential chunk
  yielders = [Yielder(a), Yielder(b), ....]
  runlist = split pbits into runs of zeros
  for y in yielders: # for each signal a b c d ...
     for i in runlist: # for each partition
        for _ in range(i)+1: # for the length of each partition
            thing = yield from y # grab sequential chunks
            output.append(thing)
  with m.Case(pbits):
     comb += out.eq(Cat(*output)

where Yielder() is a function that yields one partition
at a time from the PartitionedSignal.

another way to do this is just to have a list of
indices which get incremented and explicitly select
the partition data explicitly.
Comment 3 Luke Kenneth Casson Leighton 2021-09-24 20:01:24 BST
(In reply to Luke Kenneth Casson Leighton from comment #2)

> where Yielder() is a function that yields one partition
> at a time from the PartitionedSignal.

drat. i may have gotten confused how to use yield
 
> another way to do this is just to have a list of
> indices which get incremented and explicitly select
> the partition data explicitly.

i went this route instead.