Bug 458 - PartitionedSignal needs nmigen constructs "m.If", Switch etc
Summary: PartitionedSignal needs nmigen constructs "m.If", Switch etc
Status: CONFIRMED
Alias: None
Product: Libre-SOC's first SoC
Classification: Unclassified
Component: Source Code (show other bugs)
Version: unspecified
Hardware: Other Linux
: --- enhancement
Assignee: Jacob Lifshay
URL:
Depends on: 708
Blocks: 132 362
  Show dependency treegraph
 
Reported: 2020-08-18 01:06 BST by Luke Kenneth Casson Leighton
Modified: 2021-09-24 11:30 BST (History)
4 users (show)

See Also:
NLnet milestone: NLNet.2019.10.Wishbone
total budget (EUR) for completion of task and all subtasks: 1250
budget (EUR) for this task, excluding subtasks' budget: 1250
parent task for budget allocation: 362
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:
# uncomment once the budget is decided on... # commrnted since sum of 0 causes an error #awygle=0 #lkcl=0 #jacob=0


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Kenneth Casson Leighton 2020-08-18 01:06:55 BST
PartitionedSignal is a dynamic SIMD version of Signal.  it needs to work with the following constructs:

* m.If / Elif / Else
* m.Switch / Case

without PartitionedSignal, SIMD has to be done as follows:

with m.If(partition == 64):
    operations treating Signals at full 64 bit
with m.Elif(partition == 2x32bit):
    for i in range(2):
         exactly the same code as 64 bit
         except now it is repeated twice,
         first on the low 32 bits and then
         on the hi 32

and that repetition continues right the way down to 8 bit, complicating design massively.

PartitionedSignal hides that entirely, as far as arithmetic and logic operations are concerned (__lt__, __or__, see bug #132)

where things break down is this:

ppts = PartitionPoints(64, 8)
x = PartitionedSignal(64, partition=ppts)

with m.If(x == 5):
    ... do something.

the reason it breaks down is because PartitionedSignal.__eq__ does *not* return a single bit object, it returns a *dynamic multi-bit* object that dynamically reflects the current state of the partitioning.

example:

* partition points is set to 8 breaks
* this subdivides the 64 bit signal into 8 separate 8 bit values
* comparison against a constant (5) creates EIGHT separate 8-bit comparisons
* those 8 comparisons are bundled together

when the partitions are cleared to indicate that the 64 bits are to be treated as a single 64 bit value, the comparison "5" is done against the entire 64 bit, however the answer still goes into the same 8 bit object used when the partition was set to 8x8bit.

the m.If construct cannot cope with this.

the current workaround is to completely avoid using m.If entirely and to use a special PartitionedMux construct, PMux, instead.

this however is less than ideal.

similar logic applies to Switch/Case. Arrays need some discussion.
Comment 1 Luke Kenneth Casson Leighton 2020-12-06 01:06:06 GMT
https://github.com/nmigen/nmigen/blob/59ef6e6a1c4e389a41148554f2dd492328820ecd/nmigen/hdl/dsl.py#L447

ah ha!

this is the location where the code-fragments for m.If/Elif/Else are
generated, and it should be really pretty straightforward to drop the appropriate replacement (parallel, SIMD) statements in at this location, with a replacement (override) of Module._pop_ctrl
Comment 2 Luke Kenneth Casson Leighton 2020-12-06 17:15:06 GMT
more context:
https://freenode.irclog.whitequark.org/nmigen/2020-12-06#28537093;
Comment 3 Luke Kenneth Casson Leighton 2020-12-07 00:33:17 GMT
relevant page for describing the dynamic partitioned signal operators

https://libre-soc.org/3d_gpu/architecture/dynamic_simd/
Comment 4 Luke Kenneth Casson Leighton 2021-01-13 18:41:08 GMT
taking a look at this more closely, i realised that Value (and UserValue) will need some tweaks.
at present, the code does this:

class Value:
     def part(....)
         return Part(self, ...)

where Part is a global function.  in order to support override capability this needs to change to:

class Value:
     def part(....)
         return ValuePart(self, ...)

where Part is defined as:

def Part(lhs, ....):
    return lhs.part(....)

and the *current* Part function is *renamed* to ValuePart.

UserValue then does:

class UserValue:
     def part(....)
         return UserValuePart(self, ...)

or more likely also calls ValuePart but it is clearly documented that user-derived overrides are perfectly well permitted.

PartitionedSignal would do *exactly that*.

then, a crucial addition to Value and UserValue would be:

    def mux(self, choice1, choice2):
         return ValueMux(self, choice1, choice2)

and PartitionedSignal could override it to provide the correct dynamic SIMD functionality.
Comment 5 Luke Kenneth Casson Leighton 2021-09-21 18:48:41 BST
http://lists.libre-soc.org/pipermail/libre-soc-dev/2021-September/003750.html

carrying on from there, i am slowly recovering the analysis and state
from many months back.


line 437:
                if if_test is not None:
                    if_test = Value.cast(if_test)
                    if len(if_test) != 1:
                        if_test = if_test.bool()


that was what i was talking about might need a function "isconsideredbool"
instead of len iftest !=1

line 447:


            self._statements.append(Switch(Cat(tests), cases,
                src_loc=src_loc, case_src_locs=dict(zip(cases, if_src_locs))))

the Switch there can be Value.__Switch__ and that function
call standard Switch.

the PartitionedSignal.__Switch__ variant instead does:

    for i, pbit in enumerate(self.partition_bits):
        totest = Part(test, i)
        ...

and creates *multiple* switch/case statements, each of which takes
a chunk at a time rather than does the entire lot as one hit.

8 paritions means *EIGHT* completely separate switch/cases, one for
each partition.

the reason why this works is because the HDL gates are going to be
there anyway, and the rule for PartitionedBool is that ALL bits
have to be set across the whole partition.

therefore, each "column" in the partition receives a copy
of the same "if" decision.
Comment 6 Luke Kenneth Casson Leighton 2021-09-23 20:39:36 BST
there's a couple of phases here:

1) redirection of Part (similar to operator.add(x, y) calling x.__add__(y)

2) redirection of Mux (same)

3) redirection of Switch (same)

Part:

a) existing Part renamed to InternalPart

https://github.com/nmigen/nmigen/blob/59ef6e6a1c4e389a41148554f2dd492328820ecd/nmigen/hdl/ast.py#L785

any name will do. _InternalPart? _DefaultInternalPart?

the idea here is *not* to disrupt the codebase, minimise
changes, where actually moving the contents *of* Part into
UserValue would flag that we are "making more changes than
it actually is".

if whitequark then asks for the merge, that's great.

b) replacement for Part.

real simple:

def Part(lhs, *args): lhs.__part__(*args)

this is exactly what is done in python operator module,
follow the breadcrumbs.... :)

c) add overrideable function to Value

https://github.com/nmigen/nmigen/blob/59ef6e6a1c4e389a41148554f2dd492328820ecd/nmigen/hdl/ast.py#L133

a case can be made for calling it Value.__part__ because of the convention used by the python operator module.

d) (later, TODO) write a PartitionedSignal.__part__()

but this can be under a separate bugreport, separate budget.


all of the others (Switch, Mux, Cat as well) can follow the
same pattern.  Cat() needs some special consideration.



we have a nmigen repo copy, i suggest making a branch (sigh)
and using that.  means we also have to change all the documentation
and scripts (argh) or (argh) work with a separate branch for
anything depending on this... hmm, that might not be needed
because the ieee754 PartitionedSignal class isn't used anywhere
have to see how that goes.
Comment 7 Luke Kenneth Casson Leighton 2021-09-23 20:42:03 BST
separate comment about Cat()

Cat() is a pain.

what does it mean in a parallel / SIMD context to even attempt to
concatenate a set of Signals or PartitionedSignals together?
this is sufficiently complex i think it needs its own bugreport
(TODO, edit... bug #707)
Comment 8 Luke Kenneth Casson Leighton 2021-09-24 11:21:34 BST
(In reply to Luke Kenneth Casson Leighton from comment #7)

> this is sufficiently complex i think it needs its own bugreport
> (TODO, edit... bug #707)

worked it out.  as long as only PartitionedSignals are concatenated
it works perfectly fine, irrespective of the sizes of the PSes.
if a Signal or Const is needed it is reasonable to require copying
into a PartitionedSignal.