PartitionedSignal needs a way to select parts by standard python x[3:5] and x[4] syntax, but the results returned to respect SIMD (return a SIMD result)
like Value.abs and Value.implies, Value.__getitem__ is in terms of other Value.xxxx operations. a SimdSignal.__getitem__ is *not* required. once SimdSignal.__Slice__ and part are written simdsignal[start:end:step] should just magically "work" as intuitively expected. thank goodness Slice and Part take fixed constants only (unlike Array) the basic principle is that the submodule (call them slice.py and part.py) does all the heavy lifting, and a "helper" (see pcat.py, prepl.py) is responsible for auto-instantiating an on-demand instance of that submodule followed by returning the output *of* that module, which, back in the SimdSignal override, simply returns that submodule output as *its* return result. it's a pretty straightforward trick that only got one scary moment for the Assign override but i managed to work that out, thank goodness. you _should_ be able to just use e.g. repl.py as a base template, i did cat.py first then used it to create repl.py dont use assign.py as the starting point it operates differently (returns statements not a signal) you can check the git history (commits are in bug #458) for the sequence i did, cat.py first, pcat.py as an afterthought, integrate submodule into partsig.py third and finally unit test fourth (each as separate commits) should be very straightforward, and by starting from either cat.py or repl.py you will get to see how PartType works.
I think we also need something to handle stuff like: a[3:5].eq(b) since assigning to a slice is quite common.
(In reply to Jacob Lifshay from comment #2) > I think we also need something to handle stuff like: > a[3:5].eq(b) > since assigning to a slice is quite common. this should work automatically because (i forgot to emphasise) the return result from the submodule (which is then a return result of SimdSignal.Slice/Part, which in turn is then a return result of Value.__getitem__) must be *another SimdSignal*. that's very important. have a close look at repl.py and prepl.py for how it (very surprisingly) works. the submodule is responsible for computing the width of the output SimdSignal. in the case of repl.py it was dead easy, take the width of the input and multiply it by the count. cat.py likewise, just sum up the total width of all input args. for Slice and Part the width is easily calculated (statically) thank goodness by analysing the slice which only takes integers/Consts so can also be done in the constructor. in early revisions Michael and i forgot to make the submodule return result a SimdSignal, we accidentally returned a Signal(). if that mistake had not been corrected then yes, things like Cat(simd11, simd2).eq(simd3) would totally fail. even things like simd1+simd2+simd3 would totally fail because the first add would have returned a Signal not another SimdSignal, and the 2nd add of simd3 would have produced utterly the wrong result.
(In reply to Luke Kenneth Casson Leighton from comment #3) > (In reply to Jacob Lifshay from comment #2) > > I think we also need something to handle stuff like: > > a[3:5].eq(b) > > since assigning to a slice is quite common. > > this should work automatically because (i forgot to > emphasise) the return result from the submodule > (which is then a return result of SimdSignal.Slice/Part, > which in turn is then a return result of Value.__getitem__) > must be *another SimdSignal*. I think you're missing my point, which is that, for the following code: a = SimdSignal(...) b = SimdSignal(...) c = Cat(a, b) d = a[5] m.d.comb += [c.eq(0), d.eq(0)] c and d are the lhs of an assignment, which means that the signals *have to flow backwards* from c and d back to a and b. From reading the existing source for PartitionedCat, PartitionedAssign, and SimdSignal, that case isn't handled at all (unless I'm missing something).
(In reply to Luke Kenneth Casson Leighton from comment #3) > (In reply to Jacob Lifshay from comment #2) > > I think we also need something to handle stuff like: > > a[3:5].eq(b) > > since assigning to a slice is quite common. > > this should work automatically because (i forgot to > emphasise) the return result from the submodule > (which is then a return result of SimdSignal.Slice/Part, > which in turn is then a return result of Value.__getitem__) > must be *another SimdSignal*. That operators need to return SimdSignals is totally obvious to me...
(In reply to Jacob Lifshay from comment #5) > (In reply to Luke Kenneth Casson Leighton from comment #3) > > (In reply to Jacob Lifshay from comment #2) > > > I think we also need something to handle stuff like: > > > a[3:5].eq(b) > > > since assigning to a slice is quite common. > > > > this should work automatically because (i forgot to > > emphasise) the return result from the submodule > > (which is then a return result of SimdSignal.Slice/Part, > > which in turn is then a return result of Value.__getitem__) > > must be *another SimdSignal*. > > That operators need to return SimdSignals is totally obvious to me... actually, now that I think about it, Slice and Cat and Part have to return a SimdValue, not a SimdSignal, since otherwise a[5].eq(b) will assign to the signal that is a result of slicing, but the assignment won't propagate back to `a` since it won't know where to propagate to.
(In reply to Jacob Lifshay from comment #5) > That operators need to return SimdSignals is totally obvious to me... :) the non-obvious bit is, "but... butbut... is that it??" :) > actually, now that I think about it, Slice and Cat and Part have to return a > SimdValue, not a SimdSignal, since otherwise a[5].eq(b) will assign to the > signal that is a result of slicing, but the assignment won't propagate back > to `a` since it won't know where to propagate to. oo, oo, err... i need to think about that one. this was part of the mind-bending involving Assign. ... half-a-brainwave: i think it's ok... as long as the submodules entirely use the comb domain (which they do), the netlist from LHS and RHS become synonymous in effect, with the end result that an eq assignment should propagate. (this because unlike programming variables where assignment to LHS overwrites RHS but a second assignment preserves the first, HDL comb eq()s make the LHS and RHS "the same netlist" comb += x.eq(y) comb += y.eq(z) x now == y now == z which would not be true if that was software. i will have to think about that, or simply try a test Cat(x, y).eq(z) which is doable because Cat and Assign are available. ooo, it'd better work!
(In reply to Jacob Lifshay from comment #4) > I think you're missing my point, which is that, for the following code: > a = SimdSignal(...) > b = SimdSignal(...) > c = Cat(a, b) > d = a[5] > m.d.comb += [c.eq(0), d.eq(0)] > > c and d are the lhs of an assignment, which means that the signals *have to > flow backwards* from c and d back to a and b. > > From reading the existing source for PartitionedCat, PartitionedAssign, and > SimdSignal, that case isn't handled at all (unless I'm missing something). well, one way to find out: try it :) comb assignments should (LHS or RHS) result in netlist merging, so i believe with a say... 40% certainty it should work :] it's not intuitively obvious, i appreciate, and early versions of nmigen sim couldn't handle long assignment chains (whoops)
I added WIP code that will handle any combination of Slice and Cat, it can reduce it all down to a single layer of muxes. it also supports assignment to any combination of Slice and Cat of SimdSignals. I put it all in a swizzle.py file, that handles arbitrary bit-swizzles: https://git.libre-soc.org/?p=ieee754fpu.git;a=blob;f=src/ieee754/part_swizzle/swizzle.py;h=44c1922f4893b03a1aa58f2225426191deadf037;hb=8692c384d1bab5a4dc86ccad5110c3db597b35a7 The basic idea is there's SwizzledSimdValue, a subclass of SimdSignal, that tracks where each bit goes from the original Signals for every possible value of mask/elwid (we really need to get elwid integrated into SimdSignal...), then we have each SwizzledSimdValue instance independently construct it's .sig member on-demand from the *original* Signals' bits, *instead of* from any other SwizzledSimdValue.sig. When a SwizzledSimdValue is assigned to, it uses a completely different evaluation method: it constructs the full list of Signals that it's built out of, and returns the assignments that would reassign those full signals to the correct values.
(In reply to Jacob Lifshay from comment #9) > I added WIP code that will handle any combination of Slice and Cat, it can > reduce it all down to a single layer of muxes. it also supports assignment > to any combination of Slice and Cat of SimdSignals. I put it all in a > swizzle.py file, that handles arbitrary bit-swizzles: > https://git.libre-soc.org/?p=ieee754fpu.git;a=blob;f=src/ieee754/ > part_swizzle/swizzle.py;h=44c1922f4893b03a1aa58f2225426191deadf037; > hb=8692c384d1bab5a4dc86ccad5110c3db597b35a7 SwizzledSimdValue will effectively replace the existing Cat, since it's more powerful.
Assign for SwizzledSimdValue is 100% completed except for converting the rhs to the right SimdSignal shape.
(In reply to Luke Kenneth Casson Leighton from comment #8) > well, one way to find out: try it :) comb assignments should > (LHS or RHS) result in netlist merging, so i believe with a say... 40% > certainty it should work :] I'm 99% certain that won't work, since yosys tries to emulate verilog's semantics, and the following doesn't work in verilog: wire a, b, c; # this doesn't connect a with c always b = a; always b = c;
(In reply to Jacob Lifshay from comment #12) > (In reply to Luke Kenneth Casson Leighton from comment #8) > > well, one way to find out: try it :) comb assignments should > > (LHS or RHS) result in netlist merging, so i believe with a say... 40% > > certainty it should work :] > > I'm 99% certain that won't work, since yosys tries to emulate verilog's > semantics, and the following doesn't work in verilog: > wire a, b, c; > > # this doesn't connect a with c > always b = a; > always b = c; so you didn't check it, with a simple test comprising 15-20 lines, before writing nearly 400 and introducing new classes and concepts that haven't been demonstrated to be needed. you keep doing this. now please can you do what was asked, which was to write a simple test.
(jacob, i have to share this with you: i too went through similar experiences during my early career. i have done exactly these things. i had to learn, and am happy to share the results with you. for me, they were... hard knocks, and yet i actually ended up appreciating the fact that managers, other libre developers etc. actually took the time to explain things to me, because i understood that they were important. i trust that you understand this too)
carrying over from https://bugs.libre-soc.org/show_bug.cgi?id=731#c10 Slice is straightforward to do because it is fixed consts. *please do not use SimdSwizzle* it is too massive a jump and throws away far too much code. once Slice is done Part is (error checking aside) literally 1 line of code, using existing Simd shift operator and using straight slices: return (self >> offset)[:width] that's it. incredibly simple. yes agreed under no circumstances do we go anywhere near LHS Part right now.
inferring how Part works from the difference between bit_select and word_select: 413 if type(offset) is Const and isinstance(width, int): 414 return self[offset.value * width:(offset.value + 1) * width] 415 return Part(self, offset, width, stride=width, src_loc_at=1) the one-line (key part of) Part i believe would be more like: return (self >> (offset*stride))[:width] it is this aimple because under no circumstances are we doing LHS Part.
in the URL for this bugreport https://libre-soc.org/3d_gpu/architecture/dynamic_simd/slice/ the question was raised, how the heck do you connect between a Slice()d signal and a non-sliced one, even when they use the same elwidth, because the non-sliced one has no padding partitionpoints and the sliced one does. the answer is: an adapter submodule. the exact details of how that works will emerge from the examples, TBD. i will write a couple now.
(In reply to Luke Kenneth Casson Leighton from comment #17) > the answer is: an adapter submodule. > > the exact details of how that works will emerge from > the examples, TBD. i will write a couple now. oo. err and um. that's interesting. an add of a 3 bit sliced quantity with a 2 bit (element width=2bit) quantity requires sign (or zero) extension prior to partition expansion. this is going to be interesting (translation: hairy)
(In reply to Luke Kenneth Casson Leighton from comment #18) > (In reply to Luke Kenneth Casson Leighton from comment #17) > > > the answer is: an adapter submodule. > > > > the exact details of how that works will emerge from > > the examples, TBD. i will write a couple now. > > oo. err and um. that's interesting. an add of a 3 bit > sliced quantity with a 2 bit (element width=2bit) quantity > requires sign (or zero) extension prior to partition > expansion. > > this is going to be interesting (translation: hairy) The answer to all of those conversions is to have a SimdSignal.cast method that handles conversion of an input nmigen Value or SimdSignal to the desired SimdShape: Full conversion implementation: class SimdSignal(...): @staticmethod def cast(value, *, shape=None, scope): # scope injected by calling this method as # scope.Signal_cast(value, shape=shape), which is a # partial() of SimdSignal.cast if shape is not None: assert shape.scope is scope if isinstance(value, SimdSignal): assert value.shape().scope is scope if shape is None or shape == value.shape(): return value # I'm assuming SwizzleKey is already replaced by SimdScope. # from_value handles splatting a scalar nmigen Value if needed. src = SwizzledSimdValue.from_value(scope, value) swizzles = {} for src_el, dst_el in zip(src.shape().elements(), shape.elements()): elwid = src_el.elwid # SimdShape always iterates elements in the same order across all # SimdShapes for a particular scope. assert elwid == dst_el.elwid if elwid not in swizzles: swizzles[elwid] = Swizzle.from_const(0, shape.width) # get bits for current src_el el = src.swizzles[elwid][src_el.slice] if src_el.shape.signed: # *src* determines sign/zero extension # truncate/sign-extend el = el.convert_s_to(dst_el.shape) else: # truncate/zero-extend el = el.convert_u_to(dst_el.shape) # assign converted bits to dest swizzles[elwid].bits[dst_rl.slice] = el.bits # oops, I forgot shape argument for __init__ return SwizzledSimdValue(scope, swizzles, shape=shape) now, all ops can use: ops_input_arg = SimdSignal.cast(ops_input_arg, shape=desired_shape, scope=scope) e.g.: def __add__(self, other): l_shape = self.shape() scope = l_shape.scope # splat scalars to SimdSignal other = SimdSignal.cast(other, scope=scope) r_shape = other.shape() def get_el_shape(l, r): # let nmigen compute result element's shape return (Const(0, l) + Const(0, r)).shape() el_shapes = SimdMap.map(get_el_shape, l_shape.el_shapes, r_shape.el_shapes) shape = SimdShape(el_shapes, scope=scope) # could have a SimdShape.map method letting us write: # shape = SimdShape.map(get_el_shape, l_shape, r_shape) # convert to result's shape lhs = SimdSignal.cast(self, shape=shape, scope=scope) rhs = SimdSignal.cast(other, shape=shape, scope=scope) # do add return lhs.add_op(lhs, rhs, carry=0)
(In reply to Jacob Lifshay from comment #19) > (In reply to Luke Kenneth Casson Leighton from comment #18) > > (In reply to Luke Kenneth Casson Leighton from comment #17) > > > > > the answer is: an adapter submodule. > > > > > > the exact details of how that works will emerge from > > > the examples, TBD. i will write a couple now. > > > > oo. err and um. that's interesting. an add of a 3 bit > > sliced quantity with a 2 bit (element width=2bit) quantity > > requires sign (or zero) extension prior to partition > > expansion. > > > > this is going to be interesting (translation: hairy) > > The answer to all of those conversions is to have a SimdSignal.cast method > that handles conversion of an input nmigen Value or SimdSignal to the > desired SimdShape: ooo that makes sense. like it. Cat however still needs alteration: it can't cope with the fact that the input(s) and output are fundamentally tied together. take e.g. a b and c to be Cat()ed together. b creates holes right in the middle of the result's PartitionPoints, that critically depend on the size of a and c. only PartitionedCat itself can determine where to copy the bits of b into the result. bottom line is, SimdSignal.cast should work fine for one-to-one, but not for many-to-one. hopefully Cat() is just the weird one here.
(In reply to Luke Kenneth Casson Leighton from comment #20) > bottom line is, SimdSignal.cast should work fine for one-to-one, > but not for many-to-one. hopefully Cat() is just the weird > one here. See my other email for Cat's solution: https://bugs.libre-soc.org/show_bug.cgi?id=734#c9
(In reply to Jacob Lifshay from comment #21) > (In reply to Luke Kenneth Casson Leighton from comment #20) > > bottom line is, SimdSignal.cast should work fine for one-to-one, > > but not for many-to-one. hopefully Cat() is just the weird > > one here. > > See my other email for Cat's solution: > > https://bugs.libre-soc.org/show_bug.cgi?id=734#c9 please help to fix the existing code. please do not keep advocating disruptive regressions that have no unit tests. i have said this multiple times and yet you keep insisting on ignoring what i am saying. this is getting extremely tedious. when i give some reasons please LISTEN to and RESPECT those reasons. please do not keep making me waste time repeating things multiple times. you are making me angry at how much of my time you are wasting here.