Bug 64 - data handling / io control / data routing API needed
Summary: data handling / io control / data routing API needed
Status: CONFIRMED
Alias: None
Product: Libre-SOC's first SoC
Classification: Unclassified
Component: Source Code (show other bugs)
Version: unspecified
Hardware: PC Linux
: --- enhancement
Assignee: Luke Kenneth Casson Leighton
URL:
Depends on: 57 148
Blocks: 62
  Show dependency treegraph
 
Reported: 2019-04-19 09:05 BST by Luke Kenneth Casson Leighton
Modified: 2020-12-04 12:39 GMT (History)
2 users (show)

See Also:
NLnet milestone: NLnet.2019.02
total budget (EUR) for completion of task and all subtasks: 4500
budget (EUR) for this task, excluding subtasks' budget: 1800
parent task for budget allocation: 62
child tasks for budget allocation: 57 148 538
The table of payments (in EUR) for this task; TOML format:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Kenneth Casson Leighton 2019-04-19 09:05:20 BST
A data handling, routing and I/O control API is needed. this as the
basic building blocks for creating pipelines.

required features (edit this comment and add as necessary)

* ready/valid signalling that matches wishbone and AXI4
* disconnected mode for FSMs (FSM accepts, multi-cycle processes, sends)
* full data separation from signalling, to allow combinatorial data block
  chaining without signalling being involved
* queue/buffering capability (with write-through)
* synchronous mode (to provide guaranteed clock delay characteristics)
* fan-in multiplexing (many-to-one data routing) with priority selection
* multiple fan-in multiplexing (many-to-several single-clock data routing)
* fan-out multiplexing (one-to-many data routing) with selection mechanism
* fan-out multiplexing (several-to-many data routing) with single-clock select

the multi-selection modes on fan-in and fan-out are particularly complex
however will likely be needed to support multi-issue instruction

* bug #148 - iocontrol, stageapi, singlepipe API
Comment 1 Jacob Lifshay 2019-04-19 09:16:46 BST
see https://salsa.debian.org/Kazan-team/simple-barrel-processor/blob/master/src/PipelineBuildingBlockAPIProposal.py for a WIP API proposal.

Which API design to use is not resolved yet.
Comment 2 Luke Kenneth Casson Leighton 2019-04-19 09:25:17 BST
(In reply to Jacob Lifshay from comment #1)
> see
> https://salsa.debian.org/Kazan-team/simple-barrel-processor/blob/master/src/
> PipelineBuildingBlockAPIProposal.py for a WIP API proposal.
> 
> Which API design to use is not resolved yet.

yes. and names need to be agreed.  there's been a lot of confusion
resulting from names that have multiple potential meanings.

the other design is here:
* https://git.libre-riscv.org/?p=ieee754fpu.git;a=blob;f=src/add/singlepipe.py;h=fe052be72a3e70fde85e56058cf708ea9e7cd341;hb=25a0ec563bd7837b43a1d04036b2a5945c97023b
* https://git.libre-riscv.org/?p=ieee754fpu.git;a=blob;f=src/add/multipipe.py;h=35da5c2ec741aafc66e97804e842be0395b8a863;hb=25a0ec563bd7837b43a1d04036b2a5945c97023b

this design is already in use (in the IEEE754 FPU) which implies that
changes need to be handled very carefully in an incremental fashion,
to keep all unit tests operational at all times.

review of each is needed (TBD separate comment).
Comment 3 Luke Kenneth Casson Leighton 2019-04-19 10:16:23 BST
Review of singlepipe.py
-----------------------

review of the following revision:

https://git.libre-riscv.org/?p=ieee754fpu.git;a=blob;f=src/add/singlepipe.py;h=fe052be72a3e70fde85e56058cf708ea9e7cd341;hb=25a0ec563bd7837b43a1d04036b2a5945c97023b

RecordObject
------------

not documented.  required due to Record not being able to modify fields.
should be moved to its own issue (#66).

PrevControl
-----------

self._o_ready and s_o_ready and use of i_valid_test not documented or
clear.  makes a bit of a mess of the code.

effort is designed to allow data processing blocks to have a degree
of control over when they accept incoming data (even if upstream
is otherwise ready)

really not clear.


NextControl
-----------

self.d_valid not documented or clear.

effort is designed to allow data processing blocks to control when
they *send* outgoing data (even if downstream is otherwise ready)

not clear.


Visitor
-------

* a mess.  needs to be converted to an iterator/generator (yield)
* needs __iter__ and __next__ (etc).
* move to utils #68

Eq
--

* would be much simpler when Visitor is converted to a generator.
* move to utils #68


flatten
-------

needs to be converted to an iterator/generator, and for yield
part to be split out to a separate class/function (similar to
what has been done with eq/Eq/Visitor).

move to utils #68


StageCls
--------

name is unclear.  postprocess experiment not documented.  


Stage
-----

name is again unclear.


RecordBasedStage
----------------

name ("stage") unclear.  purpose (actual need) is also unclear.


StageChain
----------

name ("stage") unclear.  need for specallocate parameter is not documented


ControlBase
-----------

* name "stage" not clear
* ControlBase allocating i_data / o_data not clear
* connect() function could (does?) overwrite i_data / o_data
* role of set_input not documented (or clear)
* recursive ports() function's purpose not clear (or documented)
* _elaborate rather than just elaborate, reason not clear
* "dynamic" ready/valid (involving stage in signalling) not clear
* addition of _postprocess not been discussed (or evaluated)


BufferedHandshake
-----------------

* known bugs (#57) in interaction with other control-blocks
* *may* be possible to replace with FIFOControl (see below)
* really complex and not obvious what's going on
* name not clear


SimpleHandshake
---------------

* based on wishbone / AXI4 signalling however actual characteristics unknown
* involved in bug #57
* name not clear


UnbufferedPipeline
------------------

* based on "RegStage" from proposal, however actual characteristics unknown
* involved in bug #57
* name not clear


UnbufferedPipeline2
-------------------

* based on "BreakReadyStage" from proposal, actual characteristics unknown
* involved in bug #57
* name not clear


PassThroughStage
----------------

* very useful as a building-block
* use of a single iospecfn however stops any kind of data width processing
  (this is good?)


PassThroughHandshake
--------------------

* name not clear
* based on "BreakReadyStage" however actual characteristics unknown
* is not involved in bug #57

RegisterPipeline
----------------

* probably should not be based on UnbufferedPipeline due to known issues #57
* better name needed?


FIFOControl
-----------

* nice class!
* buffered argument has to go (use new Queue)
* use of flatten() on output causes problems for stage postprocess
* comment about "looking like PrevControl" redundant due to removal of code
* i_data requiring a "shape" function is problematic.  bug #67


Replacement classes
-------------------

all replacement classes need a proper audit and review of their own.
these are the classes with the exact same names, UnbufferedPipeline,
PassThroughHandshake, BufferedHandshake, SimpleHandshake.

should they even *be* replaced, due to the additional complexity
introduced through the use of Queue?  the graphviz layouts are a mess,
comparatively speaking.
Comment 4 Luke Kenneth Casson Leighton 2019-04-19 12:31:49 BST
review of https://salsa.debian.org/Kazan-team/simple-barrel-processor/blob/master/src/PipelineBuildingBlockAPIProposal.py

general comments (1)
----------------

hard dependence on typing is causing readability issues and
will make contributions from experienced python programmers
that much harder.  *nobody* uses typing in common-usage python,
because it gets in the way.  there is an actual known paradigm
which python conforms to, that the use of typing violates
(i just can't recall what it is, right now).

basically python is based on the Liskov Substitution Principle,
of *semantic* relationships rather than *syntactical* relationships,
and introducing syntactical relationships increases code size,
decreases readability, decreases flexibility and breaks python
developer expectations.

https://en.wikipedia.org/wiki/Liskov_substitution_principle

can the typing be moved to a .pyi file and mypi deployed *if
the user wants to use it*?

general comments (2)
----------------

as an "alternative" API, the job of morphing the existing code
(of which there is a *lot*) is made that much harder.  in
addition, several unit tests and use-cases will no longer work
as the API cannot support them.

there are over 40 unit tests dependent on simplepipe.py and
multipipe.py, including FP div, add, mul, Reservation Stations,
and much more.

all of this has to be kept operational, and an "incremental transition"
strategy developed that ensures that all unit tests are functional
at *all* times, with *every* (small) commit.

developing such a strategy is itself an awful lot of work!


Data, Shape, SignalShape, RecordShape, Empty, EmptyShape
----------.....

* addition of name extension (prefix) is a good idea.  however
  as a hard requirement (a hard-required parameter) this introduces
  a burden on developers.  if the same prefixing can be achieved in
  the same way as nmigen, that would be even better.

  nmigen uses this code (example, from Signal.__init__):

  tracer.get_var_name(depth=2 + src_loc_at, default="$signal")

* restricting data usage using a syntactical hard limit instead
  of defining a semantic relationship results in limitations
  that a normal python developer will rebel against / reject.

  the limitation of *all* data using *forcibly* being required
  to go through the "Data" class instance is also a burden that
  will result in significant user demands for "additional features"
  on the Data paradigm, to add support for unanticipated
  requirements.

* the Data paradigm needs to be much more in line with the
  nmigen "Value" object (i.e. the nmigen API, such as
  using __iter__ instead of members() (or perhaps ports()),
  having an eq (instead of "assign") function etc.
  also note that Record is going to have a "connect" function
  added (see nmigen-migen Record compat library)

  in discussions with whitequark it was agreed that at some
  point a "flattening" system would be provided such that
  the nmigen IR "Visitor" system could handle arbitrary objects.

  therefore it becomes even more important that Data conform
  to nmigen's API, to avoid the "Principle of Least Astonishment"
  when presented with the Data paradigm:

  https://en.wikipedia.org/wiki/Principle_of_least_astonishment


EntryPort (and ExitPort)
---------

* o_ready and
* i_valid names were selected due to their vertical alignment.
  this increases code clarity.

* Signals default to width 1.  the addition of the width 1 is not needed.

* the name "EntryPort" begins with the letter E, and so does "ExitPort".
  this is problematic from a clarity perspective in later usage
  (see below)

* use of Data.assign for ready_in/out and valid_in/out, they should just
  be an eq.

* connect_from_entry is not clear what it does, by way of it being
  an EntryPort connecting *to* an EntryPort.  PrevControl._connect_in
  is clear (by using the prefix underscore as a standard python convention
  equivalent to "protected" in c++) that it should not be used outside
  of the API, and its docstring warns against such.

  with the name connect_from_entry being so similar to connect_from_exit
  it is very confusing, particularly as they are the exact same names
  in ExitPort

similar logic applies to ExitPort

CombSeg
-------

* comment is good.  "side-effect-free data path", very clear.

* self.i and self.o (see above about Entry/Exit Port) get very confusing.
  the naming "prev" and "next" (p and n for short) give clarity about
  what shall be connected to what, particularly in the context of
  ready/valid signalling going *both* directions.

* the use of the function create_data *on* an object - rather than being
  a function itself - forces a requirement on chaining of segments to
  provide an *intermediary* data object from which *both* the input
  object *and* the output object are forced to inherit.

  by simply having a function, the location of that function is irrelevant.
  it could be a globally-imported function from a module.  it could be
  a lambda function.  it could be a member class function.  it could be
  a static class function.

  this is what is meant by *semantic* relationships (LSP) as opposed to
  *syntactical* relationships.

* the use of an object (shape.create_data) *may* make the use of
  static classes difficult. needs evaluation and investigation

* the hard requirement for all derivatives to be modules is unnecessary
  (particularly in simple cases)


CombSegFn
---------

* Data.eq not Data.assign is better, although just "eq" is preferred
  (which can be used as "from import library import data" then "data.eq(...)")

* not sure what this class is for.  the large amount of data typing
  information is interfering with being able to understand it.

* once removed, the typing information should result in such a small
  class that it may turn out to be unnecessary.


CombChain
---------

* data typing is interfering with readability

* first_child can be merged into children in the constructor,
  making children_tuple redundant

* use of a tuple and the forced requirement that the tuple be an array
  of CombSegs is a design-flaw hard and unnecessary limitation.

  simply passing in *children and assigning them to __children will
  allow developers to pass in anything that is iterable.  generators,
  __iter__-objects, tuples, lists, sets, dictionaries (iterating a
  dictionary returns its values) and so on.

* in elaborate() the assignment of submodules is a good idea... however
  it assumes that the object *has* an elaborate function (i.e. is
  a nmigen module), which is a burdensome requirement in simple cases

* the functionality StageChain._specallocate is missing.  some use-cases,
  if the creation of an intermediary from a module is not done, the
  entire module is "flattened" with a warning issued.


Identity
--------

* not sure what this is for


Block
-----

* self.entry and self.exit are non-matching lengths, resulting in
  loss of vertical alignment when reading and debugging the code

* reducing these to self.e and self.e is clearly not going to work

* this is why "p" and "n" were chosen in ControlBase.

* self.p and self.n are clearly "previous" and "next" yet at the same
  time provide a base to put the "input" (i) and "output (o) into.

  data is referred to as self.p.i_data and self.n.o_data which make
  it really really obvious that input data comes from the *previous*
  stage and output data goes to the *next* stage.

* why does Block.wrap exist? the need for Block and BlockLike is not
  clear, and is probably down to the (unnecessary) type-restricting


CombBlock
---------

* why does CombBlock exist?

* why is connect_from_entry not being used in elaborate()?


Chain
-----

* all the same arguments applies to first_child and children as to
  CombChain. can be greatly simplified.

* likewise same arguments apply to elaborate()

* naming convention connect_from_entry / connect_from_exit is causing
  the very confusion predicted to occur.  what's being connected to
  what?

* the exposure of the inner details of Module is unnecessary.  an
  example of the equivalent code for the singlepipe.py API is:

    def elaborate(self, platform):
        m = Module()

        pipe1 = ExampleBufPipe()
        pipe2 = ExampleBufPipe()

        m.submodules.pipe1 = pipe1
        m.submodules.pipe2 = pipe2

        m.d.comb += self.connect([pipe1, pipe2])

  in this code it is really clear, the pipes are declared, they're added
  with names *chosen by the developer* as submodules, and then connected.

  by contrast, in Chain.elaborate(), the developer has no choice about
  the names of the modules (they are forcibly called child_0/1/2/3).
  (likewise for the names of CombChain segs)


SimpleReg
---------

* Signal(1, reset=0), both width 1 and reset=0 are the defaults and may
  be removed

* super().__init__(shape, shape) is not clear

*  shape = Shape.wrap(shape) is not clear

* shape.create_data() again, see explanation about forcing the use
  of an object to become part of the API rather than allowing freedom
  and flexibility by way of a function

* self.data_out does not vertically line up with self.valid_out.

  - self.o_data lines up with
  - self.o_valid

  at least at the prefix.  code that was sequentially as follows
  would also line up, improving readability:

  self.o_data
  self.i_data
  self.o_valid
  self.i_ready

  which would make a user consider tidying them up as follows:

  self.o_valid
  self.o_data
  self.i_data
  self.i_ready


Queue
-----

* like this class a lot, prefer that it be derived from FIFOInterface,
  to make it clear that it conforms to the nmigen FIFO API.

* Queue entries can actually be zero, as long as fwft (first-word
  fall-through) is enabled.

* using the name "fwft" as opposed to "allow_comb" makes it clear
  that it has the same purpose and effect as fwft in the nmigen FIFO API.
Comment 5 Luke Kenneth Casson Leighton 2019-04-20 11:06:39 BST
i thought about the FSM case, that the stb/ack is at a "disadvantage"
due to it being 2-cycle acknowledgement... and it occurred to me
that this is just a natural consequence of an FSM within a ready/valid
prev/next pair.

in other words: even a FSM placed within a ready/valid prev/next pair
*has* to fall back to 2-cycle behaviour, and, in fact the names stb/ack
*are* ready/valid.

this down to the fact that the FSM must *only* permit a single clock
of "ready/valid" on its input, and must *only* permit a single clock
of "ready/valid" on its output.

so... i believe it will be safe to proceed with an example FSM
that's derived from ControlBase, that (unfortunately) has to handle
the ready/valid signalling itself.

if that is successful, it *may* be possible to then morph it to use
the stage-overloads d_ready and d_valid.  d_ready being raised on
the "incoming" (idle) state, and d_valid being raised on the "outgoing"
(final, data-delivery) state.

at that point, if *that* is successful, the FSM control/data block will
become *independent* of ControlBase, and may potentially be split out
to derive from Stage/StageCls and handed in as an argument to ControlBase
derivatives such as SimpleHandshake

once this is done it will be possible to place fpdiv into an ALU block,
and general development of the OoO design can proceed.
Comment 6 Luke Kenneth Casson Leighton 2019-04-20 13:00:34 BST
https://git.libre-riscv.org/?p=ieee754fpu.git;a=blob;f=src/add/test_fsm_experiment.py;h=51476ccf4be50d74b503c861085c1c67cc08fef7;hb=HEAD

success!  total shock!

        # see if connecting to stb/ack works
        m.d.comb += self.p.o_ready.eq(self.fpdiv.in_a.ack)
        m.d.comb += self.fpdiv.in_a.stb.eq(self.p.i_valid_test)

        m.d.comb += self.n.o_valid.eq(self.fpdiv.out_z.stb)
        m.d.comb += self.fpdiv.out_z.ack.eq(self.n.i_ready_test)

and... err... they do.

well, that was a lot easier than i was expecting it to be.
Comment 7 Luke Kenneth Casson Leighton 2019-04-20 16:33:35 BST
created an FPOpIn and FPOpOut which instead of deriving
from Trigger (with ack/stb) derive instead from PrevControl
and NextControl seems to work well.

the two FSMs, fpdiv and fpadd have been converted and
remain functional.

however, unit test code making them based on ControlBase
hasn't been done yet.
Comment 8 Jacob Lifshay 2019-04-26 02:26:36 BST
(In reply to Luke Kenneth Casson Leighton from comment #4)
> review of
> https://salsa.debian.org/Kazan-team/simple-barrel-processor/blob/master/src/PipelineBuildingBlockAPIProposal.py
>
> can the typing be moved to a .pyi file and mypi deployed *if
> the user wants to use it*?
yes.

> Data, Shape, SignalShape, RecordShape, Empty, EmptyShape
> ----------.....
> 
> * addition of name extension (prefix) is a good idea.  however
>   as a hard requirement (a hard-required parameter) this introduces
>   a burden on developers.  if the same prefixing can be achieved in
>   the same way as nmigen, that would be even better.
yeah, we can add that.

> * restricting data usage using a syntactical hard limit instead
>   of defining a semantic relationship results in limitations
>   that a normal python developer will rebel against / reject.
> 
>   the limitation of *all* data using *forcibly* being required
>   to go through the "Data" class instance is also a burden that
>   will result in significant user demands for "additional features"
>   on the Data paradigm, to add support for unanticipated
>   requirements.
We can override __subclasshook__ like in collections.abc.Hashable:
https://github.com/python/cpython/blob/1b5f9c9653f348b0aa8b7ca39f8a9361150f7dfc/Lib/_collections_abc.py#L84

This makes any class that implements the 2 abstract functions to be a subclass of Data automatically (except that Data's methods won't be visible unless actually deriving from Data).
> 
> * the Data paradigm needs to be much more in line with the
>   nmigen "Value" object (i.e. the nmigen API, such as
>   using __iter__ instead of members() (or perhaps ports()),
>   having an eq (instead of "assign") function etc.
>   also note that Record is going to have a "connect" function
>   added (see nmigen-migen Record compat library)
this will require every new class to actually derive from Data unless they want to implement the zillion functions themselves.

Note that, last I checked, nmigen doesn't support using classes derived from Signal or Record since nmigen uses `type(a) is type(Signal)` instead of `isinstance(a, Signal)` in a lot of its code.

> EntryPort (and ExitPort)
> ---------
> 
> * o_ready and
> * i_valid names were selected due to their vertical alignment.
>   this increases code clarity.
we can use _inp and _out or _i and _o. I think we should have the name first (direction second) because the name is the most important part

> * Signals default to width 1.  the addition of the width 1 is not needed.
ok.

> * the name "EntryPort" begins with the letter E, and so does "ExitPort".
>   this is problematic from a clarity perspective in later usage
>   (see below)
I wanted to pick a name pair other than input/output to avoid confusion with the signal directions.
What about ingress/egress, from/to, upstream/downstream, or next/previous? Or do you think input/output will be fine?
> 
> * use of Data.assign for ready_in/out and valid_in/out, they should just
>   be an eq.
They can't be a member function because Record.eq doesn't work (the simulator doesn't handle it), though I guess we could reassign Record.eq to a custom function that does work. 

> * connect_from_entry is not clear what it does, by way of it being
>   an EntryPort connecting *to* an EntryPort.  PrevControl._connect_in
>   is clear (by using the prefix underscore as a standard python convention
>   equivalent to "protected" in c++) that it should not be used outside
>   of the API, and its docstring warns against such.
> 
>   with the name connect_from_entry being so similar to connect_from_exit
>   it is very confusing, particularly as they are the exact same names
>   in ExitPort
> 
> similar logic applies to ExitPort
They need better naming. connect_from_* connects such that data is transferred from `rhs` to `self`.

> CombSeg
> -------
> 
> * comment is good.  "side-effect-free data path", very clear.
> 
> * self.i and self.o (see above about Entry/Exit Port) get very confusing.
>   the naming "prev" and "next" (p and n for short) give clarity about
>   what shall be connected to what, particularly in the context of
>   ready/valid signalling going *both* directions.
CombSeg has no ready/valid signalling, it's just a data-path.

> * the use of the function create_data *on* an object - rather than being
>   a function itself - forces a requirement on chaining of segments to
>   provide an *intermediary* data object from which *both* the input
>   object *and* the output object are forced to inherit.
No, it doesn't actually. all it requires is that the output shape of the previous CombSeg == the input shape of the current CombSeg, which is a structural comparison, not identity.

Note that Shape.wrap is used which converts whatever shape you pass in to an instance of Shape. only the Shape instance has to support create_data, whatever value you pass in does not.

>   by simply having a function, the location of that function is irrelevant.
>   it could be a globally-imported function from a module.  it could be
>   a lambda function.  it could be a member class function.  it could be
>   a static class function.
> 
>   this is what is meant by *semantic* relationships (LSP) as opposed to
>   *syntactical* relationships.
> 
> * the use of an object (shape.create_data) *may* make the use of
>   static classes difficult. needs evaluation and investigation
> 
> * the hard requirement for all derivatives to be modules is unnecessary
>   (particularly in simple cases)
It doesn't add any extra logic gates to the final design and it helps when finding the generated code for the output to be split up.
I don't want to have to debug a module made from 20 stages, because I won't be able to easily locate anything.

> CombSegFn
> ---------
> 
> * Data.eq not Data.assign is better, although just "eq" is preferred
>   (which can be used as "from import library import data" then
> "data.eq(...)")
I'm using Data as a place to stash lots of Data-related static functions. We can switch to a module if you like.

> * not sure what this class is for.  the large amount of data typing
>   information is interfering with being able to understand it.
this is for wrapping functions into CombSeg instances.

> CombChain
> ---------
> 
> * data typing is interfering with readability
> 
> * first_child can be merged into children in the constructor,
>   making children_tuple redundant
ok. I had it to enforce at least 1 child.

> * use of a tuple and the forced requirement that the tuple be an array
>   of CombSegs is a design-flaw hard and unnecessary limitation.
the *children arg is already a tuple.

>   simply passing in *children and assigning them to __children will
>   allow developers to pass in anything that is iterable.  generators,
>   __iter__-objects, tuples, lists, sets, dictionaries (iterating a
>   dictionary returns its values) and so on.
you can do that already by calling CombChain(*iterable)

> * in elaborate() the assignment of submodules is a good idea... however
>   it assumes that the object *has* an elaborate function (i.e. is
>   a nmigen module), which is a burdensome requirement in simple cases
in simple usecases just use CombSegFn(lambda ...)

> * the functionality StageChain._specallocate is missing.  some use-cases,
>   if the creation of an intermediary from a module is not done, the
>   entire module is "flattened" with a warning issued.
not sure what that's supposed to do.

> Identity
> --------
> 
> * not sure what this is for
when you need a canonical pass-through stage. probably not particularly useful.
> 
> 
> Block
> -----
> 
> * why does Block.wrap exist? the need for Block and BlockLike is not
>   clear, and is probably down to the (unnecessary) type-restricting
Block.wrap is a function that takes an instance of Block or CombSeg and returns an instance of Block. It is needed because CombSeg does not have the ready/valid signalling that Block instances do.
> 
> 
> CombBlock
> ---------
> 
> * why does CombBlock exist?
the class used by Block.wrap to wrap a CombSeg
> 
> * why is connect_from_entry not being used in elaborate()?
connect_from_entry only works on instances of EntryPort. CombSeg doesn't contain any instances of EntryPort because you had suggested that we need a class that doesn't have accessible ready/valid signalling, hence why I had added CombSeg at all and not gone with my original plan of having only Block.

> Chain
> -----
> 
> * all the same arguments applies to first_child and children as to
>   CombChain. can be greatly simplified.
> 
> * likewise same arguments apply to elaborate()
> 
> * naming convention connect_from_entry / connect_from_exit is causing
>   the very confusion predicted to occur.  what's being connected to
>   what?
see above on renaming connect_from_*.
> 
> * the exposure of the inner details of Module is unnecessary.  an
>   example of the equivalent code for the singlepipe.py API is:
> 
>     def elaborate(self, platform):
>         m = Module()
> 
>         pipe1 = ExampleBufPipe()
>         pipe2 = ExampleBufPipe()
> 
>         m.submodules.pipe1 = pipe1
>         m.submodules.pipe2 = pipe2
> 
>         m.d.comb += self.connect([pipe1, pipe2])
> 
>   in this code it is really clear, the pipes are declared, they're added
>   with names *chosen by the developer* as submodules, and then connected.
> 
>   by contrast, in Chain.elaborate(), the developer has no choice about
>   the names of the modules (they are forcibly called child_0/1/2/3).
>   (likewise for the names of CombChain segs)
Chain is supposed to be like a tuple or list. We can add an API for named submodules either by modifying Chain or by adding a named tuple variant.

> SimpleReg
> ---------
> 
> * Signal(1, reset=0), both width 1 and reset=0 are the defaults and may
>   be removed
I'd like reset=0 to be left as it makes it explicit that the reset value actually matters whereas most other signals the reset value doesn't really matter.
so Signal(reset=0)

> * super().__init__(shape, shape) is not clear
both entry and exit data shapes are shape.

> *  shape = Shape.wrap(shape) is not clear
convert from arbitrary shape to Shape instance

> * shape.create_data() again, see explanation about forcing the use
>   of an object to become part of the API rather than allowing freedom
>   and flexibility by way of a function
see above

> * self.data_out does not vertically line up with self.valid_out.
> 
>   - self.o_data lines up with
>   - self.o_valid
> 
>   at least at the prefix.  code that was sequentially as follows
>   would also line up, improving readability:
> 
>   self.o_data
>   self.i_data
>   self.o_valid
>   self.i_ready
> 
>   which would make a user consider tidying them up as follows:
> 
>   self.o_valid
>   self.o_data
>   self.i_data
>   self.i_ready
I have no opinion about the assignment order, reorder as you please.

> Queue
> -----
> 
> * like this class a lot, prefer that it be derived from FIFOInterface,
>   to make it clear that it conforms to the nmigen FIFO API.
It doesn't necessarily conform to the nmigen FIFO API.

I want it to have the same interface as Block so it can be used in Block-based pipelines without friction.

> * Queue entries can actually be zero, as long as fwft (first-word
>   fall-through) is enabled.
that would actually make it identical to Identity, so I think that won't be useful.

> * using the name "fwft" as opposed to "allow_comb" makes it clear
>   that it has the same purpose and effect as fwft in the nmigen FIFO API.
allow_comb doesn't have the same purpose or effect as fwft. It specifically allows combinational forwarding, but also enables more than that. allow_comb=False specifically requires Queue to be a Moore FSM (outputs are produced only from flip-flop outputs) instead of a Mealy FSM (outputs are produced from a combinatorial combination of flip-flop outputs and input signals).
Comment 9 Jacob Lifshay 2019-04-26 02:43:51 BST
(In reply to Luke Kenneth Casson Leighton from comment #5)
> i thought about the FSM case, that the stb/ack is at a "disadvantage"
> due to it being 2-cycle acknowledgement... and it occurred to me
> that this is just a natural consequence of an FSM within a ready/valid
> prev/next pair.
> 
> in other words: even a FSM placed within a ready/valid prev/next pair
> *has* to fall back to 2-cycle behaviour, and, in fact the names stb/ack
> *are* ready/valid.
> 
> this down to the fact that the FSM must *only* permit a single clock
> of "ready/valid" on its input, and must *only* permit a single clock
> of "ready/valid" on its output.
Actually, a FSM isn't limited to a data-transfer every other clock, as is evidenced by the fact that every digital logic circuit with only one clock and no combinatorial loops is a FSM, either a Mealy FSM or a Moore FSM.

The part that is limiting stb/ack (from what you told me) is that ack has to be de-asserted after every transfer.
Comment 10 Luke Kenneth Casson Leighton 2019-04-26 11:03:46 BST
(In reply to Jacob Lifshay from comment #8)

> this will require every new class to actually derive from Data unless they
> want to implement the zillion functions themselves.

[more on this later, just want to make sure below isn't lost.  that
 "connect" function has been committed, and *one* of the uses of
 type() has been dealt with]

> Note that, last I checked, nmigen doesn't support using classes derived from
> Signal or Record since nmigen uses `type(a) is type(Signal)` instead of
> `isinstance(a, Signal)` in a lot of its code.

this is extremely bad practice that was warned about well over 10 years
ago (and since everyone listened and stopped doing it the advice stopped
being so well known).  i am... alarmed that they're disregarding standard
practices in this way by continuing to use it.

it's sufficiently important to ask, can you raise a bugreport on github
about it, and cross-reference here and also this:
http://bugs.libre-riscv.org/show_bug.cgi?id=66

https://stackoverflow.com/a/1549854
Comment 11 Jacob Lifshay 2019-04-26 19:21:43 BST
turns out subclassing Record is supported, but not Signal.
Comment 12 Jacob Lifshay 2019-04-26 19:26:06 BST
(In reply to Luke Kenneth Casson Leighton from comment #10)
> it's sufficiently important to ask, can you raise a bugreport on github
> about it
https://github.com/m-labs/nmigen/issues/65
Comment 13 Luke Kenneth Casson Leighton 2019-04-26 20:15:58 BST
(In reply to Jacob Lifshay from comment #11)
> turns out subclassing Record is supported, but not Signal.

It is now, as of appx last week, yes.
Whitequark and I discussed this in some detail, and I managed to convince wq
to fix a bug related to iteration that
was stopping eq, Cat and many other things
from working.

Basically Record is viewed by whitequark
as a contiguous sequence of bits, hence
why it derives from Value.

In the discussions that we held she acknowledged
that that is a conceptual perspective,
reality is, the internal structure (fields)
is clearly an ordered bank of signals.

Iteration in python is done as follows:

If obj has __iter__ call it in sequence,
accumulate its yielded results, one by one

Else, create an integer sequence and call
__getattr__ until that raises an AttributeError.

The addition of __getattr__ originally was
intended to make it possible to refer to
Fields by name. rec.src1 where layout has
src1.

Unfortunately that had the unexpected side
effect of interfering with iteration.

I asked whitequark to add __setattr__
to Record, she refused, citing that it
does not fit with how Record is supposed
to be used (as a sequence of bits)

I asked whitequark to add __iter__ because
not having it causes assignment on flattening
to go haywire, creating a massive array of
bit-level assignments even between two
Records.

With __iter__ added, there is no detrimental
side effects and the assignment of
Record to a sequence of Signals is done
with the expected minimum of eq statements

This is still potentially open to debate
if we can create a suitably small demo
use case.

__setattr__ unfortunately wq remains closed
to feedback, viewing Record as a sequence
of bits, not a hierarchical sequence of
Signals-that-happen-to-be-bits.

Therefore we need to create a class that
derives from Record and adds both functions.

I considered monkey-patching Record, however
if deriving from Record is indeed supposed
to work, bugs can be raised if any uses of
type have been missed, with the monkey patch
to be removed once the bugs are fixed.

RecordObject has the two functions.
It works very well.
Comment 14 Luke Kenneth Casson Leighton 2019-04-26 20:26:31 BST
(In reply to Jacob Lifshay from comment #12)
> (In reply to Luke Kenneth Casson Leighton from comment #10)
> > it's sufficiently important to ask, can you raise a bugreport on github
> > about it
> https://github.com/m-labs/nmigen/issues/65

Sigh the correct solution is to use ABCs or duck typing (technique used in c++ RTTI), as can be found by googling in many many discussions as to why type should be the absolute last redort.

The introduction of a function or property in the base lass that each derivative sets to uniquely identify itself, that is the right solution.  An integer (enum) would be the fastest way as it is right at the top of the python FORTH-like interpreter in the c code.

Although, an enum would not be extensible unless the enum itself was dynamic.

Or, ABCs apparently can be used to do the same thing, although I hazard a guess nowhere near as fast as using an integer ducktype would be.

class Value:
    def __init__ ...
        self.nmtype = 1 # or an enum
Comment 15 Luke Kenneth Casson Leighton 2019-04-26 20:53:26 BST
> > * the Data paradigm needs to be much more in line with the
> >   nmigen "Value" object (i.e. the nmigen API, such as
> >   using __iter__ instead of members() (or perhaps ports()),
> >   having an eq (instead of "assign") function etc.
> >   also note that Record is going to have a "connect" function
> >   added (see nmigen-migen Record compat library)
> this will require every new class to actually derive from Data unless they
> want to implement the zillion functions themselves.

Two. Not zillions.  One, it has to be iterable and (eventually, at some nested depth), start yielding Signals (or Records)

And the second, I added a ports() function due to a workaround associated with ArrayProxy. This may be possible to remove at some point if the bug in nmigen is fixed.

That is the full extent of the current data API.

nothing else.

So this is how tuples, generators, lists, iterators, objects that are iterable, *anything* can be the Data API.

As long as *eventually* the iterator starts returning Signals (or Records), it is perfectly fine.

Thus there is no *need* to restrict the API to a specific object.

Put another way: if such a Data object is created, the next recommendation is to allow *lists* of such objects... and we come full circle right back to why eq() exists.

And if Data object derives from Record, then because Record supports the "zillions" of functions, and derives from Value, you might as well allow that list to have Signals and anything else derived from Value in future.

The zillions it turns out aren't zillions, it is actually just iteration capability, eq, shape and.. there's one more, I forget the name.

Everything else basically is intended to make assignment of one type of Value derivative compatible transparently with another, including when slices and arrays are involved.  Bugs notwithstanding.

So the question is, then, with iterables being capable of *being* the Data API, do we want to cut off that capability?

Or, is it possible to make Data iterable, and for it to transparently be capable of handling assignment to an iterable (list, tuple, generator, sequence)?

I suspect that the latter will be quite a bit of work, whereas eq() now basically already exists.

and works.

with over 45 use cases.

Another way to put it: can you think of a good reason to NOT make the API be "a arbitrary nested generator that eventually yields derivatives of Value"?

Where one of those objects could be Data (as long as Data derives from Record)
Comment 17 Luke Kenneth Casson Leighton 2019-04-27 14:07:46 BST
jacob let's go through these, in order, one at a time, it's the
only real sane way to get through them all, just with a lot of
patience.

on the "name=" for input data and output data spec, i added this wrapper:

+def _spec(fn, name=None):
+    if name is None:
+        return fn()
+    varnames = dict(inspect.getmembers(fn.__code__))['co_varnames']
+    if 'name' in varnames:
+        return fn(name=name)
+    return fn()
+
+

basically, this keeps the API simple.  developers who do not *want*
to add a name prefix are not forced to do so... and do not require
adding "name=None" to the input and output spec function.

https://git.libre-riscv.org/?p=ieee754fpu.git;a=commitdiff;h=8d38dc3d0ead75ff5313a653a2194c82a1ca0572

i modified one of the unit tests and visually inspected the graphviz,
it works well.  i_data_INPUTNAME and o_data_OUTPUTNAME, and intermediaries,
all good.

also, chaining (both types: combinatorial and pipelining) have intermediary
names added where (if) ispec/ospec is deployed.

i know i suggested using nmigen-style name-None overriding: it'll get
too complicated.  adding name= to input spec and output spec as originally
suggested however _will_ work... i just prefer that it not be forced onto
developers that don't want it (hence the dynamic check)
Comment 18 Luke Kenneth Casson Leighton 2019-04-27 14:10:29 BST
(In reply to Jacob Lifshay from comment #16)
> Additional reference material on FSMs:
> http://web.archive.org/web/20180720071048/http://www2.elo.utfsm.cl/~lsb/
> elo211/aplicaciones/katz/chapter8/chapter08.doc4.html

good to know (so it's not lost), will get to comment #9 and #16 later,
let's break things down and go in sequence through the review, one thing
at a time, in order (as much as possible).
Comment 19 Luke Kenneth Casson Leighton 2019-04-27 15:15:36 BST
(In reply to Jacob Lifshay from comment #8)

> > * o_ready and
> > * i_valid names were selected due to their vertical alignment.
> >   this increases code clarity.
> we can use _inp and _out or _i and _o. I think we should have the name first
> (direction second) because the name is the most important part

 yep i like that, it's also a convention used in ariane, to put the
 _o and _i as suffixes on signal names.

 i did a massive system-wide global/search/replace set of commits,
this one was i_data/o_data --> data_i/data_o

https://git.libre-riscv.org/?p=ieee754fpu.git;a=commitdiff;h=527ca98dabd8fdfad8a5f58f2d3cd859bd0443b5

unit tests all still pass (must clean those up so can just run "make"...
*sigh*)
Comment 20 Luke Kenneth Casson Leighton 2019-04-27 15:25:58 BST
(In reply to Jacob Lifshay from comment #8)

> >   the limitation of *all* data using *forcibly* being required
> >   to go through the "Data" class instance is also a burden that
> >   will result in significant user demands for "additional features"
> >   on the Data paradigm, to add support for unanticipated
> >   requirements.
> We can override __subclasshook__ like in collections.abc.Hashable:
> https://github.com/python/cpython/blob/
> 1b5f9c9653f348b0aa8b7ca39f8a9361150f7dfc/Lib/_collections_abc.py#L84
> 
> This makes any class that implements the 2 abstract functions to be a
> subclass of Data automatically (except that Data's methods won't be visible
> unless actually deriving from Data).

 https://stackoverflow.com/questions/40764347/python-subclasscheck-subclasshook

 this looks alarmingly complex.  i have learned the hard way that complex
 code that solves particularly difficult python problems using advanced
 features of python is REJECTED BY THE USERS.

 technically it is correct... yet because the features deployed are so
 hard to understand they simply do not trust it, as it requires extraordinary
 skill and intelligence to debug if it goes wrong.

 i can honestly say that detested it when this was quoted to me.

 crucially: if the data API is "an arbitrary nested generator that
 eventually yields derivatives of Value", i believe it would not
 be necessary to even use subclasscheck/hook.

 does the generator approach make sense?
Comment 21 Luke Kenneth Casson Leighton 2019-04-27 20:26:51 BST
(In reply to Jacob Lifshay from comment #8)

> > * the name "EntryPort" begins with the letter E, and so does "ExitPort".
> >   this is problematic from a clarity perspective in later usage
> >   (see below)
> I wanted to pick a name pair other than input/output to avoid confusion with
> the signal directions.
> What about ingress/egress, from/to, upstream/downstream, or next/previous?

 from/to - from is a python keyword. ingress/egress too pretentious .. :)
 upstream/downstream ok, next/previous is ok.

> Or do you think input/output will be fine?

 i am horribly confused by input/output, because the direction of
 each of ready/valid is different and opposite for input/ingress/from/upstream
 compared to output/egress/to/downstream/next.

 p and n are sufficiently radically different letters from o and i that the
 four combinations are abundantly clear.

 (basically, i have a very mild form of dyslexia)

 p.ready_i
 p.valid_o
 n.ready_o
 n.valid_i

 these are *really* clear to me.  the vertical alignment seems to do something
 that breaks the (mild) dyslexia.


 i.ready_i
 i.valid_o
 o.ready_o
 o.valid_i
 
 i have absolutely *no* idea what's going on.  which one's input? what's the
 output? which one is for ready to the output? which output?  is that output
 from the stage? or is it output from the signal...

 total utter confusion.

 similarly with connect_to_entry and connect_to_exit.  both beginning with "e"
 and both having "to" and "from", i can stare at the piece of code for ten
 to twenty minutes and not understand it.

 the use of a totally different word ("connect_in") and the inclusion of
 an underscore (on the one that's *not* supposed to be used to connect
 stage-to-stage) is sufficient to indicate "this is a private function"
 and the logic/letter-based dyslexia is broken.


> > * use of Data.assign for ready_in/out and valid_in/out, they should just
> >   be an eq.
> They can't be a member function because Record.eq doesn't work (the
> simulator doesn't handle it),

 i was referring only to ready_in/out and valid_in/out, not Data in/out.
 ready_in/out are pure signals (not Data), and therefore eq works fine.
 (i.e. it's just Signal.eq, not Data.eq and not Record.eq)


> though I guess we could reassign Record.eq to
> a custom function that does work. 

 whilst Record has been "fixed", i will be absolutely bluntly honest: i am
 not happy to be beholden to the way that nmigen's development is being
 handled.

 we are at the mercy of not just whether they *decide* to fix a problem,
 even more than that, the main developer has expressed a clear and
 unequivocable hatred of python, a disregard for standard python
 conventions that have taken over twenty years to establish, and expects
 *python* to fix its "problems" rather than understand that not following
 its long-established conventions has massive detrimental knock-on effects
 on its users.

 i was extremely shocked to be told, on mentioning the (many) dangers of
 use of python wildcard imports, "python should not have been designed to
 be problematic, they should go fix that, it's not my problem".


 the global eq(), shape() and cat() functions (now moved to nmobject.py
 pending a decision) are under *our* control.

 Record.eq is not.

 i have already had to add workarounds into both eq() and cat() that
 take into consideration both bugs in nmigen and intransigence in its
 key developer.

 we cannot have our entire project at the mercy of a team that is overloaded
 and does not understand or respect python conventions, and who have
 demonstrated an irrational and pathological unwillingness to listen.

 this said despite the fact that nmigen is otherwise extremely good,
 and, i believe, worth sticking with.
Comment 22 Luke Kenneth Casson Leighton 2019-04-27 20:38:01 BST
(In reply to Jacob Lifshay from comment #8)

> > * Data.eq not Data.assign is better, although just "eq" is preferred
> >   (which can be used as "from import library import data" then
> > "data.eq(...)")
> I'm using Data as a place to stash lots of Data-related static functions. We
> can switch to a module if you like.

 temporarily nmoperator.py (would like it to be e.g. nmutils.operator)

 http://bugs.libre-riscv.org/show_bug.cgi?id=68

 amazingly, only 3 or 4 actual functions are needed.  eq, shape and cat
 are three key functions:

 * eq is obvious [at least, if nmigen's eq is understood, it's obvious]

 * shape and cat are needed in order to flatten data down and back out
   to and from a straight signal.

 shape and cat get used inside the FIFOControl object (which in turn
 uses that excellent Queue module that you wrote).

 shape() is used to find out, from the *recursively-iterable*
 thing-of-Value-derivatives, what size of Signal() would be needed to store it
 (as a sequence of bits)

 cat() is used to flatten() and *de*-flatten() the incoming and outgoing
 hierarchy-of-Value-derivatives so that ONLY ONE Queue object is needed.

 without cat() and shape() it would be an awful mess-of-code, analysing
 the data and recursively allocating (*and connecting up*) a massive suite
 of Queue objects.

 this prospect was so dreadful that i added the two extra functions to the
 data-handling API.

 if it wasn't for that, the data-handling API would *literally* be one
 function:

 eq()!
Comment 23 Luke Kenneth Casson Leighton 2019-04-27 20:47:38 BST
(In reply to Jacob Lifshay from comment #8)
> > * Signal(1, reset=0), both width 1 and reset=0 are the defaults and may
> >   be removed
> I'd like reset=0 to be left as it makes it explicit that the reset value
> actually matters whereas most other signals the reset value doesn't really
> matter.
> so Signal(reset=0)

 ok - yes, the usual convention there is to add a comment, so that people
 are not caught out by surprise at the explicit mention of something that's
 known to be a default value.

 which is ok with me, if you feel it helps readability, to do that.

 however... this masks something that's really important: if the reset value
 doesn't matter, it's *really important* to put "reset_less=True" to not only
 make that absolutely clear, but to *STOP* nmigen from generating unnecessary
 gates that the optimiser will in *no way* be able to spot.

 which is why i mentioned in an earlier comment that reset_less=True should,
 really, be the preferred... what's the word... "pattern" / "distinguisher".


 in addition, one thing that i've found you really *really* have to watch
 out for: if the width is not equal to 1, sometimes setting an integer
 constant is NOT ACCURATE.

 it results in assignment of the WRONG NUMBER OF BITs.  i have seen only
 the LSB(s) get set due to incorrect (implicit) width detection that goes
 wrong.

 you can see evidence of having to work around this bug in the FPU exponent
 handling, where i have to create Const() objects of specific lengths
 to match what the exponent's bitwidth is *SUPPOSED* to be.

 basically what i am saying is: be careful here! :)
Comment 24 Luke Kenneth Casson Leighton 2019-04-27 21:26:09 BST
(In reply to Jacob Lifshay from comment #8)

> > * first_child can be merged into children in the constructor,
> >   making children_tuple redundant
> ok. I had it to enforce at least 1 child.

 [see below...]
 
> > * use of a tuple and the forced requirement that the tuple be an array
> >   of CombSegs is a design-flaw hard and unnecessary limitation.
> the *children arg is already a tuple.

 [...]
 
> >   simply passing in *children and assigning them to __children will
> >   allow developers to pass in anything that is iterable.  generators,
> >   __iter__-objects, tuples, lists, sets, dictionaries (iterating a
> >   dictionary returns its values) and so on.
> you can do that already by calling CombChain(*iterable)

 yes... except... that's... ok, there's several aspects to this.

(1) in some ways, abuse of a function is not really Your Problem
    (as the developer).

    think of it this way: if the function's called with the wrong
    number of arguments (zero), it's going to barf (throw an exception)
    if you don't have a check, and it's going to barf (throw an exception)
    if you *do* have a check.

    therefore... um... save yourself some typing... and don't bother
    checking!

    now, if this was a statically-typed language, this situation just
    wouldn't happen.

    however, this "bug" will in no way be detected by pylint or anything
    like it... so just accept it, roll with it, and save the typing.

    less code is better.

    however... see beloW (at end)

(2) if you can save yourself some typing by not coding "defensively"
    you can *also* save yourself some typing by not having to (unnecessarily)
    construct a tuple.

(3) if extra arguments are needed (which they are: see specallocate,
    i updated the docstring to explain why it may be needed), it gets...
    confusing.  specallocate (a boolean) *might* be accidentally included
    in the *args (either visually on reading or in actual code)

(4) *args is typically used to pass in *disparate* (variable numbers of)
    arguments, like this:

    def fn(*args, **kwargs):
       if len(args == 2):
           x, y = args
           z = "a default string"
       elif len(args == 3):
           x, y, z = args

    although, what you're supposed to do, obviously, is have z in the
    keywordargs.

    in this particular case, however, you *know* that the children is
    a variable-length list.

so... it's kiiinda abusing the *args paradigm, it's not really what
it was designed for, it costs CPU cycles, and places limitations on
what the developer can (has) to do, and could be confusing.

anyway.

you'll see, despite all that, i did add assertions on StageChain's
children list argument.  the reason is, the error (the exception)
may occur further into the development of the code, i.e. not *at*
the point at which StageChain is used.

this is a valid / legitimate reason for having the assert check.
however... overloading the *args capability of python... no so much.
Comment 25 Luke Kenneth Casson Leighton 2019-04-27 23:26:58 BST
(In reply to Luke Kenneth Casson Leighton from comment #22)
> (In reply to Jacob Lifshay from comment #8)
> 
> > > * Data.eq not Data.assign is better, although just "eq" is preferred
> > >   (which can be used as "from import library import data" then
> > > "data.eq(...)")
> > I'm using Data as a place to stash lots of Data-related static functions. We
> > can switch to a module if you like.
> 
>  temporarily nmoperator.py (would like it to be e.g. nmutils.operator)

replaced several uses of eq, and the (few) where shape() and cat() are used,
with explicit "nmoperator.eq()" etc.

https://git.libre-riscv.org/?p=ieee754fpu.git;a=commitdiff;h=67759929b06f91493c93a533018f9b4653e54b5f

there may be more of those to catch.  unit tests still operational...
Comment 26 Luke Kenneth Casson Leighton 2019-04-28 12:25:32 BST
(In reply to Jacob Lifshay from comment #8)

> > * the use of the function create_data *on* an object - rather than being
> >   a function itself - forces a requirement on chaining of segments to
> >   provide an *intermediary* data object from which *both* the input
> >   object *and* the output object are forced to inherit.
> No, it doesn't actually. all it requires is that the output shape of the
> previous CombSeg == the input shape of the current CombSeg, which is a
> structural comparison, not identity.

 you may have misunderstood as i didn't give an example.  let me try
 to write one.  the issue is:

 * that there is only one name for the function that creates data.
 * that in creating a Chain of Blocks, each Block can no longer
   tell the chaining function which data is its input and which its output

 so a Stage (Block) is declared... except to be honest i cannot for the
 life of me see how to even start to do one Stage, let alone chain two
 together!

 which brings me on to something very important - a flaw in the proposed
 setup of EntryPort and ExitPort - that i missed before.

 i have encountered situations - usually involving modules - where it is
 NOT POSSIBLE to determine, at constructor time, what data is supposed to
 go into the Stage.

 the only way to know is during elaboration.

 yet, the elaborate function is clearly inappropriate for overloading as
 a constructor.

 in addition, i also encountered situations - again, involving modules -
 where constructing the data ***FAILS*** because the nmigen Modules on
 which it critically depends have not yet been set up.  only if those
 modules are correctly initialised during *elaboration* (and the data
 constructed at that time as well) will it work.


 *this is one of the many reasons i separated Data out from the I/O control*.

 def EntryPort
    def __init__(self, name: str):
--->    self.data_in = None # CAN ONLY BE SAFELY SET UP LATER  <----
        self.ready_out = Signal(1, name=name + "_ready_out")
        self.valid_in = Signal(1, reset=0, name=name + "_valid_in")


 so the subdivision of "labour" so to speak is:

 * A "convention" for Data [a generator with eventual yield of something
   that derives from Value.  yes that includes Consts!]

 * PrevControl for IO signalling to previous stage (no data involved)

 * NextControl for IO signalling to next stage (no data involved)

 * Stage API with ispec, ospec and process function for combinatorial block
   and specifying both the input data format and output data format

   NO IO SIGNALLING IS INVOLVED.  clean separation of data from IO

 * StageChain conforming to the same DATA-HANDLING-ONLY Stage API to
   chain (potentially hierarchically) Stages together

   NO IO SIGNALLING IS INVOLVED.  clean separation of data from IO

 * ControlBase which brings Stage (data) together with Prev and Next
   IO Control

   *THIS* is where IO signalling meets Data... yet they are still separate

 * ControlBase.connect for connecting chains of ControlBase instances together
   (creating actual multi-stage pipelines in the process)

 * Derivatives of ControlBase actually manage the prev/next sync, and
   pass data around, taking care of when it is "processed", according to
   the design of that *derivative*... *NOT* when the *STAGE* decides data
   shall be processed.


 you are proposing:

 * Data for mandatory control of data formats (module or class)

 * Shape for mandatory restriction of data allocation through one and only
   one specifically-named member function of Data (create_data).

 * EntryPort for IO signalling to previous stage *AND* allocation of DATA
   (which will fail under certain circumstances)
 
 * ExitPort for IO signalling to next stage *AND* allocation of DATA
   (which will fail under certain circumstances)

 * CombSeg for bringing previous and next data together (despite the
   fact that Block *ALSO* brings data together).

   NO PROCESSING IS INVOLVED OR POSSIBLE (that's CombSegFn)

 * CombSegFn for providing a means of (combinatorially) linking input data
   to processing to output.

   This is basically functionally directly equivalent to the Stage API
   (except that it is mandatory that it be a nmigen module, where the
    Stage API permits non-module scenarios, hence the difference between
    Stage.process and Stage.setup)

 * Block for bringing EntryPort and ExitPort together (despite CombSeg
   already *having* Data)

 * CombBlock as a really unclear (and unnecessary) workaround for CombSeg
   already having in/out Data...


the proposed API goes awry in several places:

* certain Blocks may wish to pre-process or post-process, depending on
  design.  CombSegFn prevents and prohibits that decision-making process
  by *BLOCKS THEMSELVES* - by *design*

* Entry/ExitPort creating data in the constructor runs into difficulties
  with nmigen module elaboration

* several design restrictions which limit the scope of possibilities.

* a workaround (Block.wrap) for double-allocation (by design) of data,
  where only one class is strictly necessary to take responsibility for
  data emplacement.
Comment 27 Luke Kenneth Casson Leighton 2019-04-28 13:11:24 BST
Hendrik Boom via lists.libre-riscv.org 
12:21 PM (47 minutes ago)

> Even without dyslexia, I find it useful to have significant visual 
> differences between words of the same kind.  If one reads text with an 
> expectation of its meaning (the usual way) is is very easy to misread 
> the word that's present as the word you expect.

exactly.  turns out that most humans can easily read sentences where the
middle letters of all words are jumbled up.  the brain skips all but
the first and last letter for context / meaning.
Comment 28 Luke Kenneth Casson Leighton 2019-04-28 18:38:50 BST
jacob i'm going to think for a few days about CombSegFn as despite
the potential limitations i like the idea of data being transparently
transformed from an input format to an output format.
Comment 29 Luke Kenneth Casson Leighton 2019-04-28 21:22:46 BST
(In reply to Luke Kenneth Casson Leighton from comment #28)
> jacob i'm going to think for a few days about CombSegFn as despite
> the potential limitations i like the idea of data being transparently
> transformed from an input format to an output format.

Turns out that every single instance where data_i is used it goes directly through the Stage.process() function.

Therefore, fascinatingly, the StageCombFn would work well, removing the processing from ControlBase and making it the Stage block's responsibility.

If needed.

I wonder if this can be done with decorators in some way.
Comment 30 Luke Kenneth Casson Leighton 2019-04-28 23:01:44 BST
(In reply to Luke Kenneth Casson Leighton from comment #29)
> (In reply to Luke Kenneth Casson Leighton from comment #28)
> > jacob i'm going to think for a few days about CombSegFn as despite
> > the potential limitations i like the idea of data being transparently
> > transformed from an input format to an output format.
> 
> Turns out that every single instance where data_i is used it goes directly
> through the Stage.process() function.

 damn.  except in MultiOut (which can be fixed) and MultiIn (which can't).

 MultiIn needs to select the input source *before* processing, otherwise
 it will result in multiple processing blocks being created (one for
 every muxed-in data source) if the data source itself is considered
 to be a transparent self-processing combinatorial block (aka "CombStageFn").

 if however *post* processing was permitted, that would solve the problem.
Comment 31 Luke Kenneth Casson Leighton 2019-04-29 01:19:45 BST
https://git.libre-riscv.org/?p=ieee754fpu.git;a=commitdiff;h=2b26be5b974d43047bf096a555408a62bab2c4eb

i hate it when this happens!

all that succeeded in doing was: putting a totally useless wrapper
around Stage.

greaaat.

am going to back out these last few commits.

what *did* come out of the exercise was: it would actually be much
more practical, i believe, to make "Stage.process()" optional.

as in, if it does not exist, no processing occurs.

as in: if there is no process() function, the Stage instance is
basically a pass-through.
Comment 32 Luke Kenneth Casson Leighton 2019-04-29 04:19:52 BST
ok so now Stage API process() function is optional, which has a few side-effects:

(1) the Stage API is now semi-equivalent to CombSeg (key difference being
    that the Stage API may be complied with by a static class)

(2) if a process() function is not provided, it is effectively a declaration
    of "pass-through" capability

(3) if a process() function *is* provided, it is effectively directly
    equivalent to CombSegFn, the key difference being that CombSegFn
    is a nmigen module, which presents some awkwardness in setting up
    (that Stage API setup() was specifically added to avoid)

i also found that i quite liked the wrapper class... so despite reverting
the changes, ended adding it sort-of back in, just in a different way

also i kept the ControlBase.set_data function as its use simplifies
ControlBase.connect (makes it look cleaner)
Comment 33 Luke Kenneth Casson Leighton 2019-04-29 05:16:31 BST
fascinating.  i just split out everything associated with the Stage API
into its own separate module... and there is not a single import from
nmigen anywhere in existence.

this i would consider to be a really good sign.  yes, really: Stage API,
which is *handling* Combinatorial Block Processing and *defining* data
formats, has *no imports of or anything to do with nmigen*

the only evidence that it "touches" nmigen in any way is through
use of nmoperator.eq and in the case of setup() it assigns some
eq()s to m.d.comb.

the Stage API *does not "Do"*, it *Defines*.
Comment 34 Luke Kenneth Casson Leighton 2019-04-30 00:03:22 BST
(In reply to Jacob Lifshay from comment #8)

> > * why does Block.wrap exist? the need for Block and BlockLike is not
> >   clear, and is probably down to the (unnecessary) type-restricting
> Block.wrap is a function that takes an instance of Block or CombSeg and
> returns an instance of Block. It is needed because CombSeg does not have the
> ready/valid signalling that Block instances do.

interestingly this still didn't help me to understand.  it was only
when i realised that Block holds duplicate (redundant) data-specs to
CombSeg that i finally *deduced* that Block.wrap must exist as a means
to make CombSeg look like a Block, by connecting CombSeg's in/out data
to Block's in/out-data.

that this took me several attempts to understand, when i have been looking
at the code for some considerable time, has me concerned (as does the
duplication).

i am much more comfortable with Block not having data *until* it is
passed a Stage, where Stage says what the data format *is*.  not
*contains* the data: contains (specifies) what the *format* is.

Stage API specifies the format of the data and specifies how it is processed.

connecting Stage API compliant "things" together creates a combinatorial
chain that *STILL DOES NOT CONTAIN ANY DATA*.

when calling StageChain with "nospecallocate" it results in a *python*
function-chain which is in effect:

def process(i):
    return chain[0].process(chain[1].process(chain[2].process(i))))

this *still* does not allocate any data.

however due to some issues with nmigen modules, i added a variant on that
which can actually allocate *intermediary* ispec/ospec objects.  the python
intermediary ispec/ospec objects are not actually stored anywhere, and it's
important to appreciate that they do not have to be.  they are still part
of the nmigen HDL/AST.

in this way, it is possible to use and chain together *even static classes*,
because by virtue of the static classes not containing any actual data,
it's perfectly fine to chain their process() functions together.
Comment 35 Luke Kenneth Casson Leighton 2019-04-30 00:28:10 BST
(In reply to Jacob Lifshay from comment #8)

> >   self.o_valid
> >   self.o_data
> >   self.i_data
> >   self.i_ready
> I have no opinion about the assignment order, reorder as you please.

 it's not about me, it's about other people (not necessarily me)
 not being visually confused because there are clear and clean
 patterns in the code that make "differences" easier to spot.

> > Queue
> > -----
> > 
> > * like this class a lot, prefer that it be derived from FIFOInterface,
> >   to make it clear that it conforms to the nmigen FIFO API.
> It doesn't necessarily conform to the nmigen FIFO API.
> 
> I want it to have the same interface as Block so it can be used in
> Block-based pipelines without friction.

 that's what FIFOControl is for: to provide a transition between
 the API-of-Queue and the API-of-ControlBase (Block-esque)
 
> > * Queue entries can actually be zero, as long as fwft (first-word
> >   fall-through) is enabled.
> that would actually make it identical to Identity, so I think that won't be
> useful.

from a programmatic perspective, the way i see it, it would be very useful.
the alternative would have to be this:

class FIFOControl:
    def __init__(self, depth....):
        if depth == 0:
            # ARGH!  we actually needed to *derive* from Identity!
            # TODO: some sort of god-awful meta-classing
        else:
            # ARGH!  we need to meta-derive from Queue here!

or... *shudder*... provide wrapper-functions or worse have logic that
says, throughout the *entirety* of FIFOControl, "if depth == 0 do something
involving Identity else do something involving Queue"

it may actually turn out that that's what's needed anyway, so as to skip
some of the more complex parts of Queue when depth == 0.

it creates zero-width Signals to do so!

by being able to dynamically set the depth argument even to zero, 
it becomes possible to experiment *without* major code rewrites.


> > * using the name "fwft" as opposed to "allow_comb" makes it clear
> >   that it has the same purpose and effect as fwft in the nmigen FIFO API.
> allow_comb doesn't have the same purpose or effect as fwft. It specifically
> allows combinational forwarding, but also enables more than that.
> allow_comb=False specifically requires Queue to be a Moore FSM (outputs are
> produced only from flip-flop outputs) instead of a Mealy FSM (outputs are
> produced from a combinatorial combination of flip-flop outputs and input
> signals).

okay! this is very interesting.  i've been looking for these definitions
for some time.

can you please check that i have these comments correct:

https://git.libre-riscv.org/?p=ieee754fpu.git;a=commitdiff;h=ab40d12129532bb5da02f2e8b9d10107069f00a0

if not, can you please make the necessary modifications, or explain it
clearly enough, or provide a very simple wrapper class around FIFOControl
that performs the conversion of fwft/pipe semantics into what you mean?
Comment 36 Luke Kenneth Casson Leighton 2019-04-30 01:05:42 BST
(In reply to Jacob Lifshay from comment #9)

> > this down to the fact that the FSM must *only* permit a single clock
> > of "ready/valid" on its input, and must *only* permit a single clock
> > of "ready/valid" on its output.
> Actually, a FSM isn't limited to a data-transfer every other clock, as is
> evidenced by the fact that every digital logic circuit with only one clock
> and no combinatorial loops is a FSM, either a Mealy FSM or a Moore FSM.

 ... so... this is useful definitions... the FPDIV code is a...
 hang on let me check the wikipedia page, damn their names both begin with M.
 Mealy.  the FPDIV code is a Mealy FSM.  not pure combinatorial.
 
> The part that is limiting stb/ack (from what you told me) is that ack has to
> be de-asserted after every transfer.

 ah this may be a.. ok, so read again here:
 https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html

 the data is transferred on every clock where READY=TRUE and VALID=TRUE.
 or, more to the point, it is *required* that data be accepted when those
 conditions are true.

 so if there are two clocks in which those conditions are true, we can
 logically deduce that *TWO* pieces of data have been transferred.

 thus, if ACK is *NOT* de-asserted, and the user happens accidentally to
 leave STB asserted, *TWO* pieces of data will be transferred into the FSM...

 ...one of which will be DESTROYED.

 if the FSM was not an FSM, it would be capable of receiving data on
 each and every clock, letting data in by leaving ACK high.

 now, we could indeed have two separate FSMs, one for input and one for
 output, however that's just not how this *particular* example has been
 designed.

 that does not mean that a twin-FSM could not be designed: it's just that
 this particular one has not.

 i am going to see if further topological morphing can make the FPDIV
 example conform to the Stage API.  i believe it can.
Comment 37 Jacob Lifshay 2019-04-30 02:58:20 BST
(In reply to Luke Kenneth Casson Leighton from comment #20)
> (In reply to Jacob Lifshay from comment #8)
> 
> > >   the limitation of *all* data using *forcibly* being required
> > >   to go through the "Data" class instance is also a burden that
> > >   will result in significant user demands for "additional features"
> > >   on the Data paradigm, to add support for unanticipated
> > >   requirements.
> > We can override __subclasshook__ like in collections.abc.Hashable:
> > https://github.com/python/cpython/blob/
> > 1b5f9c9653f348b0aa8b7ca39f8a9361150f7dfc/Lib/_collections_abc.py#L84
> > 
> > This makes any class that implements the 2 abstract functions to be a
> > subclass of Data automatically (except that Data's methods won't be visible
> > unless actually deriving from Data).
> 
>  https://stackoverflow.com/questions/40764347/python-subclasscheck-
> subclasshook
> 
>  this looks alarmingly complex.  i have learned the hard way that complex
>  code that solves particularly difficult python problems using advanced
>  features of python is REJECTED BY THE USERS.
> 
>  technically it is correct... yet because the features deployed are so
>  hard to understand they simply do not trust it, as it requires extraordinary
>  skill and intelligence to debug if it goes wrong.
> 
>  i can honestly say that detested it when this was quoted to me.
> 
>  crucially: if the data API is "an arbitrary nested generator that
>  eventually yields derivatives of Value", i believe it would not
>  be necessary to even use subclasscheck/hook.
> 
>  does the generator approach make sense?
I had been planning on the Data API being a tree where the links were named, similar to directories and files in a filesystem. This is similar to a subset of C that has only structs and integers.

I think the nested tree of iterables makes more sense, though we will have to be careful about member variable ordering and that dict-like objects return the list of keys rather than values.
Comment 38 Jacob Lifshay 2019-04-30 03:27:08 BST
(In reply to Luke Kenneth Casson Leighton from comment #21)
> (In reply to Jacob Lifshay from comment #8)
> 
> > > * the name "EntryPort" begins with the letter E, and so does "ExitPort".
> > >   this is problematic from a clarity perspective in later usage
> > >   (see below)
> > I wanted to pick a name pair other than input/output to avoid confusion with
> > the signal directions.
> > What about ingress/egress, from/to, upstream/downstream, or next/previous?
> 
>  from/to - from is a python keyword. ingress/egress too pretentious .. :)
>  upstream/downstream ok, next/previous is ok.
> 
> > Or do you think input/output will be fine?
>  ...
>  total utter confusion.
I actually think we should use something more like input/output, but not literally input/output, because I wanted to emphasize that the pipeline building blocks API should support arbitrary directed graphs (multiple entries/exits per block), not just linear graphs, hence why I picked entry/exit.
If I have to choose from upstream/downstream and next/previous, I have a mild preference for upstream/downstream, though I think entry/exit could also work if abbreviated to e/x.

>  similarly with connect_to_entry and connect_to_exit.  both beginning with
> "e"
>  and both having "to" and "from", i can stare at the piece of code for ten
>  to twenty minutes and not understand it.
> 
>  the use of a totally different word ("connect_in") and the inclusion of
>  an underscore (on the one that's *not* supposed to be used to connect
>  stage-to-stage) is sufficient to indicate "this is a private function"
>  and the logic/letter-based dyslexia is broken.
The connect_from_* functions are not supposed to be private functions. I don't currently have any better suggestions, except for .eq (where the direction is how data is assigned) or something like that.

> > > * use of Data.assign for ready_in/out and valid_in/out, they should just
> > >   be an eq.
> > They can't be a member function because Record.eq doesn't work (the
> > simulator doesn't handle it),
> 
>  i was referring only to ready_in/out and valid_in/out, not Data in/out.
>  ready_in/out are pure signals (not Data), and therefore eq works fine.
>  (i.e. it's just Signal.eq, not Data.eq and not Record.eq)
I was using Data.assign for consistency's sake, am perfectly fine switching to Signal.eq.

> > though I guess we could reassign Record.eq to
> > a custom function that does work. 
> 
>  whilst Record has been "fixed", i will be absolutely bluntly honest: i am
>  not happy to be beholden to the way that nmigen's development is being
>  handled.
> 
>  ...
> 
>  the global eq(), shape() and cat() functions (now moved to nmobject.py
>  pending a decision) are under *our* control.
> 
>  Record.eq is not.
> 
>  i have already had to add workarounds into both eq() and cat() that
>  take into consideration both bugs in nmigen and intransigence in its
>  key developer.
> 
>  we cannot have our entire project at the mercy of a team that is overloaded
>  and does not understand or respect python conventions, and who have
>  demonstrated an irrational and pathological unwillingness to listen.
> 
>  this said despite the fact that nmigen is otherwise extremely good,
>  and, i believe, worth sticking with.

In my opinion, Record should not be a Value, since a Value behaves like a Signal, and the only shared operations are assignment (though that technically is different because Record members have a direction and .connect is named differently). This is another instance of abusing inheritance to share implementations, rather than because Record is a Value.

I guess this is (since there is no documentation to state otherwise) because I see a Record as a tree of named Signals rather than a Signal that has parts that happen to be named.

I guess this is more of Rust rubbing off on me, since in Rust most things are done through traits (interfaces) and not through specific types. This is less restrictive than class inheritance in that a type implementing some trait is independent of implementing some other trait unless the traits are related.

This is similar to how types work in python except that, in python, the traits required are stated in the documentation (we hope) rather than in the actual code.
Comment 39 Jacob Lifshay 2019-04-30 03:42:09 BST
(In reply to Luke Kenneth Casson Leighton from comment #22)
> (In reply to Jacob Lifshay from comment #8)
> 
> > > * Data.eq not Data.assign is better, although just "eq" is preferred
> > >   (which can be used as "from import library import data" then
> > > "data.eq(...)")
> > I'm using Data as a place to stash lots of Data-related static functions. We
> > can switch to a module if you like.
> 
>  temporarily nmoperator.py (would like it to be e.g. nmutils.operator)
> 
>  http://bugs.libre-riscv.org/show_bug.cgi?id=68
> 
>  amazingly, only 3 or 4 actual functions are needed.  eq, shape and cat
>  are three key functions:
> 
>  * eq is obvious [at least, if nmigen's eq is understood, it's obvious]
> 
>  * shape and cat are needed in order to flatten data down and back out
>    to and from a straight signal.
> 
>  shape and cat get used inside the FIFOControl object (which in turn
>  uses that excellent Queue module that you wrote).
I don't think we should use nmigen's FIFO interface, having an interface like was in the original Chisel code with a separate enq (entry) and deq (exit) port means that Queue is a Block rather than changing Queue to fit nmigen's FIFO interface only to then need to adapt back to the 2-port interface everywhere.

> 
>  shape() is used to find out, from the *recursively-iterable*
>  thing-of-Value-derivatives, what size of Signal() would be needed to store
> it
>  (as a sequence of bits)
I think we should call it bitlen or something like that, to me shape implies returning the type of the Data rather than the number of bits inside it. In C++ it would be sizeof vs. decltype.
> 
>  cat() is used to flatten() and *de*-flatten() the incoming and outgoing
>  hierarchy-of-Value-derivatives so that ONLY ONE Queue object is needed.
I think we should do that inside the Queue implementation (or better yet, inside Memory, which is what the original Chisel code does).
> 
>  without cat() and shape() it would be an awful mess-of-code, analysing
>  the data and recursively allocating (*and connecting up*) a massive suite
>  of Queue objects.
> 
>  this prospect was so dreadful that i added the two extra functions to the
>  data-handling API.
> 
>  if it wasn't for that, the data-handling API would *literally* be one
>  function:
> 
>  eq()!

We need at least one more function, create a new Data that shares the type of an existing one. I did that using Shape.get(data).create_data(), similar to python type(value)(). This is used to create Data instances for internal registers and what-not.
Comment 40 Jacob Lifshay 2019-04-30 03:50:07 BST
(In reply to Luke Kenneth Casson Leighton from comment #23)
> (In reply to Jacob Lifshay from comment #8)
> > > * Signal(1, reset=0), both width 1 and reset=0 are the defaults and may
> > >   be removed
> > I'd like reset=0 to be left as it makes it explicit that the reset value
> > actually matters whereas most other signals the reset value doesn't really
> > matter.
> > so Signal(reset=0)
> 
>  ok - yes, the usual convention there is to add a comment, so that people
>  are not caught out by surprise at the explicit mention of something that's
>  known to be a default value.
> 
>  which is ok with me, if you feel it helps readability, to do that.
> 
>  however... this masks something that's really important: if the reset value
>  doesn't matter, it's *really important* to put "reset_less=True" to not only
>  make that absolutely clear, but to *STOP* nmigen from generating unnecessary
>  gates that the optimiser will in *no way* be able to spot.
> 
>  which is why i mentioned in an earlier comment that reset_less=True should,
>  really, be the preferred... what's the word... "pattern" / "distinguisher".
I didn't use reset_less=True because that way we can prevent "optimizations" that depend on the values being X and going to town removing critical logic. If nmigen can somehow guarantee that those undesired optimizations aren't occurring, then I agree that reset_less=True should be the default.
> 
> 
>  in addition, one thing that i've found you really *really* have to watch
>  out for: if the width is not equal to 1, sometimes setting an integer
>  constant is NOT ACCURATE.
that's an nmigen bug then. If you find any particular cases, report them to the nmigen project.
Comment 41 Luke Kenneth Casson Leighton 2019-04-30 10:35:34 BST
(In reply to Jacob Lifshay from comment #37)

> >  does the generator approach make sense?
> I had been planning on the Data API being a tree where the links were named,
> similar to directories and files in a filesystem. This is similar to a
> subset of C that has only structs and integers.

 yyehh, the names can be flattened at the IR (verilog) level because the
 IR (verilog) does not support objects.  Record uses underscores to
 concatenate the name of the parent object (if given) to the child
 object (if given, one being auto-created if it isn't)

 _spec now exists and it actually dynamically analyses the function it's
 calling to find out if it has a name or not:

https://git.libre-riscv.org/?p=ieee754fpu.git;a=commitdiff;h=bc7bbd60c80c44fc6d675c727938c289e2b9a631

 so this gives developers the *option* to explicitly give more complex
 structures a bit more information.

 the reason for doing so is that i looked at the code that *requires*
 name, and went, "ugh, ugly".

    def ispec(self, name):
        return (Signal(16, name="%s_sig1" % name),
                Signal(16, name="%s_sig2" % name))

 that's ugly....

 class Example2OpClass:
    def __init__(self):
        self.op1 = Signal(16)
        self.op2 = Signal(16)

    def ispec(self):
        return Example2OpClass()

 ... that's not.


> I think the nested tree of iterables makes more sense, though we will have
> to be careful about member variable ordering and that dict-like objects
> return the list of keys rather than values.

yes, absolutely: the only reason that Visitor2 accepts a dict as an option
is because when the input is viewed as free-form "things that happen to have a
__getitem__ function", the actual ordering of the application of the
function on the matched pair doesn't actually matter.

reason: the application of the function to a matched-pair (like zip)
is *independent* of the application of the function to any *other*
matched-pair.

where it really really would start to matter is in the single-visitor
case (where i've not added support for dicts for precisely this reason)
where typically that's used to concatenate the leaf-nodes together, i.e.
the ordering is critically important.

mandating use of OrderedDict makes that problem go away, with the penalty
that the freedom within Visitor2 (where it really doesn't matter) is
(unnecessarily) curtailed.

needs thought.  and/or properly documenting.
Comment 42 Luke Kenneth Casson Leighton 2019-04-30 13:12:11 BST
(In reply to Jacob Lifshay from comment #38)
> (In reply to Luke Kenneth Casson Leighton from comment #21)
> > (In reply to Jacob Lifshay from comment #8)
> > 
> > > > * the name "EntryPort" begins with the letter E, and so does "ExitPort".
> > > >   this is problematic from a clarity perspective in later usage
> > > >   (see below)
> > > I wanted to pick a name pair other than input/output to avoid confusion with
> > > the signal directions.
> > > What about ingress/egress, from/to, upstream/downstream, or next/previous?
> > 
> >  from/to - from is a python keyword. ingress/egress too pretentious .. :)
> >  upstream/downstream ok, next/previous is ok.
> > 
> > > Or do you think input/output will be fine?
> >  ...
> >  total utter confusion.
> I actually think we should use something more like input/output, but not
> literally input/output, because I wanted to emphasize that the pipeline
> building blocks API should support arbitrary directed graphs (multiple
> entries/exits per block), 

 see multipipe.py.  the choice of names works without confusion for me.
 p becomes an array of PrevControl instances in the fan-in case.
 n becomes an array of NextControl instances in the fan-out case.

 no confusion in my mind arises over naming.

 p and n become plural, yet the PrevControl and NextControl objects
 within their (respective) arrays remain singular.


> not just linear graphs, hence why I picked
> entry/exit.
> If I have to choose from upstream/downstream and next/previous, I have a
> mild preference for upstream/downstream, though I think entry/exit could
> also work if abbreviated to e/x.

 see below.  it's not just about the objects, it's about the function
 names that result during their use.


> >  similarly with connect_to_entry and connect_to_exit.  both beginning with
> > "e"
> >  and both having "to" and "from", i can stare at the piece of code for ten
> >  to twenty minutes and not understand it.
> > 
> >  the use of a totally different word ("connect_in") and the inclusion of
> >  an underscore (on the one that's *not* supposed to be used to connect
> >  stage-to-stage) is sufficient to indicate "this is a private function"
> >  and the logic/letter-based dyslexia is broken.
> The connect_from_* functions are not supposed to be private functions. 

 
 which one gets connected to which, again?

 what does connect_from_entry connect to again?

 and which way round is connect_from_exit?

 which entry is connected to what?

 what's the difference between connecting to an entry and connecting to
 an exit?

 which one does what again?

 ... no need to answer those questions: i'm illustrating that from the
 name *i can't tell*.

 that's enough to highlight that the choices - both entry/exit *and*
 connect_from_e/e - are not good choices.


> I
> don't currently have any better suggestions, except for .eq (where the
> direction is how data is assigned) or something like that.

 eq is actually different _again_... because those are *definitely*
 straight (non-direction-respecting) assignments.  this is PrevControl's
 eq function (which, actually, can now be removed, given that it
 has an __iter__)

    def eq(self, i):
        return [nmoperator.eq(self.data_i, i.data_i),
                self.ready_o.eq(i.ready_o),
                self.valid_i.eq(i.valid_i)]

 that is *NOT*, repeat *NOT* the same thing as _connect_in, which does this:

 def _connect_in(self, prev):
        return [self.valid_i.eq(valid_i),
                prev.ready_o.eq(self.ready_o),
                nmoperator.eq(self.data_i, data_i)]

 you see how _connect_in, the ready_o assignment is inverted in its
 direction?  this distinction is absolutely critically important
 enough to have a different function.

 it's basically the distinction between Record.eq and Record.connect.
 *both are needed*, and (see below) it's something that really, really
 needs to be extended down to the whole of the nmigen API.

 if the nmigen API (all of it) supported Direction, eq would *be* connect,
 and we would not be having this discussion.

 *sigh* :)


> In my opinion, Record should not be a Value, since a Value behaves like a
> Signal, and the only shared operations are assignment (though that
> technically is different because Record members have a direction and
> .connect is named differently). This is another instance of abusing
> inheritance to share implementations, rather than because Record is a Value.

 Signal also derives from Value, so it's not quite true, however i do
 know what you mean.

 take a deep breath... once things have stabilised and become a bit
 more established, i'm going to recommend a *massive* fundamental
 design paradigm shift to the nmigen developers: that the *entire*
 API consider "Direction" to be associated with *EVERY* bit on EVERY
 derivative from Value.

 and that connect be propagated down to Value as part of the API,
 and that a similar "direction-flipping" function (like in Chisel3)
 be added.

 this is the only way that the concept introduced in Record() can
 be properly supported.

 Const obviously would always be an OUT.

 the default would obviously always be INOUT (for backwards compatibility)

 extreme care will have to be taken in recommending it, as whitequark
 is not coping well with python, or the responsibility of being the
 maintainer of nmigen.

 anyway i digress.


> I guess this is (since there is no documentation to state otherwise) because
> I see a Record as a tree of named Signals rather than a Signal that has
> parts that happen to be named.

 [*clenched teeth and trying to smile at the same time*] iii kknoooow.

 it would help enormously if you could be a second person to have that
 conversation with whitequark, independently of me.


> I guess this is more of Rust rubbing off on me, 

 nooo, it's not just you, it's how whitequark is not accepting the
 reality of how people view Record, and wants to forcibly impose
 a particular mental perspective on how it must be used.

 the more people that can independently provide an alternative perspective,
 the easier things will become.

 or, we just fork nmigen.


> since in Rust most things
> are done through traits (interfaces) and not through specific types.

 yes - you'll need to be careful about that, as the technique results
 in an above-average increase in code size.


> This is similar to how types work in python except that, in python, the
> traits required are stated in the documentation (we hope) rather than in the
> actual code.

 now you know why i insisted on the software engineering practice of
  well-documented code [and why writing unit ests are so utterly,
 utterly critical].

 you _can_ write code that doesn't have docstrings or unit tests... it'll
 just be a hopeless waste of time as not only other users won't understand
 it, neither will you in about... mmm.... 4-5 months time.

 i'm serious! :)
Comment 43 Luke Kenneth Casson Leighton 2019-04-30 13:23:01 BST
(In reply to Jacob Lifshay from comment #39)
> (In reply to Luke Kenneth Casson Leighton from comment #22)
> > (In reply to Jacob Lifshay from comment #8)
> > 
> > > > * Data.eq not Data.assign is better, although just "eq" is preferred
> > > >   (which can be used as "from import library import data" then
> > > > "data.eq(...)")
> > > I'm using Data as a place to stash lots of Data-related static functions. We
> > > can switch to a module if you like.
> > 
> >  temporarily nmoperator.py (would like it to be e.g. nmutils.operator)
> > 
> >  http://bugs.libre-riscv.org/show_bug.cgi?id=68
> > 
> >  amazingly, only 3 or 4 actual functions are needed.  eq, shape and cat
> >  are three key functions:
> > 
> >  * eq is obvious [at least, if nmigen's eq is understood, it's obvious]
> > 
> >  * shape and cat are needed in order to flatten data down and back out
> >    to and from a straight signal.
> > 
> >  shape and cat get used inside the FIFOControl object (which in turn
> >  uses that excellent Queue module that you wrote).
> I don't think we should use nmigen's FIFO interface, having an interface
> like was in the original Chisel code with a separate enq (entry) and deq
> (exit) port means that Queue is a Block rather than changing Queue to fit
> nmigen's FIFO interface only to then need to adapt back to the 2-port
> interface everywhere.

 they're all 2-port interfaces, all of them.  Queue, Block, ControlBase,
 Chisel's code, FIFOInterface, they all have exactly the same incoming
 ready/valid/data and outgoing ready/valid/data ports.

 where did you get the impression from that they don't have the exact
 same port-semantics?

 Queue-to-FIFOControl (a derivative of ControlBase aka Block-esque):

         connections = [self.n.valid_o.eq(fifo.readable),
                       fifo.re.eq(self.n.ready_i_test),

 valid_o EQUALs readable.  fifo.re EQUALs ready_i.

 there's literally no change in the logic.  there's no change made.  there's
 no "if ready == True and valid != something set readable True".

 it's a DDDDIIIIRRREEEECCCCCT eq assignment, on *ALL SIX* signal-sets
 (ok except the data goes through process() and happens to get flattened
  and de-flattened).

 can you see that FIFOControl is *LITERALLY* just a wrapper around Queue?

 can you see also that Queue has, from the original Chisel3 code, ONLY
 had the names of the variables changed, *NOT THEIR FUNCTION IN ANY WAY
 SHAPE OR FORM*?

 i *literally* transcribed the code from the original conversion that you
 did, and *ONLY* did a global/search/replace of the names, to match
 FIFOInterface.

 FIFOInterface ***IS*** Queue.  Queue ***IS*** FIFOInterface.

 i can't think of a way to emphasise clearly enough how these are literally
 identical except for the change in names.
Comment 44 Luke Kenneth Casson Leighton 2019-04-30 13:34:53 BST
(In reply to Jacob Lifshay from comment #40)
> (In reply to Luke Kenneth Casson Leighton from comment #23)
> > (In reply to Jacob Lifshay from comment #8)
> > > > * Signal(1, reset=0), both width 1 and reset=0 are the defaults and may
> > > >   be removed
> > > I'd like reset=0 to be left as it makes it explicit that the reset value
> > > actually matters whereas most other signals the reset value doesn't really
> > > matter.
> > > so Signal(reset=0)
> > 
> >  ok - yes, the usual convention there is to add a comment, so that people
> >  are not caught out by surprise at the explicit mention of something that's
> >  known to be a default value.
> > 
> >  which is ok with me, if you feel it helps readability, to do that.
> > 
> >  however... this masks something that's really important: if the reset value
> >  doesn't matter, it's *really important* to put "reset_less=True" to not only
> >  make that absolutely clear, but to *STOP* nmigen from generating unnecessary
> >  gates that the optimiser will in *no way* be able to spot.
> > 
> >  which is why i mentioned in an earlier comment that reset_less=True should,
> >  really, be the preferred... what's the word... "pattern" / "distinguisher".
> I didn't use reset_less=True because that way we can prevent "optimizations"
> that depend on the values being X and going to town removing critical logic.

 it doesn't work that way.  all that happens is, at the top of the
 IR output, an assignment to the reset value is left out.

 if reset_less=False, that assignment is near-guaranteed (there's apparently
 some guesses made at situations where a reset is *not* needed)

 whereas if reset_less=True, those guesses are DISABLED, and the reset is
 GUARANTEED not to be outputted into the IR/verilog.

 so it's the other way round: setting reset_less=True GUARANTEES that the
 clock-based clk/rst code will NOT be generated.

 which is exactly what is needed when you KNOW that you are using a Signal
 that you do NOT want clk/rst code to be generated.



> If nmigen can somehow guarantee that those undesired optimizations aren't
> occurring, then I agree that reset_less=True should be the default.

 that's not nmigen's responsibility, it's *our* responsibility.
 it can't be.  *we* need to be the ones that make the declaration,
 "this variable is an intermediary as part of a combinatorial
  block, no state or reset is required".

 that's precisely what reset_less=True is for, and there is absolutely
 no way for nmigen to correctly - and safely - determine that.

 (note: the OTHER WAY ROUND however - when reset_less=False - *IS*
  sometimes possible).


 it requires careful case-by-case analysis, and is why i am doing
 the graphviz analysis on every single module.  really, though, we
 do need formal proofs.

 nmigen takes the "safe" route by making reset_less=False the default
 because most novice hardware engineers get this hopelessly wrong,
 forgetting to put in a reset and costing hundreds to millions of
 dollars in NREs, through one simple mistake.

 basically, reset_less=True needs to be given the same fundamental
 level of priority as something as simple as "getting the bitwidth
 right" or "getting the algorithm right"


> >  in addition, one thing that i've found you really *really* have to watch
> >  out for: if the width is not equal to 1, sometimes setting an integer
> >  constant is NOT ACCURATE.
> that's an nmigen bug then. If you find any particular cases, report them to
> the nmigen project.

 i can't.  they're forcing the requirement onto all people wishing to
 interact with them to have a github account.
Comment 45 Luke Kenneth Casson Leighton 2019-04-30 14:43:47 BST
(In reply to Jacob Lifshay from comment #39)

> >  shape() is used to find out, from the *recursively-iterable*
> >  thing-of-Value-derivatives, what size of Signal() would be needed to store
> > it
> >  (as a sequence of bits)
> I think we should call it bitlen or something like that, to me shape implies
> returning the type of the Data rather than the number of bits inside it.

 yes.  agreed.  however...

> In C++ it would be sizeof vs. decltype.

 ...that needs to be a recommendation made to the nmigen developers, to change
 the nmigen API from "shape" to "bitlen".

 the nmoperator.shape function simply follows the nmigen convention: not
 doing so is... well... inadviseable: it will result in "surprises":
 https://en.wikipedia.org/wiki/Least_surprise

 so... i agree that yes, shape is a bad name... unfortunately, the name's
 already been chosen, and it would be required to be documented that the
 functionality is identical despite the name being different.

 the better choice, i feel, is to stick with the [poor] name-choice, "shape()"


> >  cat() is used to flatten() and *de*-flatten() the incoming and outgoing
> >  hierarchy-of-Value-derivatives so that ONLY ONE Queue object is needed.
> I think we should do that inside the Queue implementation (or better yet,
> inside Memory, which is what the original Chisel code does).

 mmmm this is an idea which, initially, i like, however it has knock-on
 consequences.

 the first is: Memory2 (or whatever it is called) now has a critical
 dependence on the nmoperator API, which would be the first "base"
 utility class outside of the Stage/Pipe API to do so.

 i'm... reluctant to set that precedent, particularly given that the API
 would change to no longer specify a width, it would need to specify
 the "spec".  i.e. it would need to be passed an "iospecfn".

 that's a really drastic change in the API, going beyond my comfort level
 for a long-term API


 second: the use of the flattening actually results in some truly
 dreadful graphviz representations, as the data gets jammed into a
 box of bits, then extracted on the other side in a truly awful messy
 way.

 
 third: the same logic that applies to Memory also applies to Queue.
 Queue's API would need to be changed, and i'm really not comfortable
 with that.

 right now, it's pretty damn obvious.  read the chisel3 code, read
 the comment about the name-changes, set the two pieces of code
 side-by-side and it's damn obvious (if you can read chisel3 that is)
 that the two are directly equivalent.

 that equivalence is DESTROYED by embedding a hard critical dependence
 on the iospecfn (data format) API.


 ... or...

 very simple...

 use what i called FIFOControl... because FIFOControl *is* Queue [with
 the flattening and processing added].

 in an earlier iteration of FIFOControl, this direct one-for-one equivalence
 was much more obvious, because i was able to make the input port DIRECTLY
 equal to a PrevControl, and the output port DIRECTLY equal to a NextControl.

 unfortunately due to the need to process the in/out data, that equivalence
 was broken, and i had to extract the contents of NextControl._connect_in
 and use those contents *in* FIFOControl.

 *grumble*.

 so basically:

 * if you don't want the nmoperator API to be involved, don't want
   the Shape API involved, don't want the pipeline API involved...

   ... use Queue

 * if you *do* want the data to be flattened...

   ... use FIFOControl.  they are EXACTLY and PRECISELY equivalent.
   ABSOLUTELY NO CHANGE IN THE ARGUMENTS OR THE LOGIC HAS BEEN MADE.  

 if this is not understood, see the comments in queue.py:

            din  = enq_data, writable  = enq_ready, we = enq_valid
            dout = deq_data, re = deq_ready, readable = deq_valid


 Memory has problems of its own (can't do length-1 arrays), which
 makes a Memory2 worthwhile considering doing (for different reasons)

 SyncFIFOBuffered should not exist, basically.


> We need at least one more function, create a new Data that shares the type
> of an existing one. I did that using Shape.get(data).create_data(), similar
> to python type(value)(). This is used to create Data instances for internal
> registers and what-not.

 you're thinking of Value.like.  Signal.like.  etc.

 yes, and the nice thing is: whilst Record.like does not exist (and it should),
 nmoperator.like() can substitute for that lack by auto-detecting if it is
 ever passed a Record.

 the only thing is: we'll need to actually write a recursive-Record-walker
 "like" function (as a non-member function).

 basically Record.like is really really simple: grab the fields member,
 and um create a new Record instance using that Record's fields.

 that's it.

 the only thing we have to watch out for is: if we want to support something
 that *derives* from Record, it should have its own "like" function so
 that we can test for it (if hasattr(x, "like")) and call it.
Comment 46 Luke Kenneth Casson Leighton 2019-04-30 14:52:37 BST
(In reply to Jacob Lifshay from comment #39)

> I don't think we should use nmigen's FIFO interface, having an interface
> like was in the original Chisel code with a separate enq (entry) and deq
> (exit) port means that Queue is a Block rather than changing Queue to fit
> nmigen's FIFO interface only to then need to adapt back to the 2-port
> interface everywhere.

 just to be absolutely clear: i changed ONLY the NAMES.  i did NOT alter
 the FUNCTIONALITY in ANY WAY, SHAPE OR FORM.

 the change in name is documented at the top of the file:

            din  = enq_data, writable  = enq_ready, we = enq_valid
            dout = deq_data, re = deq_ready, readable = deq_valid

 thus, if you are considering that Queue is a Block simply by changing
 some characters of variable names using global/search/replace substitution,
 this is clearly false.

 from this clearly being false, it is just as equally clear that Queue
 *EQUALS* block *EQUALS* FIFOControl *EQUALS* ControlBase in terms of
 their API equivalence, as far as the ready/valid/data entry and exit
 is concerned (and that, pure and simple, only the names are changed).

 Block valid/ready EQUALS FIFOInterface writable/we - readable/re.

 fact.

 other parameters are added, yes - these have NOTHING TO DO WITH the plain
 and simple fact that the ready/valid/data IS EXACTLY THE SAME.

 does that make sense?

 all of these objects *really do have exactly the same fundamental API*
Comment 47 Luke Kenneth Casson Leighton 2019-04-30 14:53:06 BST
even the FSM example with the stb/ack has exactly the same fundamental API!
Comment 48 Luke Kenneth Casson Leighton 2019-04-30 20:51:51 BST
http://git.libre-riscv.org/?p=ieee754fpu.git;a=blob;f=src/add/singlepipe.py;h=8296bf2d4ecd9702cd2d32c20dc345734ed2a641;hb=55f59ae36e2e29b383a97074e59a0e0b4d1010f2#l1043

Found it in the commit history, the point where I made Queue look exactly like a ControlBase as far as the que and deq ports are concerned and literally connected them back to back just like any other object conforming to the ControlBase API.

All it took: faking up PrevControl and NextControl with some name-changing assignments.

NO logic involved AT ALL.

And If you look at Queue you will see that there is literally variable assignments where not even the names from the original port that you did have been changed.

Again, zero change to the log4ic.

Does this help make it clear all that these objects are all literally the same concept?

It really should not be a surprise because ready/valid is such a fundamental part of computer science / digital data handling. Want to write data, you need to say "i want to write data", and you need an acknowledgment back.
Comment 49 Jacob Lifshay 2019-04-30 20:58:14 BST
(In reply to Luke Kenneth Casson Leighton from comment #48)
> http://git.libre-riscv.org/?p=ieee754fpu.git;a=blob;f=src/add/singlepipe.py;
> h=8296bf2d4ecd9702cd2d32c20dc345734ed2a641;
> hb=55f59ae36e2e29b383a97074e59a0e0b4d1010f2#l1043
> 
> Found it in the commit history, the point where I made Queue look exactly
> like a ControlBase as far as the que and deq ports are concerned and
> literally connected them back to back just like any other object conforming
> to the ControlBase API.
> 
> All it took: faking up PrevControl and NextControl with some name-changing
> assignments.
> 
> NO logic involved AT ALL.
> 
> And If you look at Queue you will see that there is literally variable
> assignments where not even the names from the original port that you did
> have been changed.
> 
> Again, zero change to the log4ic.
> 
> Does this help make it clear all that these objects are all literally the
> same concept?
> 
> It really should not be a surprise because ready/valid is such a fundamental
> part of computer science / digital data handling. Want to write data, you
> need to say "i want to write data", and you need an acknowledgment back.

to clarify, I had meant having the enq and deq signals grouped into separate port objects, similar to how entry/exit signals are grouped in Block. I totally get that the signals are identical, I was just objecting to how they had been changed to be ungrouped.
Comment 50 Luke Kenneth Casson Leighton 2019-04-30 21:54:34 BST
(In reply to Jacob Lifshay from comment #49)

> to clarify, I had meant having the enq and deq signals grouped into separate
> port objects, similar to how entry/exit signals are grouped in Block. I
> totally get that the signals are identical, I was just objecting to how they
> had been changed to be ungrouped.

they effectively still are, it's perhaps not clear/obvious.
i'm restoring the original code (managed to work out how to do it):

        ## prev: make the FIFO (Queue object) "look" like a PrevControl...
        m.submodules.fp = fp = PrevControl()
        fp.valid_i, fp._ready_o, fp.data_i = fifo.we, fifo.writable, fifo.din
        m.d.comb += fp._connect_in(self.p, fn=processfn)

        # next: make the FIFO (Queue object) "look" like a NextControl...
        m.submodules.fn = fn = NextControl()
        fn.valid_o, fn.ready_i, fn.data_o  = fifo.readable, fifo.re, fifo.dout
        connections = fn._connect_out(self.n, fn=nmoperator.cat)

so, in Block:

* EntryPort is the "grouping" that is absolutely identical in purpose and
  function to PrevControl.

  EntryPort has grouping of ready/valid/data

  PrevControl has grouping of ready/valid/data

* ExitPort is the "grouping" that is absolutely identical to NextControl.

  ditto

so, i do not believe you could be referring to that, as the grouping is
absolutely identical.

(yes i admit that it is a bit weird that PrevControl and NextControl are
 *not allowed* to set data_i and data_o in their constructor, that job
  is the responsibility of *ControlBase* - through the function now known
  as ControlBase.set_data, however that's fully and carefully documented)


in Queue, it's not clear that grouping exists explicitly
by way of separate classes/objects.  however, by deriving from a known
interface (FIFOInterface), the grouping becomes clear from looking
at the code:

class FIFOInterface:

        self.din      = Signal(width, reset_less=True)
        self.writable = Signal() # not full
        self.we       = Signal()

        self.dout     = Signal(width, reset_less=True)
        self.readable = Signal() # not empty
        self.re       = Signal()

from there, in Queue, the "convenience names" preserve the groupings:

        # convenience names
        p_ready_o = self.writable
        p_valid_i = self.we
        enq_data = self.din

        n_valid_o = self.readable
        n_ready_i = self.re
        deq_data = self.dout

and from there, the above changes (which explicitly drop those named-vars
into PrevControl and NextControl objects) make it, i believe, clear.

that in turn has the side-effect that:

* Queue is made to *look* like a PrevControl / NextControl pair

* Queue is used *inside* FIFOControl

* FIFOControl derives from ControlBase

* ControlBase is conceptually equivalent to Block

* Therefore, logically, Queue *is* directly equivalent to Block,
  only with the slight awkwardness that its members are separated.

now, we _could_ go one stage further and actually make Queue actually
*have* PrevControl and NextControl instances... however, again, the
"dependence" of the Queue class on the iocontrol.py module is not...
i'm not totally comfortable with the idea, as explained above.

point being: we could conceivably morph Queue so that it *becomes*
Block-like... however that's precisely what FIFOControl is supposed
to do... and does do.

more to the point: by morphing Queue to look absolutely nothing like
the nmigen FIFOInterface, we lose several opportunities, such as
being able to talk to the nmigen developers and say "hey, you know
those bugs and limitations of SyncFIFO / SyncFIFOBuffered...."