Make a layout of ethmac as a standalone block to evaluate multiple clock-trees strategies. ethmac is taken from here: https://github.com/freecores/ethmac.git
The Verilog from Freecores/ethmac seems not be readable by Yosys. Hang there: Yosys 0.12+23 (git sha1 UNKNOWN, gcc 11.2.1 -fPIC -Os) 1. Executing Verilog-2005 frontend: ethmac.v Parsing Verilog input from `ethmac.v' to AST representation. make: *** [mk/synthesis-yosys.mk:53: ethmac.blif] Error 247 I'm setting it up as a standalone example in alliance-check-toolkit. Do you want me to commit it right now? There will also be likely questions about the implementation of the FIFOs/SRAMs.
(In reply to Jean-Paul Chaput from comment #1) With the following script, inspired from: https://git.libre-soc.org/?p=ls2.git;a=blob;f=simsoc.ys;h=a4adcefdcd7103aa29cc6578bdaf470b89e0f845;hb=0ed190756075447abdf96cb7e508e7ed92118236#l33 That is: yosys read_verilog eth_clockgen.v yosys read_verilog eth_cop.v yosys read_verilog eth_crc.v yosys read_verilog eth_fifo.v yosys read_verilog eth_maccontrol.v yosys read_verilog ethmac_defines.v yosys read_verilog eth_macstatus.v yosys read_verilog ethmac.v yosys read_verilog eth_miim.v yosys read_verilog eth_outputcontrol.v yosys read_verilog eth_random.v yosys read_verilog eth_receivecontrol.v yosys read_verilog eth_registers.v yosys read_verilog eth_register.v yosys read_verilog eth_rxaddrcheck.v yosys read_verilog eth_rxcounters.v yosys read_verilog eth_rxethmac.v yosys read_verilog eth_rxstatem.v yosys read_verilog eth_shiftreg.v yosys read_verilog eth_spram_256x32.v yosys read_verilog eth_top.v yosys read_verilog eth_transmitcontrol.v yosys read_verilog eth_txcounters.v yosys read_verilog eth_txethmac.v yosys read_verilog eth_txstatem.v yosys read_verilog eth_wishbone.v yosys read_verilog timescale.v yosys hierarchy -check -top ethmac yosys synth -top ethmac yosys memory yosys dfflibmap -liberty FlexLib.lib yosys abc -liberty FlexLib.lib yosys clean yosys write_blif ethmac.blif I can go a little further: Yosys 0.12+23 (git sha1 UNKNOWN, gcc 11.2.1 -fPIC -Os) 1. Executing Verilog-2005 frontend: eth_clockgen.v Parsing Verilog input from `eth_clockgen.v' to AST representation. Generating RTLIL representation for module `\eth_clockgen'. Successfully finished Verilog frontend. 2. Executing Verilog-2005 frontend: eth_cop.v Parsing Verilog input from `eth_cop.v' to AST representation. Generating RTLIL representation for module `\eth_cop'. eth_cop.v:0: Warning: System task `$display' outside initial block is unsupported. eth_cop.v:0: Warning: System task `$display' outside initial block is unsupported. eth_cop.v:0: Warning: System task `$display' outside initial block is unsupported. eth_cop.v:0: Warning: System task `$display' outside initial block is unsupported. eth_cop.v:0: Warning: System task `$display' outside initial block is unsupported. eth_cop.v:0: Warning: System task `$display' outside initial block is unsupported. eth_cop.v:0: Warning: System task `$display' outside initial block is unsupported. eth_cop.v:0: Warning: System task `$display' outside initial block is unsupported. eth_cop.v:0: Warning: System task `$display' outside initial block is unsupported. eth_cop.v:0: Warning: System task `$display' outside initial block is unsupported. eth_cop.v:0: Warning: System task `$display' outside initial block is unsupported. eth_cop.v:0: ERROR: System task `$stop' outside initial block is unsupported. As I'm not fluent in Verilog, I cannot tell if it's a Yosys unsupported feature or an outright Verilog error.
(In reply to Jean-Paul Chaput from comment #2) > eth_cop.v:0: ERROR: System task `$stop' outside initial block is > unsupported. > > > As I'm not fluent in Verilog, I cannot tell if it's a Yosys unsupported > feature or > an outright Verilog error. https://github.com/freecores/ethmac/blob/master/rtl/verilog/eth_cop.v it is for simulation purposes (icarus, verilator). $display and $stop clearly will not work in an ASIC! if you remove $stop you will get further
(In reply to Jean-Paul Chaput from comment #1) > I'm setting it up as a standalone example in alliance-check-toolkit. > Do you want me to commit it right now? sure, let's get it up and running.
Took a while to get it up and running. It triggered some annoying bugs that I wanted to be completely solved before going any further. All know errors in the router should now have been cleared. Commited in Coriolis #c877d7e9 and alliance-check-toolkit #5fb4f50, the ethmac base example. It is provided for both TSMC 180nm (private use only) and SkyWater 130nm, for the general public. This is the starting point from which I will start optimizing the P&R of the block.
(In reply to Jean-Paul Chaput from comment #5) > Took a while to get it up and running. It triggered some annoying bugs that > I wanted to be completely solved before going any further. interesting. > This is the starting point from which I will start optimizing the > P&R of the block. adhoc clock tree, localisation of the parts connected to it? be interesting to hear, also it occurs to me that maybe jtag_tck could be treated similarly on ls180 as bigger test?
(In reply to Luke Kenneth Casson Leighton from comment #6) > adhoc clock tree, localisation of the parts connected to it? > be interesting to hear, also it occurs to me that maybe jtag_tck > could be treated similarly on ls180 as bigger test? Yes. I will analyse to what block the clocks are connected. See if a manual placement of said block can help. Also will look at the data flow as we have huge buses and clearly bi-directional data-flow. Concerning the jtag_tck, that will depend on how many DFFs is it connected to and how widespread in the rest of the design they are.
(In reply to Jean-Paul Chaput from comment #7) > Yes. I will analyse to what block the clocks are connected. > See if a manual placement of said block can help. > > Also will look at the data flow as we have huge buses and > clearly bi-directional data-flow. yes. all IO Pads. these are combinatorial muxes to re-route IO for testing. > Concerning the jtag_tck, that will depend on how many DFFs > is it connected to and how widespread in the rest of the > design they are. there will be a lot of Muxes onto the wishbone bus, i set that to cut off the core in case things go wrong, but they should not involve DFFs there. basically, whilst information on the JTAG side comes from or into ASync DFFs to cross over between tck and sysclk, signal interception goes through *combinatorial* muxes. you will see this (Clock-Domain-Crossing) jtag side signal -> DFF(tck) -> DFF(clk) -> clk controlled signal you will NEVER see this: jtag side signal -> DFF(clk) -> clk controlled signal or this: jtag side signal -> DFF(tck) -> clk controlled signal this is something you will also see on eth_mac around the FIFOs, a pair of DFFs chained together.
(In reply to Luke Kenneth Casson Leighton from comment #6) > > also it occurs to me that maybe jtag_tck > could be treated similarly on ls180 as bigger test? The jtag_clk is indeed an interesting case as the boundary scan goes over the whole input signals. So although jtag_tck is not used in the core it needs to be distributed close to all the IO cells. The placer is also involved here as that will determine where exactly the logic is placed. As the boundary scan is basically a big shift register one could also have a strategy for jtag_clk that distributes jtag in a circular way and not a tree. Typically this is done with the clock going in the opposite direction of the shift register to help for hold violations. From timing point of view the max. operating frequency for jtag_tck can also be made lower than the core max. clock frequency.
After a first basic run with the Yosys generated SRAM, it appears that the SRAM takes up 42% of the area for the DFF only. If all the paraphernalia of address decoding and output muxing is added we should be close to 60%. So, would it be possible to have a SRAM of 256 words of 32 bits, conforming to the following interface: entity cmpt_eth_spram_256x32 is port ( ce : in bit ; clk : in bit ; oe : in bit ; rst : in bit ; we : in bit_vector(3 downto 0) ; addr : in bit_vector(7 downto 0) ; di : in bit_vector(31 downto 0) ; dato : out bit_vector(31 downto 0) ; vdd : in bit ; vss : in bit ); end cmpt_eth_spram_256x32; It would ensure a drastic area reduction.
removed from bug #850 and copied here to the appropriate bugreport > * 256x32 SRAM for eth_mac If that's supposed to hold a full ethernet packet, it's too small for the standard ethernet frame size: it needs to be at least 1522 bytes if we don't want to support jumbo frames (the ethernet fields -- not just the payload -- are needed for full packet capture like for wireshark): https://en.wikipedia.org/wiki/Ethernet_frame if we want to support jumbo frames we'll need 9022 bytes: https://en.wikipedia.org/wiki/Jumbo_frame
(In reply to Luke Kenneth Casson Leighton from comment #11) > If that's supposed to hold a full ethernet packet, no. registers (and something called "BD", Buffer Descriptor, whatever that is). packets are transferred directly to/from FIFOs from/to memory using a Wishbone Master interface. in theory the SRAM could be made larger.
(In reply to Jean-Paul Chaput from comment #10) > After a first basic run with the Yosys generated SRAM, it appears that the > SRAM takes up 42% of the area for the DFF only. If all the paraphernalia of > address decoding and output muxing is added we should be close to 60%. as there is not an actual ASIC being manufactured this is not such a big concern. > So, would it be possible to have a SRAM of 256 words of 32 bits, > conforming to the following interface: we are out of budget to do so, everything has been allocated.
(In reply to Luke Kenneth Casson Leighton from comment #13) > (In reply to Jean-Paul Chaput from comment #10) > > After a first basic run with the Yosys generated SRAM, it appears that the > > SRAM takes up 42% of the area for the DFF only. If all the paraphernalia of > > address decoding and output muxing is added we should be close to 60%. > > as there is not an actual ASIC being manufactured this is not such a big > concern. Yes and no... We won't do the ASIC but still plan to submit a mini-design to the Google/SkyWater MPW program. I preemptively reply to your question : yes the SkyWater I/O pads are too slow to run the ethmac at nominal speed. But we will try to run it slower just to check the whole design. On a more general side, I think that some people may not have access to SRAM optimized block and still rely on Yosys generated ones, so having a dedicated placer should be beneficial for the community at large. > > So, would it be possible to have a SRAM of 256 words of 32 bits, > > conforming to the following interface: I leave that up to Staf if he wants to still do it.
https://gitlab.lip6.fr/vlsi-eda/alliance-check-toolkit/-/commit/48f10117d5fbeb4a2c2bf54b2ff711477e8c34bf