DONE: PriorityPicker DONE: MultiPriorityPicker DONE: BetterMultiPriorityPicker
I completed the formal proof for PriorityPicker, which turned out to be quite broken: the lsb_mode parameter when set to False (the default) actually changed it to LSB-priority. When set to True, it was just plain broken. Since nothing actually uses lsb_mode=True or even sets it (except for dumping PriorityPicker as rtlil for user inspection), I renamed it to msb_mode and fixed its operation when set to True. https://git.libre-soc.org/?p=nmutil.git;a=blobdiff;f=src/nmutil/picker.py;h=7cf7f7bd03001f920ce898ab765211d7689832ad;hp=8df01e783445f53e623e61eac7e1ef3698ed660b;hb=406a59ee96b948810f274151808c1551517d28f2;hpb=8aa377705c96f365606950c0fa142e40546e566b
I added the formal proof for MultiPriorityPicker. While I was reading the implementation, I thought of a more efficient implementation with O(log N) latency instead of the > O(N) latency that MultiPriorityPicker has, so I wrote BetterMultiPriorityPicker: it uses a parallel prefix sum of the input bits to compute which level each input bit should be sent to, rather than needing a O(N) stack of PriorityPickers. Testing with width=16, levels=16: The new algorithm uses a bit less gates and has about 1/4 the latency: > yosys <<<$'read_rtlil sim_test_out/nmutil.formal.test_picker.TestBetterMultiPriorityPicker.test_16_levels_16_idxs_f_mi_f/0.il\nflatten\nsynth\nltp -noff' Number of cells: 1171 Longest topological path in top (length=21): > yosys <<<$'read_rtlil sim_test_out/nmutil.formal.test_picker.TestMultiPriorityPicker.test_16_levels_16_idxs_f_mi_f/0.il\nflatten\nsynth\nltp -noff' Number of cells: 1469 Longest topological path in top (length=85):
yes brilliant, i did a proof-of-concept version based on PPicker as it is easier to think through but nowhere neae gate-efficient. nicely done.