The first NLnet grant allowed us to create IEEE754 FP pipelines, which now need integration into the Libre-SOC Core, and suitable unit tests created. It will add approximately 60 new instructions for Power ISA 3.0 SFFS Compliancy. second priority is Power ISA v3.1 ---------------------------------------------------------------- To-do list: (this doesn't include formal proofs, which a lot of will be complex enough to blow out our budget and/or not even run in a reasonable amount of time) cost estimates: * NOT COMPLETED AT ALL: make educated guess how much code will be needed * TODO: comment #6 and other comments (which you need to track down and re-read jacob) regarding getting FP into TestIssuer. without TestIssuer supporting the FP pipeline there *is* no FP pipeline. this set of tasks *includes documenting* of TestIssuer and how it fits together, finding checking and then updating the existing page. * TODO: plan out modifications to LDSTCompUnit, integrating DOUBLE/SINGLE into them, coordinating with Cesar ensuring that the Formal Proof plans are not disrupted (TODO: find the bugreport, link it here, link it to this bugreport and increase cesar's budget up again to cover the additional work) * DONE: done for everything but: * fptrans * huge, needs to be split into much smaller subtasks. it's enough work that, except for the tasks already split out (recip/[r]sqrt/mod/rem and fminmax) we should probably eliminate it entirely from this grant due to being lower priority and difficult. * will figure out only if we have budget left over from other tasks * DONE: translate code size estimates to monetary amounts. using conversion factor of 27 days per 1000 lines of code, estimated in: https://lists.libre-soc.org/pipermail/libre-soc-dev/2023-August/005607.html using EUR 3000/mo from: https://lists.libre-soc.org/pipermail/libre-soc-dev/2023-August/005592.html this comes out as EUR 2.70 per line of code, assuming a month is 30 days. I then rounded budgets to the nearest EUR 100 ==================== subtasks we're doing ==================== Needed for v3.0/v3.1-without-PO1: * DONE: add FPSCR and rounding-mode definitions to ieee754fpu https://bugs.libre-soc.org/show_bug.cgi?id=1135 * 265 lines of code 70 lines of tests * rounded to EUR 900 * TODO: add FPSCR and FP registers and dependency-tracking to soc.git * probably 100 lines of code with 100 lines of tests * rounded to EUR 1400 * TODO: add PowerISA SINGLE/DOUBLE conversion functions * (not the same as ieee 754 conversion) * 200 lines of code with 100 lines of tests * rounded to EUR 1600 * TODO: add FP loads to soc.git * 200 lines of code with 200 lines of tests * lf[s/d][u][x] * lfiwax/lfiwzx * rounded to EUR 1900 * TODO: add FP stores to soc.git * 200 lines of code with 200 lines of tests * stf[s/d][u][x] * stfiwx * rounded to EUR 1900 * TODO: FP move/sign-bit-manipulation * fmr[.] * fneg[.]/fabs[.]/fnabs[.] * fcpsgn[.] * probably 100 lines of code with 50 lines of tests * rounded to EUR 500 * TODO: FP add/sub * fadd[s][.]/fsub[s][.] using existing pipelines * probably 100 lines with 100 lines of tests * add FPSCR/rounding support to add/sub pipeline * probably 200 lines with 100 lines of tests (due to rounding modes) * rounded to EUR 2600 * TODO: FP select * probably 100 lines with 50 lines of tests * fpsel[.] * simple enough that I don't think we even need a ieee754fpu pipeline * doesn't interact with FPSCR except for writing CR1 when Rc=1 * rounded to EUR 400 * TODO: FPSCR manipulation * probably 300 lines with 200 lines of tests * mffs[.] * mffsce[.] * mffsc[d]rn[i] * mffsl * mcrfs * mtfsf[i][.] * mtfsb[0/1][.] * rounded to EUR 1400 ============================= not doing as part of this bug ============================= (just here since I already did the budgeting work) (overlaps with https://bugs.libre-soc.org/show_bug.cgi?id=1026#c0, we need to resolve what goes where) Needed for v3.0/v3.1-without-PO1: * TODO: FP store double pair * 300-500 lines of code with 400 lines of tests * stfdp[x] * stores are part of SFFS subset, loads are not according to v3.1B Appendix H * rounded to EUR 2200 * TODO: FP merge words * fmrgew * fmrgow * probably 100-200 lines of code with 50 lines of tests * rounded to EUR 500 * TODO: FP mul/fma * fmul[s][.]/f[n]madd[s][.]/f[n]msub[s][.] using existing pipelines * probably 2-300 lines with 200 lines of tests * add FPSCR/rounding support to mul/fma pipeline * probably 200 lines with 100 lines of tests (due to rounding modes) * rounded to EUR 3100 * TODO: FP div/sqrt/fre/frsqrte * fdiv[s][.]/fsqrt[s][.]/fre[s][.]/frsqrte[s][.] using existing giant div core (fre[s][.]/frsqrte[s][.] are full accuracy ops since that's easiest since we already need the full accuracy ops for fptrans and we already have a pipeline that does the core operation.) * probably 300-400 lines with 300 lines of tests * create new much smaller FSM pipeline suitable for small cores (the giant pipeline won't fit on most our FPGAs), basically taking all the giant div core and converting each stage to a FSM state. This includes fmod*/fremainder*/frecip*/frsqrt* since those are simple on top of the existing div/sqrt/rsqrt logic. * probably 500 lines with 200 lines of tests * add FPSCR/rounding support to div/sqrt/rsqrt pipeline * probably 200 lines with 100 lines of tests * rounded to EUR 4500, 1800 for just using existing giant div core, 3600 without full rounding support * TODO: FP test for SW div/sqrt * ftdiv/ftsqrt * probably 150 lines with 50 lines of tests * rounded to EUR 800 * TODO: FP round to single * frsp[.] * add FPSCR/rounding support * probably 2-300 lines with 300 lines of tests * rounded to EUR 1500 * TODO: FP convert from integer * https://bugs.libre-soc.org/show_bug.cgi?id=1137 * needs fcfi* in ISACaller -- 500 lines with 200 lines of tests * mostly copy-paste, so only counting 300 lines total * probably 200-400 lines for pipelines/insns * create ctfpr pipeline, it can be used (perhaps with slight modifications) for all the fcfi* insns too. * this covers all of: * fcfid[u][s][.] * ctfpr* * rounded to EUR 1900 * TODO: FP convert to integer * https://bugs.libre-soc.org/show_bug.cgi?id=1136 * needs fcti* in ISACaller -- 500 lines with 200 lines of tests * mostly copy-paste, so only counting 300 lines total * probably 400-500 lines for pipelines/insns * create cffpr pipeline, it can be used (perhaps with slight modifications) for all the fcti* insns too. * this covers all of: * fcti[d/w][u][z][.] * cffpr* * rounded to EUR 2000 * TODO: FP round to integer (FP to FP, not FP to int) * probably 200 lines with 100 lines of tests * fri[n/p/z/m][.] * add FPSCR support * rounded to EUR 800 * TODO: FP compare * probably 200 lines with 100 lines of tests * fcmp[u/o] * add FPSCR support * rounded to EUR 800 Needed for Libre-SOC extensions (stuff that's in openpower-isa): * TODO: fmvis/fishmv * probably 100 lines with 50 lines of tests * rounded to EUR 400 * SEE ABOVE: c[ft]fpr* (covered by tasks above for FP convert from/to integer) * TODO: m[ft]fpr* * probably 150 lines with 50 lines of tests * rounded to EUR 500 * TODO fptrans insns (fminmax*, fsqrt*/frsqrt*, frecip*, fmod*/fremainder* are split out into separate tasks) * easily multiple thousands of lines of code, should be split up further into sub-sub-tasks * SEE ABOVE: fsqrt*/frsqrt*, frecip*, fmod*/fremainder* (covered by task above for FP div/sqrt/fre/frsqrte) * TODO: fminmax* * probably 300 lines with 100 lines of tests * https://bugs.libre-soc.org/show_bug.cgi?id=1134 * rounded to EUR 1100 * TODO: FFT/DCT FP ops * fdmadds/ffadd/ffdiv[s]/ffmadds/ffmsubs/ffmul[s] /ffnmadds/ffnmsubs/ffsub[s] in openpower-isa * probably 500 lines with 300 lines of tests * rounded to EUR 2200
(In reply to Luke Kenneth Casson Leighton from bug #1072 comment #8) > (In reply to Jacob Lifshay from bug #1072 comment #7) > > > all I need is just the register with accessible fields, > > copy the style here > https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/ > decoder/isa/svstate.py;hb=HEAD Done, though I may have gone a little overboard -- I added a parser that parses the fields from the doc comment to ensure they always match. One benefit is it can probably be easily adapted to whatever Latex table thing has the field definitions whenever the OpenPower Foundation releases the spec source. https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=bb7069cee706f7fbc75dc7cafec3afe19cd87e02;hp=c8b2c5d2c984cc444880187679c2a9589bae0526 commit bb7069cee706f7fbc75dc7cafec3afe19cd87e02 Author: Jacob Lifshay <programmerjake@gmail.com> Date: Fri May 5 20:38:22 2023 -0700 fix fpscr table parser error reporting commit bef52a38023bfce850174a3156d61170f767dd01 Author: Jacob Lifshay <programmerjake@gmail.com> Date: Fri May 5 20:34:18 2023 -0700 add FPSCRState and FPSCRRecord and a FPSCR smoke-test > > i will followup by adding it to ISACaller. Thanks!
(In reply to Jacob Lifshay from comment #1) > https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff; > h=bb7069cee706f7fbc75dc7cafec3afe19cd87e02; > hp=c8b2c5d2c984cc444880187679c2a9589bae0526 + + @property + def DRN(self): + return self.fsi['DRN'].asint(msb0=True) + + @DRN.setter + def DRN(self, value): + self.fsi['DRN'].eq(value) + please remove all of those and replace them with dynamic functions added within the for-loop. there are far too many and you should have immediately red-flagged the fact that they are a regular pattern (60+ identical duplications of the above lines with different 'DRN') python is not rust or java. use a simple override on __getattr__ and __setattr__ do not waste time creating metaclasses (or override __dir__) https://amir.rachum.com/python-dynamic-attributes/ we do not want complex concepts nor unintelligable code nor vast quantities of unmaintainable code.
(In reply to Luke Kenneth Casson Leighton from comment #2) > please remove all of those and replace them with > dynamic functions added within the for-loop. I put them in there because that's what you did for SVSHAPE https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/isa/svshape.py;hb=d40763cd6e186ad9b17ce6f974a38b4c4877965e I can remove them if you like...
I adjusted the costs for different tasks, and figured out what would fit in our EUR 8000 budget (it ended up excluding all Libre-SOC instructions, so I think some of those should just be moved to #1026 where we explicitly have a budget for Libre-SOC instructions) I picked: * add FPSCR and rounding-mode definitions to ieee754fpu already done * add FPSCR and FP registers and dependency-tracking to soc.git needed for everything else * add PowerISA SINGLE/DOUBLE conversion functions needed for nearly everything else * add FP loads to soc.git we need a way to get data in/out of FP registers and loads/stores are what compilers currently use (they don't support m[ft]fpr yet) * add FP stores to soc.git same as FP loads reasoning * FP move/sign-bit-manipulation data in registers needs to move around * FP add/sub common and fits in budget * FP select exactly fits in EUR 400 left over from everything else * FPSCR manipulation we need a way to get data in/out of FPSCR
(In reply to Jacob Lifshay from comment #3) > (In reply to Luke Kenneth Casson Leighton from comment #2) > > please remove all of those and replace them with > > dynamic functions added within the for-loop. > > I put them in there because that's what you did for SVSHAPE yes but not 64 individual bits resulting in 250+ lines of duplicated code easily autogenerated. > I can remove them if you like... no it's too late now, done and a waste of time removing them, despite being a maintenance nightmare. to be honest these should all be done by insndb, which is a big enough task to justify its own budget, and autogenerating nmigen Record-derivatives as well.
(In reply to Jacob Lifshay from comment #4) > I adjusted the costs for different tasks, and figured out what would fit in > our EUR 8000 budget (it ended up excluding all Libre-SOC instructions, so I > think some of those should just be moved to #1026 where we explicitly have a > budget for Libre-SOC instructions) not sure yet. see how it goes. > I picked: .... it is a good start, gives the general idea of cost, but is missing preparation (please find where i described it already) and regfile profile analysis and documentation. also adding regfiles, and prep code to initialise then extract them in the TestAPI. and adding config params, and adding TestIssuer options, all the way to Makefile to compile with or without FP. > * add FP loads to soc.git > we need a way to get data in/out of FP registers ignore LD/ST initially and have the unit tests pre-arrange values in regs. i had to add that for GPRs, and so adding the ability to up!oad into FPRs prior to starting the nmigen sim is *another* task on the list. please follow the following incremental strategy as top-level bugs: * add FPR regfile **ONLY** plus do the pipeline reg allocation analysis. grep all code for "IntRegs" and duplicate all sections, then document the proposed pipeline reg allocation. - fneg will be in 1R1W (plus FPSCR plus CR1, read and write) - fmac in 3R1W (plus FPSCR CR1) - fadd *also* consider in 3R1W but with mul as "no input". - fld/fst as another (bear in mind these are special-case) * do the 1in1out as one bug "group" * do 3r1w as another * do LD/ST as another * add FPSCR and CR1 to all pipelines *as a totally separate set of tasks* with their own toplevel budget. reason: look in common_input and output_stage.py, the same can be done for CR1 / FPSCR. FPSCR regfile can copy XERRegs style. this will give a coarse granularity that NLnet will be happier with, and is a safer incremental approach.
i did think of another way (another subdivision) but i can't remember it this morning. oh yes! * pipeline (+tests) * CompUnit (+tests using classes as used in test_caller_*) * FunctionUnit (+tests using same classes) * Core integration (test_core.py - needs FP option) * TestIssuer integration (test_issuer.py - needs option(s) plural) the *exact* same unit tests - many of which have already been written for use by test_caller*.py - can if generalised be used the entire way, *four* new (additional) times. i got test_core.py up and running again last time i looked at this (when doing the InOrder core).