initially some simulator support for transcendentals were added for mp3 support, more are needed for 3D. atan/atan2, log1p expm1. also needs adding to sv/trans/svp64.py disasm and binutils * move trans. to fptrans DONE jacob * reallocate to not overlap DONE jacob take into account email below and replace all A-FORM by X-FORM: https://lists.libre-soc.org/pipermail/libre-soc-dev/2022-September/005253.html https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=openpower/power_trans_ops.mdwn;h=a3d4a9d5fdc12dc46695a803bf7191f26e5c151c;hb=e6130994542942a629b8d1e723f746079e870581 * add CSV files matching URL DONE jacob https://libre-soc.org/openpower/transcendentals/ https://git.libre-soc.org/?p=openpower-isa.git;a=blobdiff;f=openpower/isatables/minor_59.csv;h=cdb66e40781d29a953511371e1e5ff33eed24577;hp=04a07b8a4183fbfd61fa80135a4ca67ee9fed12d;hb=4ca5b36dd4a72243a433e6371993667d4eedf916;hpb=76c6dd37fcb2c86ddf5458a9b1bf2284cacef7a1 https://git.libre-soc.org/?p=openpower-isa.git;a=blobdiff;f=openpower/isatables/minor_63.csv;h=493c6f02d3e081263c9208a4a322156a8cedbc07;hp=fd94ad3b056c5080b72b8382b957ef309f254c6a;hb=HEAD;hpb=9878bd52a8af9d504a84f1eef5b81f3ce5afcdff * allocate opcodes for new ops from #923 DONE jacob * add new ops from #923 to csvs DONE jacob * add stub ops to fptrans SKIPPED jacob * check sv_analysis DONE jacob * add to sv/trans.py DONE jacob * add binutils TODO ghostmansd * implement pseudocode DONE jacob * implement helpers.py DONE jacob * implement unit tests DONE jacob (list linked below) * impl. *some* svp64 tests DONE jacob there is no need *at all* to be "comprehensive" in the unit tests, even a single example is sufficient. exceptions and rounding are NOT needed at this stage. list of ops: https://libre-soc.org/openpower/power_trans_ops/
just started working on this: moved fsins/fcoss to their own file fptrans.mdwn
(In reply to Jacob Lifshay from comment #1) > just started working on this: > moved fsins/fcoss to their own file fptrans.mdwn brilliant, good idea, remember i have already done the opcode allocation https://libre-soc.org/openpower/transcendentals/ that is for the CSV files ghostmansd can do the binutils the moment the reg allocation and CSV is available, so if you can create stubs for every op he can do that straight away. i will edit comment 0 with a plan
| flog2p1(s)| log2 plus 1 | 10101 01110 | | fexp2(s) | power-of-2 | 10110 01110 | | flog2(s) | log2 | 10111 01111 | | fexpm1(s)| exponential minus 1 | 11000 01110 | suggest a helper.py function pair FPLOG(self, val, pow, offset=1.0) FPEXP(self, val, exp, offset=-1.0)
affected by outcome of https://lists.libre-soc.org/pipermail/libre-soc-dev/2022-September/005251.html reproduced for convenience: On Thu, Sep 1, 2022, 02:01 Luke Kenneth Casson Leighton via Libre-soc-dev < libre-soc-dev at lists.libre-soc.org> wrote: > but it is *not the whole story* because the GF ops have > not been allocated, nor rounding converts, nor 3D textures, > and additionally there are the Transcendentals > https://libre-soc.org/openpower/transcendentals/ > > fortunately the Ts can fit into EXT063 and EXT059 naturally. > looking over the transcendentals encodings, they both have 5 bits set to `///`, which iirc is don't care bits, this seems like a waste, we should have those be an `EO` expanded opcode field (where RA would go for single-input instructions) or just part of XO (for 2-in instructions), reducing the number of XO values used by 32x. iirc `EO` is already used like that for some X-form instructions, e.g.: VSX Scalar Negate Quad-Precision X-form xsnegqp VRT,VRB fields: PO RT EO RB XO 63 VRT 16 VRB 804 / 0 6 11 16 21 31
looking through the encodings some more, some of the encodings appear to be duplicates: 1000001110,fcos[s] 1000001110,frsqrt[s] 1000101110,facos[s] 1000101110,fcbrt[s] 1001001111,frecip[s] 1001001111,ftanpi[s] 1010001110,fcosh[s] 1010001110,fexp2m1[s] 1010101111,facosh[s] 1010101111,fasinh[s] 1010101111,fatanh[s] 1010001101,fsinh -----01101,fhypot
(In reply to Jacob Lifshay from comment #4) > looking over the transcendentals encodings, they both have 5 bits set to ha. drop the entire 2-op table into an X-Form, pick one area of 058 . 063 that has free identical but also partly already used. around fcfid and fcfids would do. only one is needed. 11000 01110 and below right columns are precious.
found an empty spot with enough space, once fdmadds is moved. https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=dbc0e373942ecab6298832da9c0ed925aa7fa4c7 ran out of time to actually reallocate today, sorry. will do that soon. added a bunch of TODOs, so it's obvious it's a WIP.
(In reply to Jacob Lifshay from comment #7) > found an empty spot with enough space, once fdmadds is moved. ffadds can share the space with fptrans, once it's converted to X-FORM. TODO: look for other A-FORM instructions we're proposing that don't need 3-inputs, convert to X-FORM. > > https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff; > h=dbc0e373942ecab6298832da9c0ed925aa7fa4c7 > > ran out of time to actually reallocate today, sorry. will do that soon. > > added a bunch of TODOs, so it's obvious it's a WIP.
21 -----01101,FPU,OP_FPOP,FRA,FRB,NONE,FRT,NONE,CR1,0,0,ZERO,0,NONE,0,0,0,0,1,0,RC_ONLY,0,0,ffadds,A,,1,unoff ahh maaaahn, we need a version of the opcode tables. one place. can't track all these otherwise. best done autogenerated by reading csv files i feel. option to parse idea_minor_NN.csv as well so as to have something to test without committing.
(In reply to Luke Kenneth Casson Leighton from comment #9) > 21 > -----01101,FPU,OP_FPOP,FRA,FRB,NONE,FRT,NONE,CR1,0,0,ZERO,0,NONE,0,0,0,0,1,0, > RC_ONLY,0,0,ffadds,A,,1,unoff > > ahh maaaahn, we need a version of the opcode tables. > one place. can't track all these otherwise. > > best done autogenerated by reading csv files i feel. > option to parse idea_minor_NN.csv as well so as to > have something to test without committing. if/when we write that, we need to detect overlap and not silently pick one instruction for each cell.
(In reply to Jacob Lifshay from comment #10) > if/when we write that, we need to detect overlap and not silently pick one > instruction for each cell. easily done.
(In reply to Jacob Lifshay from comment #8) > (In reply to Jacob Lifshay from comment #7) > > found an empty spot with enough space, once fdmadds is moved. > > ffadds can share the space with fptrans, once it's converted to X-FORM. I moved ffadds (converted to X-FORM) and fdmadds: https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=6507ee86c014b70fac45578500fc87612403d664 I found and updated the hack to detect if an instruction is SVP64 FFT or not, converting it to: # main PowerDecoder2 determines if different SVP64 modes enabled # detect if SVP64 FFT mode enabled (really bad hack), # exclude fcfids and others # XXX this is a REALLY bad hack, REALLY has to be done better. # likely with a sub-decoder. major = Signal(6) comb += major.eq(self.dec.opcode_in[26:32]) xo = Signal(10) comb += xo.eq(self.dec.opcode_in[1:11]) comb += self.use_svp64_fft.eq((major == 59) & xo.matches( '-----00100', # ffmsubs '-----00101', # ffmadds '-----00110', # ffnmsubs '-----00111', # ffnmadds '1000001100', # ffadds '-----11011', # fdmadds )) All tests in openpower-isa.git pass (though I had to do a git diff on the output logs to find that hack, `setarch -R` makes it easier since addresses don't change as much). TODO(lkcl or whoever): fix that annoying hack!
(In reply to Jacob Lifshay from comment #12) > I found and updated the hack to detect if an instruction is SVP64 FFT or > not, converting it to: none of that should be in there. like ld-st-with-shift it's now banned. > TODO(lkcl or whoever): fix that annoying hack! it needs to go entirely. there *cannot* be re-uses of opcodes specific to svp64 prefixing, it will be catastrophic. ld-st-with-shift showed why.
(In reply to Luke Kenneth Casson Leighton from comment #13) > (In reply to Jacob Lifshay from comment #12) > > > I found and updated the hack to detect if an instruction is SVP64 FFT or > > not, converting it to: rright. ok. so those detections use_svp64_fft mode go through to DecodeOut2 which has to know about all 3-in 2-out instructions having an (implicit) RS operand it *used* to be the case that the same use_svp64_fft was going to be for different 64-bit op meanings from 32-bit but that was dropped. but the implicit-RS detection has to stay.
partially through reallocating: got very distracted by building a program to copy XO values between tables. We'll need more than one 32-entry table per PO, since there are more instructions than I expected (my approximated count was waay off). 64 should be good though. https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=d07de16dd70b5d7a8f0cb75c7d510002c16c2628
that was a mess! i had to hand-edit the table to get it to insert the correct line-breaking in pandoc, as well as add a sequence of unicode symbols. it would be much better not to do that, they look a mess. no, converting to latex is not ok. the trick to getting the table to work was to add "\ space space" and it creates a line-break. no idea how or why.
I finished allocating opcodes for fptrans, I fixed a bunch of formatting bugs while I was at it. https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=openpower/power_trans_ops.mdwn;h=a3d4a9d5fdc12dc46695a803bf7191f26e5c151c;hb=e6130994542942a629b8d1e723f746079e870581 I added those opcodes to the CSVs in openpower-isa.git. https://git.libre-soc.org/?p=openpower-isa.git;a=blobdiff;f=openpower/isatables/minor_59.csv;h=cdb66e40781d29a953511371e1e5ff33eed24577;hp=04a07b8a4183fbfd61fa80135a4ca67ee9fed12d;hb=4ca5b36dd4a72243a433e6371993667d4eedf916;hpb=76c6dd37fcb2c86ddf5458a9b1bf2284cacef7a1 https://git.libre-soc.org/?p=openpower-isa.git;a=blobdiff;f=openpower/isatables/minor_63.csv;h=493c6f02d3e081263c9208a4a322156a8cedbc07;hp=fd94ad3b056c5080b72b8382b957ef309f254c6a;hb=HEAD;hpb=9878bd52a8af9d504a84f1eef5b81f3ce5afcdff
(In reply to Jacob Lifshay from comment #17) > I finished allocating opcodes for fptrans, I fixed a bunch of formatting > bugs while I was at it. these look really good. fast progress.
reallocated fptrans to account for new min/max/fmod/remainder ops, had to move ffadds and fcbrt(s) to XO 1111100000 and 1000001100 respectively. didn't have time to fix all the test errors, afaict none of them are caused by my commits, they also failed in CI on master before I committed. allocated all new ops in power_trans_ops.mdwn, but didn't yet add the new ones to the csvs. I did change the csvs to account for moving ffadds and fcbrt(s) though.
I added the rest of the fptrans ops to the CSVs, https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=501f817ababe9f00ae0aa7cd31ffe9ca0b50683c then added them all to sv/trans/svp64.py, https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=893928fa12fe0d13b578d59075623194f17f3c00 then added all the needed helpers, renaming to match the Power ISA spec's bfp32_ADD, https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=3b33561320233b68eec014f111016ab8527045a6 then added all the pseudocode. https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=ecdbf851e42481ea43fff829458f0668f8d4076b giant regex to the rescue for all three, just take the appropriate lines from mdwn or csv and paste, then regex replace-all a few times -- done!
I implemented ultra-simple scalar unit tests for all fptrans operations. I had to adjust the inputs to avoid the simulator raising ValueError (fixing that is left for later, when we change the implementations to actually be correctly rounded and set flags correctly and stuff). https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/test/fptrans/fptrans_cases.py;h=0d05876c5cf252073a0c523d8472734d64098741;hb=71ab6a4049cd3771ab05651463f5274afa68f843 I added fp support to TestRunnerBase: https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=747aa80444f10d33507ff064c2e3479e14f1a176
(In reply to Jacob Lifshay from comment #21) > https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/test/ > fptrans/fptrans_cases.py;h=0d05876c5cf252073a0c523d8472734d64098741; > hb=71ab6a4049cd3771ab05651463f5274afa68f843 have a look at the test_caller dct and matrix tests, it is not necessary to put binary values in, there is an __float__ method and a helper instance can be set up and the conversion performed automatically.
(In reply to Luke Kenneth Casson Leighton from comment #22) > (In reply to Jacob Lifshay from comment #21) > > > https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/test/fptrans/fptrans_cases.py;h=0d05876c5cf252073a0c523d8472734d64098741;hb=71ab6a4049cd3771ab05651463f5274afa68f843 > > have a look at the test_caller dct and matrix tests, it is not necessary > to put binary values in, there is an __float__ method and a helper > instance can be set up and the conversion performed automatically. that may be true, however, putting binary values in was easier, so I did it.
not if the unit test computes the expected value another way using a subroutine or math library. direct explicit precomputed values like this are useless for understanding and manual doublechecking. please look at the style used in the dct and fft tests which have actual routines (understandable ones, from project nayuki) computing the expected values. i have no problem with non-exact values at this early stage, i.e doing the usual subtract abs less-than-tiny-value more accuracy can be added later iirc the comparison on dct is to 1e-6, it's such a huge accumulated error. it's fine as things stand, we need to move on, but do give some thought to that, esp. if it's quick to fix. and also gives the possibility of checking a few more values. if it's a lot of work let's move on and plan improvements later.
(In reply to Luke Kenneth Casson Leighton from comment #24) > if it's a lot of work let's move on and plan improvements > later. i'd say it isn't a giant amount of work, but it is enough work that imho we shouldn't replace expected values with actual code generating them as part of this bug. i generated the expected values using the simulator and a one-off bash script reading from /tmp/expected (i improved it to print fprs), so it wasn't that difficult.
(In reply to Jacob Lifshay from comment #25) > i'd say it isn't a giant amount of work, but it is enough work that imho we > shouldn't replace expected values with actual code generating them as part > of this bug. yep agreed. > i generated the expected values using the simulator and a one-off bash > script reading from /tmp/expected (i improved it to print fprs), so it > wasn't that difficult. ha cool, i'd forgotten about that :)
I added svp64 tests for fptrans ops https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=e3ebeaafbc0fc1864f05746c49c1b6b98b3e12ad I also extended SimState to read all fprs/gprs/cr-fields so ExpectedState checks all of them, and outputs all of them to /tmp/expected. https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=6a3b3f88a962b72a7d833102888bbf72aa71d441 I also added the FRT, FRA, RB case to sv_analysis and checked all fptrans instructions in sv_analysis's output (I checked some instructions in each of the 3 different groups of fptrans instructions and assumed that all other instructions in each group (just changing the mnemonic and XO) won't make sv_analysis treat them differently). https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=021bebb754c66c0b0ea0719a79a82f2e803cfe95 Now, the only stuff left to do is the binutils stuff, so I'm reassigning this to ghostmansd.
(In reply to Jacob Lifshay from comment #27) > I added svp64 tests for fptrans ops > > https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff; > h=e3ebeaafbc0fc1864f05746c49c1b6b98b3e12ad great! not too many i hope - running them all takes 15 mins already > I also extended SimState to read all fprs/gprs/cr-fields so ExpectedState > checks all of them, and outputs all of them to /tmp/expected. > > https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff; > h=6a3b3f88a962b72a7d833102888bbf72aa71d441 niice > I also added the FRT, FRA, RB case to sv_analysis and checked all fptrans > instructions in sv_analysis's output (I checked some instructions in each of > the 3 different groups of fptrans instructions and assumed that all other > instructions in each group (just changing the mnemonic and XO) won't make > sv_analysis treat them differently). no everything really is qualified by its register profile, how many src, how many dest. what the actual op is or its Form (X, D, MDS) is utterly irrelevant. XO likewise. the only reason sone of the EXTRA allocations are done through looking at the instruction name is, it's sometimes easier and there are special cases (LDST-with-update) > https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff; > h=021bebb754c66c0b0ea0719a79a82f2e803cfe95 yyeah "TODO" in the CSV files, not so hot :) it's relatively straightforward and obvious what needs to be done, there. > Now, the only stuff left to do is the binutils stuff, so I'm reassigning > this to ghostmansd. brilliant.
I followed Jacob's suggestion, except that I used insndb to generate binutils opcodes, as well as assembly and disassembly for tests. Please refer to the recent commits in dis branch. The result is sv_binutils_fptrans.py, which can be called with either opcodes or asm or dis argument; these generate the needed files respectively. I have yet to check it with binutils, but the output looks reasonable on the first glance. https://git.libre-soc.org/?p=openpower-isa.git;a=commit;h=ba64cb8d185380b854fdeb26cd1aeb07679dab96
Jacob, the most important thing is to check whether all of these are X form with Rc support. Examples are below. #define OP(x) ((((uint64_t)(x)) & 0x3f) << 26) #define X(op, xop) (OP (op) | ((((uint64_t)(xop)) & 0x3ff) << 1)) #define XRC(op, xop, rc) (X ((op), (xop)) | ((rc) & 1)) #define X_MASK XRC (0x3f, 0x3ff, 1) {"fpowns", XRC(59,876,0), X_MASK, SVP64, PPCVLE, {FRT, FRA, RB}} {"fminmagnum19", XRC(63,910,0), X_MASK, SVP64, PPCVLE, {FRT, FRA, FRB}}, {"flog10.", XRC(63,973,1), X_MASK, SVP64, PPCVLE, {FRT, FRB}},
P.S. Note that the code assumes 1 section opcode per instruction for now.
(In reply to Dmitry Selyutin from comment #29) > I followed Jacob's suggestion, except that I used insndb to generate > binutils opcodes, as well as assembly and disassembly for tests. Please > refer to the recent commits in dis branch. The result is > sv_binutils_fptrans.py, which can be called with either opcodes or asm or > dis argument; these generate the needed files respectively. I have yet to > check it with binutils, but the output looks reasonable on the first glance. > > https://git.libre-soc.org/?p=openpower-isa.git;a=commit; > h=ba64cb8d185380b854fdeb26cd1aeb07679dab96 looks fine to me, except PPCVLE looks wrong...isn't PPCVLE the flag for the VLE extension? also it's apparently deprecated: https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=opcodes/ChangeLog-2016;hb=fe39ffdc202f04397f31557f17170b40bc42b77a#l1485
(In reply to Dmitry Selyutin from comment #30) > Jacob, the most important thing is to check whether all of these are X form > with Rc support. all of fptrans should be that way -- sv/trans/svp64.py would be broken if there were any exceptions: https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/sv/trans/svp64.py;h=c9436e9c1498adf797af91a3ee04140500c902a2;hb=e3ebeaafbc0fc1864f05746c49c1b6b98b3e12ad#l317
(In reply to Jacob Lifshay from comment #32) > looks fine to me, except PPCVLE looks wrong...isn't PPCVLE the flag for the > VLE extension? also it's apparently deprecated: > https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=opcodes/ChangeLog- This PPCVLE there actually _disables_ PPCVLE support. And in enabled flags we have SVP64. Unless PPCVLE flag is disabled, one has to play with another VLE-specific table for any instruction declared.
The code needs more changes, stay tuned.
OK, done. I'll publish the patch for binutils soon.
(In reply to Dmitry Selyutin from comment #29) > I followed Jacob's suggestion, except that I used insndb to generate > binutils opcodes, as well as assembly and disassembly for tests. Please > refer to the recent commits in dis branch. The result is > sv_binutils_fptrans.py, which can be called with either opcodes or asm or > dis argument; brilliant. > these generate the needed files respectively. I have yet to > check it with binutils, but the output looks reasonable on the first glance. > > https://git.libre-soc.org/?p=openpower-isa.git;a=commit; > h=ba64cb8d185380b854fdeb26cd1aeb07679dab96 ahh can we please not do this: + return f"{entry.name} {','.join(map(str, values))}" i critically rely on colour syntax highlighting (left/right brain) and PHP-style embedded programming within strings is far harder to read and understand. if you absolutely must use f"" which is itself confusing please create a (short) temporary variable which is then inserted. this is infinitely better as it separates the layout from the variables/expressions being substituted and allows colour syntax highlighting to reduce neural brain activity required to identify "%s %s" % (entry.name, ','.join(map(str, values)))
(In reply to Luke Kenneth Casson Leighton from comment #37) > ahh can we please not do this: > > + return f"{entry.name} {','.join(map(str, values))}" > > i critically rely on colour syntax highlighting (left/right brain) > and PHP-style embedded programming within strings is far harder > to read and understand. well, if your syntax highlighter doesn't highlight the expressions in a f-string, you need a better syntax highlighter. for vim, apparently this one works: https://github.com/vim-python/python-syntax
It's just personal preference. This piece of code is not critical, and it's opinion-based part. I suggest we stick to discussing real technical stuff, not style preferences.
(In reply to Dmitry Selyutin from comment #39) > It's just personal preference. This piece of code is not critical, and it's > opinion-based part. I suggest we stick to discussing real technical stuff, > not style preferences. it's going to be a lot of work to remove them all rather than have them written in a readable style initially. if nothing else please do minimise the use of PHP-style embedding of actual code fragments inside strings, they really do take more neural effort to parse. (In reply to Jacob Lifshay from comment #38) > well, if your syntax highlighter doesn't highlight the expressions in a > f-string, you need a better syntax highlighter. gitweb does not cope and i do not want to be spending time writing improved gitweb syntax highlighting
Again: please do not operate with such wording like "readable". To me things like "%s%d%08x" (a, b, c) is less readable. It's like discussing these idiotic indents like I did. You're forcing to use your own favorite convention, and, whilst it's reasonable for critical components or code you wrote personally, it is a time waste for other parts.
This discussion has non-technical direction. Again: this is discussing personal preferences, like that indentation debate. Yes there might be some technical arguments, on both sides, and still, overall, this is just a major distraction.