The current unit tests are high-level, cumbersome and time-consuming to run (and can never give 100% coverage). Formal correctness proofs are needed which at least cover the low-level components, and move up to the higher levels - if practical. https://git.libre-riscv.org/?p=ieee754fpu.git https://smtlib.cs.uiowa.edu/papers/BTRW15.pdf https://github.com/cvc5/cvc5/blob/main/INSTALL.rst https://hal.inria.fr/hal-01522770/document http://toccata.lri.fr/gallery/fp.en.html https://www.lri.fr/~sboldo/research.html https://www.cl.cam.ac.uk/~jrh13/hol-light/ https://www.cl.cam.ac.uk/~jrh13/fpv/index.html https://www.cl.cam.ac.uk/~jrh13/papers/sfm.html

http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2020-January/003658.html make sure this is not lost. james, thank you, i think this went to spam for most people

https://www.lri.fr/~filliatr/ftp/publis/caduceus-floats.pdf some interesting leads and papers, it looks like there have been quite a few Formal Correctness Proof attempts on IEEE754 FP already. which if true would make this more a task of applying those rather than attempting to create a new one from scratch. that approach is IMPORTANT because those pre-existing Proofs would be a "known-good". attempting to reinvent the wheel unnecessarily actually introduces the serious risk of not having provenance behind the newly-invented Proof.

(In reply to Luke Kenneth Casson Leighton from comment #2) > https://www.lri.fr/~filliatr/ftp/publis/caduceus-floats.pdf afaict from reading the first few pages, this particular paper isn't what we want, since it's focused on formal proofs of C programs based on pre-existing known-correct fp operations. what we want is instead formal proofs of those fp operations. that said, its list of references could be useful. > > some interesting leads and papers, it looks like there > have been quite a few Formal Correctness Proof attempts > on IEEE754 FP already. yes. > which if true would make this more a task of applying > those rather than attempting to create a new one from > scratch. the problem is that all the formal proofs of floating-point that I've seen are based on comparing the floating-point operation to doing the operation on infinite-precision real numbers and then rounding that to the correct floating-point number. The problem is that, because our formal proofs are written in nmigen, they have to go through yosys translation before getting to the smtlib2 language (the actual input language for the smt solvers -- which supports real numbers (actually real algebraic numbers, it lacks transcendental ops/constants)). yosys doesn't support anything but bitvectors and arrays of bitvectors -- no real numbers (don't confuse it with yosys's partial support for verilog reals but only in parameters and only as literal numbers -- verilog real is a double, not a real number). > that approach is IMPORTANT because those pre-existing > Proofs would be a "known-good". > > attempting to reinvent the wheel unnecessarily actually > introduces the serious risk of not having provenance > behind the newly-invented Proof.

from email with lkcl -- in response to concern about yosys not supporting real numbers (i corrected some spelling errors): basically there are a number of potential approaches: 1) cheat. proofs are done on submodules in a verbatim fashion. "this module intends to do X therefore test for X with very little thought as to the bigger picture" 2) reimplement (effectively) the algorithms in "simple" (obvious) reviewable fashion and use straight "equals" 3) compute error-bounds (erf, error-function) and check results 4) find someone else's papers or formal verification technique e.g. https://www.lri.fr/~filliatr/ftp/publis/caduceus-floats.pdf this should go on mailing list. my guess is (3) will have lots of papers online about error-bounds of IEEE754 FP.

https://www.cl.cam.ac.uk/~jrh13/papers/tang.html although quite old and archaic this looks very good, there must be more.

(In reply to Jacob Lifshay from comment #4) > from email with lkcl -- in response to concern about yosys not supporting > real numbers (i corrected some spelling errors): > basically there are a number of > potential approaches: > > 1) cheat. proofs are done on submodules in a verbatim fashion. > "this module intends to do X therefore test for X with very little > thought as to the bigger picture" if we just rehashed the implementation verbatim in the formal correctness proof, then we'd be lying if we said the formal proof was any better than the original unknown-correctness nmigen code, we'd only be checking that the original code gives the same answers when run in a formal proof, not that those answers are actually correct. > > 2) reimplement (effectively) the algorithms in "simple" (obvious) > reviewable fashion and use straight "equals" that requires the fp code we have to implement at least one of the rounding modes correctly, otherwise it'll give different results than the formal proof's correct rounding, causing the equality to fail. this also kinda requires real numbers to compute the correctly rounded result to compare against. > > 3) compute error-bounds (erf, error-function) and check results that generally requires real numbers of some sort...and no, the error function isn't what you'd want here...that has to do with statistics of normal distributions, not bounding the mathematical error in fp operations (unless that fp operation is an implementation of the erf function). > > 4) find someone else's papers or formal verification technique > e.g. https://www.lri.fr/~filliatr/ftp/publis/caduceus-floats.pdf > > this should go on mailing list. my guess is (3) will have lots > of papers online about error-bounds of IEEE754 FP.

(In reply to Jacob Lifshay from comment #3) > to the smtlib2 language (the actual input language for the smt solvers -- > which supports real numbers (actually real algebraic numbers, it lacks > transcendental ops/constants)). yosys doesn't support anything but > bitvectors and arrays of bitvectors -- no real numbers (don't confuse it > with yosys's partial support for verilog reals but only in parameters and > only as literal numbers -- verilog real is a double, not a real number). yeah even if it existed we could in no way rely on FP number representation, it would be machine-dependent and there is just no way it could be trusted. even if soft-implemented (j hauser softfloat) that's still dicey.

(In reply to Luke Kenneth Casson Leighton from comment #5) > https://www.cl.cam.ac.uk/~jrh13/papers/tang.html > > although quite old and archaic this looks very good, > there must be more. yup. unfortunately afaict from reading the first few pages, this paper's basic idea is "lets use a formal proof program that lets us use real numbers, making it easier to check if our fp is correct since we can just round those real numbers to the correct fp answers."

(In reply to Jacob Lifshay from comment #6) > if we just rehashed the implementation verbatim in the formal correctness > proof, then we'd be lying if we said the formal proof was any better than > the original unknown-correctness nmigen code, we'd only be checking that the > original code gives the same answers when run in a formal proof, not that > those answers are actually correct. the idea was even worse than that :) just do a real basic proof of "the normalisation class" and so on. it was not a serious suggestion but one that is easy to do and perhaps might work if they are chained together then tested against other implementations. > > > > 2) reimplement (effectively) the algorithms in "simple" (obvious) > > reviewable fashion and use straight "equals" > > that requires the fp code we have to implement at least one of the rounding > modes correctly, otherwise it'll give different results than the formal > proof's correct rounding, causing the equality to fail. this also kinda > requires real numbers to compute the correctly rounded result to compare > against. we discussed the flexible solution here to not implement rounding yet and come back to it later during a second phase. the proof would also not have rounding and would also have it added later. > > > > 3) compute error-bounds (erf, error-function) and check results > > that generally requires real numbers of some sort... ah no. i mean in digital form. express the expected difference in digital form. similar to that Goldberg iteration checking thing.

(In reply to Luke Kenneth Casson Leighton from comment #7) > (In reply to Jacob Lifshay from comment #3) > > to the smtlib2 language (the actual input language for the smt solvers -- > > which supports real numbers (actually real algebraic numbers, it lacks > > transcendental ops/constants)). yosys doesn't support anything but > > bitvectors and arrays of bitvectors -- no real numbers (don't confuse it > > with yosys's partial support for verilog reals but only in parameters and > > only as literal numbers -- verilog real is a double, not a real number). > > yeah even if it existed we could in no way rely on FP number > representation, it would be machine-dependent and there is > just no way it could be trusted. even if soft-implemented > (j hauser softfloat) that's still dicey. smtlib2's fp types are machine independent.

(In reply to Luke Kenneth Casson Leighton from comment #9) > (In reply to Jacob Lifshay from comment #6) > > > > > > > 2) reimplement (effectively) the algorithms in "simple" (obvious) > > > reviewable fashion and use straight "equals" > > > > that requires the fp code we have to implement at least one of the rounding > > modes correctly, otherwise it'll give different results than the formal > > proof's correct rounding, causing the equality to fail. this also kinda > > requires real numbers to compute the correctly rounded result to compare > > against. > > we discussed the flexible solution here to not implement rounding yet and > come back to it later during a second phase. the problem is that the mere act of generating an answer encoded as a fp number *is rounding* (even if not following any defined rounding mode). > > the proof would also not have rounding and would also have it added later. unless you can make the proof round in exactly the same way as the algorithm your testing against, the proof will always have assertion failures where the rounding has a mismatch. > > > > > > > > 3) compute error-bounds (erf, error-function) and check results > > > > that generally requires real numbers of some sort... > > ah no. i mean in digital form. express the expected difference in > digital form. similar to that Goldberg iteration checking thing. that's done using arbitrary-precision rational numbers and/or bigints...another set of things smtlib2 supports that nmigen/yosys doesn't.

(In reply to Jacob Lifshay from comment #11) > the problem is that the mere act of generating an answer encoded as a fp > number *is rounding* (even if not following any defined rounding mode). then that would default temporarily to the exact same rounding mode implemented in the current HDL. be flexible here. work with what is.

I created a demo f32 fmul round-to-nearest-ties-to-even formal proof in smtlib2 format to see if current solvers are actually have good enough fp support to be usable. I tried running it in z3 and it just kept running for 5-10min (after about 15 rounds of fixing bugs in my fmul implementation)...I canceled it for now and committed my work, I'll try running it longer tomorrow. https://git.libre-soc.org/?p=ieee754fpu.git;a=commit;h=eb55e2592d0dba8565beff291aeee45bceffe9b8 Author: Jacob Lifshay <programmerjake@gmail.com> Date: Fri May 13 01:23:32 2022 -0700 add fpmul_test.smt2 as a test to see if we should bother trying to wire-up smtlib2 real and fp support in yosys and nmigen It's probably correct, z3 ran longer than I bothered to wait, it usually stops relatively quickly if there's an error in the logic.

(In reply to Jacob Lifshay from comment #10) > (In reply to Luke Kenneth Casson Leighton from comment #7) > > > > yeah even if it existed we could in no way rely on FP number > > representation, it would be machine-dependent and there is > > just no way it could be trusted. even if soft-implemented > > (j hauser softfloat) that's still dicey. > > smtlib2's fp types are machine independent. Note smtlib2 is a specification for a format that sat solvers can read their inputs in. elaborating more, smtlib2's fp support is specifically designed to be machine independent, for example, there is no way to determine which NaN a particular fp op returns, smtlib2 treats all NaNs identically. all you can do is see you got a NaN, not which NaN: Quoting the spec: https://smtlib.cs.uiowa.edu/theories-FloatingPoint.shtml > There is no function for converting from (_ FloatingPoint eb sb) to the > corresponding IEEE 754-2008 binary format, as a bit vector (_ BitVec m) with > m = eb + sb, because (_ NaN eb sb) has multiple, well-defined representations. > Instead, an encoding of the kind below is recommended, where f is a term > of sort (_ FloatingPoint eb sb): > > (declare-fun b () (_ BitVec m)) > (assert (= ((_ to_fp eb sb) b) f)) also, smtlib2 supports fp formats with arbitrary user-specified exponent and mantissa sizes, so it can also easily be used to test bf16 ops. If you're worried the fp implementation they use might be buggy, the solution is utterly simple -- just run your tests with more than one sat solver, they shouldn't all have the exact same bugs. therefore, I think smtlib2's fp can be relied upon to be correct.

i researched how smtlib2 real and fp support could be added to yosys: it'll likely be a huge amount of work if we try to add signals of real/fp types since yosys assumes everything is only ever a list of bits. therefore, I think the best route is to instead add a new built-in cell type, where all inputs/outputs are still bitvectors/bools but internally it can do real/fp ops. it would have a parameter with the expression to evaluate (probably as a string). all of yosys would treat it as a black box, except for the smtlib2 backend which would put the expression in the appropriate place in the generated output. nmigen could be extended with helper classes/functions to make it easier to generate those expression cells, though that's not technically necessary.

(In reply to Jacob Lifshay from comment #15) > therefore, I think the best route is to instead add a new built-in cell > type, to be clear, this is only if we decide we want to extend yosys with smtlib2's real/fp support.

(In reply to Jacob Lifshay from comment #13) > It's probably correct, z3 ran longer than I bothered to wait, it usually > stops relatively quickly if there's an error in the logic. yes, mul is awful. fpadd would have been much faster (In reply to Jacob Lifshay from comment #14) > Note smtlib2 is a specification for a format that sat solvers can read their > inputs in. oo that's a really good sign. > also, smtlib2 supports fp formats with arbitrary user-specified exponent and > mantissa sizes, so it can also easily be used to test bf16 ops. ah. and FP8 which someone from the IBM India Education Course wants to investigate. (In reply to Jacob Lifshay from comment #15) > therefore, I think the best route is to instead add a new built-in cell > type, where all inputs/outputs are still bitvectors/bools but internally it > can do real/fp ops. it would have a parameter with the expression to > evaluate (probably as a string). all of yosys would treat it as a black box, > except for the smtlib2 backend which would put the expression in the > appropriate place in the generated output. nice. elegant hack. means we need a yosys git repo but that's ok. > nmigen could be extended with helper classes/functions to make it easier to > generate those expression cells, though that's not technically necessary. well anything that makes life easier is good: an IT sysadmin rule i heard, if you do something more than even just once, script it :) it's Friday and i have a meeting with the IBM India team today, i will see what the Professor thinks. he has students who could help.

https://theory.stanford.edu/~nikolaj/programmingz3.html#sec-floating-point-arithmetic z3 apparently likewise has FPSort() looking positive.

https://git.libre-soc.org/?p=ieee754fpu.git;a=commitdiff;h=eb55e2592d0dba8565beff291aeee45bceffe9b8 looks really interesting. do remember to keep commit messages to below 80 chars though. single-line rolls scrollbars off the page as the 1st line is by convention "special" (treated as a heading)

there is an advantage to using nmigen rather than explicit yosys: nmigen acts as a "cut-off" point by which we can represent the FP numbers in a stable format. if written explicitly in the representation that yosys has (after patching) and it gets rejected upstream in favour of an alternative implementation we're screwed: hundreds to thousands of one-line changes needed to conform to the new yosys representation. if however it's hidden behind a nmigen abstraction then there's just one place that would need updating: nmigen.

(In reply to Luke Kenneth Casson Leighton from comment #17) > (In reply to Jacob Lifshay from comment #13) > > > It's probably correct, z3 ran longer than I bothered to wait, it usually > > stops relatively quickly if there's an error in the logic. > > yes, mul is awful. fpadd would have been much faster I specifically chose fmul because it's slower...the idea is that if fmul is feasible, nearly everything we'd want to implement will be feasible. > > nmigen could be extended with helper classes/functions to make it easier to > > generate those expression cells, though that's not technically necessary. > > well anything that makes life easier is good: an IT sysadmin rule i heard, > if you do something more than even just once, script it :) > > it's Friday and i have a meeting with the IBM India team today, i will > see what the Professor thinks. he has students who could help. ok, though once we have the yosys parts working (I expect the only hard part to be modifying the smtlib2 backend), i expect the nmigen support to be very quick and easy -- around as fast as I can type.

(In reply to Jacob Lifshay from comment #21) > (In reply to Luke Kenneth Casson Leighton from comment #17) > > (In reply to Jacob Lifshay from comment #13) > > > > > It's probably correct, z3 ran longer than I bothered to wait, it usually > > > stops relatively quickly if there's an error in the logic. > > > > yes, mul is awful. fpadd would have been much faster > > I specifically chose fmul because it's slower...the idea is that if fmul is > feasible, nearly everything we'd want to implement will be feasible. I ran z3 for 84min on my phone, it still didn't finish. imho it's not feasible for at least z3 4.8.17. imho we should try it on other smt engines, they may work faster. one other option worth trying is figuring out how to export the hw into coq format, since coq allows interactive proofs -- aka it'll ask a human (or a file we write containing the expected answers) for help when it gets stuck. coq should work much better than most smt engines since they tend to end up brute-forcing it when they're stuck, which only works for simple cases.

21 seconds. $z3 -version Z3 version 4.8.5 - 64 bit lkcl@fizzy:~/src/libresoc/ieee754fpu$ time z3 -smt2 fpmul_test.smt2 should return unsat: sat dumping values in case it returned sat: ((a #x80000000) (b #xff800000) (p #x7f800002) ((f32_to_fp a) (_ -zero 8 24)) ((f32_to_fp b) (_ -oo 8 24)) ((fp.mul RNE (f32_to_fp a) (f32_to_fp b)) (_ NaN 8 24)) ((f32_to_fp (f32_mul_rne a b)) (_ +oo 8 24)) ((f32_mul_nonzero_finite_rne a b) #x40800000) ((f32_mantissa_field a) #b00000000000000000000000) ((f32_mantissa_value a) #x800000) ((f32_mantissa_field b) #b00000000000000000000000) ((f32_mantissa_value b) #x800000) (product #x400000000000) (sign #b0) (exponent #x002) ((bv48_clz product) #x000000000001) ((ite (bvult #x00000000FFFF product) #x000000000000 #x000000000010) #x000000000000) (norm_product #x800000000000) (norm_exponent #x002) (subnormal_shift #xf80) (product_subnormal #x000000000001) (half_way_subnormal false) (is_even_subnormal true) (rounded_up_subnormal #x000000800001) (round_up_overflows_subnormal false) (do_round_up_subnormal false) (exponent_field_normal #x81) (half_way_normal false) (is_even_normal true) (rounded_up_normal #x800000800000) (round_up_overflows_normal false) (do_round_up_normal false)) real 0m21.179s user 0m21.147s sys 0m0.024s

(In reply to Luke Kenneth Casson Leighton from comment #23) > 21 seconds. > > $z3 -version > Z3 version 4.8.5 - 64 bit > > > > lkcl@fizzy:~/src/libresoc/ieee754fpu$ time z3 -smt2 fpmul_test.smt2 > should return unsat: > sat > dumping values in case it returned sat: > ((a #x80000000) > (b #xff800000) > (p #x7f800002) > ((f32_to_fp a) (_ -zero 8 24)) > ((f32_to_fp b) (_ -oo 8 24)) > ((fp.mul RNE (f32_to_fp a) (f32_to_fp b)) (_ NaN 8 24)) > ((f32_to_fp (f32_mul_rne a b)) (_ +oo 8 24)) ahh, i forgot to add the check for inf * 0 -> NaN.

(In reply to Luke Kenneth Casson Leighton from comment #23) > dumping values in case it returned sat: > ((a #x80000000) minus zero > (b #xff800000) times minus infinity > (p #x7f800002) would be plus infinity if not for the 2

(In reply to Luke Kenneth Casson Leighton from comment #25) > (In reply to Luke Kenneth Casson Leighton from comment #23) > > > dumping values in case it returned sat: > > ((a #x80000000) > > minus zero > > > (b #xff800000) > > times minus infinity > > > (p #x7f800002) > > would be plus infinity if not for the 2 yup, p is a NaN. p is the output of smtlib2's fp.mul. what isn't correct is the output of f32_mul_rne which is the fmul that I wrote using bitvectors.

(In reply to Luke Kenneth Casson Leighton from comment #23) > 21 seconds. > > $z3 -version > Z3 version 4.8.5 - 64 bit I compiled that version of z3 and it took 13.6s to run... After that, I added the missing inf * 0 cases and am currently running it.

https://libre-soc.org/irclog/%23libre-soc.2022-05-14.log.html

(In reply to Jacob Lifshay from comment #27) > (In reply to Luke Kenneth Casson Leighton from comment #23) > > 21 seconds. > > > > $z3 -version > > Z3 version 4.8.5 - 64 bit > > I compiled that version of z3 and it took 13.6s to run... > > After that, I added the missing inf * 0 cases and am currently running it. I ran both z3 and cvc5, I stopped both of them after 3hr...that's definitely too slow to be readily useful.

(In reply to Jacob Lifshay from comment #29) > I ran both z3 and cvc5, I stopped both of them after 3hr...that's definitely > too slow to be readily useful. still running here (overnight). so. there is not yet a sense of "scale", i.e. this is still "unknown". to get a sense of scale of time, a "success" must be achieved. the next test then would be to try FP16, BF16, or even FP8. FP8 would be so tiny (assume 4-bit exponent, 3-bit mantissa) that it should complete even brute-force. comparing then to FP16 and then BF16 should give a sense of whether the scale of completion time is exponential or other. it does not matter - at all - that brute-force "might be better" it also does not strictly matter that the ultimate desire is to prove FP32 and FP64: given that the FP classes are parameterised and it is identical code i would say that if the FP classes *in general* succeed at lower bitwidths that there is a reasonable chance they are correct at the larger. [this is not entirely true due to the number of pipeline/FSM stages varying, but their chain and sequence does not vary]

(In reply to Luke Kenneth Casson Leighton from comment #30) > the next test then would be to try FP16 I created a fp16 variant: https://git.libre-soc.org/?p=ieee754fpu.git;a=blob;f=fp16mul_test.smt2;h=6ad9dc798eb34c72357c3d94d953de01493a5795;hb=c17c88ffba83c546ae439238fdd43adce3992f53 currently running z3 and cvc5, though cvc5's support for stuff other than f32/f64 is experimental and may be broken.

(In reply to Jacob Lifshay from comment #31) > (In reply to Luke Kenneth Casson Leighton from comment #30) > > the next test then would be to try FP16 > > I created a fp16 variant: > https://git.libre-soc.org/?p=ieee754fpu.git;a=blob;f=fp16mul_test.smt2; > h=6ad9dc798eb34c72357c3d94d953de01493a5795; > hb=c17c88ffba83c546ae439238fdd43adce3992f53 > > currently running z3 and cvc5, though cvc5's support for stuff other than > f32/f64 is experimental and may be broken. success on both z3 and cvc5! log-cvc5.txt:fp16mul_test.smt2:12.12: No set-logic command was given before this point. log-cvc5.txt:fp16mul_test.smt2:12.12: cvc5 will make all theories available. log-cvc5.txt:fp16mul_test.smt2:12.12: Consider setting a stricter logic for (likely) better performance. log-cvc5.txt:fp16mul_test.smt2:12.12: To suppress this warning in the future use (set-logic ALL). log-cvc5.txt:"should return unsat:" log-cvc5.txt:unsat log-cvc5.txt:"dumping values in case it returned sat:" log-cvc5.txt:(error "Cannot get value unless after a SAT or UNKNOWN response.") log-cvc5.txt: log-cvc5.txt:real 106m35.297s log-cvc5.txt:user 106m31.478s log-cvc5.txt:sys 0m0.916s log-z3.txt:should return unsat: log-z3.txt:unsat log-z3.txt:dumping values in case it returned sat: log-z3.txt:(error "line 410 column 1: model is not available") log-z3.txt: log-z3.txt:real 77m36.487s log-z3.txt:user 77m35.130s log-z3.txt:sys 0m0.236s

so, i think smtlib2 fp will be useful, though probably won't be able to prove everything, so there we'll probably have to rely on fp16 being proven correct and fp32/fp64 assumed to be correct because we ran lots of tests and the same code is known to work for fp16. I'd like to start working on adding support for it to yosys and nmigen, does that sound good? (should be a separate bug)

(In reply to Jacob Lifshay from comment #33) > so, i think smtlib2 fp will be useful, though probably won't be able to > prove everything, so there we'll probably have to rely on fp16 being proven > correct and fp32/fp64 assumed to be correct because we ran lots of tests and > the same code is known to work for fp16. basically yes. > I'd like to start working on adding support for it to yosys and nmigen, does > that sound good? (should be a separate bug) yeah absolutely great. if you're happy to create a gitlab account then pull-requests can be submitted there. for yosys please base it off of 0.13, i'll create a repo on git.libre-soc.org shortly $ yosys --version Yosys 0.13 (git sha1 8b1eafc3a, clang 9.0.1-12 -fPIC -Os)