650 – write rfc for OpenPower fpr <-> gpr moves/conversions

Bug 650 - write rfc for OpenPower fpr <-> gpr moves/conversions

Summary: write rfc for OpenPower fpr <-> gpr moves/conversions

Status:	RESOLVED FIXED

Alias:	None

Product:	Libre-SOC's first SoC
Classification:	Unclassified
Component:	Specification (show other bugs)
Version:	unspecified
Hardware:	All All

Importance:	--- enhancement
Assignee:	Jacob Lifshay

URL:	https://libre-soc.org/openpower/sv/in...

Depends on:	881
Blocks:
	Show dependency tree / graph

Reported:	2021-06-02 23:14 BST by Jacob Lifshay
Modified:	2023-04-27 16:24 BST (History)
CC List:	4 users (show)

See Also:	230 550 945 1015
NLnet milestone:	NLNet.2019.10.031.Video
total budget (EUR) for completion of task and all subtasks:	2000
budget (EUR) for this task, excluding subtasks' budget:	0
parent task for budget allocation:	230
child tasks for budget allocation:	881
The table of payments (in EUR) for this task; TOML format:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jacob Lifshay 2021-06-02 23:14:31 BST

Comment 1 Jacob Lifshay 2021-06-02 23:38:21 BST

See https://bugs.libre-soc.org/show_bug.cgi?id=230#c74
and https://bugs.libre-soc.org/show_bug.cgi?id=230#c76

Comment 2 Jacob Lifshay 2021-06-03 02:31:47 BST

Finished initial draft:
https://libre-soc.org/openpower/sv/int_fp_mv/

Comment 3 Luke Kenneth Casson Leighton 2021-06-03 10:27:57 BST

(In reply to Jacob Lifshay from comment #2)
> Finished initial draft:
> https://libre-soc.org/openpower/sv/int_fp_mv/

looks really good, Jacob.

comments: float-load-immediate.  hmm. i like it, the only issue
being it needs its own major opcode (or, a massive part of one).
the cost-benefit therefore had better be really, *really* good.

one possibility is that it's added as a 64-bit prefixed version,
where the constant is made full 32-bit, mostly from the prefix.

the other possibility is, i notice the constants have zero in
the last 4 bits, and yet still cover a pretty large useful range

fmvis f4, 0x800 # writes -0.0 to f4
fmvis f4, 0x3F8 # writes +1.0 to f4
fmvis f4, 0xBF8 # writes -1.0 to f4
fmvis f4, 0xBFC # writes -1.5 to f4
fmvis f4, 0x7FC # writes +qNaN to f4
fmvis f4, 0x7F8 # writes +Infinity to f4
fmvis f4, 0xFF8 # writes -Infinity to f4

12 bits dedicated to an immediate is still pretty large but with
there being only one destination this is doable:

    0-5   | 6-10 | 11-23 | 24-30 | 31
    Major | FRT  | UI    | XO    | Rc

this would then fit into one column of a Minor 19 (similar to addpcis)
with one bit spare (bit 24, set to either zero or 1)

Comment 4 Jacob Lifshay 2021-06-03 10:46:36 BST

(In reply to Luke Kenneth Casson Leighton from comment #3)
> (In reply to Jacob Lifshay from comment #2)
> > Finished initial draft:
> > https://libre-soc.org/openpower/sv/int_fp_mv/
> 
> looks really good, Jacob.

:)

> comments: float-load-immediate.  hmm. i like it, the only issue
> being it needs its own major opcode (or, a massive part of one).
> the cost-benefit therefore had better be really, *really* good.
> 
> one possibility is that it's added as a 64-bit prefixed version,
> where the constant is made full 32-bit, mostly from the prefix.

that could also work... we need to beat fusing an int load immediate
with a GPR -> FPR move though, which is not easy. Also, 64-bit instructions can't be SVP64-prefixed.
> 
> the other possibility is, i notice the constants have zero in
> the last 4 bits, and yet still cover a pretty large useful range

We really should cover at least all ints -16.0 to 16.0 so that needs at least 3 mantissa bits, more bits are preferred. We don't need all 8 exponent bits -- we could probably just use a 7-bit exponent and have the instruction re-expand it if we need space -- it's about as expensive as a sign-extension. I don't want to drop down to 6 exponent bits though -- I'd like to be able to represent 2^31, 2^32, and 2^63 since those are somewhat common constants.
> 
> fmvis f4, 0x800 # writes -0.0 to f4
> fmvis f4, 0x3F8 # writes +1.0 to f4
> fmvis f4, 0xBF8 # writes -1.0 to f4
> fmvis f4, 0xBFC # writes -1.5 to f4
> fmvis f4, 0x7FC # writes +qNaN to f4
> fmvis f4, 0x7F8 # writes +Infinity to f4
> fmvis f4, 0xFF8 # writes -Infinity to f4
> 
> 12 bits dedicated to an immediate is still pretty large but with
> there being only one destination this is doable:
> 
>     0-5   | 6-10 | 11-23 | 24-30 | 31
>     Major | FRT  | UI    | XO    | Rc

Rc?! who needs that on a load immediate?! -- one more bit of mantissa...

Comment 5 Luke Kenneth Casson Leighton 2021-06-03 11:22:48 BST

(In reply to Jacob Lifshay from comment #4)
> (In reply to Luke Kenneth Casson Leighton from comment #3)

> > one possibility is that it's added as a 64-bit prefixed version,
> > where the constant is made full 32-bit, mostly from the prefix.
> 
> that could also work... we need to beat fusing an int load immediate
> with a GPR -> FPR move though, which is not easy.

the LD itself is a cost, however yes a double whammy of less space
is better.

> Also, 64-bit instructions can't be SVP64-prefixed.

they can... with the shift register that is shared with REMAP. it's just a pain.


> > 
> > the other possibility is, i notice the constants have zero in
> > the last 4 bits, and yet still cover a pretty large useful range
> 
> We really should cover at least all ints -16.0 to 16.0 so that needs at
> least 3 mantissa bits, more bits are preferred. We don't need all 8 exponent
> bits -- we could probably just use a 7-bit exponent and have the instruction
> re-expand it if we need space -- it's about as expensive as a
> sign-extension. I don't want to drop down to 6 exponent bits though -- I'd
> like to be able to represent 2^31, 2^32, and 2^63 since those are somewhat
> common constants.

ok.  hm.  i was thinking FP16 up until you mentioned that.

> > 
> > fmvis f4, 0x800 # writes -0.0 to f4
> > fmvis f4, 0x3F8 # writes +1.0 to f4
> > fmvis f4, 0xBF8 # writes -1.0 to f4
> > fmvis f4, 0xBFC # writes -1.5 to f4
> > fmvis f4, 0x7FC # writes +qNaN to f4
> > fmvis f4, 0x7F8 # writes +Infinity to f4
> > fmvis f4, 0xFF8 # writes -Infinity to f4
> > 
> > 12 bits dedicated to an immediate is still pretty large but with
> > there being only one destination this is doable:
> > 
> >     0-5   | 6-10 | 11-23 | 24-30 | 31
> >     Major | FRT  | UI    | XO    | Rc
> 
> Rc?! who needs that on a load immediate?! -- one more bit of mantissa...

:) sets CR1 as well... is that useful? probably not because the compiler
can statically determine what to do.

and i realised, bit 24 can also be used, so that's 14 bits.

Comment 6 Luke Kenneth Casson Leighton 2021-06-03 12:23:59 BST

    0-5   | 6-10 | 11-24 | 25-30 | 31
    Major | FRT  | UI    | XO    | UI0

if two columns of Minor 19 are used it becomes possible
to fit the entire BF16

Comment 7 Luke Kenneth Casson Leighton 2021-06-03 12:26:58 BST

(In reply to Luke Kenneth Casson Leighton from comment #6)
>     0-5   | 6-10 | 11-24 | 25-30 | 31
>     Major | FRT  | UI    | XO    | UI0
> 
> if two columns of Minor 19 are used it becomes possible
> to fit the entire BF16

nope. wrong.

    0-5   | 6-10 | 11-25 | 26-30 | 31
    Major | FRT  | UI    | XO    | UI0

* 26-30 is five. top half of XO like in addpcis
* 11-25 is fifteen
* plus bit 31 is sixteen bits.

no need for 2 columns, just one will do

Comment 8 Luke Kenneth Casson Leighton 2021-06-03 12:38:15 BST

|  0-5   | 6-10 | 11-15  | 16-25 | 26-30 | 31 |
|--------|------|--------|-------|-------|----|
|  Major | RT   | //Mode | FRA   | XO    | Rc |
|  Major | FRT  | //Mode | RA    | XO    | Rc |

ha, that looks allright for FPint cvt.  meshes with the existing
format, with the Mode field fitting into RB.

unfortunately, i bet you that no illegal instruction exception is
raised in the existing fcfids (etc) which means we can't fit into
the existing XO with bits 11-15 being all zero.

if that's the case an entirely new XO set will be needed.
i'd strongly suggest reserved encodings for all 5 bits
11-15 throw an illegal instruction exception. this allows
software emulation of future Mode variants if there are any.

Comment 9 Luke Kenneth Casson Leighton 2021-06-03 18:35:49 BST

i started adding in crossreferences to the equivalent Power ISA operations
fcfids etc then got totally confused.

jacob can you add a full table listing modes on one axis, operation on
the other, and place the equivalent v3.0B operation(s) in each relevant cell?

perhaps no need to put the VSX or LD/ST in it as well which would get FRT or FRS into the right INT reg, leave that implicit?

Comment 10 Jacob Lifshay 2021-06-03 19:05:42 BST

fix formatting: luke you had mistakenly unindented the list of reasons to have each FP -> Int conversion mode, I reindented those so they are correctly grouped under the appropriate list entry.

Also:
commit bb9a2dcc50b656e07accaf1036edb3607ea82f6c (HEAD -> master, origin/master, origin/HEAD)
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Jun 3 10:58:53 2021 -0700

    bitwise moves never set exceptions or mess with FPSCR

diff --git a/openpower/sv/int_fp_mv.mdwn b/openpower/sv/int_fp_mv.mdwn
index f28da1fe..dfb93c1d 100644
--- a/openpower/sv/int_fp_mv.mdwn
+++ b/openpower/sv/int_fp_mv.mdwn
@@ -129,7 +129,7 @@ move a 32-bit float from a GPR to a FPR, just copying bits. Converts the
 32-bit float in `RA` to a 64-bit float, then writes the 64-bit float to
 `FRT`.

-TODO: Rc=1 variants? also, any exceptions or FPSCR bits set?
+TODO: Rc=1 variants?

 ### Float load immediate (kinda a variant of `fmvfg`)

Comment 11 Luke Kenneth Casson Leighton 2021-06-03 20:45:37 BST

(In reply to Jacob Lifshay from comment #10)
> fix formatting: luke you had mistakenly unindented the list of reasons to
> have each FP -> Int conversion mode, I reindented those so they are
> correctly grouped under the appropriate list entry.

hmm, i did that because it wasn't clear, because of the spaces in between.
this would indicate that some descriptive text or headings, or other reorganisation might be in order.
 
> Also:
> commit bb9a2dcc50b656e07accaf1036edb3607ea82f6c (HEAD -> master,
> origin/master, origin/HEAD)
> Author: Jacob Lifshay <programmerjake@gmail.com>
> Date:   Thu Jun 3 10:58:53 2021 -0700
> 
>     bitwise moves never set exceptions or mess with FPSCR

are we absolutely certain of that? because if so it needs to be explicitly
stated, rather than leaving it "unstated".

otherwise people will ask during the review.

something like:

"this bitwise move does not raise exceptions nor alter FPSCR or other status flags"

Comment 12 Jacob Lifshay 2021-06-03 20:56:49 BST

(In reply to Luke Kenneth Casson Leighton from comment #11)
> (In reply to Jacob Lifshay from comment #10)
> > fix formatting: luke you had mistakenly unindented the list of reasons to
> > have each FP -> Int conversion mode, I reindented those so they are
> > correctly grouped under the appropriate list entry.
> 
> hmm, i did that because it wasn't clear, because of the spaces in between.

Well, it's perfectly clear to me in the rendered markdown (which is the part we should care about). The blank lines are just so it doesn't get all crammed onto one line when rendered.

> this would indicate that some descriptive text or headings, or other
> reorganisation might be in order.
>  
> > Also:
> > commit bb9a2dcc50b656e07accaf1036edb3607ea82f6c (HEAD -> master,
> > origin/master, origin/HEAD)
> > Author: Jacob Lifshay <programmerjake@gmail.com>
> > Date:   Thu Jun 3 10:58:53 2021 -0700
> > 
> >     bitwise moves never set exceptions or mess with FPSCR
> 
> are we absolutely certain of that?

Yes. It seems obvious to me, since it's just a bitwise copy (like fmv). FP exceptions and FPSCR only affect things where actual arithmetic/comparison/etc. operations are performed.

> because if so it needs to be explicitly
> stated, rather than leaving it "unstated".

Go ahead and add that if you like.
> 
> otherwise people will ask during the review.

Ok, though I'd guess they probably won't, since it's just a bitwise move.

> something like:
> 
> "this bitwise move does not raise exceptions nor alter FPSCR or other status
> flags"

Comment 13 Jacob Lifshay 2021-06-03 22:00:34 BST

(In reply to Luke Kenneth Casson Leighton from comment #9)
> i started adding in crossreferences to the equivalent Power ISA operations
> fcfids etc then got totally confused.
> 
> jacob can you add a full table listing modes on one axis, operation on
> the other, and place the equivalent v3.0B operation(s) in each relevant cell?
> 
> perhaps no need to put the VSX or LD/ST in it as well which would get FRT or
> FRS into the right INT reg, leave that implicit?

I added the code needed to emulate Rust and JavaScript conversion semantics

Comment 14 Luke Kenneth Casson Leighton 2021-06-03 22:20:40 BST

(In reply to Jacob Lifshay from comment #12)
> (In reply to Luke Kenneth Casson Leighton from comment #11)
> > (In reply to Jacob Lifshay from comment #10)
> > > fix formatting: luke you had mistakenly unindented the list of reasons to
> > > have each FP -> Int conversion mode, I reindented those so they are
> > > correctly grouped under the appropriate list entry.
> > 
> > hmm, i did that because it wasn't clear, because of the spaces in between.
> 
> Well, it's perfectly clear to me in the rendered markdown (which is the part
> we should care about). The blank lines are just so it doesn't get all
> crammed onto one line when rendered.
> 
> > this would indicate that some descriptive text or headings, or other
> > reorganisation might be in order.
> >  
> > > Also:
> > > commit bb9a2dcc50b656e07accaf1036edb3607ea82f6c (HEAD -> master,
> > > origin/master, origin/HEAD)
> > > Author: Jacob Lifshay <programmerjake@gmail.com>
> > > Date:   Thu Jun 3 10:58:53 2021 -0700
> > > 
> > >     bitwise moves never set exceptions or mess with FPSCR
> > 
> > are we absolutely certain of that?
> 
> Yes. It seems obvious to me, since it's just a bitwise copy (like fmv). FP
> exceptions and FPSCR only affect things where actual
> arithmetic/comparison/etc. operations are performed.
> 
> > because if so it needs to be explicitly
> > stated, rather than leaving it "unstated".
> 
> Go ahead and add that if you like.
> > 
> > otherwise people will ask during the review.
> 
> Ok, though I'd guess they probably won't, since it's just a bitwise move.

ohh, they'll ask. specifications are... excruciating.  all possible
ambiguities or potential questions have to be explicitly answered,
leaving no possibility that people *might* misunderstand or, because
it's not spelled out, do exactly what they are not supposed to do but
*you* thought it wasn't "necessary" to prohibit.

sigh

> 
> > something like:
> > 
> > "this bitwise move does not raise exceptions nor alter FPSCR or other status
> > flags"

(In reply to Jacob Lifshay from comment #13)

> I added the code needed to emulate Rust and JavaScript conversion semantics

brilliant, that helps enormously. 
https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=b67fe6025a25da9d467213444f623d4f8f6d4abd

EEK!  that's AWFUL!  and that's replaceable by ONE instruction?
holy cow.  i'm adding the c++ code as well, that should really
hit home.

Comment 15 Luke Kenneth Casson Leighton 2021-06-03 22:34:44 BST

nuts, hit send too soon.

https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=caa35bce868af3880119aa4cacbd520b23eaae34

what we think is obvious, is not.  the list is "note-form", i.e.
it's a map - a reminder - to *you* - of what is obvious *to you*.

unfortunately, with no explicit context, the indented bullet-list
makes no sense (to me, who does not know the context), and
consequently it's *definitely* not going to make sense to someone
who has even less context.

in particular, the "standard Integer -> FP conversion": which standard
is that? it needs to be explicitly listed.

then, now that each is moved to their own sub-section, each piece
which was previously a statement is no longer a complete sentence:
a paragraph is needed which introduces them each.

writing standards is hard!

Comment 16 Jacob Lifshay 2021-06-03 22:40:47 BST

(In reply to Luke Kenneth Casson Leighton from comment #15)
> nuts, hit send too soon.
> 
> https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;
> h=caa35bce868af3880119aa4cacbd520b23eaae34
> 
> what we think is obvious, is not.  the list is "note-form", i.e.
> it's a map - a reminder - to *you* - of what is obvious *to you*.

it's intended to be a short summary/motivating explanation. The full semantics are in the instruction descriptions.
> 
> unfortunately, with no explicit context, the indented bullet-list
> makes no sense (to me, who does not know the context), and
> consequently it's *definitely* not going to make sense to someone
> who has even less context.
> 
> in particular, the "standard Integer -> FP conversion": which standard
> is that? it needs to be explicitly listed.

The standard used by nearly all programming languages and cpus: IEEE 754.

> then, now that each is moved to their own sub-section, each piece
> which was previously a statement is no longer a complete sentence:
> a paragraph is needed which introduces them each.
> 
> writing standards is hard!

:)

Comment 17 Jacob Lifshay 2021-06-03 22:49:43 BST

one other formatting thing:
text should not go on the same line as the opening triple-backtick, that is reserved for the language the code block should be syntax-highlighted in:
https://www.markdownguide.org/extended-syntax/#syntax-highlighting

https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=openpower/sv/int_fp_mv/appendix.mdwn;h=407a92bc6abfefaad2fd72ee55184ebe4afd3ba2;hb=a3f50713bca3f5c4b0544d90921df1b56f1f84b1#l5

Comment 18 Luke Kenneth Casson Leighton 2021-06-03 22:50:51 BST

(In reply to Jacob Lifshay from comment #16)
> (In reply to Luke Kenneth Casson Leighton from comment #15)
> > nuts, hit send too soon.
> > 
> > https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;
> > h=caa35bce868af3880119aa4cacbd520b23eaae34
> > 
> > what we think is obvious, is not.  the list is "note-form", i.e.
> > it's a map - a reminder - to *you* - of what is obvious *to you*.
> 
> it's intended to be a short summary/motivating explanation. The full
> semantics are in the instruction descriptions.

they need words as well... sigh

> > 
> > unfortunately, with no explicit context, the indented bullet-list
> > makes no sense (to me, who does not know the context), and
> > consequently it's *definitely* not going to make sense to someone
> > who has even less context.
> > 
> > in particular, the "standard Integer -> FP conversion": which standard
> > is that? it needs to be explicitly listed.
> 
> The standard used by nearly all programming languages and cpus: IEEE 754.

okaay, now i have context.  so i am adding this:

 index c497f6f7..5d33ef43 100644
--- a/openpower/sv/int_fp_mv.mdwn
+++ b/openpower/sv/int_fp_mv.mdwn
@@ -34,8 +34,7 @@ If we're adding new Integer <-> FP conversion instructions, we may
 as well take this opportunity to modernise the instructions and make them
 well suited for common/important conversion sequences:
 
-* standard Integer -> FP conversion (**TODO, which standard?** can it
-  be described in words? how does it differ from the other "standards"?)
+* standard Integer -> FP IEEE754 conversion 
 * standard OpenPower FP -> Integer conversion (saturation with NaN
   converted to minimum valid integer)
 * Rust FP -> Integer conversion (saturation with NaN converted to 0)
@@ -54,9 +53,9 @@ the feature being proposed.
 
 ## standard Integer -> FP conversion
 
-TODO, explain this further
-
-- rounding mode read from FPSCR
+This conversion is outlined in the IEEE754 specification.  It is used
+by nearly all programming languages and CPUs.  In the case of OpenPOWER,
+the rounding mode is read from FPSCR
 
 # standard OpenPower FP -> Integer conversion

Comment 19 Jacob Lifshay 2021-06-03 23:21:44 BST

note, for FP -> Int, IEEE 754 only specifies the result for finite inputs that round to an integer in-range for the result integer type. The results for NaN/Infinite/out-of-range inputs is unspecified.

C/C++ specifies NaN/Infinite/out-of-range FP -> Int is Undefined Behavior.

Comment 20 Luke Kenneth Casson Leighton 2021-06-03 23:22:06 BST

hmmm, given that the "standard" way of doing fp conversion is not
equal to how OpenPOWER does it, i think it would be highly illustrative
to also include c as well, in the appendix
https://libre-soc.org/openpower/sv/int_fp_mv/appendix/

i'm assuming there that c implements "standard" IEEE754.

added:


toInt32(double):
        fctiwz 1,1
        addi 9,1,-16
        stfiwx 1,0,9
        lwz 3,-16(1)
        extsw 3,3
        blr
        .long 0
        .byte 0,9,0,0,0,0,0,0

which is still pretty awful.  5 instructions including a LD/ST, which
gets 2 other registers involved plus some local stack.

Comment 21 Luke Kenneth Casson Leighton 2021-06-04 02:51:35 BST

jacob what does rust have to do with java, llvm or SPIR-V? i cannot see the connection between these 4 languages in the words used in the page.

Comment 22 Jacob Lifshay 2021-06-04 02:54:03 BST

(In reply to Luke Kenneth Casson Leighton from comment #21)
> jacob what does rust have to do with java, llvm or SPIR-V? i cannot see the
> connection between these 4 languages in the words used in the page.

They all need the fp -> int semantics I labeled "Rust conversion semantics"
https://libre-soc.org/openpower/sv/int_fp_mv/#fp-to-int-rust-conversion-semantics

Comment 23 Luke Kenneth Casson Leighton 2021-06-04 02:55:10 BST

(In reply to Jacob Lifshay from comment #19)
> note, for FP -> Int, IEEE 754 only specifies the result for finite inputs
> that round to an integer in-range for the result integer type. The results
> for NaN/Infinite/out-of-range inputs is unspecified.
> 
> C/C++ specifies NaN/Infinite/out-of-range FP -> Int is Undefined Behavior

which is as it should be.  it means, "don't do it! and if you do, you get to sort it out". i.e. c/c++ matches IEEE754. makes flags / exceptions for detecting when that happens pretty important though.

Comment 24 Jacob Lifshay 2021-06-04 02:57:48 BST

(In reply to Jacob Lifshay from comment #22)
> (In reply to Luke Kenneth Casson Leighton from comment #21)
> > jacob what does rust have to do with java, llvm or SPIR-V? i cannot see the
> > connection between these 4 languages in the words used in the page.
> 
> They all need the fp -> int semantics I labeled "Rust conversion semantics"
> https://libre-soc.org/openpower/sv/int_fp_mv/#fp-to-int-rust-conversion-
> semantics

I fixed the section headers to clarify the grouping

Comment 25 Luke Kenneth Casson Leighton 2021-06-04 02:59:01 BST

(In reply to Jacob Lifshay from comment #22)
> (In reply to Luke Kenneth Casson Leighton from comment #21)
> > jacob what does rust have to do with java, llvm or SPIR-V? i cannot see the
> > connection between these 4 languages in the words used in the page.
> 
> They all need the fp -> int semantics I labeled "Rust conversion semantics"
> https://libre-soc.org/openpower/sv/int_fp_mv/#fp-to-int-rust-conversion-
> semantics

ok, then that should be spelled out in the paragraph.  can you please add that, something like, "all of these languages listed below have the exact same conversion behaviour"

then it is clear.

otherwise it looks like rust is *implemented* in or by llvm, java, and SPIRV.

see how things have to be absolutely spelled out? no possibility for
english language ambiguity.

Comment 26 Jacob Lifshay 2021-06-04 03:47:17 BST

(In reply to Luke Kenneth Casson Leighton from comment #23)
> (In reply to Jacob Lifshay from comment #19)
> > note, for FP -> Int, IEEE 754 only specifies the result for finite inputs
> > that round to an integer in-range for the result integer type. The results
> > for NaN/Infinite/out-of-range inputs is unspecified.
> > 
> > C/C++ specifies NaN/Infinite/out-of-range FP -> Int is Undefined Behavior
> 
> which is as it should be.  it means, "don't do it! and if you do, you get to
> sort it out".

nah, more like the compiler assumes that out-of-range fp -> int conversions will never occur. if they do occur, the compiler has license to format your hard drive or publish your ssh private keys or (more likely) delete that entire branch of your program.

> i.e. c/c++ matches IEEE754.
C/C++ explicitly says it's Undefined Behavior (which has a specific meaning). IEEE 754 just doesn't say.

> makes flags / exceptions for
> detecting when that happens pretty important though.

Comment 27 Jacob Lifshay 2021-06-04 04:10:13 BST

(In reply to Luke Kenneth Casson Leighton from comment #25)
> (In reply to Jacob Lifshay from comment #22)
> > (In reply to Luke Kenneth Casson Leighton from comment #21)
> > > jacob what does rust have to do with java, llvm or SPIR-V? i cannot see the
> > > connection between these 4 languages in the words used in the page.
> > 
> > They all need the fp -> int semantics I labeled "Rust conversion semantics"
> > https://libre-soc.org/openpower/sv/int_fp_mv/#fp-to-int-rust-conversion-
> > semantics
> 
> ok, then that should be spelled out in the paragraph.  can you please add
> that, something like, "all of these languages listed below have the exact
> same conversion behaviour"
> 
> then it is clear.

clarified.

> otherwise it looks like rust is *implemented* in or by llvm, java, and SPIRV.
> 
> see how things have to be absolutely spelled out? no possibility for
> english language ambiguity.

If the section is just explaining why we want instructions, and is not actually part of the specification of how implementations should or should not behave, I see no problem with some ambiguity as part of making it short enough to be easy to read.

Comment 28 Jacob Lifshay 2021-06-04 04:29:04 BST

(In reply to Jacob Lifshay from comment #26)
> (In reply to Luke Kenneth Casson Leighton from comment #23)
> > (In reply to Jacob Lifshay from comment #19)
> > > note, for FP -> Int, IEEE 754 only specifies the result for finite inputs
> > > that round to an integer in-range for the result integer type. The results
> > > for NaN/Infinite/out-of-range inputs is unspecified.
> > > 
> > > C/C++ specifies NaN/Infinite/out-of-range FP -> Int is Undefined Behavior
> > 
> > which is as it should be.  it means, "don't do it! and if you do, you get to
> > sort it out".
> 
> nah, more like the compiler assumes that out-of-range fp -> int conversions
> will never occur. if they do occur, the compiler has license to format your
> hard drive or publish your ssh private keys or (more likely) delete that
> entire branch of your program.

A contrived example of how UB from out-of-range fp -> int makes it format your hard drive

https://gcc.godbolt.org/z/eM1s78vTG

Comment 29 Luke Kenneth Casson Leighton 2021-06-04 12:49:24 BST

(In reply to Jacob Lifshay from comment #28)

> A contrived example of how UB from out-of-range fp -> int makes it format
> your hard drive
> 
> https://gcc.godbolt.org/z/eM1s78vTG

it's got to be better than INTERCAL
http://www.catb.org/~esr/intercal/ick.htm

COME FROM UNLESS, the program is permitted to delete all
evidence of its failure to undo the damage caused if
"unless" clause is true.

Comment 30 Luke Kenneth Casson Leighton 2021-06-09 23:39:12 BST

javascript rounding, being based on FP modulo, has me concerned it's CISC. any thoughts anyone?

Comment 31 Jacob Lifshay 2021-06-10 00:12:27 BST

(In reply to Luke Kenneth Casson Leighton from comment #30)
> javascript rounding, being based on FP modulo, has me concerned it's CISC.
> any thoughts anyone?

In hardware, it's just a barrel shifter and an optional negation. This is because the modulus is a power of 2.

Also, being CISC is not usually a good reason to not have any specific operation, if it makes it run waay faster for a common use-case, we'd be stupid to not include it.

Comment 32 Luke Kenneth Casson Leighton 2021-06-10 00:24:10 BST

(In reply to Jacob Lifshay from comment #31)

> In hardware, it's just a barrel shifter and an optional negation. This is
> because the modulus is a power of 2.

ok whew, power-2 is fine.  that's a relief.

l.

Comment 33 Jacob Lifshay 2022-05-26 08:39:09 BST

Lkcl, I noticed you renamed the Rust conversion semantics to Java conversion semantics, imho that makes it more confusing since those semantics only match Java for fp -> 32/64-bit integers, Java fp -> 8/16-bit integers instead convert to 32-bit integer and then truncate that 32-bit integer to 16/8-bit.

Rust saturates for all integer sizes, so is more consistent with what I want those instructions to do.

For example, Java converts 257.0 to the unsigned 8-bit value 1 (because 257 fits in a 32-bit integer, then the top 24 bits are removed to leave the value 1),
whereas Rust correctly saturates to 255.

Also, WebAssembly recently introduced saturating fp -> int conversions:
https://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-u

Comment 34 Jacob Lifshay 2022-05-26 09:21:59 BST

one other idea: have a variant of fp->int that sets OV/OV32 (and SO) if the rounded fp value doesn't fit in the destination type or the input is NaN...would be very useful for the other webassembly fp->int conversions that are specified to trap in those cases. the proposed instruction wouldn't trap, it'd just set OV and return the saturated result. wasm engines can just branch on overflow if they want.

this would likely also be useful for fp->bigint conversions (like in python float->int) or fp->128-bit integer conversions so they can try the simple 64-bit conversion and if it sets overflow, then branch to the slow path.

Comment 35 Luke Kenneth Casson Leighton 2022-05-26 10:33:43 BST

(In reply to Jacob Lifshay from comment #33)
> Lkcl, I noticed you renamed the Rust conversion semantics to Java conversion
> semantics, 

yes: i find rust to be that irritating (that much hyped and obsessed over)
i'd rather we didn't mention it except in passing, where absolutely necessary.
also, if there's a language that has been around for longer and is more
commonly used i'd prefer it to be the one that's used as the reference.

also, when putting this to the OPF ISA WG they *may* flag up that java
and javascript are trademarks of oracle (javascript: 75026640),
although in these particular cases "fair use" is likely to cut it.
as in: we literally have no choice but to refer to
"the semantics of javascript" by any other means, and this is considered
acceptable under Trademark Law.

howeverrr... we maay have to invent terms (phrases) in order to refer
to these Trademarked languages to an absolute bare-minimum degree
(i.e. once, i.e. not as headings: "javascript semantics".)

turns out that rust is also trademarked
https://internals.rust-lang.org/t/rust-s-freedom-flaws/11533

and they're causing problems with it.

nope. i don't think it's safe to refer to *any* of these Trademarked
languages except once and only once.

how about:

* IEEE754 semantics as-is
* OpenPOWER semantics as-is
* saturated semantics replace java/rust
* modulo semantics (or, wrapping) replace javascript

> imho that makes it more confusing since those semantics only
> match Java for fp -> 32/64-bit integers, Java fp -> 8/16-bit integers
> instead convert to 32-bit integer and then truncate that 32-bit integer to
> 16/8-bit.

then that should have been mentioned! :)

it sounds to me like it's completely different language semantics,
which would justify a separate instruction.  or, at the very least,
an explicit mention of the difference.

 
> For example, Java converts 257.0 to the unsigned 8-bit value 1 (because 257
> fits in a 32-bit integer, then the top 24 bits are removed to leave the
> value 1),
> whereas Rust correctly saturates to 255.

that's *definitely* something to put into the page as being completely
different semantics
 
> Also, WebAssembly recently introduced saturating fp -> int conversions:
> https://webassembly.github.io/spec/core/exec/numerics.html#op-trunc-sat-u

urrr... another bandwagon. good as a reference to justify adding the
instructions though.


(In reply to Jacob Lifshay from comment #34)
> one other idea: have a variant of fp->int that sets OV/OV32 (and SO) if the
> rounded fp value doesn't fit in the destination type or the input is
> NaN...would be very useful for the other webassembly fp->int conversions
> that are specified to trap in those cases. the proposed instruction wouldn't
> trap, it'd just set OV and return the saturated result. wasm engines can
> just branch on overflow if they want.

yes, good idea.
 
> this would likely also be useful for fp->bigint conversions (like in python
> float->int) or fp->128-bit integer conversions so they can try the simple
> 64-bit conversion and if it sets overflow, then branch to the slow path.

yep, an exception's a bit overkill although i'm tempted to suggest it.

Comment 36 Luke Kenneth Casson Leighton 2022-05-26 11:18:45 BST

https://libre-soc.org/irclog/%23libre-soc.2022-05-26.log.html

Comment 37 Jacob Lifshay 2022-05-26 11:26:17 BST

Also:
> Effectively, fmvtgs is a macro-fusion of frsp fmvtg and therefore has the
> exact same exception and flags behaviour of frsp
(edit: word wrapping)

that's not correct since imho we want the semantics of stfs not frsp. stfs expects the input to already be f32 but in f64 form, and stores that f32 to memory. it never sets flags. frsp converts f64 to f32, then back to f64. fmvtg moves a f64 to a gpr. fusing frsp and fmvtg would produce a f64 in the gpr.

the f64 -> f32 conversion in fp stores is described in the spec v3.1 section 4.6.3. it isn't how frsp does it, it's designed to be simpler because it doesn't need to round correctly since the input should already be a valid f32 value as a f64.

Comment 38 Luke Kenneth Casson Leighton 2022-05-26 12:22:13 BST

(In reply to Jacob Lifshay from comment #37)
> Also:
> > Effectively, fmvtgs is a macro-fusion of frsp fmvtg and therefore has the
> > exact same exception and flags behaviour of frsp
> (edit: word wrapping)

> the f64 -> f32 conversion in fp stores is described in the spec v3.1 section
> 4.6.3. it isn't how frsp does it, it's designed to be simpler because it
> doesn't need to round correctly since the input should already be a valid
> f32 value as a f64.

do make sure to update the page on that.

Comment 39 Luke Kenneth Casson Leighton 2022-05-26 12:27:31 BST

ah interesting, anothre standard popped up, v3.1 p265 book I section 6.6.1
http://www.open-std.org/jtc1/sc22/wg14/docs/c9x/floating-point/floating-point.ps.gz

C9X

Comment 40 Luke Kenneth Casson Leighton 2022-05-27 20:56:14 BST

alain i'm adding you cc on this one as we have some questions to ask of the
ISA WG. the most logical first instructions to propose, with a lot of
impact, are the two float-immediate ones
https://libre-soc.org/openpower/sv/int_fp_mv/#fmvis

we are not interested right now in v3.1 prefixes, but IBM might be.
could you ask on the next ISA WG meeting if anyone would like to
collaborate (publicly) on that?