Bug 550 - binutils support needed for svp64
Summary: binutils support needed for svp64
Status: CONFIRMED
Alias: None
Product: Libre-SOC's first SoC
Classification: Unclassified
Component: Source Code (show other bugs)
Version: unspecified
Hardware: Other Linux
: --- enhancement
Assignee: dmitry.selyutin
URL:
Depends on:
Blocks:
 
Reported: 2020-12-18 20:33 GMT by Luke Kenneth Casson Leighton
Modified: 2021-11-30 20:51 GMT (History)
5 users (show)

See Also:
NLnet milestone: NLNet.2019.10.Formal
total budget (EUR) for completion of task and all subtasks: 0
budget (EUR) for this task, excluding subtasks' budget: 0
parent task for budget allocation: 577
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:


Attachments
Ho ho ho! (4.67 KB, patch)
2020-12-26 02:14 GMT, Alexandre Oliva
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Kenneth Casson Leighton 2020-12-18 20:33:41 GMT
gnu as and objdump support is needed for the new svp64 encoding.

first recommended approach (simplest): a new "instruction"

     svp64 0xNNNNNN (or binary)

encoding would be as described in "Prefix Fields" starting with Major Opcode EXT01

https://libre-soc.org/openpower/sv/svp_rewrite/svp64/

subsequent encoding (TBD) would be:

    svp64 SUBVL=2,ew=8,rt=v,mask=r3

and finally allow that to be a "prefix" of instructions as (TBD)

     addi[r3] r64.w.vec2.v, 5

w: elwidth=32
v: RT is vector
vec2: SUBVL=2
[r3]: mask=r3

opcodes listed here, gives prefixing info, autogenerated.
https://libre-soc.org/openpower/opcode_regs_deduped/
Comment 1 Alexandre Oliva 2020-12-21 19:08:46 GMT
I'm a little puzzled (not just because I can hardly make head from tail of the svp64 web page :-)

why bother with "svp64 0x..." syntax, if we already have .long?


as for making sense of the page.  I guess it must all make some sense if you have some vague notion of what the prefixes are supposed to accomplish, but that's not me.  I could use some examples, or pointers to earlier, more complete and self-contained docs that would give me some sense of what's supposed to be going on there.

not that I really need to be able to make sense of it before I can implement binutils changes, mind you; it just helps avoid silly mistakes, and wrong assumptions, and I figured I might be able to help validate the proposed design, if only I had the required background.  alas, I suppose I'm missing background on GPUs, ppc 3.1 opcodes, and the earlier simd design for risc-v
Comment 2 Luke Kenneth Casson Leighton 2020-12-21 20:09:51 GMT
(In reply to Alexandre Oliva from comment #1)
> I'm a little puzzled (not just because I can hardly make head from tail of
> the svp64 web page :-)
> 
> why bother with "svp64 0x..." syntax, if we already have .long?

yes jacob pointed that out... although... an "svp64 0xNNNNNNN" instruction would help you to understand the "first phase": where the RM field fits.


> 
> as for making sense of the page.  I guess it must all make some sense if you
> have some vague notion of what the prefixes are supposed to accomplish, but
> that's not me.

SV - aka SimpleV - is a hardware for-loop around instructions.

that's it.  full stop.

here is some pseudocode that shows what that looks like, using ADD as an example:

https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=simple_v_extension/simple_v_chennai_2018.tex;hb=HEAD#l190


>  I could use some examples, or pointers to earlier, more
> complete and self-contained docs that would give me some sense of what's
> supposed to be going on there.

this paragraph puts the above one-liner and the pseudocode into context:

https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=simple_v_extension/specification.mdwn;hb=HEAD#l38


> not that I really need to be able to make sense of it before I can implement
> binutils changes, mind you; it just helps avoid silly mistakes, and wrong
> assumptions, and I figured I might be able to help validate the proposed
> design, if only I had the required background. 

appreciated.

> alas, I suppose I'm missing
> background on GPUs, ppc 3.1 opcodes, and the earlier simd design for risc-v

there was no SIMD ISA: SV is *categorically* and very specifically diametrically opposed to SIMD.

SIMD is considered harmful:
https://www.sigarch.org/simd-instructions-considered-harmful/

x86 expanded from 70 to *1400* instructions since 1978, thanks to SIMD (far, far more since adding AVX512.  SIMD is an O(N^6) opcode proliferation nightmare.

also we are not adding v3.1B opcodes (that is a separate discussion which requires OPF permission). the sole exclusive reason for using EXT01 is to get the "fitting in" with v3.1B 64 bit prefixing in a nondisruptive fashion that the OPF ISA WG should not have any objection to.


the sigarch article shows how RVV works.  SV is based on the exact same underlying principle: you have an instruction, you have a vector loop on that instruction, elements are computed based on that instruction.

full stop.

it's real simple.

VL in our case can be anywhere from 1 to 64.  *very rarely* it is permitted to be zero.

so how do we set this "VL" or vector length?

well, with an instruction of course.
https://libre-soc.org/openpower/sv/setvl/

and... err... then what?  well, no standard 32 bit scalar instructions do anything: they don't "understand" VL.

so we "Prefix" them.  this says, "hey you know that VL for-loop you want applied? well the next 32 bits contains the instruction to be smashed into that for-loop, oh and by the way here's some other random trash to chuck at the loop, such as predication, blah blah".

therefore, ultimately, we want this kind of syntax:

    setvl r3, r5, VL=4
    SUBVL=2, ELWIDTH=8 { add r5, r5, r2 }

the output will be:

* 32 bits containing an instruction for setvl
* 32 bits starting with EXT01 as its Major Opcode and continuing with the pattern that drops SUBVL=2 and ELWIDTH=8 somewhere into the RM field bits
* 32 bits containing an addi instruction

this will get us that hardware for-loop activated 4 times (0-3) on that add instruction.

actually 8 because SUBVL=2

and, actually, it will be 8bit adds not 64bit adds because ELWIDTH=8.

does that provide you with a quick crash-course in how SV works?
Comment 3 Luke Kenneth Casson Leighton 2020-12-21 21:50:27 GMT
i've added a rapid prototype "Assembly Annotation" to the appendix,
and also updated the "Prefix Fields".

unnnforttunately, i just realised that, actually, working out which
of the "Remapped Encodings" to apply, will need to be a per-instruction
basis, for everything but MASK_KIND, MASK, ELWIDTH, SUBVL and MODE.
these are always in the same place: everything else (EXTRAs, ELWIDTH_SRC,
MASK_SRC) critically depends on what instruction is used.

we can "get away with this" by specifying the mode-type as part of the
svp64 encoding... for now.
Comment 4 Alexandre Oliva 2020-12-26 02:14:34 GMT
Created attachment 123 [details]
Ho ho ho!

Here's a patch that introduces in GNU binutils an svp64 pseudo-instruction, that takes a single 24-bit operand, and encodes it as a 32-bit insn with EXT01 as the major opcode, and MSB0 bits 7 and 9 also set, shuffling the top two bits of the 24-bit operand, RM[0] and RM[1], into bits 6 and 8 of the insn.
Comment 5 Luke Kenneth Casson Leighton 2020-12-26 02:38:08 GMT
(In reply to Alexandre Oliva from comment #4)
> Created attachment 123 [details]
> Ho ho ho!

:)
 
> Here's a patch that introduces in GNU binutils an svp64 pseudo-instruction,
> that takes a single 24-bit operand, and encodes it as a 32-bit insn with
> EXT01 as the major opcode, and MSB0 bits 7 and 9 also set, shuffling the top
> two bits of the 24-bit operand, RM[0] and RM[1], into bits 6 and 8 of the
> insn.

cool!  this is fantastic, it means that the next stages open up as well, for adding basic SV capability to ISACaller (the simulator).

alexandre, i will create a binutils git clone tomorrow, to make sure this gets tracked properly.
Comment 6 Alexandre Oliva 2020-12-26 02:51:30 GMT
thanks for the crash course.  as I said in the call, it was very useful.
it's all beginning to make sense.

> we can "get away with this" by specifying the mode-type as part of the
svp64 encoding... for now.

I was going to ask about that.  it seems that there's nothing in the svp64 prefix instruction itself that tells how to decode its fields, you have to look at the actual insn that follows to know.

Once we get to a stage in which we'll want to specify svp64 fields separately, rather than combined into a 24-bit immediate, an explicit specification of mode may help the assembler, to some extent, but the disassembler (and the assembler, if it's to detect inconsistencies) will have to look at prefix+insn as a single thing to be able to do its job.
Comment 7 Luke Kenneth Casson Leighton 2020-12-26 08:03:59 GMT
(In reply to Alexandre Oliva from comment #6)
> thanks for the crash course.  as I said in the call, it was very useful.
> it's all beginning to make sense.

ah good.  it's kinda surprising that nobody has thought of this before.

> > we can "get away with this" by specifying the mode-type as part of the
> svp64 encoding... for now.
> 
> I was going to ask about that.  it seems that there's nothing in the svp64
> prefix instruction itself that tells how to decode its fields, you have to
> look at the actual insn that follows to know.

correct.  bit (haha) annoying however with bits so precious it's how it goes. the alternative is that we request a Major Opcode then use the 2 extra bits, one for 1/2 Predication, the other for 2/3 EXTRA (although to be honest, 2 more bits means 4 more modes/features...)

the sv_analysis.py program is generating tables already
https://libre-soc.org/openpower/opcode_regs_deduped/

the idea is to create CSV files which give those 2 missing bits.  it is not outside the realm of possibility to autogenerate a header file for inclusion in binutils.
 
> Once we get to a stage in which we'll want to specify svp64 fields
> separately, rather than combined into a 24-bit immediate, an explicit
> specification of mode may help the assembler, to some extent, 

autogenerated.  otherwise it's too much work (200 insns) and you get transcription errors.  dunno bout you but i don't want to have to check that.

> but the
> disassembler (and the assembler, if it's to detect inconsistencies) will
> have to look at prefix+insn as a single thing to be able to do its job.

indeed.  and PowerDecoder2 as well.  this is how it goes.

i'm not happy about it because normally RISC is not supposed to have lots of gates in the decoder.

if we were doing our own ISA from scratch these two bits, saying whether 1P/2P was set and whether EXTRA2/3 was set would definitely be part of the opcode.
Comment 8 Luke Kenneth Casson Leighton 2021-09-07 00:22:55 BST
relevant links:
http://lists.libre-soc.org/pipermail/libre-soc-dev/2021-August/003590.html

the only thing: it is *not* a good idea to hand-create the tables needed by binutils.  these should be *auto-generated*, teaching sv_analysis.py how to do that.

https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/sv/sv_analysis.py;hb=HEAD

there's nothing particularly sophisticated or clever about that program: it's written in a bland, non-OO "Get It Done" style.  it:
* reads OpenPOWER ISA v3.0B CSV files containing micro-code-style instruction format information
(exactly like the tables in binutils)
* identifies and groups v3.0B instructions by identical register file profile (number of Read regs, number of Write regs, number of CR regs read etc)
* assigns an SVP64 "Style" to each (Twin/Single-predicate, 2 or 3 EXTRA bits for reg extension)
* spits out *more* CSV files with that grouping information in it, to assist in decoding

thus rather than hand-create the SVP64 decoding information in binutils it should be trivial to autogenerate c header files and c structs.

http://lists.libre-soc.org/pipermail/libre-soc-dev/2021-August/003592.html


no deadlines given that i am using the python class, which has a mode where it can do .S processing.  i actually had to add gas macro recognition to get that to work.

so there is a temporary workaround.  however it will become increasingly more of a priority particularly for Lauri who is working at assembler level for Video/Audio CODECs, and later for compilers.

the function entrypoint is asm_process()

https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/sv/trans/svp64.py;h=45b292b4c4c32bbff548f2bf299235633d31db6c;hb=HEAD#l1052

you can see it looks for ".set" macros of the utmost basic form, example where this is used:

https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=media/Makefile;h=4dd904b6ba48f3fcae3b1ab04e1b0479e460abd4;hb=HEAD#l34

and some actual assembler containing sv.xxx opcodes, which get translated by asm_process() libe by line into ".long xxxxx; some_v3.0b_asmopcode"

https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=media/audio/mp3/mp3_0_apply_window_float_basicsv.s;hb=HEAD

you've seen the spec page which contains the format?

 https://libre-soc.org/openpower/sv/svp64/

it's very deliberately only describing the format, not why it is what it us, or how to *use* that format (how to implement hardware etc i mean).
Comment 9 lechenko 2021-09-19 20:33:01 BST
(In reply to Luke Kenneth Casson Leighton from comment #8)

Slowly but surely, I figured out, what https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/sv/trans/svp64.py;h=45b292b4c4c32bbff548f2bf299235633d31db6c;hb=HEAD#l1052 does. 

As I understood, it translates svp64 asm mnemonics to prefix as a 32-bit literal and subsequent scalar OpenPOWER asm mnemonic. And after that translated .S-file feeds to binutils to produce binary file.

So, now we want to support the same svp64 asm mnemonics directly in binutils. But, my guess, that scalar OpenPOWER instructions are already there. Thus, few questions.

Does it mean, that we can try to implement the same two-step translation logic inside binutils? Or reuse OpenPOWER-related header files, at least?

Another one about svp64 asm syntax. As far as I understand, it is already support current version of asm syntax, but is there a spec on it? Could you share a link, please.

And the last one, for now. I guess, I can reuse/refactor both sv_analysis.py and svp64.py to generate header for binutils. Was that your intentions on how to do that task?
Comment 10 Luke Kenneth Casson Leighton 2021-09-19 22:25:30 BST
(In reply to lechenko from comment #9)
> (In reply to Luke Kenneth Casson Leighton from comment #8)
> 
> Slowly but surely, I figured out, what
> https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/sv/
> trans/svp64.py;h=45b292b4c4c32bbff548f2bf299235633d31db6c;hb=HEAD#l1052
> does. 
> 
> As I understood, it translates svp64 asm mnemonics to prefix as a 32-bit
> literal and subsequent scalar OpenPOWER asm mnemonic.

because there is no binutils support for svp64, yes.  when the .long
and the 32 bit mnemonic get passed to binutils, they get converted to
binary *without* binutils ever needing to know anout svp64 assembler.

the exercise is therefore to merge the EXACT functionality of svp64.py
*into* binutils.


> And after that
> translated .S-file feeds to binutils to produce binary file.

yes.

> So, now we want to support the same svp64 asm mnemonics directly in
> binutils. But, my guess, that scalar OpenPOWER instructions are already
> there. 

yes, and yes.

> Thus, few questions.
> 
> Does it mean, that we can try to implement the same two-step translation
> logic inside binutils?

yes, exactly. or, more to the point: after "conversion" to ".long xxxxx; {equivalent v3.0B}" pass that *again* to the relevant function and get it to convert those to the appropriate binary output.

it is important to do that conversion pass *after* all the macro renaming
and expansion of registers.  gas has a builtin macro system, you cannot
process SVP64 registers until you know the actual number, 0-127.


> Or reuse OpenPOWER-related header files, at least?

yes absolutely, for goodness sake don't duplicate the entirety of power isa headers for scalar operations.


> Another one about svp64 asm syntax. As far as I understand, it is already
> support current version of asm syntax, but is there a spec on it? Could you
> share a link, please.

there is a spec, http://libre-soc.org/openpower/sv/svp64 however
i literally made up the syntax as i went along.

"need a way to indicate mapreduce, err "mr" is short, that'll do"

no kidding! :)

svpy4.py is pretty much it, alongside the consts.py and other data structures,
power_enums.py and so on.

svp64.py and the Decoder are the canonical sources at the moment until
such time as there *is* time for someone *to* write documentation like
this.


> And the last one, for now. I guess, I can reuse/refactor both sv_analysis.py
> and svp64.py to generate header for binutils. Was that your intentions on
> how to do that task?

yyyep!  or, err refactor svp64.py? ahh more, "use svp64.py as a reference
to create the exact same thing in c", and *add* to sv_analysis.py to get
it to output autogenerated headers for use in binutils.  if you look
at how the microwatt vhdl struct is autogenerated that is pretty much
exactly what is needed.

cut, paste, substitute right magic constants, done.

you will see there are some data structures in binutils headers, that list
instructions.

if you add *one* extra field to that (a pointer to a binutils-ppc-svp64-struct)
ordinarily by leaving that out of the existing structs it will default to NULL.

you can then autogenerate a binutils svp64 header full of ppc-svp64-struct entries then have a function which, before use, *fills in*, *at runtime*, the pointers.

btw you got the message about copyright assignment to the FSF? this is *really* important.  binutils code that has not had an FSF copyright assignment *cannot be accepted upstream* and whatever you did would have to be thrown away and duplicated by someone who has.
Comment 11 Luke Kenneth Casson Leighton 2021-09-20 13:24:49 BST
https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/power_svp64_rm.py;hb=HEAD

that's the reverse: disassembly of binary to internal data structure
for use in the simulator and in HDL.  it will give some insight.

did you receive the email with the contact details of the copyright
clerk at the FSF?
Comment 12 lechenko 2021-09-20 22:42:24 BST
(In reply to Luke Kenneth Casson Leighton from comment #10)
> (In reply to lechenko from comment #9)
> > (In reply to Luke Kenneth Casson Leighton from comment #8)
> > Thus, few questions.
> > 
> > Does it mean, that we can try to implement the same two-step translation
> > logic inside binutils?
> 
> yes, exactly. or, more to the point: after "conversion" to ".long xxxxx;
> {equivalent v3.0B}" pass that *again* to the relevant function and get it to
> convert those to the appropriate binary output.

Okay, I shall find that function then.

> it is important to do that conversion pass *after* all the macro renaming
> and expansion of registers.  gas has a builtin macro system, you cannot
> process SVP64 registers until you know the actual number, 0-127.
> 

You mean, the conversion of 'sv.*' instruction to '.long xxxxxx; \1'? Will this macro/expansion machinery work correctly on 'sv.*' instruction?


> there is a spec, http://libre-soc.org/openpower/sv/svp64 however
> i literally made up the syntax as i went along.

Aha. This is a binary format spec and there is no spec for mnemonics per se. My guess, that I have to dig out the format from svp64.py and tests.

Also, forgot to mention last time. There are some macro processing in svp64.py. Where can I read about it?

> 
> btw you got the message about copyright assignment to the FSF? this is
> *really* important.  binutils code that has not had an FSF copyright
> assignment *cannot be accepted upstream* and whatever you did would have to
> be thrown away and duplicated by someone who has.

I noticed the email. But had no time to fill in and send. I'll deal with paperwork as soon as I'll start hacking binutils.
Comment 13 Luke Kenneth Casson Leighton 2021-09-20 23:27:59 BST
(In reply to lechenko from comment #12)

> > it is important to do that conversion pass *after* all the macro renaming
> > and expansion of registers.  gas has a builtin macro system, you cannot
> > process SVP64 registers until you know the actual number, 0-127.
> > 
> 
> You mean, the conversion of 'sv.*' instruction to '.long xxxxxx; \1'? Will
> this macro/expansion machinery work correctly on 'sv.*' instruction?

yes, a macro expandion like system should work perfectly.

SVP64 has a hard rule: you cannot do this:

     sv.X   ==>  .long NNNN; Y

it MUST be this:

     sv.X   ==>  .long NNNN; X

in other words there is not ONE SINGLE 64 bit instruction that does not
map to its corresponding 32 bit one.

therefore you can perfectly well do a runtime substitution.

> > there is a spec, http://libre-soc.org/openpower/sv/svp64 however
> > i literally made up the syntax as i went along.
> 
> Aha. This is a binary format spec and there is no spec for mnemonics per se.
> My guess, that I have to dig out the format from svp64.py and tests.

yes, sorry.  we can put a documentation budget if you would like to write
it, even if it is very sparse working notes for yourself.

> Also, forgot to mention last time. There are some macro processing in
> svp64.py. Where can I read about it?

binutils docs describe the macro system, it is ".set X Y"
i simply copied that at its most basic simplest level so that
Lauri could do some very basic macros, ".set counter r3" and
so on.  this was enough.

 
> > 
> > btw you got the message about copyright assignment to the FSF? this is
> > *really* important.  binutils code that has not had an FSF copyright
> > assignment *cannot be accepted upstream* and whatever you did would have to
> > be thrown away and duplicated by someone who has.
> 
> I noticed the email. But had no time to fill in and send. I'll deal with
> paperwork as soon as I'll start hacking binutils.

ok cool. i must send it as well.
Comment 14 lechenko 2021-10-03 17:54:56 BST
I have found a task https://bugs.libre-soc.org/show_bug.cgi?id=615 and it makes me wonder. Have been results of those discussion implemented in svp64.py?
Comment 15 Luke Kenneth Casson Leighton 2021-10-03 22:21:20 BST
(In reply to lechenko from comment #14)
> I have found a task https://bugs.libre-soc.org/show_bug.cgi?id=615 and it
> makes me wonder. Have been results of those discussion implemented in
> svp64.py?

not yet.  i thought about it (use of "?") and it makes the entire syntax
well... not understandable.

so i meant to ask about an alternative separator and haven't got round
to it yet.

probably sensible to make it a #define for now.
Comment 16 Luke Kenneth Casson Leighton 2021-11-21 18:24:10 GMT
anton, dmitry has a couple of weeks spare very shortly (tuesday), this
is precious time: what progress has been made, and do you mind if he
helps out?
Comment 17 lechenko 2021-11-22 12:44:21 GMT
Hi! Well, I didn't do much, studied code base and outlined plan of coding. But unfortunately, I have no time to do that until January. So I may brief Dmitry on my findings and pass this task to him for now.
Comment 18 dmitry.selyutin 2021-11-24 05:07:33 GMT
Re-assigned the task to me. For now, I briefly checked the comments and patch by Alexandre; I'll try to take a look at https://libre-soc.org/openpower/sv/svp64/ and src/openpower/sv/trans/svp64.py today.
Comment 19 Luke Kenneth Casson Leighton 2021-11-24 11:00:15 GMT
(In reply to dmitry.selyutin from comment #18)
> Re-assigned the task to me. For now, I briefly checked the comments and
> patch by Alexandre; I'll try to take a look at
> https://libre-soc.org/openpower/sv/svp64/ and
> src/openpower/sv/trans/svp64.py today.

an entire new table will be needed (an extra field in the "normal"
table which defaults to NULL and is optionally filled in for any
svp64 instruction)

it is *not safe* to write that SVP64-information table by hand.

it should be entirely auto-generated, just like this table for
microwatt is auto-generated:

https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/sv/sv_analysis.py;h=d62b587521db0303189f88b57873bb6a725d7ec0;hb=e5d2a21bd25720f9267c7c8045df83163bc63a20#l653

    constant sv_minor_19_decode_rom_array :
             sv_minor_19_rom_array_t := (
        -- insn  Ptype  Etype  in1  in2  in3  out  out2  CR in  CR out  sv_in1  sv_in2  sv_in3  sv_out  sv_out2  sv_cr_in  sv_cr_out
    2#0000000000# => (P2, EXTRA3, NONE, NONE, NONE, NONE, NONE, BFA, BF, NONE, NONE, NONE, NONE, NONE, Idx1, Idx0), -- mcrf


etc etc. etc. etc.

those consts - BFA, BF, Idx1, Idx0 etc etc. - heck even those can be
auto-generated through enumeration of the corresponding magic enums
in power_enums.py:

@unique
class SVEXTRA(Enum):
    NONE = 0
    Idx0 = 1
    Idx1 = 2
    Idx2 = 3
    Idx3 = 4
    Idx_1_2 = 5  # due to weird BA/BB for crops

==>

for value in SVEXTRA:
      f.write("#define SVP64_SVEXTRA_%s %s\n" % (value.name, value.field))

something like that.

almost nothing in the entire table should be written by hand.
Comment 20 Luke Kenneth Casson Leighton 2021-11-24 13:46:07 GMT
here:

an extra pointer in powerpc_opcode:

https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=include/opcode/ppc.h;h=8c356b1ff4c0e82d7dd8b367cb27559707fd14dd;hb=84629a61ee0f459a78e245e5aa41bec73f30c4d1#l64

  35 struct powerpc_opcode
  36 {
  38   const char *name;
  42   uint64_t opcode;
  48   uint64_t mask;
  53   ppc_cpu_t flags;
  58   ppc_cpu_t deprecated;
  63   unsigned char operands[8];
       svp64_opcode_table* svp64;
  64 };

where here, the *AUTO-GENERATED* svp64_opcode_table would be "merged"
by this function:

https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=gas/config/tc-ppc.c;h=cd0ba56ddffa728f90f2a9b1d299990b435645ad;hb=84629a61ee0f459a78e245e5aa41bec73f30c4d1#l1646

into this massive table of epic proportions:

https://git.libre-soc.org/?p=binutils-gdb.git;a=blob;f=opcodes/ppc-opc.c;h=bbbadffad8f62f867c53630b9bf67cfe72ecfed6;hb=84629a61ee0f459a78e245e5aa41bec73f30c4d1
Comment 21 dmitry.selyutin 2021-11-24 17:43:00 GMT
We've discussed the state of art today with Anton. I'll dig binutils and svp64.
Comment 22 dmitry.selyutin 2021-11-24 19:40:40 GMT
What's the current state of 615? Is there an established format?
Comment 23 dmitry.selyutin 2021-11-24 19:44:35 GMT
(In reply to Luke Kenneth Casson Leighton from comment #20)
>   35 struct powerpc_opcode
>   36 {
>   63   unsigned char operands[8];
>        svp64_opcode_table* svp64;
>   64 };

Did you mean a standalone table, or you'd like to bound each and every Power ISA instruction to its own SVP64 table? It really looks like that you meant to add plus one opcodes[]/num_opcodes pair, but the structure layout you posted tells otherwise.

I'd really start with checking for "sv." prefix in mnemonics, and then splitting by space or slash. I, however, do not yet understand how SVP64 works; it'd be nice to have some examples like "extsw from vanilla Power ISA vs sv.extsw 5, 3 vs sv.extsw. vs sv.extsw./satu/sz/dz/sm=r3/dm=r3 5, 31" and so on.
Comment 24 Luke Kenneth Casson Leighton 2021-11-24 19:59:03 GMT
use a #definable char.

we were advised that "?" is available, but it looks.. well.. s***

   sv.addi?vec2?ew=8?sm=r3

vs

   sv.addi/vec2/ew=8/sm=r3

if you use a #defined SVP64SEP it will not hold things up.
Comment 25 Luke Kenneth Casson Leighton 2021-11-24 20:24:58 GMT
(In reply to dmitry.selyutin from comment #23)
> (In reply to Luke Kenneth Casson Leighton from comment #20)
> >   35 struct powerpc_opcode
> >   36 {
> >   63   unsigned char operands[8];
> >        svp64_opcode_table* svp64;
> >   64 };
> 
> Did you mean a standalone table, 

YES.  one that is autogenerated by sv_analysis.py.  do not even
remotely consider creating it by hand, this will be absolutely
disastrous.

> or you'd like to bound each and every Power
> ISA instruction to its own SVP64 table?

the SVP64_table to powerpc_opcode table is a many-to-one relationship.
it is not a many-to-many.

it would be absolutely distastrous to try to modify powerpc_opcode
by hand (10,000 instructions).

therefore:

1) autogenerate svp64_opcode_table
2) in ppc_setup_opcodes hunt through powerpc_opcodes looking
   for name matches powerpc_opcode->name == svp64_opcode

oh i see, yes:

a) call it "struct svp64_opcode_augmentation" (or something)
b) autogenerate a table named svp64_table[135] (whatever)
c) add a pointer to the struct svp64_opcode_augmentation

35 struct powerpc_opcode
36 {
63   unsigned char operands[8];
     struct svp64_opcode_augmentation* svp64;
64 };


> It really looks like that you meant
> to add plus one opcodes[]/num_opcodes pair, but the structure layout you
> posted tells otherwise.

yes name of table not name of struct.

> I'd really start with checking for "sv." prefix in mnemonics, and then
> splitting by space or slash. I, however, do not yet understand how SVP64
> works; it'd be nice to have some examples like "extsw from vanilla Power ISA
> vs sv.extsw 5, 3 vs sv.extsw. vs sv.extsw./satu/sz/dz/sm=r3/dm=r3 5, 31" and
> so on.

ironically - interestingly - understanding svp64 is not actually necessary
in order to translate it!  the job is that simple and self-contained.

a higher priority item is, funnily enough, to add the *32 bit* Draft opcodes
such as the FFT and DCT butterfly muladd, and the new ternlogi, all of
which need to go under a binutils --experimental-svp64 option (something
like that)

those can be done under a separate bugreport, btw.

SVP64 overview including FOSDEM video is here:
https://libre-soc.org/openpower/sv/overview/

bear in mind, there really is nothing actually useful or
relevant to this task, in there.  if working on the *simulator*
however, that would be a different matter: it would be absolutely
essential to understand SVP64.

however the data format, that is in links you will already find in
earlier comments, as well as in svp64.py etc which you should consider
to be the canonical reference guide.
Comment 26 dmitry.selyutin 2021-11-27 20:41:04 GMT
Pushed a pair of smallish commits into sv_analysis.py so that it'll be capable of generating code for binutils. These are mostly preparations, but I think it's inevitable: I have neither desire nor time to copy&paste sv_analysis.py customizations (all these if/else branches, continues in loops and so on). I intend to decouple the format-specific parts into, well, format-specific entities. It'd be great if you could take a look and tell me if it's OK or not on the early stage. :-)
Comment 27 Luke Kenneth Casson Leighton 2021-11-27 21:06:34 GMT
(In reply to dmitry.selyutin from comment #26)
> Pushed a pair of smallish commits into sv_analysis.py so that it'll be
> capable of generating code for binutils. These are mostly preparations, but
> I think it's inevitable: I have neither desire nor time to copy&paste
> sv_analysis.py customizations (all these if/else branches, continues in
> loops and so on). I intend to decouple the format-specific parts into, well,
> format-specific entities. It'd be great if you could take a look and tell me
> if it's OK or not on the early stage. :-)

yyeah go for it.  try not to go too overboard (a la jinja2) it's just
not worth it.  see how it goes: if there turns out to be a hell of a lot
of "if format == " statements then some sort of *really small* API would
be worth considering, i do tend to find however that the readability of
such APIs is absolutely awful and after about the 5th one you deeply
regret having created them, and go back to code-duplication, where at
least the damn code is linear, shorter, and bleedin obvious.

see how you get on.  this could be short enough to work out okay: it's
only 50 lines within the cvs.svp64.items() loop.

what might be handy would be to drop in some namedtuples. or perhaps
if i'd put in comment it would help, ehn?
Comment 28 dmitry.selyutin 2021-11-27 21:25:34 GMT
I actually prefer dataclasses to namedtuples, these tend to be more flexible, especially for code generation. And it's also one of rare cases where type annotations _are_ useful (though, as usual, in a limited way).
Comment 29 Luke Kenneth Casson Leighton 2021-11-28 03:20:15 GMT
(In reply to dmitry.selyutin from comment #28)
> I actually prefer dataclasses to namedtuples, these tend to be more
> flexible, especially for code generation.

if this was even 300 lines i would agree with you no question.

however this is not 300 lines: it is one single for-loop
on one single dictionary, with a scant 30 lines inside
that one single for-loop.

it is under absolutely no circumstances whatsoever going
to get more complicated than that one single dictionary.
there will NEVER be anything else added, ever.

consequently attempting to add complex infrastructure normally
suited to large sophisticated code generation is in danger of creating massive
code bloat.

to give you some idea of how inappropriate code generator
technology is here: there is so little to output here that
it could well be covered entirely by a single 64 bit uint64_t
per row, comprising the ORing of a series of #defines:
not even a struct need be created (unlike in the VHDL case).
Comment 30 Luke Kenneth Casson Leighton 2021-11-28 15:25:37 GMT
(In reply to Luke Kenneth Casson Leighton from comment #29)

> it could well be covered entirely by a single 64 bit uint64_t
> per row, comprising the ORing of a series of #defines:
> not even a struct need be created (unlike in the VHDL case).

... possibly.  there's actually quite a lot.  looking at the VHDL:

        -- insn  Ptype  Etype  in1  in2  in3  out  out2  CR in  CR out  sv_in1  sv_in2  sv_in3  sv_out  sv_out2  sv_cr_in  sv_cr_out
2#0100001010# => (P1, EXTRA3, RA, RB, NONE, RT, NONE, NONE, CR0, Idx1, Idx2, NONE, Idx0, NONE, NONE, Idx0), -- add

so:

* sv_* fields are all 3-bit (5 possible values):  (None, Idx0, Idx1, Idx2, Idx3)
* there are 7 of those (in1, in2, in3, out, out2, cr_in, cr_out)
* that totals 21 bits
* Ptype (predicate type) is 1 bit: PTYPE_P1 or PTYPE_P2
* Etype (EXTRA fields type) is 1 bit: EXTRA2 or EXTRA3

(total 23 so far)

but, ahh, where there is sv_in1 .... etc. there are also 7 fields
in1, in2, in3, out, out2, CRin, Crout which tell you *which register*
those 7 sv_* fields are associated with.  in the example above:

* in1 = RA, but sv_in1 = Idx1.  therefore:
  - RA is an input
  - RA EXTRA3 encoding is in position Idx1
* in2 = RB, but sv_in2 = Idx2 therefore:
  - RB is an input
  - RB EXTRA3 encoding is in position Idx2
...
* CRout = CR0, but sv_cr_out = Idx0.  therfore
  - CR0 is an **OUTPUT**
  - CR0 EXTRA3 encoding is in position Idx0

now, the number of possible entries for in1/in2 etc. is:

* RA
* RB
* RC
* RT
* EA
* FRA
* FRB
* FRC
* FRS
* FRT
* CR0
* CR1

so 12 possible values, requiring 4 bits.  there are 7 of them,
therefore another 4*7=28 bits

28 + 23 == 51

ooooo it's still achievable to fit into all the SVP64 table
encoding information into a single uint64_t
Comment 31 dmitry.selyutin 2021-11-30 20:49:53 GMT
Pushed few more commits, now it looks like the whole code generation is format-based. I don't quite like the part where we convert CSV entry into (op, insn, row) stuff, this should likely be in some function or whatever, but maybe later. Anyway, I think the code is ready to output what we need for binutils.
Comment 32 dmitry.selyutin 2021-11-30 20:51:22 GMT
For the record, I completely dislike that we (ab)use print for logging. I'd rather prefer using print to output the file. But I guess that's one of really "way too late" remarks. :-)