917 – pysvp64dis: support SVP64 disassembly

Bug 917 - pysvp64dis: support SVP64 disassembly

Summary: pysvp64dis: support SVP64 disassembly

Status:	RESOLVED FIXED

Alias:	None

Product:	Libre-SOC's first SoC
Classification:	Unclassified
Component:	Source Code (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	--- enhancement
Assignee:	Dmitry Selyutin

URL:

Depends on:	918 919 920 921
Blocks:
	Show dependency tree / graph

Reported:	2022-09-01 09:25 BST by Dmitry Selyutin
Modified:	2022-10-12 23:37 BST (History)
CC List:	2 users (show)

See Also:	577 838 845 899 254 871
NLnet milestone:	NLNet.2019.10.042.Vulkan
total budget (EUR) for completion of task and all subtasks:	3000
budget (EUR) for this task, excluding subtasks' budget:	3000
parent task for budget allocation:	254
child tasks for budget allocation:	919 920 921
The table of payments (in EUR) for this task; TOML format:	ghostmansd={amount=2500, submitted=2022-09-20, paid=2022-09-27} lkcl={amount=500, submitted=2022-09-25, paid=2022-10-04}

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Dmitry Selyutin 2022-09-01 09:25:31 BST

The reference Python-based disassembler is required.

Comment 1 Dmitry Selyutin 2022-09-04 19:23:16 BST

Not much progress today: most of the time spend to hunt the bug with the opcode (revealed upon power_table usage). Refactored some portions; also, SVP64 assembly can now provide the information about the mode used.

Comment 2 Dmitry Selyutin 2022-09-05 19:02:43 BST

I wanted to start 911, but hey, I got carried by the world of disassembly, and started checking into extra stuff (no pun intended, I'm really speaking of EXTRA). I've augmented DynamicOperandGPR and DynamicOperandFPR with information on these.

40 0a 40 05    sv.add
14 6a e2 7e
    spec
        sv.add RT,RA,RB (OE=0 Rc=0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00001010
        [24:32] 01000000
        [32:40] 01111110
        [40:48] 11100010
        [48:56] 01101010
        [56:64] 00010100
    opcode
        0x7c000214
    mask
        0xfc0007ff
    RT
        01010
        [6, 7, 8, 9, 10]
        EXTRA[0]
    RA
        00000
        [11, 12, 13, 14, 15]
        EXTRA[1]
    RB
        00001
        [16, 17, 18, 19, 20]
        EXTRA[2]
    OE
        0
        [21]
    Rc
        0
        [31]
    mode
        normal: simple

Comment 3 Luke Kenneth Casson Leighton 2022-09-05 19:18:30 BST

(In reply to Dmitry Selyutin from comment #2)
> I wanted to start 911, but hey, I got carried by the world of disassembly,
> and started checking into extra stuff (no pun intended, I'm really speaking
> of EXTRA). I've augmented DynamicOperandGPR and DynamicOperandFPR with
> information on these.

ah brilliant because the next step to add the extra bits should be
straightforward.
>     RT
>         01010
>         [6, 7, 8, 9, 10]
>         EXTRA[0]

suggest putting EXTRA3, (see Etype) so EXTRA3[0] is RM.EXTRA3[0] bits 0-2

>     RA
>         00000
>         [11, 12, 13, 14, 15]
>         EXTRA[1]

RM.EXTRA3[1] is RM.EXTRA bits 3-5
>     RB
>         00001
>         [16, 17, 18, 19, 20]
>         EXTRA[2]

and this is RM.EXTRA3[2] which is RM.EXTRA bits 6-8

from there checking bit 0 of each tells you "scalar or vector".
then appending the extra bits should be easy enough to make the
right regnum.

Comment 4 Dmitry Selyutin 2022-09-05 19:29:16 BST

(In reply to Luke Kenneth Casson Leighton from comment #3)
> suggest putting EXTRA3, (see Etype) so EXTRA3[0] is RM.EXTRA3[0] bits 0-2

Ah yeah awesome idea! I thought about putting it on a separate line, but your suggestion is much better, it expresses the intent in a much cleaner way. I'll do it now.

> from there checking bit 0 of each tells you "scalar or vector".
> then appending the extra bits should be easy enough to make the
> right regnum.

I really thought that "if self.vector: prepend_asterisk(name)" is really clear and obvious. So yeah this is largely the reason why this was placed in operand classes. :-)

Comment 5 Dmitry Selyutin 2022-09-06 20:55:30 BST

OK, I started playing around SVP64 EXTRA2/EXTRA3 concepts. Here's what we have for now.
Note the updated ranges for operands, the disassembly itself (first lines), and extra modes printed.

Yes I remember about target_addr. I just don't have time to do everything at once.

00 00 40 05    sv.bc 2,9,0x8
08 00 49 40
    spec
        sv.bc BO,BI,BD (AA=0 LK=0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00000000
        [24:32] 00000000
        [32:40] 01000000
        [40:48] 01001001
        [48:56] 00000000
        [56:64] 00001000
    opcode
        0x40000000
    mask
        0xfc000003
    BO
        00010
        (38, 39, 40, 41, 42)
    BI
        01001
        (43, 44, 45, 46, 47)
    BD
        00000000000010
        (48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61)
        target_addr = EXTS(BD || 0b00))
    AA
        0
        (62,)
    LK
        0
        (63,)
    mode
        branch

40 18 40 05    sv.add r127,r31,r65
14 0a ff 7f
    spec
        sv.add RT,RA,RB (OE=0 Rc=0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00011000
        [24:32] 01000000
        [32:40] 01111111
        [40:48] 11111111
        [48:56] 00001010
        [56:64] 00010100
    opcode
        0x7c000214
    mask
        0xfc0007ff
    RT
        01111111
        (38, 39, 40, 41, 42, 18, 19, 20)
        extra3[0]
    RA
        11111
        (43, 44, 45, 46, 47)
        extra3[1]
    RB
        01000001
        (48, 49, 50, 51, 52, 24, 25, 26)
        extra3[2]
    OE
        0
        (53,)
    Rc
        0
        (63,)
    mode
        normal: simple

00 00 40 05    sv.lwzu r3,16,r1
10 00 61 84
    spec
        sv.lwzu RT,D(RA)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00000000
        [24:32] 00000000
        [32:40] 10000100
        [40:48] 01100001
        [48:56] 00000000
        [56:64] 00010000
    opcode
        0x84000000
    mask
        0xfc000000
    RT
        00011
        (38, 39, 40, 41, 42)
        extra2[0]
    D
        0000000000010000
        (48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63)
    RA
        00001
        (43, 44, 45, 46, 47)
        extra2[2]
    mode
        ld/st imm: simple

Comment 6 Dmitry Selyutin 2022-09-07 10:51:15 BST

OK, as I suspected, the recent changes broke stuff which relied on indexing extras. However, to my surprise, it was quite straightforward to fix it. I think we can try rebasing the branch. The next target is CRs.

Comment 7 Dmitry Selyutin 2022-09-07 10:55:23 BST

static inline uint64_t
svp64_insn_get_prefix_rm_extra3_idx0(const struct svp64_insn *insn)
{
  uint64_t value = insn->value;
  return (
    (((value >> UINT64_C(43)) & UINT64_C(1)) << UINT64_C(0)) |
    (((value >> UINT64_C(44)) & UINT64_C(1)) << UINT64_C(1)) |
    (((value >> UINT64_C(45)) & UINT64_C(1)) << UINT64_C(2))
  );
}

static inline uint64_t
svp64_insn_get_prefix_rm_extra3_idx1(const struct svp64_insn *insn)
{
  uint64_t value = insn->value;
  return (
    (((value >> UINT64_C(40)) & UINT64_C(1)) << UINT64_C(0)) |
    (((value >> UINT64_C(41)) & UINT64_C(1)) << UINT64_C(1)) |
    (((value >> UINT64_C(42)) & UINT64_C(1)) << UINT64_C(2))
  );
}

Perfect. I guess we have an opportunity to introduce a nice table in binutils when we're done with disasm (and same for setters).

Comment 8 Luke Kenneth Casson Leighton 2022-09-07 11:10:19 BST

(In reply to Dmitry Selyutin from comment #5)
> OK, I started playing around SVP64 EXTRA2/EXTRA3 concepts. Here's what we
> have for now.
> Note the updated ranges for operands, the disassembly itself (first lines),
> and extra modes printed.

ack.

> 40 18 40 05    sv.add r127,r31,r65
> 14 0a ff 7f
>     RB
>         01000001
>         (48, 49, 50, 51, 52, 24, 25, 26)
>         extra3[2]

ah! yeah, that looks right.  vector would
be:

    [48,49,50,51,52,24,25] # (or maybe 25,24)

scalar would be:

    [24,25,48,49,50,51,52,24,25] # (or maybe 25,24)


> 00 00 40 05    sv.lwzu r3,16,r1
> 10 00 61 84
>     spec
>         sv.lwzu RT,D(RA)

>     RA
>         00001
>         (43, 44, 45, 46, 47)
>         extra2[2]

and as this is EXTRA2, it would be best as:

     RA
         {0}000001
         [25,43,44,45,46,47] # or maybe 24

where vector would be:

         0000001{0}
         [43,44,45,46,47,25] # or maybe 24

Comment 9 Luke Kenneth Casson Leighton 2022-09-07 11:12:30 BST

(In reply to Dmitry Selyutin from comment #7)
> static inline uint64_t
> svp64_insn_get_prefix_rm_extra3_idx0(const struct svp64_insn *insn)
> {
>   uint64_t value = insn->value;
>   return (
>     (((value >> UINT64_C(43)) & UINT64_C(1)) << UINT64_C(0)) |

these look really easy/straightforward, i suspected they would be.
bear in mind that their use in spec *might* not be in the same
order, we established that with the spec_aug variable from
SVP64RMExtra.

> Perfect. I guess we have an opportunity to introduce a nice table in
> binutils when we're done with disasm (and same for setters).

huzzah.

Comment 10 Dmitry Selyutin 2022-09-07 13:47:38 BST

I've played a bit with the representation, here's what we have now:

00 00 40 05    sv.bc 2,9,
08 00 49 40
    spec
        sv.bc BO,BI,target_addr (AA=0 LK=0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00000000
        [24:32] 00000000
        [32:40] 01000000
        [40:48] 01001001
        [48:56] 00000000
        [56:64] 00001000
    opcode
        0x40000000
    mask
        0xfc000003
    BO
        00010
        38, 39, 40, 41, 42
    BI
        01001
        43, 44, 45, 46, 47
    target_addr
        00000000000010{00}
        48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61
        target_addr = EXTS(BD || 0b00))
    AA
        0
        62
    LK
        0
        63
    mode
        branch

40 18 40 05    sv.add r127,r31,r65
14 0a ff 7f
    spec
        sv.add RT,RA,RB (OE=0 Rc=0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00011000
        [24:32] 01000000
        [32:40] 01111111
        [40:48] 11111111
        [48:56] 00001010
        [56:64] 00010100
    opcode
        0x7c000214
    mask
        0xfc0007ff
    RT
        01111111
        38, 39, 40, 41, 42, 18, 19, 20
        extra3[0]
    RA
        11111
        43, 44, 45, 46, 47
        extra3[1]
    RB
        01000001
        48, 49, 50, 51, 52, 24, 25, 26
        extra3[2]
    OE
        0
        53
    Rc
        0
        63
    mode
        normal: simple

14 02 ef 7f    add r31,r15,r0
    spec
        add RT,RA,RB (OE=0 Rc=0)
    binary
        [0:8]   01111111
        [8:16]  11101111
        [16:24] 00000010
        [24:32] 00010100
    opcode
        0x7c000214
    mask
        0xfc0007ff
    RT
        11111
        6, 7, 8, 9, 10
    RA
        01111
        11, 12, 13, 14, 15
    RB
        00000
        16, 17, 18, 19, 20
    OE
        0
        21
    Rc
        0
        31

00 15 40 05    sv.lwzu r30,16,r15
10 00 cf 87
    spec
        sv.lwzu RT,D(RA)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00010101
        [24:32] 00000000
        [32:40] 10000111
        [40:48] 11001111
        [48:56] 00000000
        [56:64] 00010000
    opcode
        0x84000000
    mask
        0xfc000000
    RT
        011110{0}
        38, 39, 40, 41, 42, 18
        extra2[0]
    D
        0000000000010000
        48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
    RA
        001111{0}
        43, 44, 45, 46, 47, 22
        extra2[2]
    mode
        ld/st imm: simple

Comment 11 Dmitry Selyutin 2022-09-07 13:49:59 BST

Oops, one of operands in sv.bc lost... :-)

Comment 12 Dmitry Selyutin 2022-09-07 13:56:33 BST

(In reply to Dmitry Selyutin from comment #11)
> Oops, one of operands in sv.bc lost... :-)

Damn, used return instead of yield. Should be fine now: sv.bc 2,9,0x8

Comment 13 Luke Kenneth Casson Leighton 2022-09-07 14:06:52 BST

(In reply to Dmitry Selyutin from comment #10)
> I've played a bit with the representation, here's what we have now:
> 
> 00 00 40 05    sv.bc 2,9,

wark-wark :)

> 08 00 49 40
>     spec
>         sv.bc BO,BI,target_addr (AA=0 LK=0)

niiice

>     BI
>         01001
>         43, 44, 45, 46, 47
>     target_addr
>         00000000000010{00}
>         48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61
>         target_addr = EXTS(BD || 0b00))

nice. BD seems missing entirely but hey

> 40 18 40 05    sv.add r127,r31,r65
> 14 0a ff 7f
>     spec
>         sv.add RT,RA,RB (OE=0 Rc=0)
>     RT
>         01111111
>         38, 39, 40, 41, 42, 18, 19, 20
>         extra3[0]

this i assume is numbering based on 64-bit MSB0. it must be.
* 38-42 minus 32 (to get them down to 32-bit MSB0) is 6-10
  ah ha! that matches with RT in a 32-bit instruction, so yes.

it also looks like it's vector numbering rather than scalar
numbering because 18,19,20 are at the end (in LSB positions)

if it's scalar it should be

    [19,20, 38, 39, 40, 41, 42] # or maybe 20,19,38..42


>     RA
>         11111
>         43, 44, 45, 46, 47
>         extra3[1]

this one is missing the 7-bit extension. no GPR/FPR/CR
operands should miss EXTRA extension


> 14 02 ef 7f    add r31,r15,r0
>     spec
>         add RT,RA,RB (OE=0 Rc=0)
>     RT
>         11111
>         6, 7, 8, 9, 10

yeah there we go.  6-10 (so 38-42 minus 32 was also 6-10)

> 00 15 40 05    sv.lwzu r30,16,r15
> 10 00 cf 87
>     spec
>         sv.lwzu RT,D(RA)
>     RT
>         011110{0}
>         38, 39, 40, 41, 42, 18
>         extra2[0]

if that was vector it's correct.

if it is scalar it should be {0}011110 [18,38,39....42]

Comment 14 Luke Kenneth Casson Leighton 2022-09-07 14:17:08 BST

> 00 15 40 05    sv.lwzu r30,16,r15
> 10 00 cf 87

repro with

    echo -n -e '\x00\x15\x40\x05\x10\x00\xcf\x87' | pysvp64dis -v

Comment 15 Dmitry Selyutin 2022-09-07 14:36:16 BST

(In reply to Luke Kenneth Casson Leighton from comment #13)
> >     BI
> >         01001
> >         43, 44, 45, 46, 47
> >     target_addr
> >         00000000000010{00}
> >         48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61
> >         target_addr = EXTS(BD || 0b00))
> 
> nice. BD seems missing entirely but hey

Not quite entirely, cf. last line. :-) Also the range corresponds to BD and formula explains this mystic {00}.

> this i assume is numbering based on 64-bit MSB0. it must be.
> if it's scalar it should be
> if that was vector it's correct.
> if it is scalar it should be {0}011110 [18,38,39....42]

Yeah this needs tuning then. Stay tuned. :-)

> this one is missing the 7-bit extension. no GPR/FPR/CR
> operands should miss EXTRA extension

This is caused by ambiguous wording "If EXTRA3 is zero, maps to "scalar identity" (scalar Power ISA field naming)." (same for EXTRA2). So I printed these as they were in SVP64-less world. But I can add these bits, not a big deal.

Comment 16 Luke Kenneth Casson Leighton 2022-09-07 14:52:11 BST

(In reply to Dmitry Selyutin from comment #15)

> This is caused by ambiguous wording "If EXTRA3 is zero, maps to "scalar
> identity" (scalar Power ISA field naming)." (same for EXTRA2). So I printed
> these as they were in SVP64-less world. But I can add these bits, not a big
> deal.

ahh interesting.  technically-speaking, that's actually true.
so yes, that would work out fine.  had me confused for a minute.

Comment 17 Luke Kenneth Casson Leighton 2022-09-07 16:45:44 BST

(In reply to Dmitry Selyutin from comment #10)

> 00 15 40 05    sv.lwzu r30,16,r15
> 10 00 cf 87
>     spec
>         sv.lwzu RT,D(RA)

no rush, i just noticed that should be:

 00 15 40 05    sv.lwzu r30,16(r15)

Comment 18 Dmitry Selyutin 2022-09-07 16:53:36 BST

OK yet another take on these, never surrender! Note that I refactored the representation and {0} appears for the fields instead, this looks much clearer and explains what happens better (especially with extras).
00 00 40 05    sv.bc 2,9,0x8
08 00 49 40
    spec
        sv.bc BO,BI,target_addr (AA=0 LK=0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00000000
        [24:32] 00000000
        [32:40] 01000000
        [40:48] 01001001
        [48:56] 00000000
        [56:64] 00001000
    opcode
        0x40000000
    mask
        0xfc000003
    BO
        00010
        38, 39, 40, 41, 42
    BI
        01001
        43, 44, 45, 46, 47
    target_addr
        0000000000001000
        48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, {0}, {0}
        target_addr = EXTS(BD || 0b00))
    AA
        0
        62
    LK
        0
        63
    mode
        branch

40 18 40 05    sv.add r127,r31,r65
14 0a ff 7f
    spec
        sv.add RT,RA,RB (OE=0 Rc=0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00011000
        [24:32] 01000000
        [32:40] 01111111
        [40:48] 11111111
        [48:56] 00001010
        [56:64] 00010100
    opcode
        0x7c000214
    mask
        0xfc0007ff
    RT
        1111111
        19, 20, 38, 39, 40, 41, 42
        extra3[0]
    RA
        11111
        43, 44, 45, 46, 47
        extra3[1]
    RB
        1000001
        25, 26, 48, 49, 50, 51, 52
        extra3[2]
    OE
        0
        53
    Rc
        0
        63
    mode
        normal: simple

14 02 ef 7f    add r31,r15,r0
    spec
        add RT,RA,RB (OE=0 Rc=0)
    binary
        [0:8]   01111111
        [8:16]  11101111
        [16:24] 00000010
        [24:32] 00010100
    opcode
        0x7c000214
    mask
        0xfc0007ff
    RT
        11111
        6, 7, 8, 9, 10
    RA
        01111
        11, 12, 13, 14, 15
    RB
        00000
        16, 17, 18, 19, 20
    OE
        0
        21
    Rc
        0
        31

00 15 40 05    sv.lwzu r62,16,r63
10 00 df 87
    spec
        sv.lwzu RT,D(RA)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00010101
        [24:32] 00000000
        [32:40] 10000111
        [40:48] 11011111
        [48:56] 00000000
        [56:64] 00010000
    opcode
        0x84000000
    mask
        0xfc000000
    RT
        0111110
        19, {0}, 38, 39, 40, 41, 42
        extra2[0]
    D
        0000000000010000
        48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
    RA
        0111111
        23, {0}, 43, 44, 45, 46, 47
        extra2[2]
    mode
        ld/st imm: simple

00 25 40 05    sv.lwzu *r0,16,r63
10 00 1f 84
    spec
        sv.lwzu RT,D(RA)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00100101
        [24:32] 00000000
        [32:40] 10000100
        [40:48] 00011111
        [48:56] 00000000
        [56:64] 00010000
    opcode
        0x84000000
    mask
        0xfc000000
    RT
        0000000
        38, 39, 40, 41, 42, 19, {0}
        extra2[0]
    D
        0000000000010000
        48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
    RA
        0111111
        23, {0}, 43, 44, 45, 46, 47
        extra2[2]
    mode
        ld/st imm: simple

00 35 40 05    sv.lwzu *r2,16,r63
10 00 1f 84
    spec
        sv.lwzu RT,D(RA)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00110101
        [24:32] 00000000
        [32:40] 10000100
        [40:48] 00011111
        [48:56] 00000000
        [56:64] 00010000
    opcode
        0x84000000
    mask
        0xfc000000
    RT
        0000010
        38, 39, 40, 41, 42, 19, {0}
        extra2[0]
    D
        0000000000010000
        48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
    RA
        0111111
        23, {0}, 43, 44, 45, 46, 47
        extra2[2]
    mode
        ld/st imm: simple

There's one quirk still: `sv.lwzu *r2,16,r63` should have rather been `sv.lwzu *r2,16(r63)`. I'll update it.

Comment 19 Dmitry Selyutin 2022-09-07 16:54:07 BST

Sigh, spans are still wrong wrt vectors. One more iteration.

Comment 20 Luke Kenneth Casson Leighton 2022-09-07 17:08:21 BST

confirmed vector EXTRA2 works.  repro:

$ echo "sv.lwzu *62,16(*48)" >> lwzu.tst.s
$ powerpc64le-linux-gnu-as lwzu.tst.s
$ powerpc64le-linux-gnu-objdump -D ./a.out
0000000000000000 <.text>:
   0:   00 3a 40 05     .long 0x5403a00
   4:   10 00 ec 85     lwzu    r15,16(r12)
$ echo -n -e '\x00\x3a\x40\x05\x10\x00\xec\x85' | pysvp64dis -v
00 3a 40 05    sv.lwzu *r62,16,*r48
10 00 ec 85
    spec
        sv.lwzu RT,D(RA)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00111010
        [24:32] 00000000
        [32:40] 10000101
        [40:48] 11101100
        [48:56] 00000000
        [56:64] 00010000
    opcode
        0x84000000
    mask
        0xfc000000
    RT
        0111110                     <- 62 (correct)
        38, 39, 40, 41, 42, 19, {0} <- good
        extra2[0]
    D
        0000000000010000
        48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
    RA
        0110000                     <- 48 (correct)
        43, 44, 45, 46, 47, 23, {0} <- good
        extra2[2]
    mode
        ld/st imm: simple

Comment 21 Luke Kenneth Casson Leighton 2022-09-07 17:13:38 BST

$ echp 'sv.add *97,*23,*63' > svadd.tst.s
$ powerpc64le-linux-gnu-as svadd.tst.s 
$ powerpc64le-linux-gnu-objdump -D ./a.out 
0000000000000000 <.text>:
   0:   e0 2f 40 05     .long 0x5402fe0
   4:   14 7a 05 7f     add     r24,r5,r15
$ echo -n -e '\xe0\x2f\x40\x05\x14\x7a\x05\x7f' | pysvp64dis -v
e0 2f 40 05    sv.add *r97,*r23,*r63
14 7a 05 7f
    spec
        sv.add RT,RA,RB (OE=0 Rc=0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00101111
        [24:32] 11100000
        [32:40] 01111111
        [40:48] 00000101
        [48:56] 01111010
        [56:64] 00010100
    opcode
        0x7c000214
    mask
        0xfc0007ff
    RT
        1100001                    <- correct
        38, 39, 40, 41, 42, 19, 20 <- good
        extra3[0]
    RA
        0010111                    <- correct
        43, 44, 45, 46, 47, 22, 23 <- good
        extra3[1]
    RB
        0111111                    <- correct
        48, 49, 50, 51, 52, 25, 26 <- good
        extra3[2]
    OE
        0
        53
    Rc
        0
        63
    mode
        normal: simple

Comment 22 Dmitry Selyutin 2022-09-07 17:17:34 BST

Fixed immediates.

00 00 40 05    bc 2,9,0x8
08 00 49 40
    spec
        sv.bc 10,0,0x0 (AA=0 LK=0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00000000
        [24:32] 00000000
        [32:40] 01000000
        [40:48] 01001001
        [48:56] 00000000
        [56:64] 00001000
    opcode
        0x40000000
    mask
        0xfc000003
    BO
        00010
        38, 39, 40, 41, 42
    BI
        01001
        43, 44, 45, 46, 47
    target_addr
        0000000000001000
        48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, {0}, {0}
        target_addr = EXTS(BD || 0b00))
    AA
        0
        62
    LK
        0
        63
    mode
        branch

40 18 40 05    add r127,r31,r65
14 0a ff 7f
    spec
        sv.add r10,r0,r3 (OE=0 Rc=0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00011000
        [24:32] 01000000
        [32:40] 01111111
        [40:48] 11111111
        [48:56] 00001010
        [56:64] 00010100
    opcode
        0x7c000214
    mask
        0xfc0007ff
    RT
        1111111
        19, 20, 38, 39, 40, 41, 42
        extra3[0]
    RA
        11111
        43, 44, 45, 46, 47
        extra3[1]
    RB
        1000001
        25, 26, 48, 49, 50, 51, 52
        extra3[2]
    OE
        0
        53
    Rc
        0
        63
    mode
        normal: simple

14 02 ef 7f    add r31,r15,r0
    spec
        add r31,r15,r0 (OE=0 Rc=0)
    binary
        [0:8]   01111111
        [8:16]  11101111
        [16:24] 00000010
        [24:32] 00010100
    opcode
        0x7c000214
    mask
        0xfc0007ff
    RT
        11111
        6, 7, 8, 9, 10
    RA
        01111
        11, 12, 13, 14, 15
    RB
        00000
        16, 17, 18, 19, 20
    OE
        0
        21
    Rc
        0
        31

00 15 40 05    lwzu r62,16(r63)
10 00 df 87
    spec
        sv.lwzu r10,5376(r0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00010101
        [24:32] 00000000
        [32:40] 10000111
        [40:48] 11011111
        [48:56] 00000000
        [56:64] 00010000
    opcode
        0x84000000
    mask
        0xfc000000
    RT
        0111110
        {0}, 19, 38, 39, 40, 41, 42
        extra2[0]
    D
        0000000000010000
        48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
    RA
        0111111
        {0}, 23, 43, 44, 45, 46, 47
        extra2[2]
    mode
        ld/st imm: simple

00 25 40 05    lwzu *r0,16(r63)
10 00 1f 84
    spec
        sv.lwzu r10,9472(r0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00100101
        [24:32] 00000000
        [32:40] 10000100
        [40:48] 00011111
        [48:56] 00000000
        [56:64] 00010000
    opcode
        0x84000000
    mask
        0xfc000000
    RT
        0000000
        38, 39, 40, 41, 42, 19, {0}
        extra2[0]
    D
        0000000000010000
        48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
    RA
        0111111
        {0}, 23, 43, 44, 45, 46, 47
        extra2[2]
    mode
        ld/st imm: simple

00 35 40 05    lwzu *r2,16(r63)
10 00 1f 84
    spec
        sv.lwzu r10,13568(r0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00110101
        [24:32] 00000000
        [32:40] 10000100
        [40:48] 00011111
        [48:56] 00000000
        [56:64] 00010000
    opcode
        0x84000000
    mask
        0xfc000000
    RT
        0000010
        38, 39, 40, 41, 42, 19, {0}
        extra2[0]
    D
        0000000000010000
        48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
    RA
        0111111
        {0}, 23, 43, 44, 45, 46, 47
        extra2[2]
    mode
        ld/st imm: simple

Comment 23 Dmitry Selyutin 2022-09-07 17:27:31 BST

00 00 40 05    sv.bc 2,9,0x8
08 00 49 40
    spec
        sv.bc BO,BI,target_addr (AA=0 LK=0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00000000
        [24:32] 00000000
        [32:40] 01000000
        [40:48] 01001001
        [48:56] 00000000
        [56:64] 00001000
    opcode
        0x40000000
    mask
        0xfc000003
    BO
        00010
        38, 39, 40, 41, 42
    BI
        01001
        43, 44, 45, 46, 47
    target_addr
        0000000000001000
        48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, {0}, {0}
        target_addr = EXTS(BD || 0b00))
    AA
        0
        62
    LK
        0
        63
    mode
        branch

40 18 40 05    sv.add r127,r31,r65
14 0a ff 7f
    spec
        sv.add RT,RA,RB (OE=0 Rc=0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00011000
        [24:32] 01000000
        [32:40] 01111111
        [40:48] 11111111
        [48:56] 00001010
        [56:64] 00010100
    opcode
        0x7c000214
    mask
        0xfc0007ff
    RT
        1111111
        19, 20, 38, 39, 40, 41, 42
        extra3[0]
    RA
        11111
        43, 44, 45, 46, 47
        extra3[1]
    RB
        1000001
        25, 26, 48, 49, 50, 51, 52
        extra3[2]
    OE
        0
        53
    Rc
        0
        63
    mode
        normal: simple

14 02 ef 7f    add r31,r15,r0
    spec
        add RT,RA,RB (OE=0 Rc=0)
    binary
        [0:8]   01111111
        [8:16]  11101111
        [16:24] 00000010
        [24:32] 00010100
    opcode
        0x7c000214
    mask
        0xfc0007ff
    RT
        11111
        6, 7, 8, 9, 10
    RA
        01111
        11, 12, 13, 14, 15
    RB
        00000
        16, 17, 18, 19, 20
    OE
        0
        21
    Rc
        0
        31

00 15 40 05    sv.lwzu r62,16(r63)
10 00 df 87
    spec
        sv.lwzu RT,D(RA)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00010101
        [24:32] 00000000
        [32:40] 10000111
        [40:48] 11011111
        [48:56] 00000000
        [56:64] 00010000
    opcode
        0x84000000
    mask
        0xfc000000
    RT
        0111110
        {0}, 19, 38, 39, 40, 41, 42
        extra2[0]
    D
        0000000000010000
        48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
    RA
        0111111
        {0}, 23, 43, 44, 45, 46, 47
        extra2[2]
    mode
        ld/st imm: simple

00 25 40 05    sv.lwzu *r0,16(r63)
10 00 1f 84
    spec
        sv.lwzu RT,D(RA)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00100101
        [24:32] 00000000
        [32:40] 10000100
        [40:48] 00011111
        [48:56] 00000000
        [56:64] 00010000
    opcode
        0x84000000
    mask
        0xfc000000
    RT
        0000000
        38, 39, 40, 41, 42, 19, {0}
        extra2[0]
    D
        0000000000010000
        48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
    RA
        0111111
        {0}, 23, 43, 44, 45, 46, 47
        extra2[2]
    mode
        ld/st imm: simple

00 35 40 05    sv.lwzu *r2,16(r63)
10 00 1f 84
    spec
        sv.lwzu RT,D(RA)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00110101
        [24:32] 00000000
        [32:40] 10000100
        [40:48] 00011111
        [48:56] 00000000
        [56:64] 00010000
    opcode
        0x84000000
    mask
        0xfc000000
    RT
        0000010
        38, 39, 40, 41, 42, 19, {0}
        extra2[0]
    D
        0000000000010000
        48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
    RA
        0111111
        {0}, 23, 43, 44, 45, 46, 47
        extra2[2]
    mode
        ld/st imm: simple

Comment 24 Luke Kenneth Casson Leighton 2022-09-07 18:07:49 BST

(In reply to Dmitry Selyutin from comment #22)
> Fixed immediates.

briiilliant.

> 00 00 40 05    bc 2,9,0x8
> 08 00 49 40
>     spec
>         sv.bc 10,0,0x0 (AA=0 LK=0)

oh whoops, should be sv.bc BO,BI,target_addr (AA....)

ah yes comment #23 has it

00 40 05    sv.bc 2,9,0x8
08 00 49 40
    spec
        sv.bc BO,BI,target_addr (AA=0 LK=0)

fantastic.

Comment 25 Dmitry Selyutin 2022-09-09 23:21:17 BST

Today I had to mess around opcode matching again. This is still tricky, but at least it matches svshape2. Below is the code for media/audio/mp3/mp3_0_apply_window_float_basicsv.s.sv, with one change: I removed .rodata section, because it messed with the disassembler. I checked only some of the instructions, but mostly it looks like the expected result.

addis 2,12,0
addi 2,2,0
addis 9,2,0
addi 9,9,0
rlwinm 7,7,2,0,29
mulli 0,7,31
add 10,6,0
setvl 0,0,8,1,1,0
addi 16,4,124
lfiwax 0,0,5
addi 5,3,64
sv.lfs *32,256(4)
sv.lfs *40,256(5)
sv.fmuls *32,*32,*40
sv.fadds 0,*32,0
addi 5,3,192
addi 4,4,128
sv.lfs *32,256(4)
sv.lfs *40,256(5)
sv.fmuls *32,*32,*40
sv.fsubs 0,0,*32
addi 4,4,65408
stfs 0,0(6)
add 6,6,7
addi 4,4,4
addi 0,0,15
mtspr 288,0
addi 8,0,4
lfiwax 0,0,9
lfiwax 1,0,9
addi 5,3,64
add 5,5,8
sv.lfs *32,256(5)
sv.lfs *40,256(4)
sv.lfs *48,256(16)
sv.fmuls *40,*32,*40
sv.fadds 0,0,*40
sv.fmuls *32,*32,*48
sv.fsubs 1,1,*32
addi 5,3,192
subf 5,8,5
addi 4,4,128
addi 16,16,128
sv.lfs *32,256(5)
sv.lfs *40,256(4)
sv.lfs *48,256(16)
sv.fmuls *40,*32,*40
sv.fsubs 0,0,*40
sv.fmuls *32,*32,*48
sv.fsubs 1,1,*32
addi 4,4,65408
addi 16,16,65408
stfs 0,0(6)
add 6,6,7
stfs 1,0(10)
subf 10,7,10
addi 8,8,4
addi 4,4,4
addi 16,16,65532
bc 16,0,0xff4c
addi 5,3,128
addi 4,4,128
lfiwax 0,0,9
sv.lfs *32,256(4)
sv.lfs *40,256(5)
sv.fmuls *32,*32,*40
sv.fsubs 0,0,*32
stfs 0,0(6)
bclr 20,0,0

Comment 26 Dmitry Selyutin 2022-09-09 23:22:01 BST

As for CRs support, these still don't work; Luke, it'd be great if you could take a look at 73f9c4c65cb886bcf0c242ca48fc9dc339fe5ba3.

Comment 27 Dmitry Selyutin 2022-09-10 06:34:50 BST

Refactored D operand so that it shows the verbose information in a more obvious and explicit form.

47 00 90 59    fmvis f12,97
    spec
        fmvis FRS,D
    binary
        [0:8]   01011001
        [8:16]  10010000
        [16:24] 00000000
        [24:32] 01000111
    opcode
        0x58000006
    mask
        0x5800003e
    FRS
        01100
        6, 7, 8, 9, 10
    D
        d0 = D[0:9]
            0000000001
            16, 17, 18, 19, 20, 21, 22, 23, 24, 25
        d1 = D[10:15]
            10000
            11, 12, 13, 14, 15
        d2 = D[16]
            1
            31

Comment 28 Dmitry Selyutin 2022-09-10 06:45:33 BST

After some thoughts, I decided to also slightly tune target_addr, hopefully it's clearer now.

00 00 40 05    sv.bc 12,2,0x1c
1c 00 82 41    
    spec
        sv.bc BO,BI,target_addr (AA=0 LK=0)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00000000
        [24:32] 00000000
        [32:40] 01000001
        [40:48] 10000010
        [48:56] 00000000
        [56:64] 00011100
    opcode
        0x40000000
    mask
        0x40000003
    BO
        01100
        38, 39, 40, 41, 42
    BI
        00010
        43, 44, 45, 46, 47
    target_addr = EXTS(BD || 0b00))
        BD
            0000000000011100
            48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, {0}, {0}
    AA
        0
        62
    LK
        0
        63
    mode
        branch

Comment 29 Dmitry Selyutin 2022-09-10 07:09:58 BST

From now on, we support pcode in extended mode, too.

00 38 40 05    sv.add *r3,r2,r1
14 0a 02 7c
    spec
        sv.add RT,RA,RB (OE=0 Rc=0)
    pcode
        RT <- (RA) + (RB)
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00111000
        [24:32] 00000000
        [32:40] 01111100
        [40:48] 00000010
        [48:56] 00001010
        [56:64] 00010100
    opcode
        0x7c000214
    mask
        0x7c000615
    RT (vector)
        0000011
        38, 39, 40, 41, 42, 19, 20
        extra3[0]
    RA (scalar)
        00010
        43, 44, 45, 46, 47
        extra3[1]
    RB (scalar)
        00001
        48, 49, 50, 51, 52
        extra3[2]
    OE
        0
        53
    Rc
        0
        63
    mode
        normal: simple

Comment 30 Luke Kenneth Casson Leighton 2022-09-10 10:35:41 BST

(In reply to Dmitry Selyutin from comment #29)
> From now on, we support pcode in extended mode, too.

nice idea for some ops - setvl and svshape on the other hand
are massive.  suggest it being an option.

> 00 38 40 05    sv.add *r3,r2,r1
> 14 0a 02 7c
>     spec
>         sv.add RT,RA,RB (OE=0 Rc=0)
>     pcode
>         RT <- (RA) + (RB)

>     RT (vector)
>     RA (scalar)

wha-hey!

Comment 31 Luke Kenneth Casson Leighton 2022-09-10 10:38:32 BST

(In reply to Dmitry Selyutin from comment #26)
> As for CRs support, these still don't work; Luke, it'd be great if you could
> take a look at 73f9c4c65cb886bcf0c242ca48fc9dc339fe5ba3.

willdo - looks like you forgot that some can be 5-bit others 3-bit.
BA/BB/BC is 5-bit, BF/BFA (Bit Field) is 3-bit

Comment 32 Dmitry Selyutin 2022-09-10 19:27:26 BST

Yet another day wasted on opcodes, but now we can finally recognize tricky cases like svshape2 and isel. Returning back to CRs.

Comment 33 Dmitry Selyutin 2022-09-10 20:17:19 BST

Since I'm unsure about CRs, I took this opportunity to cleanup opcodes even more. Here's brand new representation:

59 00 00 58    svshape 1,1,1,0,1
    spec
        svshape SVxd,SVyd,SVzd,SVrm,vf
    opcodes
        010110---------------0000-011001
        010110---------------0001-011001
        010110---------------0010-011001
        010110---------------0011-011001
        010110---------------0100-011001
        010110---------------0101-011001
        010110---------------0110-011001
        010110---------------0111-011001
        010110---------------1010-011001
        010110---------------1011-011001
        010110---------------1100-011001
        010110---------------1101-011001
        010110---------------1110-011001
        010110---------------1111-011001

19 04 c0 5b    svshape2 15,0,0,1,0,0
    spec
        svshape2 SVo,SVyx,rmm,SVd,sk,mm
    opcodes
        010110---------------100--011001

Comment 34 Dmitry Selyutin 2022-09-10 20:34:24 BST

Established dictionaries for opcodes and names (opcodes frankly I simply stole from binutils, I simply hash by PO).

Before:
real    0m1.227s
user    0m1.210s
sys     0m0.015s

After:
real    0m0.986s
user    0m0.976s
sys     0m0.008s

I won't continue these experiments now due to time constraints, unless another obvious optimizations comes to my head.

Comment 35 Luke Kenneth Casson Leighton 2022-09-11 01:41:05 BST

$ echo "sv.isel 10,20,30,33" | pysvp64asm > svisel.tst.s
$ powerpc64le-linux-gnu-as svisel.tst.s 
$ powerpc64le-linux-gnu-objdump -D ./a.out
$ echo -n -e '\x40\x00\x40\x05\x5e\xf0\x54\x7d' | pysvp64dis -v

40 00 40 05    sv.isel r10,r20,r30,33
5e f0 54 7d   
    BC (scalar)
        000100001
        {0}, {0}, {0}, 25, 53, 54, 55, 56, 57

        extra2[3]

$ echo "sv.isel 10,20,30,*33" | pysvp64asm > svisel.tst.s
echo -n -e '\xc0\x00\x40\x05\x5e\xf0\x54\x7d' | pysvp64dis -v

c0 00 40 05    sv.isel r10,r20,r30,*33
5e f0 54 7d    

    BC (vector)
        000100001
        53, 54, 55, 25, {0}, {0}, {0}, 56, 57
        extra2[3]

Comment 36 Luke Kenneth Casson Leighton 2022-09-11 01:45:16 BST

all good

https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=b49d42520dbba44d6fc5421b57ea1202ed47252d

echo "sv.isel 10,20,30,*483" | pysvp64asm > svisel.tst.s 
echo -n -e '\xc0\x00\x40\x05\xde\xf7\x54\x7d' | pysvp64dis -v

c0 00 40 05    sv.isel r10,r20,r30,*483
de f7 54 7d    

    BC (vector)
        111100011
        53, 54, 55, 25, {0}, {0}, {0}, 56, 57
        extra2[3]




echo "sv.isel 10,20,30,63" | pysvp64asm > svisel.tst.s 
echo -n -e '\x40\x00\x40\x05\xde\xf7\x54\x7d' | pysvp64dis -v

40 00 40 05    sv.isel r10,r20,r30,63
de f7 54 7d    

    BC (scalar)
        000111111
        {0}, {0}, {0}, 25, 53, 54, 55, 56, 57
        extra2[3]

Comment 37 Dmitry Selyutin 2022-09-12 20:38:06 BST

Today's achievements:
1. Fixed Rc-enabled instructions (the matching algorithm was incorrect).
2. Refactored RM mappings so that these can be different depending on the mode (needed for CR ops and branches, probably others).
3. Supported multiple opcodes for binutils (not yet tested, but on the first glance looks fine).

Below is how it looks in C.

struct svp64_opcode {
  uint32_t value;
  uint32_t mask;
};

struct svp64_record {
  const char *name;
  struct svp64_desc desc;
  const struct svp64_opcode *opcodes;
  size_t nr_opcodes;
};

const struct svp64_opcode svp64_opcodes[] = \
{
  {
    .value = 0x7c0004ae,
    .mask = 0xfc0007fe,
  },
  /* snip */
}

const struct svp64_record svp64_records[] = \
{
  /* snip */
  {
    .name = "sv.isel",
    .desc = {
      .function = SVP64_FUNCTION_CR,
      .in1 = SVP64_IN1_SEL_RA_OR_ZERO,
      .in2 = SVP64_IN2_SEL_RB,
      .in3 = SVP64_IN3_SEL_NONE,
      .out = SVP64_OUT_SEL_RT,
      .out2 = SVP64_OUT_SEL_RT,
      .cr_in = SVP64_CR_IN_SEL_BC,
      .cr_in2 = SVP64_CR_IN2_SEL_NONE,
      .cr_out = SVP64_CR_OUT_SEL_NONE,
      .ptype = SVP64_PTYPE_P1,
      .etype = SVP64_ETYPE_EXTRA2,
      .extra_idx_in1 = SVP64_EXTRA_IDX1,
      .extra_idx_in2 = SVP64_EXTRA_IDX2,
      .extra_idx_in3 = SVP64_EXTRA_NONE,
      .extra_idx_out = SVP64_EXTRA_IDX0,
      .extra_idx_out2 = SVP64_EXTRA_NONE,
      .extra_idx_cr_in = SVP64_EXTRA_IDX3,
      .extra_idx_cr_out = SVP64_EXTRA_NONE,
    },
    .opcodes = &svp64_opcodes[288],
    .nr_opcodes = 32,
  },
  /* snip */
}

Comment 38 Dmitry Selyutin 2022-09-12 20:43:58 BST

Oh, and, by the way:

  {
    .name = "sv.crand",
    .desc = {
      .function = SVP64_FUNCTION_CR,
      .in1 = SVP64_IN1_SEL_NONE,
      .in2 = SVP64_IN2_SEL_NONE,
      .in3 = SVP64_IN3_SEL_NONE,
      .out = SVP64_OUT_SEL_NONE,
      .out2 = SVP64_OUT_SEL_NONE,
      .cr_in = SVP64_CR_IN_SEL_BA,
      .cr_in2 = SVP64_CR_IN2_SEL_BB,
      .cr_out = SVP64_CR_OUT_SEL_BT,
      .ptype = SVP64_PTYPE_P1,
      .etype = SVP64_ETYPE_EXTRA3,
      .extra_idx_in1 = SVP64_EXTRA_NONE,
      .extra_idx_in2 = SVP64_EXTRA_NONE,
      .extra_idx_in3 = SVP64_EXTRA_NONE,
      .extra_idx_out = SVP64_EXTRA_NONE,
      .extra_idx_out2 = SVP64_EXTRA_NONE,
      .extra_idx_cr_in = SVP64_EXTRA_IDX1,
      .extra_idx_cr_out = SVP64_EXTRA_IDX0,
    },
    .opcodes = &svp64_opcodes[226],
    .nr_opcodes = 1,
  },

Comment 39 Dmitry Selyutin 2022-09-13 11:35:08 BST

Refactored RM modes, since we're approaching to the stage where some modes reuse existing fields for their needs (e.g. cr_ops reuse ewsrc). Setting these in ewsrc is not obvious; however, when you do `insn.prefix.rm.cr_ops.sz = 1`, it is way clearer. Below is what we have now. Note that some fields are present in base RM class: I didn't want to duplicate all these fields in all children subclasses. This leads to the fact that say insn.prefix.rm.cr_ops will also have ewsrc inherited, it just won't be used.

RM.mmode
RM.mask
RM.elwidth
RM.ewsrc
RM.subvl
RM.mode
RM.smask
RM.extra
RM.extra2
RM.extra2.idx0
RM.extra2.idx1
RM.extra2.idx2
RM.extra2.idx3
RM.extra3
RM.extra3.idx0
RM.extra3.idx1
RM.extra3.idx2
RM.normal
RM.normal.mmode
RM.normal.mask
RM.normal.elwidth
RM.normal.ewsrc
RM.normal.subvl
RM.normal.mode
RM.normal.smask
RM.normal.extra
RM.normal.extra2
RM.normal.extra2.idx0
RM.normal.extra2.idx1
RM.normal.extra2.idx2
RM.normal.extra2.idx3
RM.normal.extra3
RM.normal.extra3.idx0
RM.normal.extra3.idx1
RM.normal.extra3.idx2
RM.normal.simple
RM.normal.simple.dz
RM.normal.simple.sz
RM.normal.smr
RM.normal.smr.RG
RM.normal.pmr
RM.normal.svmr
RM.normal.svmr.SVM
RM.normal.pu
RM.normal.pu.SVM
RM.normal.ffrc1
RM.normal.ffrc1.inv
RM.normal.ffrc1.CR
RM.normal.ffrc0
RM.normal.ffrc0.inv
RM.normal.ffrc0.VLi
RM.normal.ffrc0.RC1
RM.normal.sat
RM.normal.sat.N
RM.normal.sat.dz
RM.normal.sat.sz
RM.normal.satx
RM.normal.satx.N
RM.normal.satx.zz
RM.normal.satx.dz
RM.normal.satx.sz
RM.normal.satpu
RM.normal.satpu.N
RM.normal.satpu.zz
RM.normal.satpu.dz
RM.normal.satpu.sz
RM.normal.prrc1
RM.normal.prrc1.inv
RM.normal.prrc1.CR
RM.normal.prrc0
RM.normal.prrc0.inv
RM.normal.prrc0.zz
RM.normal.prrc0.RC1
RM.normal.prrc0.dz
RM.normal.prrc0.sz
RM.ldst_imm
RM.ldst_imm.mmode
RM.ldst_imm.mask
RM.ldst_imm.elwidth
RM.ldst_imm.ewsrc
RM.ldst_imm.subvl
RM.ldst_imm.mode
RM.ldst_imm.smask
RM.ldst_imm.extra
RM.ldst_imm.extra2
RM.ldst_imm.extra2.idx0
RM.ldst_imm.extra2.idx1
RM.ldst_imm.extra2.idx2
RM.ldst_imm.extra2.idx3
RM.ldst_imm.extra3
RM.ldst_imm.extra3.idx0
RM.ldst_imm.extra3.idx1
RM.ldst_imm.extra3.idx2
RM.ldst_imm.simple
RM.ldst_imm.simple.zz
RM.ldst_imm.simple.els
RM.ldst_imm.simple.dz
RM.ldst_imm.simple.sz
RM.ldst_imm.spu
RM.ldst_imm.spu.zz
RM.ldst_imm.spu.els
RM.ldst_imm.spu.dz
RM.ldst_imm.spu.sz
RM.ldst_imm.ffrc1
RM.ldst_imm.ffrc1.inv
RM.ldst_imm.ffrc1.CR
RM.ldst_imm.ffrc0
RM.ldst_imm.ffrc0.inv
RM.ldst_imm.ffrc0.els
RM.ldst_imm.ffrc0.RC1
RM.ldst_imm.sat
RM.ldst_imm.sat.N
RM.ldst_imm.sat.zz
RM.ldst_imm.sat.els
RM.ldst_imm.sat.dz
RM.ldst_imm.sat.sz
RM.ldst_imm.prrc1
RM.ldst_imm.prrc1.inv
RM.ldst_imm.prrc1.CR
RM.ldst_imm.prrc0
RM.ldst_imm.prrc0.inv
RM.ldst_imm.prrc0.els
RM.ldst_imm.prrc0.RC1
RM.ldst_idx
RM.ldst_idx.mmode
RM.ldst_idx.mask
RM.ldst_idx.elwidth
RM.ldst_idx.ewsrc
RM.ldst_idx.subvl
RM.ldst_idx.mode
RM.ldst_idx.smask
RM.ldst_idx.extra
RM.ldst_idx.extra2
RM.ldst_idx.extra2.idx0
RM.ldst_idx.extra2.idx1
RM.ldst_idx.extra2.idx2
RM.ldst_idx.extra2.idx3
RM.ldst_idx.extra3
RM.ldst_idx.extra3.idx0
RM.ldst_idx.extra3.idx1
RM.ldst_idx.extra3.idx2
RM.ldst_idx.simple
RM.ldst_idx.simple.SEA
RM.ldst_idx.simple.sz
RM.ldst_idx.simple.dz
RM.ldst_idx.stride
RM.ldst_idx.stride.SEA
RM.ldst_idx.stride.dz
RM.ldst_idx.stride.sz
RM.ldst_idx.sat
RM.ldst_idx.sat.N
RM.ldst_idx.sat.dz
RM.ldst_idx.sat.sz
RM.ldst_idx.prrc1
RM.ldst_idx.prrc1.inv
RM.ldst_idx.prrc1.CR
RM.ldst_idx.prrc0
RM.ldst_idx.prrc0.inv
RM.ldst_idx.prrc0.zz
RM.ldst_idx.prrc0.RC1
RM.ldst_idx.prrc0.dz
RM.ldst_idx.prrc0.sz

Comment 40 Luke Kenneth Casson Leighton 2022-09-13 13:07:36 BST

(In reply to Dmitry Selyutin from comment #39)
> now. Note that some fields are present in base RM class: I didn't want to
> duplicate all these fields in all children subclasses. This leads to the
> fact that say insn.prefix.rm.cr_ops will also have ewsrc inherited, it just
> won't be used.

ack.  ok apologies, i had to *yet again* redo pack/unpack,
the temporary hack is an overload of RM.elwidth.

the reason is, pack/unpack has to go into SVSTATE and be specially
updated by setvl.

https://bugs.libre-soc.org/show_bug.cgi?id=871#c4

> RM.elwidth

overloaded 2 bits, pack and unpack, and only in normal mode.
this will move AT SOME time not now to a new VL mode

> RM.normal.elwidth

here is temporarily joined by pack/unpack bits

> RM.normal.pu
> RM.normal.pu.SVM

removed.

> RM.normal.satpu
> RM.normal.satpu.N
> RM.normal.satpu.zz
> RM.normal.satpu.dz
> RM.normal.satpu.sz

all removed.

> RM.ldst_imm.elwidth

NOT joined by packunpack.

> RM.ldst_imm.spu
> RM.ldst_imm.spu.zz
> RM.ldst_imm.spu.els
> RM.ldst_imm.spu.dz
> RM.ldst_imm.spu.sz

all removed.

Comment 41 Luke Kenneth Casson Leighton 2022-09-13 13:18:08 BST

(In reply to Dmitry Selyutin from comment #25)
> Today I had to mess around opcode matching again. This is still tricky, but
> at least it matches svshape2. Below is the code for
> media/audio/mp3/mp3_0_apply_window_float_basicsv.s.sv,

interesting.  i dropped them all into test_pysvp64dis.py,
it gets a compile error, v. strange. constants not in range.

> addis 2,12,0
> addi 2,2,0

Comment 42 Dmitry Selyutin 2022-09-13 14:31:27 BST

I'll check later, I might have broke something or ran tests on a different branch...

Comment 43 Dmitry Selyutin 2022-09-13 17:58:40 BST

I copied and pasted the last test into pysvp64asm (notorious for its longs). Here's what I got after calling binutils:

/tmp/test.s: Assembler messages:
/tmp/test.s:22: Error: operand out of range (0xff80 is not between 0xffffffffffff8000 and 0x7fff)
/tmp/test.s:51: Error: operand out of range (0xff80 is not between 0xffffffffffff8000 and 0x7fff)
/tmp/test.s:52: Error: operand out of range (0xff80 is not between 0xffffffffffff8000 and 0x7fff)
/tmp/test.s:59: Error: operand out of range (0xfffc is not between 0xffffffffffff8000 and 0x7fff)
/tmp/test.s:60: Error: operand out of range (0xff4c is not between 0xffffffffffff8000 and 0x7ffc)

addi 4,4,65408
addi 4,4,65408
addi 16,16,65408
addi 16,16,65532
bc 16,0,0xff4c

These are binutils internal checks. They have no relation to the disassembler itself.

Comment 44 Dmitry Selyutin 2022-09-13 18:10:58 BST

I'll check the assembly one more time. Perhaps pysvp64dis was wrong, perhaps pysvp64asm, perhaps both.

Comment 45 Dmitry Selyutin 2022-09-13 18:27:21 BST

OK, here's the reproducer:
pysvp64asm ./media/audio/mp3/mp3_0_apply_window_float_basicsv.s /tmp/py.s
powerpc64le-linux-gnu-as /tmp/py.s -o /tmp/py.o && powerpc-linux-gnu-objcopy -Obinary /tmp/py.o /tmp/bin.o
python3 src/openpower/sv/trans/pysvp64dis.py /tmp/bin.o -s

Comment 46 Luke Kenneth Casson Leighton 2022-09-13 19:02:14 BST

(In reply to Dmitry Selyutin from comment #43)
> I copied and pasted the last test into pysvp64asm (notorious for its longs).
> Here's what I got after calling binutils:
> 
> addi 4,4,65408

these look like constants have been mis-converted, not recognised
as sign-extended negative numbers, truncated instead to 16-bit

> addi 4,4,65408
> addi 16,16,65408
> addi 16,16,65532
> bc 16,0,0xff4c

Comment 47 Dmitry Selyutin 2022-09-13 19:04:56 BST

(In reply to Luke Kenneth Casson Leighton from comment #46)
> these look like constants have been mis-converted, not recognised
> as sign-extended negative numbers, truncated instead to 16-bit

Yeah you missed the IRC conversation, I already found it.
https://libre-soc.org/irclog/%23libre-soc.2022-09-13.log.html#t2022-09-13T18:31:25

Comment 48 Dmitry Selyutin 2022-09-13 20:14:14 BST

Added support for signed operands (as many as I could find in fields.text; I could miss something, though).

addi 2,2,0
addis 9,2,0
addi 9,9,0
rlwinm 7,7,2,0,29
mulli 0,7,31
add 10,6,0
setvl 0,0,8,1,1,0
addi 16,4,124
lfiwax 0,0,5
addi 5,3,64
sv.lfs *32,256(4)
sv.lfs *40,256(5)
sv.fmuls *32,*32,*40
sv.fadds 0,*32,0
addi 5,3,192
addi 4,4,128
sv.lfs *32,256(4)
sv.lfs *40,256(5)
sv.fmuls *32,*32,*40
sv.fsubs 0,0,*32
addi 4,4,-128
stfs 0,0(6)
add 6,6,7
addi 4,4,4
addi 0,0,15
mtspr 288,0
addi 8,0,4
lfiwax 0,0,9
lfiwax 1,0,9
addi 5,3,64
add 5,5,8
sv.lfs *32,256(5)
sv.lfs *40,256(4)
sv.lfs *48,256(16)
sv.fmuls *40,*32,*40
sv.fadds 0,0,*40
sv.fmuls *32,*32,*48
sv.fsubs 1,1,*32
addi 5,3,192
subf 5,8,5
addi 4,4,128
addi 16,16,128
sv.lfs *32,256(5)
sv.lfs *40,256(4)
sv.lfs *48,256(16)
sv.fmuls *40,*32,*40
sv.fsubs 0,0,*40
sv.fmuls *32,*32,*48
sv.fsubs 1,1,*32
addi 4,4,-128
addi 16,16,-128
stfs 0,0(6)
add 6,6,7
stfs 1,0(10)
subf 10,7,10
addi 8,8,4
addi 4,4,4
addi 16,16,-4
bc 16,0,-0xb4
addi 5,3,128
addi 4,4,128
lfiwax 0,0,9
sv.lfs *32,256(4)
sv.lfs *40,256(5)
sv.fmuls *32,*32,*40
sv.fsubs 0,0,*32
stfs 0,0(6)
bclr 20,0,0

Comment 49 Luke Kenneth Casson Leighton 2022-09-13 20:34:34 BST

======================================================================
FAIL: test_7_batch (__main__.SVSTATETestCase) [58:bc]
these come from https://bugs.libre-soc.org/show_bug.cgi?id=917#c25
----------------------------------------------------------------------
Traceback (most recent call last):
  File "openpower/sv/trans/test_pysvp64dis.py", line 27, in _do_tst
    "'%s' expected '%s'" % (line, expected[i]))
AssertionError: 'bc 16,0,-0xb4' != 'bc 16,0,0xff4c'
- bc 16,0,-0xb4
?         -  ^
+ bc 16,0,0xff4c
?           ^^ +
 : instruction does not match 'bc 16,0,0xff4c' expected 'bc 16,0,-0xb4'

58 means instruction 58 in the list tested.

https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=6b844da793861c21b8221db4c17370502b740601

Comment 50 Dmitry Selyutin 2022-09-13 20:46:02 BST

Is it the recent version?

======================================================================
FAIL: test_7_batch (__main__.SVSTATETestCase) [26:mtspr]
these come from https://bugs.libre-soc.org/show_bug.cgi?id=917#c25
----------------------------------------------------------------------
Traceback (most recent call last):
  File "src/openpower/sv/trans/test_pysvp64dis.py", line 27, in _do_tst
    "'%s' expected '%s'" % (line, expected[i]))
AssertionError: 'mtspr 288,0' != 'mtspr 9,0'
- mtspr 288,0
?       ^^^
+ mtspr 9,0
?       ^
 : instruction does not match 'mtspr 9,0' expected 'mtspr 288,0'

----------------------------------------------------------------------
Ran 8 tests in 18.859s

FAILED (failures=1, errors=1)

Comment 51 Dmitry Selyutin 2022-09-13 20:46:41 BST

cf. commit b14eaa812791aa6c089bffcfa726a467c175ed8b, should be on dis branch

Comment 52 Luke Kenneth Casson Leighton 2022-09-14 01:41:29 BST

a$ git checkout b14eaa812791aa6c089bffcfa726a46
Note: switching to 'b14eaa812791aa6c089bffcfa726a46'.
$ python3 src/openpower/sv/trans/test_pysvp64dis.py  > /tmp/f
ERROR: test_2_d_custom_op (__main__.SVSTATETestCase)
FAIL: test_7_batch (__main__.SVSTATETestCase) [26:mtspr]
 : instruction does not match 'mtspr 9,0' expected 'mtspr 288,0'

Comment 53 Luke Kenneth Casson Leighton 2022-09-14 01:43:54 BST

(In reply to Dmitry Selyutin from comment #50)
> Is it the recent version?

no.

https://git.libre-soc.org/?p=openpower-isa.git;a=commit;h=6b844da793861c21b8221db4c17370502b740601

git checkout 6b844da7938

Comment 54 Luke Kenneth Casson Leighton 2022-09-15 00:47:27 BST

(In reply to Luke Kenneth Casson Leighton from comment #52)
> FAIL: test_7_batch (__main__.SVSTATETestCase) [26:mtspr]
>  : instruction does not match 'mtspr 9,0' expected 'mtspr 288,0'

got it.
https://libre-soc.org/openpower/isa/sprset/

n <- spr[5:9] || spr[0:4]

sigh.

solvable by adding a definition to fields.txt for spr
https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=openpower/isatables/fields.text;h=95c76fb77fbcbe99c45c0a41d39a7cf50a0636c9;hb=e3ebeaafbc0fc1864f05746c49c1b6b98b3e12ad

 827     SPR (11:20)

wark-wark  replace with

 827     SPR (16:20,11:15)

and altering the pseudocode to just "n <- spr"

Comment 55 Luke Kenneth Casson Leighton 2022-09-15 01:51:34 BST

(In reply to Luke Kenneth Casson Leighton from comment #54)

> solvable by adding a definition to fields.txt for spr

> and altering the pseudocode to just "n <- spr"

sorted.  works great.

Comment 56 Luke Kenneth Casson Leighton 2022-09-15 01:54:19 BST

    def test_2_d_custom_op(self):
        expected = [
                    'fishmv 12,2',
                    'fmvis 12,97',
                    'addpcis 12,5',
                        ]

stopped working.  signed operand.   probably addpcis.

 File "/home/lkcl/src/libresoc/openpower-isa/src/openpower/decoder/power_ins
n.py", line 1095, in dynamic_operands
    value = " ".join(dis)
  File "/home/lkcl/src/libresoc/openpower-isa/src/openpower/decoder/power_ins
n.py", line 551, in disassemble
    value = insn[span]
  File "/home/lkcl/src/libresoc/openpower-isa/src/openpower/decoder/power_fie
lds.py", line 286, in __getitem__
    return self.__members[key]
KeyError: None

Comment 57 Dmitry Selyutin 2022-09-17 14:03:12 BST

Refactored RM handling again, fixed multiple issues in both power_insn and power_fields modules in scope of these works. Also now RM types inherit the docstring, so we no longer need some table between RM classes and descriptions. Below is an example of what we have for now. As usual, this resides in dis branch; the work is experimental, but I checked pysvp64dis test and it still works.

ff ff ff 07    sv.add. *r3,*r7,*r11
15 12 01 7c    
    spec
        sv.add. RT,RA,RB (OE=0 Rc=1)
    pcode
        RT <- (RA) + (RB)
    binary
        [0:8]   00000111
        [8:16]  11111111
        [16:24] 11111111
        [24:32] 11111111
        [32:40] 01111100
        [40:48] 00000001
        [48:56] 00010010
        [56:64] 00010101
    opcodes
        011111---------------01000010101
    RT (vector)
        0000011
        38, 39, 40, 41, 42, 19, 20
        extra3[0]
    RA (vector)
        0000111
        43, 44, 45, 46, 47, 22, 23
        extra3[1]
    RB (vector)
        0001011
        48, 49, 50, 51, 52, 25, 26
        extra3[2]
    OE
        0
        53
    Rc
        1
        63
RM
        normal: Rc=1: pred-result CR sel
        RM
            111111111111111111111111
            6, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31
        RM.mmode
            1
            6
        RM.mask
            111
            8, 10, 11
        RM.elwidth
            11
            12, 13
        RM.ewsrc
            11
            14, 15
        RM.subvl
            11
            16, 17
        RM.mode
            11111
            27, 28, 29, 30, 31
        RM.smask
            111
            24, 25, 26
        RM.extra
            111111111
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.extra2
            111111111
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.extra2.idx0
            11
            18, 19
        RM.extra2.idx1
            11
            20, 21
        RM.extra2.idx2
            11
            22, 23
        RM.extra2.idx3
            11
            24, 25
        RM.extra3
            111111111
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.extra3.idx0
            111
            18, 19, 20
        RM.extra3.idx1
            111
            21, 22, 23
        RM.extra3.idx2
            111
            24, 25, 26
        RM.inv
            1
            10
        RM.CR
            11
            11, 12

Comment 58 Dmitry Selyutin 2022-09-17 16:44:29 BST

OK we have the first specifiers: vec2, vec3, vec4.

ff ff ff 07    sv.add./vec4 *r3,*r7,*r11
15 12 01 7c

Comment 59 Luke Kenneth Casson Leighton 2022-09-17 16:54:42 BST

(In reply to Dmitry Selyutin from comment #58)
> OK we have the first specifiers: vec2, vec3, vec4.
> 
> ff ff ff 07    sv.add./vec4 *r3,*r7,*r11
> 15 12 01 7c

added unit test test_10_vec

https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=2fbdc7f20190afaa98a504708ae2114107f765e4

Comment 60 Dmitry Selyutin 2022-09-20 00:49:12 BST

OK we only have BRANCH left. The recent changes in dis branch consider common BRANCH bits. It works with the disassembly and tests I've written, and to my understanding matches the tables. However, I cannot make `src/openpower/decoder/isa/test_caller_svp64_bc.py` work.

Let's consider `sv.bc/all 12,*1,0xc` instruction. Here's what I get:

00 24 48 05    sv.bc/all 12,*1,0xc
0c 00 81 41    
    spec
        sv.bc BO,BI,target_addr AA=0 LK=0
    pcode
        if (mode_is_64bit) then M <- 0
        else M <- 32
        if ¬BO[2] then CTR <- CTR - 1
        ctr_ok <- BO[2] | ((CTR[M:63] != 0) ^ BO[3])
        cond_ok <- BO[0] | ¬(CR[BI+32] ^ BO[1])
        if ctr_ok & cond_ok then
          if AA then NIA <-iea EXTS(BD || 0b00)
          else       NIA <-iea CIA + EXTS(BD || 0b00)
        if LK then LR  <-iea  CIA + 4
    binary
        [0:8]   00000101
        [8:16]  01001000
        [16:24] 00100100
        [24:32] 00000000
        [32:40] 01000001
        [40:48] 10000001
        [48:56] 00000000
        [56:64] 00001100
    opcodes
        010000------------------------00
    BO
        01100
        38, 39, 40, 41, 42
    BI (vector)
        000000001
        43, 44, 45, 46, 47, 22, 23, {0}, {0}, 43, 44, 45, 46, 47
        extra3[1]
    target_addr = EXTS(BD || 0b00))
        BD
            0000000000001100
            48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, {0}, {0}
    AA
        0
        62
    LK
        0
        63
    RM
        branch: simple mode
        RM
            000010000010010000000000
            6, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31
        RM.mmode
            0
            6
        RM.mask
            000
            8, 10, 11
        RM.elwidth
            10
            12, 13
        RM.ewsrc
            00
            14, 15
        RM.subvl
            00
            16, 17
        RM.mode
            00000
            27, 28, 29, 30, 31
        RM.smask
            000
            24, 25, 26
        RM.extra
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.extra2
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.extra2.idx0
            10
            18, 19
        RM.extra2.idx1
            01
            20, 21
        RM.extra2.idx2
            00
            22, 23
        RM.extra2.idx3
            00
            24, 25
        RM.extra3
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.extra3.idx0
            100
            18, 19, 20
        RM.extra3.idx1
            100
            21, 22, 23
        RM.extra3.idx2
            000
            24, 25, 26
        RM.ALL
            1
            12
        RM.SNZ
            0
            13
        RM.SL
            0
            25
        RM.SLu
            0
            26
        RM.LRu
            0
            30
        RM.sz
            0
            31
        RM.CTR
            0
            27
        RM.VLS
            0
            28

This produces RM 0b000010000010010000000000. However, the original code did this:

-            sv_mode = ((bc_svstep << SVP64MODE.MOD2_MSB) |
-                       (bc_vlset << SVP64MODE.MOD2_LSB) |
-                       (bc_snz << SVP64MODE.BC_SNZ))
-            srcwid = (bc_vsb << 1) | bc_lru
-            destwid = (bc_lru << 1) | bc_all

This doesn't look like the correct thing (or at least it doesn't matches the table, where LRu goes to bit 22 and ALL goes to bit 4; same with VSb and (again??) LRu. For now it looks that the original code was incorrect, but I'd like to have a confirmation on this guess.

Comment 61 Luke Kenneth Casson Leighton 2022-09-20 01:08:01 BST

(In reply to Dmitry Selyutin from comment #60)

> This produces RM 0b000010000010010000000000. However, the original code did
> this:
> 
> -            sv_mode = ((bc_svstep << SVP64MODE.MOD2_MSB) |
> -                       (bc_vlset << SVP64MODE.MOD2_LSB) |
> -                       (bc_snz << SVP64MODE.BC_SNZ))
> -            srcwid = (bc_vsb << 1) | bc_lru
> -            destwid = (bc_lru << 1) | bc_all
> 
> This doesn't look like the correct thing (or at least it doesn't matches the
> table, where LRu goes to bit 22 and ALL goes to bit 4; same with VSb and
> (again??) LRu. For now it looks that the original code was incorrect, but
> I'd like to have a confirmation on this guess.

chances are high that if it worked before and you made changes that did
not match what worked because of assumptions "it must be wrong therefore
it must be changed", that the assumptions are wrong.

that, or the only thing ever tested was the "all" mode (which it is,
in the unit test).


consts.py.

19-23 = 0-4 in SVP64MODEb.

SVP64MODE.MOD2_MSB=0 ==> 19 => CTR mode (formerly bc_svstep)
SVP64MODE.MOD2_LSB=1 ==> 20 => VLset (correct)
SVP64MODE.BC_SNZ=3   ==> 22 => LRu not SNZ  - never tested but hey
srcwid =>               6/7 => CTi/VSB not VSB/LRu - never tested
destwid ->              4/5 => LRU/ALL not SNZ/all - probably wrong way.

so yes it's just "mostly borked" rather than "totally borked".

look in power_svp64_rm.py you will probably find i have the bit-order
on these the wrong way round:

            # Counter-Test Mode.
            with m.If(mode[SVP64MODE.BC_CTRTEST]):
                with m.If(self.rm_in.ewsrc[0]):
                    comb += self.bc_ctrtest.eq(SVP64BCCTRMode.TEST_INV)
                with m.Else():
                    comb += self.bc_ctrtest.eq(SVP64BCCTRMode.TEST)

            # BC Mode ALL or ANY (Great-Big-AND-gate or Great-Big-OR-gate)
            comb += self.bc_gate.eq(self.rm_in.elwidth[0])
            # Link-Register Update
            comb += self.bc_lru.eq(self.rm_in.elwidth[1])
            comb += self.bc_vsb.eq(self.rm_in.ewsrc[1])

swap these to:

            # Counter-Test Mode.
            with m.If(mode[SVP64MODE.BC_CTRTEST]):
                with m.If(self.rm_in.ewsrc[1]):
                    comb += self.bc_ctrtest.eq(SVP64BCCTRMode.TEST_INV)
                with m.Else():
                    comb += self.bc_ctrtest.eq(SVP64BCCTRMode.TEST)

            # BC Mode ALL or ANY (Great-Big-AND-gate or Great-Big-OR-gate)
            comb += self.bc_gate.eq(self.rm_in.elwidth[1])
            # Link-Register Update
            comb += self.bc_lru.eq(self.rm_in.elwidth[0])
            comb += self.bc_vsb.eq(self.rm_in.ewsrc[0])

and you'll likely find it "works"

Comment 62 Dmitry Selyutin 2022-09-20 01:23:48 BST

(In reply to Luke Kenneth Casson Leighton from comment #61)
> the assumptions are wrong.

This is not some "assumption". I see mismatches between the tables and the code. You look at the code and make an "assumption" this is correct, but it contradicts the spec.
https://libre-soc.org/openpower/sv/branches/

> that, or the only thing ever tested was the "all" mode (which it is,
> in the unit test).

And the position of it is wrong.

> destwid ->              4/5 => LRU/ALL not SNZ/all - probably wrong way.

ALL/SNZ in spec. In this exact order.

> so yes it's just "mostly borked" rather than "totally borked".

Mostly or totally depends on the spec. I consider the spec, since it was changed recently along with power_insn.

> look in power_svp64_rm.py you will probably find i have the bit-order
> on these the wrong way round:

Again: the "wrong" way depends on which resource we consider first, the code or the spec. If I recall correctly, spec edits were recent.

> and you'll likely find it "works"

Either it works (without quotes), or the spec is "correct". And the point of that long post was to find out which of these should be considered first.

Comment 63 Luke Kenneth Casson Leighton 2022-09-20 01:32:10 BST

(In reply to Dmitry Selyutin from comment #62)

> Mostly or totally depends on the spec. I consider the spec, since it was
> changed recently

not RM.branches.  only crops and ldst.  spec correct.
v. late. brainmelt.  do power_svp64_rm.py swap.

Comment 64 Dmitry Selyutin 2022-09-20 01:37:22 BST

I adopted the code to the spec, cf. dis branch. But it's up to _you_ to make the decision what's correct, the spec or the code. Otherwise I'm bound to "assumptions".

https://git.libre-soc.org/?p=openpower-isa.git;a=commit;h=d7c072fd0c1b1cd1db12e617c3c4e7c3243eb220
https://git.libre-soc.org/?p=openpower-isa.git;a=commit;h=fd84e2b00f65a16518b45023c6b6a8693599d8fe

Comment 65 Dmitry Selyutin 2022-09-20 01:38:54 BST

(In reply to Luke Kenneth Casson Leighton from comment #63)
> v. late. brainmelt.

Same here.

> do power_svp64_rm.py swap.

Done.

Comment 66 Dmitry Selyutin 2022-09-20 01:57:57 BST

All branch modes are completed, including complex stuff like below:

sv.bc/vs/all/snz/sl/slu/lru 12,*1,0xc
sv.bc/vsi/all/snz/sl/slu/lru 12,*1,0xc
sv.bc/vsb/all/snz/sl/slu/lru 12,*1,0xc
sv.bc/vsbi/all/snz/sl/slu/lru 12,*1,0xc
sv.bc/ctr/all/snz/sl/slu/lru 12,*1,0xc
sv.bc/cti/all/snz/sl/slu/lru 12,*1,0xc

Comment 67 Dmitry Selyutin 2022-09-20 02:03:49 BST

BTW see how nice this BranchCTRVLSRM class is:

class BranchVLSRM(BranchBaseRM):
    """branch: VLSET mode"""
    VSb: BaseRM[7]
    VLI: BaseRM[21]

class BranchCTRRM(BranchBaseRM):
    """branch: CTR-test mode"""
    CTi: BaseRM[6]

class BranchCTRVLSRM(BranchVLSRM, BranchCTRRM):
    """branch: CTR-test+VLSET mode"""
    pass

I've omitted the specifiers section but these are inherited too. So we really have CTR-test+VLSET, even in code. It inherits first from CTR-test RM, then from VLS RM. To me this new hierarchy looks really cool, it literally matches the spec.

Comment 68 Dmitry Selyutin 2022-09-20 02:11:00 BST

Ah yeah one note on these patches. In pysvp64asm, there are some sections with `if not is_bc`. Eventually these should go down the drain, as well as hacking around srcwid, dstwid et al., and be replaced by RM-specific fields. I started with branches, because I had to change this code to support all modes, but all modes should eventually be switched to this fields-based method, since it closely follows the spec, setting the individual fields. This is exactly how I want to do it in binutils, both assembly and disassembly.

Comment 69 Luke Kenneth Casson Leighton 2022-09-20 08:17:10 BST

(In reply to Dmitry Selyutin from comment #67)
> BTW see how nice this BranchCTRVLSRM class is:
> 
> class BranchVLSRM(BranchBaseRM):
>     """branch: VLSET mode"""
>     VSb: BaseRM[7]
>     VLI: BaseRM[21]
> 
> class BranchCTRRM(BranchBaseRM):
>     """branch: CTR-test mode"""
>     CTi: BaseRM[6]
> 
> class BranchCTRVLSRM(BranchVLSRM, BranchCTRRM):
>     """branch: CTR-test+VLSET mode"""
>     pass

love it. "duh" level of simplicity.

(In reply to Dmitry Selyutin from comment #66)
> All branch modes are completed, including complex stuff like below:
> 
> sv.bc/vs/all/snz/sl/slu/lru 12,*1,0xc
> sv.bc/vsi/all/snz/sl/slu/lru 12,*1,0xc
> sv.bc/vsb/all/snz/sl/slu/lru 12,*1,0xc
> sv.bc/vsbi/all/snz/sl/slu/lru 12,*1,0xc
> sv.bc/ctr/all/snz/sl/slu/lru 12,*1,0xc
> sv.bc/cti/all/snz/sl/slu/lru 12,*1,0xc

almost-unavoidably-scarily-long! :) you can - ha ha - also have vs/ctr
vsbi/cti etc. etc.

should be able to get it down on
"sl/slu" by combining those into a 2-3-letter acronymn: sl,slu,SLu
or something.

also, /sz is another one to combine: /snz also means "/sz"
* /sz = bit 23=1, bit 5=0
* /snz = bit 23+5=1
* ILLEGAL bit 23=0, bit5 =1
* {nothing} bit 23=0,bit5=0

(In reply to Dmitry Selyutin from comment #68)
> Ah yeah one note on these patches. In pysvp64asm, there are some sections
> with `if not is_bc`. Eventually these should go down the drain, 

goooood.

Comment 70 Luke Kenneth Casson Leighton 2022-09-20 09:03:22 BST

+                elif encmode == 'ctr':
+                    svp64_rm.branch.CTR = 1
+                    svp64_rm.branch.VLS = 0   <--- needs taking out
+                    svp64_rm.branch.ctr.CTi = 1
+                elif encmode == 'cti':
+                    svp64_rm.branch.CTR = 1
+                    svp64_rm.branch.ctr.CTi = 1

CTR and VLset mode can be combined!

-                elif encmode == 'snz':  # sz (only) already set above
-                    src_zero = 1
-                    bc_snz = 1
+                elif encmode == 'snz':
+                    svp64_rm.branch.SNZ = 1

notice how src_zero=1 was removed? that's a bug.  src_zero=1
*must* be enabled if snz is requested.  this makes disasm
"/sz/snz" redundant: it's either "/sz" or "/snz" or neither.

Comment 71 Luke Kenneth Casson Leighton 2022-09-20 09:16:08 BST

this is the pseudocode for sv.bc:

  lr_ok <- LK
  svlr_ok <- SVRMmode.SL
  if ctr_ok & cond_ok then
    if SVRMmode.LRu then lr_ok <- ¬lr_ok
    if SVRMmode.SLu then svlr_ok <- ¬svlr_ok
  if lr_ok   then LR   <-iea CIA + 4
  if svlr_ok then SVLR <- SVSTATE

that means *four* permutations *each* for:

* LK=0/1 and LRu=0/1
* SL=0/1 and SLu=0/1

/lru we can do nothing about, it combines with the mnemonic name,
"bc" or "bcl".  in theory it would be possible to chuck in the
letter "u" on that. "sv.bcu", "sv.bclu".

svlr could be done similarly but keeping it to 3 letters max.
"/sl" and "/slu" and "/sll" or
"/s"      "/slu      "/sl" or
"/sl"     "/slu"     "sli" where this last is for SL=0,SLu=1

/s is fine because it will not be used elsewhere.

Comment 72 Dmitry Selyutin 2022-09-20 09:52:20 BST

I don't think modifying the names of instruction is a good idea, unless there're unprefixed instructions which match these names. If you have sv.blu, I'd expect blu. And, well, I don't really think we should introduce something like this at all, I don't even like these permutations on VSb/VLI and would rather choose explicit /vsb/vli.

Comment 73 Luke Kenneth Casson Leighton 2022-09-20 10:03:41 BST

(In reply to Dmitry Selyutin from comment #72)
> I don't think modifying the names of instruction is a good idea, unless
> there're unprefixed instructions which match these names. If you have
> sv.blu, I'd expect blu. 

true, and there is a hard rule about not doing exactly that.  best to stick
with it.

> And, well, I don't really think we should introduce
> something like this at all, I don't even like these permutations on VSb/VLI
> and would rather choose explicit /vsb/vli.

hum ok.  it's going to be mad-length instructions but then again that is
normal for 3D GPU ISAs.

Comment 74 Luke Kenneth Casson Leighton 2022-09-20 10:06:10 BST

(In reply to Luke Kenneth Casson Leighton from comment #70)

> CTR and VLset mode can be combined!

ah, just saw the followup patch correcting that. just leaves src_zero
(and catching /sz/snz in pysvp64dis)

Comment 75 Dmitry Selyutin 2022-09-20 10:07:14 BST

Ah wait I missed sz/snz. Will update.

Comment 76 Dmitry Selyutin 2022-09-20 11:29:14 BST

I've just pushed the changes for /sz and /snz.

Comment 77 Luke Kenneth Casson Leighton 2022-09-23 20:29:09 BST

 3037  echo 'sv.subf/ff=eq 0,0,0' | pysvp64asm > sv.subf.ffirst.tst.s
 3038  echo 'sv.subf./ff=eq 0,0,0' | pysvp64asm > sv.subf.ffirst.tst.s
 3039  vi sv.subf.ffirst.tst.s
 3040  powerpc64le-linux-gnu-as sv.subf.ffirst.tst.s 
 3041  powerpc64le-linux-gnu-objdump -D ./a.out
 3042  echo -n -e '\x0c\x00\x40\x05\x51\x00\x00\x7c' | pysvp64dis -v

0c 00 40 05    sv.subf./ff=eq r0,r0,r0
51 00 00 7c    
    spec
        sv.subf. RT,RA,RB OE=0 Rc=1
    RM
        normal: Rc=1: ffirst CR sel
        RM
            000000000000000000001100
        RM.mode
            01100
            27, 28, 29, 30, 31
        RM.inv
            1
            29
        RM.CR
            00
            30, 31

Comment 78 Luke Kenneth Casson Leighton 2022-10-06 12:01:35 BST

cat /tmp/test.s 
sv.cmp/ff=RC1 *0,1,*4,3


cat /tmp/test.s && \
SILENCELOG=true pysvp64asm /tmp/test.s /tmp/test.py.s && \
powerpc64le-linux-gnu-as /tmp/test.py.s -o /tmp/test.o && \
powerpc64le-linux-gnu-objcopy -Obinary /tmp/test.o /tmp/bin.o && \
pysvp64dis -v /tmp/bin.o


sv.cmp/ff=RC1 *0,1,*4,3
09 24 40 05    sv.cmp *0,1,*r4,r3
00 18 21 7c    
    spec
        sv.cmp BF,L,RA,RB
    pcode
        if L = 0 then
            a <- EXTS((RA)[XLEN/2:XLEN-1])
            b <- EXTS((RB)[XLEN/2:XLEN-1])
        else
            a <- (RA)
            b <- (RB)
        if      a < b then c <-  0b100
        else if a > b then c <-  0b010
        else               c <-  0b001
        CR[4*BF+32:4*BF+35] <-  c || XER[SO]
    binary
        [0:8]   00000101
        [8:16]  01000000
        [16:24] 00100100
        [24:32] 00001001
        [32:40] 01111100
        [40:48] 00100001
        [48:56] 00011000
        [56:64] 00000000
    opcodes
        011111---------------0000000000-
    BF (vector)
        00000
        38, 39, 40, 19, 20, {0}, {0}
        extra3[0]
    L
        1
        42
    RA (vector)
        0000100
        43, 44, 45, 46, 47, 22, 23
        extra3[1]
    RB (scalar)
        00011
        48, 49, 50, 51, 52
        extra3[2]
    RM
        None
        RM
            000000000010010000001001
            6, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31
        RM.mmode
            0
            6
        RM.mask
            000
            8, 10, 11
        RM.elwidth
            00
            12, 13
        RM.ewsrc
            00
            14, 15
        RM.subvl
            00
            16, 17
        RM.mode
            01001
            27, 28, 29, 30, 31
        RM.smask
            000
            24, 25, 26
        RM.extra
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.extra2
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.extra2.idx0
            10
            18, 19
        RM.extra2.idx1
            01
            20, 21
        RM.extra2.idx2
            00
            22, 23
        RM.extra2.idx3
            00
            24, 25
        RM.extra3
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.extra3.idx0
            100
            18, 19, 20
        RM.extra3.idx1
            100
            21, 22, 23
        RM.extra3.idx2
            000
            24, 25, 26
        RM.SNZ
            0
            15
        RM.simple
            000000000010010000001001
            6, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31
        RM.simple.mmode
            0
            6
        RM.simple.mask
            000
            8, 10, 11
        RM.simple.elwidth
            00
            12, 13
        RM.simple.ewsrc
            00
            14, 15
        RM.simple.subvl
            00
            16, 17
        RM.simple.mode
            01001
            27, 28, 29, 30, 31
        RM.simple.smask
            000
            24, 25, 26
        RM.simple.extra
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.simple.extra2
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.simple.extra2.idx0
            10
            18, 19
        RM.simple.extra2.idx1
            01
            20, 21
        RM.simple.extra2.idx2
            00
            22, 23
        RM.simple.extra2.idx3
            00
            24, 25
        RM.simple.extra3
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.simple.extra3.idx0
            100
            18, 19, 20
        RM.simple.extra3.idx1
            100
            21, 22, 23
        RM.simple.extra3.idx2
            000
            24, 25, 26
        RM.simple.SNZ
            0
            15
        RM.simple.RG
            1
            28
        RM.simple.dz
            0
            30
        RM.simple.sz
            1
            31
        RM.mr
            000000000010010000001001
            6, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31
        RM.mr.mmode
            0
            6
        RM.mr.mask
            000
            8, 10, 11
        RM.mr.elwidth
            00
            12, 13
        RM.mr.ewsrc
            00
            14, 15
        RM.mr.subvl
            00
            16, 17
        RM.mr.mode
            01001
            27, 28, 29, 30, 31
        RM.mr.smask
            000
            24, 25, 26
        RM.mr.extra
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.mr.extra2
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.mr.extra2.idx0
            10
            18, 19
        RM.mr.extra2.idx1
            01
            20, 21
        RM.mr.extra2.idx2
            00
            22, 23
        RM.mr.extra2.idx3
            00
            24, 25
        RM.mr.extra3
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.mr.extra3.idx0
            100
            18, 19, 20
        RM.mr.extra3.idx1
            100
            21, 22, 23
        RM.mr.extra3.idx2
            000
            24, 25, 26
        RM.mr.SNZ
            0
            15
        RM.mr.RG
            1
            28
        RM.mr.dz
            0
            30
        RM.mr.sz
            1
            31
        RM.ff3
            000000000010010000001001
            6, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31
        RM.ff3.mmode
            0
            6
        RM.ff3.mask
            000
            8, 10, 11
        RM.ff3.elwidth
            00
            12, 13
        RM.ff3.ewsrc
            00
            14, 15
        RM.ff3.subvl
            00
            16, 17
        RM.ff3.mode
            01001
            27, 28, 29, 30, 31
        RM.ff3.smask
            000
            24, 25, 26
        RM.ff3.extra
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.ff3.extra2
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.ff3.extra2.idx0
            10
            18, 19
        RM.ff3.extra2.idx1
            01
            20, 21
        RM.ff3.extra2.idx2
            00
            22, 23
        RM.ff3.extra2.idx3
            00
            24, 25
        RM.ff3.extra3
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.ff3.extra3.idx0
            100
            18, 19, 20
        RM.ff3.extra3.idx1
            100
            21, 22, 23
        RM.ff3.extra3.idx2
            000
            24, 25, 26
        RM.ff3.SNZ
            0
            15
        RM.ff3.VLi
            1
            28
        RM.ff3.inv
            0
            29
        RM.ff3.CR
            01
            30, 31
        RM.ff3.zz
            0
            14
        RM.ff3.sz
            0
            14
        RM.ff3.dz
            0
            14
        RM.ff5
            000000000010010000001001
            6, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31
        RM.ff5.mmode
            0
            6
        RM.ff5.mask
            000
            8, 10, 11
        RM.ff5.elwidth
            00
            12, 13
        RM.ff5.ewsrc
            00
            14, 15
        RM.ff5.subvl
            00
            16, 17
        RM.ff5.mode
            01001
            27, 28, 29, 30, 31
        RM.ff5.smask
            000
            24, 25, 26
        RM.ff5.extra
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.ff5.extra2
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.ff5.extra2.idx0
            10
            18, 19
        RM.ff5.extra2.idx1
            01
            20, 21
        RM.ff5.extra2.idx2
            00
            22, 23
        RM.ff5.extra2.idx3
            00
            24, 25
        RM.ff5.extra3
            100100000
            18, 19, 20, 21, 22, 23, 24, 25, 26
        RM.ff5.extra3.idx0
            100
            18, 19, 20
        RM.ff5.extra3.idx1
            100
            21, 22, 23
        RM.ff5.extra3.idx2
            000
            24, 25, 26
        RM.ff5.SNZ
            0
            15
        RM.ff5.VLi
            1
            28
        RM.ff5.inv
            0
            29
        RM.ff5.dz
            0
            30
        RM.ff5.sz
            1
            31