The reference Python-based disassembler is required.
Not much progress today: most of the time spend to hunt the bug with the opcode (revealed upon power_table usage). Refactored some portions; also, SVP64 assembly can now provide the information about the mode used.
I wanted to start 911, but hey, I got carried by the world of disassembly, and started checking into extra stuff (no pun intended, I'm really speaking of EXTRA). I've augmented DynamicOperandGPR and DynamicOperandFPR with information on these. 40 0a 40 05 sv.add 14 6a e2 7e spec sv.add RT,RA,RB (OE=0 Rc=0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00001010 [24:32] 01000000 [32:40] 01111110 [40:48] 11100010 [48:56] 01101010 [56:64] 00010100 opcode 0x7c000214 mask 0xfc0007ff RT 01010 [6, 7, 8, 9, 10] EXTRA[0] RA 00000 [11, 12, 13, 14, 15] EXTRA[1] RB 00001 [16, 17, 18, 19, 20] EXTRA[2] OE 0 [21] Rc 0 [31] mode normal: simple
(In reply to Dmitry Selyutin from comment #2) > I wanted to start 911, but hey, I got carried by the world of disassembly, > and started checking into extra stuff (no pun intended, I'm really speaking > of EXTRA). I've augmented DynamicOperandGPR and DynamicOperandFPR with > information on these. ah brilliant because the next step to add the extra bits should be straightforward. > RT > 01010 > [6, 7, 8, 9, 10] > EXTRA[0] suggest putting EXTRA3, (see Etype) so EXTRA3[0] is RM.EXTRA3[0] bits 0-2 > RA > 00000 > [11, 12, 13, 14, 15] > EXTRA[1] RM.EXTRA3[1] is RM.EXTRA bits 3-5 > RB > 00001 > [16, 17, 18, 19, 20] > EXTRA[2] and this is RM.EXTRA3[2] which is RM.EXTRA bits 6-8 from there checking bit 0 of each tells you "scalar or vector". then appending the extra bits should be easy enough to make the right regnum.
(In reply to Luke Kenneth Casson Leighton from comment #3) > suggest putting EXTRA3, (see Etype) so EXTRA3[0] is RM.EXTRA3[0] bits 0-2 Ah yeah awesome idea! I thought about putting it on a separate line, but your suggestion is much better, it expresses the intent in a much cleaner way. I'll do it now. > from there checking bit 0 of each tells you "scalar or vector". > then appending the extra bits should be easy enough to make the > right regnum. I really thought that "if self.vector: prepend_asterisk(name)" is really clear and obvious. So yeah this is largely the reason why this was placed in operand classes. :-)
OK, I started playing around SVP64 EXTRA2/EXTRA3 concepts. Here's what we have for now. Note the updated ranges for operands, the disassembly itself (first lines), and extra modes printed. Yes I remember about target_addr. I just don't have time to do everything at once. 00 00 40 05 sv.bc 2,9,0x8 08 00 49 40 spec sv.bc BO,BI,BD (AA=0 LK=0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00000000 [24:32] 00000000 [32:40] 01000000 [40:48] 01001001 [48:56] 00000000 [56:64] 00001000 opcode 0x40000000 mask 0xfc000003 BO 00010 (38, 39, 40, 41, 42) BI 01001 (43, 44, 45, 46, 47) BD 00000000000010 (48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61) target_addr = EXTS(BD || 0b00)) AA 0 (62,) LK 0 (63,) mode branch 40 18 40 05 sv.add r127,r31,r65 14 0a ff 7f spec sv.add RT,RA,RB (OE=0 Rc=0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00011000 [24:32] 01000000 [32:40] 01111111 [40:48] 11111111 [48:56] 00001010 [56:64] 00010100 opcode 0x7c000214 mask 0xfc0007ff RT 01111111 (38, 39, 40, 41, 42, 18, 19, 20) extra3[0] RA 11111 (43, 44, 45, 46, 47) extra3[1] RB 01000001 (48, 49, 50, 51, 52, 24, 25, 26) extra3[2] OE 0 (53,) Rc 0 (63,) mode normal: simple 00 00 40 05 sv.lwzu r3,16,r1 10 00 61 84 spec sv.lwzu RT,D(RA) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00000000 [24:32] 00000000 [32:40] 10000100 [40:48] 01100001 [48:56] 00000000 [56:64] 00010000 opcode 0x84000000 mask 0xfc000000 RT 00011 (38, 39, 40, 41, 42) extra2[0] D 0000000000010000 (48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63) RA 00001 (43, 44, 45, 46, 47) extra2[2] mode ld/st imm: simple
OK, as I suspected, the recent changes broke stuff which relied on indexing extras. However, to my surprise, it was quite straightforward to fix it. I think we can try rebasing the branch. The next target is CRs.
static inline uint64_t svp64_insn_get_prefix_rm_extra3_idx0(const struct svp64_insn *insn) { uint64_t value = insn->value; return ( (((value >> UINT64_C(43)) & UINT64_C(1)) << UINT64_C(0)) | (((value >> UINT64_C(44)) & UINT64_C(1)) << UINT64_C(1)) | (((value >> UINT64_C(45)) & UINT64_C(1)) << UINT64_C(2)) ); } static inline uint64_t svp64_insn_get_prefix_rm_extra3_idx1(const struct svp64_insn *insn) { uint64_t value = insn->value; return ( (((value >> UINT64_C(40)) & UINT64_C(1)) << UINT64_C(0)) | (((value >> UINT64_C(41)) & UINT64_C(1)) << UINT64_C(1)) | (((value >> UINT64_C(42)) & UINT64_C(1)) << UINT64_C(2)) ); } Perfect. I guess we have an opportunity to introduce a nice table in binutils when we're done with disasm (and same for setters).
(In reply to Dmitry Selyutin from comment #5) > OK, I started playing around SVP64 EXTRA2/EXTRA3 concepts. Here's what we > have for now. > Note the updated ranges for operands, the disassembly itself (first lines), > and extra modes printed. ack. > 40 18 40 05 sv.add r127,r31,r65 > 14 0a ff 7f > RB > 01000001 > (48, 49, 50, 51, 52, 24, 25, 26) > extra3[2] ah! yeah, that looks right. vector would be: [48,49,50,51,52,24,25] # (or maybe 25,24) scalar would be: [24,25,48,49,50,51,52,24,25] # (or maybe 25,24) > 00 00 40 05 sv.lwzu r3,16,r1 > 10 00 61 84 > spec > sv.lwzu RT,D(RA) > RA > 00001 > (43, 44, 45, 46, 47) > extra2[2] and as this is EXTRA2, it would be best as: RA {0}000001 [25,43,44,45,46,47] # or maybe 24 where vector would be: 0000001{0} [43,44,45,46,47,25] # or maybe 24
(In reply to Dmitry Selyutin from comment #7) > static inline uint64_t > svp64_insn_get_prefix_rm_extra3_idx0(const struct svp64_insn *insn) > { > uint64_t value = insn->value; > return ( > (((value >> UINT64_C(43)) & UINT64_C(1)) << UINT64_C(0)) | these look really easy/straightforward, i suspected they would be. bear in mind that their use in spec *might* not be in the same order, we established that with the spec_aug variable from SVP64RMExtra. > Perfect. I guess we have an opportunity to introduce a nice table in > binutils when we're done with disasm (and same for setters). huzzah.
I've played a bit with the representation, here's what we have now: 00 00 40 05 sv.bc 2,9, 08 00 49 40 spec sv.bc BO,BI,target_addr (AA=0 LK=0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00000000 [24:32] 00000000 [32:40] 01000000 [40:48] 01001001 [48:56] 00000000 [56:64] 00001000 opcode 0x40000000 mask 0xfc000003 BO 00010 38, 39, 40, 41, 42 BI 01001 43, 44, 45, 46, 47 target_addr 00000000000010{00} 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 target_addr = EXTS(BD || 0b00)) AA 0 62 LK 0 63 mode branch 40 18 40 05 sv.add r127,r31,r65 14 0a ff 7f spec sv.add RT,RA,RB (OE=0 Rc=0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00011000 [24:32] 01000000 [32:40] 01111111 [40:48] 11111111 [48:56] 00001010 [56:64] 00010100 opcode 0x7c000214 mask 0xfc0007ff RT 01111111 38, 39, 40, 41, 42, 18, 19, 20 extra3[0] RA 11111 43, 44, 45, 46, 47 extra3[1] RB 01000001 48, 49, 50, 51, 52, 24, 25, 26 extra3[2] OE 0 53 Rc 0 63 mode normal: simple 14 02 ef 7f add r31,r15,r0 spec add RT,RA,RB (OE=0 Rc=0) binary [0:8] 01111111 [8:16] 11101111 [16:24] 00000010 [24:32] 00010100 opcode 0x7c000214 mask 0xfc0007ff RT 11111 6, 7, 8, 9, 10 RA 01111 11, 12, 13, 14, 15 RB 00000 16, 17, 18, 19, 20 OE 0 21 Rc 0 31 00 15 40 05 sv.lwzu r30,16,r15 10 00 cf 87 spec sv.lwzu RT,D(RA) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00010101 [24:32] 00000000 [32:40] 10000111 [40:48] 11001111 [48:56] 00000000 [56:64] 00010000 opcode 0x84000000 mask 0xfc000000 RT 011110{0} 38, 39, 40, 41, 42, 18 extra2[0] D 0000000000010000 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63 RA 001111{0} 43, 44, 45, 46, 47, 22 extra2[2] mode ld/st imm: simple
Oops, one of operands in sv.bc lost... :-)
(In reply to Dmitry Selyutin from comment #11) > Oops, one of operands in sv.bc lost... :-) Damn, used return instead of yield. Should be fine now: sv.bc 2,9,0x8
(In reply to Dmitry Selyutin from comment #10) > I've played a bit with the representation, here's what we have now: > > 00 00 40 05 sv.bc 2,9, wark-wark :) > 08 00 49 40 > spec > sv.bc BO,BI,target_addr (AA=0 LK=0) niiice > BI > 01001 > 43, 44, 45, 46, 47 > target_addr > 00000000000010{00} > 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 > target_addr = EXTS(BD || 0b00)) nice. BD seems missing entirely but hey > 40 18 40 05 sv.add r127,r31,r65 > 14 0a ff 7f > spec > sv.add RT,RA,RB (OE=0 Rc=0) > RT > 01111111 > 38, 39, 40, 41, 42, 18, 19, 20 > extra3[0] this i assume is numbering based on 64-bit MSB0. it must be. * 38-42 minus 32 (to get them down to 32-bit MSB0) is 6-10 ah ha! that matches with RT in a 32-bit instruction, so yes. it also looks like it's vector numbering rather than scalar numbering because 18,19,20 are at the end (in LSB positions) if it's scalar it should be [19,20, 38, 39, 40, 41, 42] # or maybe 20,19,38..42 > RA > 11111 > 43, 44, 45, 46, 47 > extra3[1] this one is missing the 7-bit extension. no GPR/FPR/CR operands should miss EXTRA extension > 14 02 ef 7f add r31,r15,r0 > spec > add RT,RA,RB (OE=0 Rc=0) > RT > 11111 > 6, 7, 8, 9, 10 yeah there we go. 6-10 (so 38-42 minus 32 was also 6-10) > 00 15 40 05 sv.lwzu r30,16,r15 > 10 00 cf 87 > spec > sv.lwzu RT,D(RA) > RT > 011110{0} > 38, 39, 40, 41, 42, 18 > extra2[0] if that was vector it's correct. if it is scalar it should be {0}011110 [18,38,39....42]
> 00 15 40 05 sv.lwzu r30,16,r15 > 10 00 cf 87 repro with echo -n -e '\x00\x15\x40\x05\x10\x00\xcf\x87' | pysvp64dis -v
(In reply to Luke Kenneth Casson Leighton from comment #13) > > BI > > 01001 > > 43, 44, 45, 46, 47 > > target_addr > > 00000000000010{00} > > 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 > > target_addr = EXTS(BD || 0b00)) > > nice. BD seems missing entirely but hey Not quite entirely, cf. last line. :-) Also the range corresponds to BD and formula explains this mystic {00}. > this i assume is numbering based on 64-bit MSB0. it must be. > if it's scalar it should be > if that was vector it's correct. > if it is scalar it should be {0}011110 [18,38,39....42] Yeah this needs tuning then. Stay tuned. :-) > this one is missing the 7-bit extension. no GPR/FPR/CR > operands should miss EXTRA extension This is caused by ambiguous wording "If EXTRA3 is zero, maps to "scalar identity" (scalar Power ISA field naming)." (same for EXTRA2). So I printed these as they were in SVP64-less world. But I can add these bits, not a big deal.
(In reply to Dmitry Selyutin from comment #15) > This is caused by ambiguous wording "If EXTRA3 is zero, maps to "scalar > identity" (scalar Power ISA field naming)." (same for EXTRA2). So I printed > these as they were in SVP64-less world. But I can add these bits, not a big > deal. ahh interesting. technically-speaking, that's actually true. so yes, that would work out fine. had me confused for a minute.
(In reply to Dmitry Selyutin from comment #10) > 00 15 40 05 sv.lwzu r30,16,r15 > 10 00 cf 87 > spec > sv.lwzu RT,D(RA) no rush, i just noticed that should be: 00 15 40 05 sv.lwzu r30,16(r15)
OK yet another take on these, never surrender! Note that I refactored the representation and {0} appears for the fields instead, this looks much clearer and explains what happens better (especially with extras). 00 00 40 05 sv.bc 2,9,0x8 08 00 49 40 spec sv.bc BO,BI,target_addr (AA=0 LK=0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00000000 [24:32] 00000000 [32:40] 01000000 [40:48] 01001001 [48:56] 00000000 [56:64] 00001000 opcode 0x40000000 mask 0xfc000003 BO 00010 38, 39, 40, 41, 42 BI 01001 43, 44, 45, 46, 47 target_addr 0000000000001000 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, {0}, {0} target_addr = EXTS(BD || 0b00)) AA 0 62 LK 0 63 mode branch 40 18 40 05 sv.add r127,r31,r65 14 0a ff 7f spec sv.add RT,RA,RB (OE=0 Rc=0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00011000 [24:32] 01000000 [32:40] 01111111 [40:48] 11111111 [48:56] 00001010 [56:64] 00010100 opcode 0x7c000214 mask 0xfc0007ff RT 1111111 19, 20, 38, 39, 40, 41, 42 extra3[0] RA 11111 43, 44, 45, 46, 47 extra3[1] RB 1000001 25, 26, 48, 49, 50, 51, 52 extra3[2] OE 0 53 Rc 0 63 mode normal: simple 14 02 ef 7f add r31,r15,r0 spec add RT,RA,RB (OE=0 Rc=0) binary [0:8] 01111111 [8:16] 11101111 [16:24] 00000010 [24:32] 00010100 opcode 0x7c000214 mask 0xfc0007ff RT 11111 6, 7, 8, 9, 10 RA 01111 11, 12, 13, 14, 15 RB 00000 16, 17, 18, 19, 20 OE 0 21 Rc 0 31 00 15 40 05 sv.lwzu r62,16,r63 10 00 df 87 spec sv.lwzu RT,D(RA) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00010101 [24:32] 00000000 [32:40] 10000111 [40:48] 11011111 [48:56] 00000000 [56:64] 00010000 opcode 0x84000000 mask 0xfc000000 RT 0111110 19, {0}, 38, 39, 40, 41, 42 extra2[0] D 0000000000010000 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63 RA 0111111 23, {0}, 43, 44, 45, 46, 47 extra2[2] mode ld/st imm: simple 00 25 40 05 sv.lwzu *r0,16,r63 10 00 1f 84 spec sv.lwzu RT,D(RA) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00100101 [24:32] 00000000 [32:40] 10000100 [40:48] 00011111 [48:56] 00000000 [56:64] 00010000 opcode 0x84000000 mask 0xfc000000 RT 0000000 38, 39, 40, 41, 42, 19, {0} extra2[0] D 0000000000010000 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63 RA 0111111 23, {0}, 43, 44, 45, 46, 47 extra2[2] mode ld/st imm: simple 00 35 40 05 sv.lwzu *r2,16,r63 10 00 1f 84 spec sv.lwzu RT,D(RA) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00110101 [24:32] 00000000 [32:40] 10000100 [40:48] 00011111 [48:56] 00000000 [56:64] 00010000 opcode 0x84000000 mask 0xfc000000 RT 0000010 38, 39, 40, 41, 42, 19, {0} extra2[0] D 0000000000010000 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63 RA 0111111 23, {0}, 43, 44, 45, 46, 47 extra2[2] mode ld/st imm: simple There's one quirk still: `sv.lwzu *r2,16,r63` should have rather been `sv.lwzu *r2,16(r63)`. I'll update it.
Sigh, spans are still wrong wrt vectors. One more iteration.
confirmed vector EXTRA2 works. repro: $ echo "sv.lwzu *62,16(*48)" >> lwzu.tst.s $ powerpc64le-linux-gnu-as lwzu.tst.s $ powerpc64le-linux-gnu-objdump -D ./a.out 0000000000000000 <.text>: 0: 00 3a 40 05 .long 0x5403a00 4: 10 00 ec 85 lwzu r15,16(r12) $ echo -n -e '\x00\x3a\x40\x05\x10\x00\xec\x85' | pysvp64dis -v 00 3a 40 05 sv.lwzu *r62,16,*r48 10 00 ec 85 spec sv.lwzu RT,D(RA) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00111010 [24:32] 00000000 [32:40] 10000101 [40:48] 11101100 [48:56] 00000000 [56:64] 00010000 opcode 0x84000000 mask 0xfc000000 RT 0111110 <- 62 (correct) 38, 39, 40, 41, 42, 19, {0} <- good extra2[0] D 0000000000010000 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63 RA 0110000 <- 48 (correct) 43, 44, 45, 46, 47, 23, {0} <- good extra2[2] mode ld/st imm: simple
$ echp 'sv.add *97,*23,*63' > svadd.tst.s $ powerpc64le-linux-gnu-as svadd.tst.s $ powerpc64le-linux-gnu-objdump -D ./a.out 0000000000000000 <.text>: 0: e0 2f 40 05 .long 0x5402fe0 4: 14 7a 05 7f add r24,r5,r15 $ echo -n -e '\xe0\x2f\x40\x05\x14\x7a\x05\x7f' | pysvp64dis -v e0 2f 40 05 sv.add *r97,*r23,*r63 14 7a 05 7f spec sv.add RT,RA,RB (OE=0 Rc=0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00101111 [24:32] 11100000 [32:40] 01111111 [40:48] 00000101 [48:56] 01111010 [56:64] 00010100 opcode 0x7c000214 mask 0xfc0007ff RT 1100001 <- correct 38, 39, 40, 41, 42, 19, 20 <- good extra3[0] RA 0010111 <- correct 43, 44, 45, 46, 47, 22, 23 <- good extra3[1] RB 0111111 <- correct 48, 49, 50, 51, 52, 25, 26 <- good extra3[2] OE 0 53 Rc 0 63 mode normal: simple
Fixed immediates. 00 00 40 05 bc 2,9,0x8 08 00 49 40 spec sv.bc 10,0,0x0 (AA=0 LK=0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00000000 [24:32] 00000000 [32:40] 01000000 [40:48] 01001001 [48:56] 00000000 [56:64] 00001000 opcode 0x40000000 mask 0xfc000003 BO 00010 38, 39, 40, 41, 42 BI 01001 43, 44, 45, 46, 47 target_addr 0000000000001000 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, {0}, {0} target_addr = EXTS(BD || 0b00)) AA 0 62 LK 0 63 mode branch 40 18 40 05 add r127,r31,r65 14 0a ff 7f spec sv.add r10,r0,r3 (OE=0 Rc=0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00011000 [24:32] 01000000 [32:40] 01111111 [40:48] 11111111 [48:56] 00001010 [56:64] 00010100 opcode 0x7c000214 mask 0xfc0007ff RT 1111111 19, 20, 38, 39, 40, 41, 42 extra3[0] RA 11111 43, 44, 45, 46, 47 extra3[1] RB 1000001 25, 26, 48, 49, 50, 51, 52 extra3[2] OE 0 53 Rc 0 63 mode normal: simple 14 02 ef 7f add r31,r15,r0 spec add r31,r15,r0 (OE=0 Rc=0) binary [0:8] 01111111 [8:16] 11101111 [16:24] 00000010 [24:32] 00010100 opcode 0x7c000214 mask 0xfc0007ff RT 11111 6, 7, 8, 9, 10 RA 01111 11, 12, 13, 14, 15 RB 00000 16, 17, 18, 19, 20 OE 0 21 Rc 0 31 00 15 40 05 lwzu r62,16(r63) 10 00 df 87 spec sv.lwzu r10,5376(r0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00010101 [24:32] 00000000 [32:40] 10000111 [40:48] 11011111 [48:56] 00000000 [56:64] 00010000 opcode 0x84000000 mask 0xfc000000 RT 0111110 {0}, 19, 38, 39, 40, 41, 42 extra2[0] D 0000000000010000 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63 RA 0111111 {0}, 23, 43, 44, 45, 46, 47 extra2[2] mode ld/st imm: simple 00 25 40 05 lwzu *r0,16(r63) 10 00 1f 84 spec sv.lwzu r10,9472(r0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00100101 [24:32] 00000000 [32:40] 10000100 [40:48] 00011111 [48:56] 00000000 [56:64] 00010000 opcode 0x84000000 mask 0xfc000000 RT 0000000 38, 39, 40, 41, 42, 19, {0} extra2[0] D 0000000000010000 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63 RA 0111111 {0}, 23, 43, 44, 45, 46, 47 extra2[2] mode ld/st imm: simple 00 35 40 05 lwzu *r2,16(r63) 10 00 1f 84 spec sv.lwzu r10,13568(r0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00110101 [24:32] 00000000 [32:40] 10000100 [40:48] 00011111 [48:56] 00000000 [56:64] 00010000 opcode 0x84000000 mask 0xfc000000 RT 0000010 38, 39, 40, 41, 42, 19, {0} extra2[0] D 0000000000010000 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63 RA 0111111 {0}, 23, 43, 44, 45, 46, 47 extra2[2] mode ld/st imm: simple
00 00 40 05 sv.bc 2,9,0x8 08 00 49 40 spec sv.bc BO,BI,target_addr (AA=0 LK=0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00000000 [24:32] 00000000 [32:40] 01000000 [40:48] 01001001 [48:56] 00000000 [56:64] 00001000 opcode 0x40000000 mask 0xfc000003 BO 00010 38, 39, 40, 41, 42 BI 01001 43, 44, 45, 46, 47 target_addr 0000000000001000 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, {0}, {0} target_addr = EXTS(BD || 0b00)) AA 0 62 LK 0 63 mode branch 40 18 40 05 sv.add r127,r31,r65 14 0a ff 7f spec sv.add RT,RA,RB (OE=0 Rc=0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00011000 [24:32] 01000000 [32:40] 01111111 [40:48] 11111111 [48:56] 00001010 [56:64] 00010100 opcode 0x7c000214 mask 0xfc0007ff RT 1111111 19, 20, 38, 39, 40, 41, 42 extra3[0] RA 11111 43, 44, 45, 46, 47 extra3[1] RB 1000001 25, 26, 48, 49, 50, 51, 52 extra3[2] OE 0 53 Rc 0 63 mode normal: simple 14 02 ef 7f add r31,r15,r0 spec add RT,RA,RB (OE=0 Rc=0) binary [0:8] 01111111 [8:16] 11101111 [16:24] 00000010 [24:32] 00010100 opcode 0x7c000214 mask 0xfc0007ff RT 11111 6, 7, 8, 9, 10 RA 01111 11, 12, 13, 14, 15 RB 00000 16, 17, 18, 19, 20 OE 0 21 Rc 0 31 00 15 40 05 sv.lwzu r62,16(r63) 10 00 df 87 spec sv.lwzu RT,D(RA) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00010101 [24:32] 00000000 [32:40] 10000111 [40:48] 11011111 [48:56] 00000000 [56:64] 00010000 opcode 0x84000000 mask 0xfc000000 RT 0111110 {0}, 19, 38, 39, 40, 41, 42 extra2[0] D 0000000000010000 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63 RA 0111111 {0}, 23, 43, 44, 45, 46, 47 extra2[2] mode ld/st imm: simple 00 25 40 05 sv.lwzu *r0,16(r63) 10 00 1f 84 spec sv.lwzu RT,D(RA) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00100101 [24:32] 00000000 [32:40] 10000100 [40:48] 00011111 [48:56] 00000000 [56:64] 00010000 opcode 0x84000000 mask 0xfc000000 RT 0000000 38, 39, 40, 41, 42, 19, {0} extra2[0] D 0000000000010000 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63 RA 0111111 {0}, 23, 43, 44, 45, 46, 47 extra2[2] mode ld/st imm: simple 00 35 40 05 sv.lwzu *r2,16(r63) 10 00 1f 84 spec sv.lwzu RT,D(RA) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00110101 [24:32] 00000000 [32:40] 10000100 [40:48] 00011111 [48:56] 00000000 [56:64] 00010000 opcode 0x84000000 mask 0xfc000000 RT 0000010 38, 39, 40, 41, 42, 19, {0} extra2[0] D 0000000000010000 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63 RA 0111111 {0}, 23, 43, 44, 45, 46, 47 extra2[2] mode ld/st imm: simple
(In reply to Dmitry Selyutin from comment #22) > Fixed immediates. briiilliant. > 00 00 40 05 bc 2,9,0x8 > 08 00 49 40 > spec > sv.bc 10,0,0x0 (AA=0 LK=0) oh whoops, should be sv.bc BO,BI,target_addr (AA....) ah yes comment #23 has it 00 40 05 sv.bc 2,9,0x8 08 00 49 40 spec sv.bc BO,BI,target_addr (AA=0 LK=0) fantastic.
Today I had to mess around opcode matching again. This is still tricky, but at least it matches svshape2. Below is the code for media/audio/mp3/mp3_0_apply_window_float_basicsv.s.sv, with one change: I removed .rodata section, because it messed with the disassembler. I checked only some of the instructions, but mostly it looks like the expected result. addis 2,12,0 addi 2,2,0 addis 9,2,0 addi 9,9,0 rlwinm 7,7,2,0,29 mulli 0,7,31 add 10,6,0 setvl 0,0,8,1,1,0 addi 16,4,124 lfiwax 0,0,5 addi 5,3,64 sv.lfs *32,256(4) sv.lfs *40,256(5) sv.fmuls *32,*32,*40 sv.fadds 0,*32,0 addi 5,3,192 addi 4,4,128 sv.lfs *32,256(4) sv.lfs *40,256(5) sv.fmuls *32,*32,*40 sv.fsubs 0,0,*32 addi 4,4,65408 stfs 0,0(6) add 6,6,7 addi 4,4,4 addi 0,0,15 mtspr 288,0 addi 8,0,4 lfiwax 0,0,9 lfiwax 1,0,9 addi 5,3,64 add 5,5,8 sv.lfs *32,256(5) sv.lfs *40,256(4) sv.lfs *48,256(16) sv.fmuls *40,*32,*40 sv.fadds 0,0,*40 sv.fmuls *32,*32,*48 sv.fsubs 1,1,*32 addi 5,3,192 subf 5,8,5 addi 4,4,128 addi 16,16,128 sv.lfs *32,256(5) sv.lfs *40,256(4) sv.lfs *48,256(16) sv.fmuls *40,*32,*40 sv.fsubs 0,0,*40 sv.fmuls *32,*32,*48 sv.fsubs 1,1,*32 addi 4,4,65408 addi 16,16,65408 stfs 0,0(6) add 6,6,7 stfs 1,0(10) subf 10,7,10 addi 8,8,4 addi 4,4,4 addi 16,16,65532 bc 16,0,0xff4c addi 5,3,128 addi 4,4,128 lfiwax 0,0,9 sv.lfs *32,256(4) sv.lfs *40,256(5) sv.fmuls *32,*32,*40 sv.fsubs 0,0,*32 stfs 0,0(6) bclr 20,0,0
As for CRs support, these still don't work; Luke, it'd be great if you could take a look at 73f9c4c65cb886bcf0c242ca48fc9dc339fe5ba3.
Refactored D operand so that it shows the verbose information in a more obvious and explicit form. 47 00 90 59 fmvis f12,97 spec fmvis FRS,D binary [0:8] 01011001 [8:16] 10010000 [16:24] 00000000 [24:32] 01000111 opcode 0x58000006 mask 0x5800003e FRS 01100 6, 7, 8, 9, 10 D d0 = D[0:9] 0000000001 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 d1 = D[10:15] 10000 11, 12, 13, 14, 15 d2 = D[16] 1 31
After some thoughts, I decided to also slightly tune target_addr, hopefully it's clearer now. 00 00 40 05 sv.bc 12,2,0x1c 1c 00 82 41 spec sv.bc BO,BI,target_addr (AA=0 LK=0) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00000000 [24:32] 00000000 [32:40] 01000001 [40:48] 10000010 [48:56] 00000000 [56:64] 00011100 opcode 0x40000000 mask 0x40000003 BO 01100 38, 39, 40, 41, 42 BI 00010 43, 44, 45, 46, 47 target_addr = EXTS(BD || 0b00)) BD 0000000000011100 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, {0}, {0} AA 0 62 LK 0 63 mode branch
From now on, we support pcode in extended mode, too. 00 38 40 05 sv.add *r3,r2,r1 14 0a 02 7c spec sv.add RT,RA,RB (OE=0 Rc=0) pcode RT <- (RA) + (RB) binary [0:8] 00000101 [8:16] 01000000 [16:24] 00111000 [24:32] 00000000 [32:40] 01111100 [40:48] 00000010 [48:56] 00001010 [56:64] 00010100 opcode 0x7c000214 mask 0x7c000615 RT (vector) 0000011 38, 39, 40, 41, 42, 19, 20 extra3[0] RA (scalar) 00010 43, 44, 45, 46, 47 extra3[1] RB (scalar) 00001 48, 49, 50, 51, 52 extra3[2] OE 0 53 Rc 0 63 mode normal: simple
(In reply to Dmitry Selyutin from comment #29) > From now on, we support pcode in extended mode, too. nice idea for some ops - setvl and svshape on the other hand are massive. suggest it being an option. > 00 38 40 05 sv.add *r3,r2,r1 > 14 0a 02 7c > spec > sv.add RT,RA,RB (OE=0 Rc=0) > pcode > RT <- (RA) + (RB) > RT (vector) > RA (scalar) wha-hey!
(In reply to Dmitry Selyutin from comment #26) > As for CRs support, these still don't work; Luke, it'd be great if you could > take a look at 73f9c4c65cb886bcf0c242ca48fc9dc339fe5ba3. willdo - looks like you forgot that some can be 5-bit others 3-bit. BA/BB/BC is 5-bit, BF/BFA (Bit Field) is 3-bit
Yet another day wasted on opcodes, but now we can finally recognize tricky cases like svshape2 and isel. Returning back to CRs.
Since I'm unsure about CRs, I took this opportunity to cleanup opcodes even more. Here's brand new representation: 59 00 00 58 svshape 1,1,1,0,1 spec svshape SVxd,SVyd,SVzd,SVrm,vf opcodes 010110---------------0000-011001 010110---------------0001-011001 010110---------------0010-011001 010110---------------0011-011001 010110---------------0100-011001 010110---------------0101-011001 010110---------------0110-011001 010110---------------0111-011001 010110---------------1010-011001 010110---------------1011-011001 010110---------------1100-011001 010110---------------1101-011001 010110---------------1110-011001 010110---------------1111-011001 19 04 c0 5b svshape2 15,0,0,1,0,0 spec svshape2 SVo,SVyx,rmm,SVd,sk,mm opcodes 010110---------------100--011001
Established dictionaries for opcodes and names (opcodes frankly I simply stole from binutils, I simply hash by PO). Before: real 0m1.227s user 0m1.210s sys 0m0.015s After: real 0m0.986s user 0m0.976s sys 0m0.008s I won't continue these experiments now due to time constraints, unless another obvious optimizations comes to my head.
$ echo "sv.isel 10,20,30,33" | pysvp64asm > svisel.tst.s $ powerpc64le-linux-gnu-as svisel.tst.s $ powerpc64le-linux-gnu-objdump -D ./a.out $ echo -n -e '\x40\x00\x40\x05\x5e\xf0\x54\x7d' | pysvp64dis -v 40 00 40 05 sv.isel r10,r20,r30,33 5e f0 54 7d BC (scalar) 000100001 {0}, {0}, {0}, 25, 53, 54, 55, 56, 57 extra2[3] $ echo "sv.isel 10,20,30,*33" | pysvp64asm > svisel.tst.s echo -n -e '\xc0\x00\x40\x05\x5e\xf0\x54\x7d' | pysvp64dis -v c0 00 40 05 sv.isel r10,r20,r30,*33 5e f0 54 7d BC (vector) 000100001 53, 54, 55, 25, {0}, {0}, {0}, 56, 57 extra2[3]
all good https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=b49d42520dbba44d6fc5421b57ea1202ed47252d echo "sv.isel 10,20,30,*483" | pysvp64asm > svisel.tst.s echo -n -e '\xc0\x00\x40\x05\xde\xf7\x54\x7d' | pysvp64dis -v c0 00 40 05 sv.isel r10,r20,r30,*483 de f7 54 7d BC (vector) 111100011 53, 54, 55, 25, {0}, {0}, {0}, 56, 57 extra2[3] echo "sv.isel 10,20,30,63" | pysvp64asm > svisel.tst.s echo -n -e '\x40\x00\x40\x05\xde\xf7\x54\x7d' | pysvp64dis -v 40 00 40 05 sv.isel r10,r20,r30,63 de f7 54 7d BC (scalar) 000111111 {0}, {0}, {0}, 25, 53, 54, 55, 56, 57 extra2[3]
Today's achievements: 1. Fixed Rc-enabled instructions (the matching algorithm was incorrect). 2. Refactored RM mappings so that these can be different depending on the mode (needed for CR ops and branches, probably others). 3. Supported multiple opcodes for binutils (not yet tested, but on the first glance looks fine). Below is how it looks in C. struct svp64_opcode { uint32_t value; uint32_t mask; }; struct svp64_record { const char *name; struct svp64_desc desc; const struct svp64_opcode *opcodes; size_t nr_opcodes; }; const struct svp64_opcode svp64_opcodes[] = \ { { .value = 0x7c0004ae, .mask = 0xfc0007fe, }, /* snip */ } const struct svp64_record svp64_records[] = \ { /* snip */ { .name = "sv.isel", .desc = { .function = SVP64_FUNCTION_CR, .in1 = SVP64_IN1_SEL_RA_OR_ZERO, .in2 = SVP64_IN2_SEL_RB, .in3 = SVP64_IN3_SEL_NONE, .out = SVP64_OUT_SEL_RT, .out2 = SVP64_OUT_SEL_RT, .cr_in = SVP64_CR_IN_SEL_BC, .cr_in2 = SVP64_CR_IN2_SEL_NONE, .cr_out = SVP64_CR_OUT_SEL_NONE, .ptype = SVP64_PTYPE_P1, .etype = SVP64_ETYPE_EXTRA2, .extra_idx_in1 = SVP64_EXTRA_IDX1, .extra_idx_in2 = SVP64_EXTRA_IDX2, .extra_idx_in3 = SVP64_EXTRA_NONE, .extra_idx_out = SVP64_EXTRA_IDX0, .extra_idx_out2 = SVP64_EXTRA_NONE, .extra_idx_cr_in = SVP64_EXTRA_IDX3, .extra_idx_cr_out = SVP64_EXTRA_NONE, }, .opcodes = &svp64_opcodes[288], .nr_opcodes = 32, }, /* snip */ }
Oh, and, by the way: { .name = "sv.crand", .desc = { .function = SVP64_FUNCTION_CR, .in1 = SVP64_IN1_SEL_NONE, .in2 = SVP64_IN2_SEL_NONE, .in3 = SVP64_IN3_SEL_NONE, .out = SVP64_OUT_SEL_NONE, .out2 = SVP64_OUT_SEL_NONE, .cr_in = SVP64_CR_IN_SEL_BA, .cr_in2 = SVP64_CR_IN2_SEL_BB, .cr_out = SVP64_CR_OUT_SEL_BT, .ptype = SVP64_PTYPE_P1, .etype = SVP64_ETYPE_EXTRA3, .extra_idx_in1 = SVP64_EXTRA_NONE, .extra_idx_in2 = SVP64_EXTRA_NONE, .extra_idx_in3 = SVP64_EXTRA_NONE, .extra_idx_out = SVP64_EXTRA_NONE, .extra_idx_out2 = SVP64_EXTRA_NONE, .extra_idx_cr_in = SVP64_EXTRA_IDX1, .extra_idx_cr_out = SVP64_EXTRA_IDX0, }, .opcodes = &svp64_opcodes[226], .nr_opcodes = 1, },
Refactored RM modes, since we're approaching to the stage where some modes reuse existing fields for their needs (e.g. cr_ops reuse ewsrc). Setting these in ewsrc is not obvious; however, when you do `insn.prefix.rm.cr_ops.sz = 1`, it is way clearer. Below is what we have now. Note that some fields are present in base RM class: I didn't want to duplicate all these fields in all children subclasses. This leads to the fact that say insn.prefix.rm.cr_ops will also have ewsrc inherited, it just won't be used. RM.mmode RM.mask RM.elwidth RM.ewsrc RM.subvl RM.mode RM.smask RM.extra RM.extra2 RM.extra2.idx0 RM.extra2.idx1 RM.extra2.idx2 RM.extra2.idx3 RM.extra3 RM.extra3.idx0 RM.extra3.idx1 RM.extra3.idx2 RM.normal RM.normal.mmode RM.normal.mask RM.normal.elwidth RM.normal.ewsrc RM.normal.subvl RM.normal.mode RM.normal.smask RM.normal.extra RM.normal.extra2 RM.normal.extra2.idx0 RM.normal.extra2.idx1 RM.normal.extra2.idx2 RM.normal.extra2.idx3 RM.normal.extra3 RM.normal.extra3.idx0 RM.normal.extra3.idx1 RM.normal.extra3.idx2 RM.normal.simple RM.normal.simple.dz RM.normal.simple.sz RM.normal.smr RM.normal.smr.RG RM.normal.pmr RM.normal.svmr RM.normal.svmr.SVM RM.normal.pu RM.normal.pu.SVM RM.normal.ffrc1 RM.normal.ffrc1.inv RM.normal.ffrc1.CR RM.normal.ffrc0 RM.normal.ffrc0.inv RM.normal.ffrc0.VLi RM.normal.ffrc0.RC1 RM.normal.sat RM.normal.sat.N RM.normal.sat.dz RM.normal.sat.sz RM.normal.satx RM.normal.satx.N RM.normal.satx.zz RM.normal.satx.dz RM.normal.satx.sz RM.normal.satpu RM.normal.satpu.N RM.normal.satpu.zz RM.normal.satpu.dz RM.normal.satpu.sz RM.normal.prrc1 RM.normal.prrc1.inv RM.normal.prrc1.CR RM.normal.prrc0 RM.normal.prrc0.inv RM.normal.prrc0.zz RM.normal.prrc0.RC1 RM.normal.prrc0.dz RM.normal.prrc0.sz RM.ldst_imm RM.ldst_imm.mmode RM.ldst_imm.mask RM.ldst_imm.elwidth RM.ldst_imm.ewsrc RM.ldst_imm.subvl RM.ldst_imm.mode RM.ldst_imm.smask RM.ldst_imm.extra RM.ldst_imm.extra2 RM.ldst_imm.extra2.idx0 RM.ldst_imm.extra2.idx1 RM.ldst_imm.extra2.idx2 RM.ldst_imm.extra2.idx3 RM.ldst_imm.extra3 RM.ldst_imm.extra3.idx0 RM.ldst_imm.extra3.idx1 RM.ldst_imm.extra3.idx2 RM.ldst_imm.simple RM.ldst_imm.simple.zz RM.ldst_imm.simple.els RM.ldst_imm.simple.dz RM.ldst_imm.simple.sz RM.ldst_imm.spu RM.ldst_imm.spu.zz RM.ldst_imm.spu.els RM.ldst_imm.spu.dz RM.ldst_imm.spu.sz RM.ldst_imm.ffrc1 RM.ldst_imm.ffrc1.inv RM.ldst_imm.ffrc1.CR RM.ldst_imm.ffrc0 RM.ldst_imm.ffrc0.inv RM.ldst_imm.ffrc0.els RM.ldst_imm.ffrc0.RC1 RM.ldst_imm.sat RM.ldst_imm.sat.N RM.ldst_imm.sat.zz RM.ldst_imm.sat.els RM.ldst_imm.sat.dz RM.ldst_imm.sat.sz RM.ldst_imm.prrc1 RM.ldst_imm.prrc1.inv RM.ldst_imm.prrc1.CR RM.ldst_imm.prrc0 RM.ldst_imm.prrc0.inv RM.ldst_imm.prrc0.els RM.ldst_imm.prrc0.RC1 RM.ldst_idx RM.ldst_idx.mmode RM.ldst_idx.mask RM.ldst_idx.elwidth RM.ldst_idx.ewsrc RM.ldst_idx.subvl RM.ldst_idx.mode RM.ldst_idx.smask RM.ldst_idx.extra RM.ldst_idx.extra2 RM.ldst_idx.extra2.idx0 RM.ldst_idx.extra2.idx1 RM.ldst_idx.extra2.idx2 RM.ldst_idx.extra2.idx3 RM.ldst_idx.extra3 RM.ldst_idx.extra3.idx0 RM.ldst_idx.extra3.idx1 RM.ldst_idx.extra3.idx2 RM.ldst_idx.simple RM.ldst_idx.simple.SEA RM.ldst_idx.simple.sz RM.ldst_idx.simple.dz RM.ldst_idx.stride RM.ldst_idx.stride.SEA RM.ldst_idx.stride.dz RM.ldst_idx.stride.sz RM.ldst_idx.sat RM.ldst_idx.sat.N RM.ldst_idx.sat.dz RM.ldst_idx.sat.sz RM.ldst_idx.prrc1 RM.ldst_idx.prrc1.inv RM.ldst_idx.prrc1.CR RM.ldst_idx.prrc0 RM.ldst_idx.prrc0.inv RM.ldst_idx.prrc0.zz RM.ldst_idx.prrc0.RC1 RM.ldst_idx.prrc0.dz RM.ldst_idx.prrc0.sz
(In reply to Dmitry Selyutin from comment #39) > now. Note that some fields are present in base RM class: I didn't want to > duplicate all these fields in all children subclasses. This leads to the > fact that say insn.prefix.rm.cr_ops will also have ewsrc inherited, it just > won't be used. ack. ok apologies, i had to *yet again* redo pack/unpack, the temporary hack is an overload of RM.elwidth. the reason is, pack/unpack has to go into SVSTATE and be specially updated by setvl. https://bugs.libre-soc.org/show_bug.cgi?id=871#c4 > RM.elwidth overloaded 2 bits, pack and unpack, and only in normal mode. this will move AT SOME time not now to a new VL mode > RM.normal.elwidth here is temporarily joined by pack/unpack bits > RM.normal.pu > RM.normal.pu.SVM removed. > RM.normal.satpu > RM.normal.satpu.N > RM.normal.satpu.zz > RM.normal.satpu.dz > RM.normal.satpu.sz all removed. > RM.ldst_imm.elwidth NOT joined by packunpack. > RM.ldst_imm.spu > RM.ldst_imm.spu.zz > RM.ldst_imm.spu.els > RM.ldst_imm.spu.dz > RM.ldst_imm.spu.sz all removed.
(In reply to Dmitry Selyutin from comment #25) > Today I had to mess around opcode matching again. This is still tricky, but > at least it matches svshape2. Below is the code for > media/audio/mp3/mp3_0_apply_window_float_basicsv.s.sv, interesting. i dropped them all into test_pysvp64dis.py, it gets a compile error, v. strange. constants not in range. > addis 2,12,0 > addi 2,2,0
I'll check later, I might have broke something or ran tests on a different branch...
I copied and pasted the last test into pysvp64asm (notorious for its longs). Here's what I got after calling binutils: /tmp/test.s: Assembler messages: /tmp/test.s:22: Error: operand out of range (0xff80 is not between 0xffffffffffff8000 and 0x7fff) /tmp/test.s:51: Error: operand out of range (0xff80 is not between 0xffffffffffff8000 and 0x7fff) /tmp/test.s:52: Error: operand out of range (0xff80 is not between 0xffffffffffff8000 and 0x7fff) /tmp/test.s:59: Error: operand out of range (0xfffc is not between 0xffffffffffff8000 and 0x7fff) /tmp/test.s:60: Error: operand out of range (0xff4c is not between 0xffffffffffff8000 and 0x7ffc) addi 4,4,65408 addi 4,4,65408 addi 16,16,65408 addi 16,16,65532 bc 16,0,0xff4c These are binutils internal checks. They have no relation to the disassembler itself.
I'll check the assembly one more time. Perhaps pysvp64dis was wrong, perhaps pysvp64asm, perhaps both.
OK, here's the reproducer: pysvp64asm ./media/audio/mp3/mp3_0_apply_window_float_basicsv.s /tmp/py.s powerpc64le-linux-gnu-as /tmp/py.s -o /tmp/py.o && powerpc-linux-gnu-objcopy -Obinary /tmp/py.o /tmp/bin.o python3 src/openpower/sv/trans/pysvp64dis.py /tmp/bin.o -s
(In reply to Dmitry Selyutin from comment #43) > I copied and pasted the last test into pysvp64asm (notorious for its longs). > Here's what I got after calling binutils: > > addi 4,4,65408 these look like constants have been mis-converted, not recognised as sign-extended negative numbers, truncated instead to 16-bit > addi 4,4,65408 > addi 16,16,65408 > addi 16,16,65532 > bc 16,0,0xff4c
(In reply to Luke Kenneth Casson Leighton from comment #46) > these look like constants have been mis-converted, not recognised > as sign-extended negative numbers, truncated instead to 16-bit Yeah you missed the IRC conversation, I already found it. https://libre-soc.org/irclog/%23libre-soc.2022-09-13.log.html#t2022-09-13T18:31:25
Added support for signed operands (as many as I could find in fields.text; I could miss something, though). addi 2,2,0 addis 9,2,0 addi 9,9,0 rlwinm 7,7,2,0,29 mulli 0,7,31 add 10,6,0 setvl 0,0,8,1,1,0 addi 16,4,124 lfiwax 0,0,5 addi 5,3,64 sv.lfs *32,256(4) sv.lfs *40,256(5) sv.fmuls *32,*32,*40 sv.fadds 0,*32,0 addi 5,3,192 addi 4,4,128 sv.lfs *32,256(4) sv.lfs *40,256(5) sv.fmuls *32,*32,*40 sv.fsubs 0,0,*32 addi 4,4,-128 stfs 0,0(6) add 6,6,7 addi 4,4,4 addi 0,0,15 mtspr 288,0 addi 8,0,4 lfiwax 0,0,9 lfiwax 1,0,9 addi 5,3,64 add 5,5,8 sv.lfs *32,256(5) sv.lfs *40,256(4) sv.lfs *48,256(16) sv.fmuls *40,*32,*40 sv.fadds 0,0,*40 sv.fmuls *32,*32,*48 sv.fsubs 1,1,*32 addi 5,3,192 subf 5,8,5 addi 4,4,128 addi 16,16,128 sv.lfs *32,256(5) sv.lfs *40,256(4) sv.lfs *48,256(16) sv.fmuls *40,*32,*40 sv.fsubs 0,0,*40 sv.fmuls *32,*32,*48 sv.fsubs 1,1,*32 addi 4,4,-128 addi 16,16,-128 stfs 0,0(6) add 6,6,7 stfs 1,0(10) subf 10,7,10 addi 8,8,4 addi 4,4,4 addi 16,16,-4 bc 16,0,-0xb4 addi 5,3,128 addi 4,4,128 lfiwax 0,0,9 sv.lfs *32,256(4) sv.lfs *40,256(5) sv.fmuls *32,*32,*40 sv.fsubs 0,0,*32 stfs 0,0(6) bclr 20,0,0
====================================================================== FAIL: test_7_batch (__main__.SVSTATETestCase) [58:bc] these come from https://bugs.libre-soc.org/show_bug.cgi?id=917#c25 ---------------------------------------------------------------------- Traceback (most recent call last): File "openpower/sv/trans/test_pysvp64dis.py", line 27, in _do_tst "'%s' expected '%s'" % (line, expected[i])) AssertionError: 'bc 16,0,-0xb4' != 'bc 16,0,0xff4c' - bc 16,0,-0xb4 ? - ^ + bc 16,0,0xff4c ? ^^ + : instruction does not match 'bc 16,0,0xff4c' expected 'bc 16,0,-0xb4' 58 means instruction 58 in the list tested. https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=6b844da793861c21b8221db4c17370502b740601
Is it the recent version? ====================================================================== FAIL: test_7_batch (__main__.SVSTATETestCase) [26:mtspr] these come from https://bugs.libre-soc.org/show_bug.cgi?id=917#c25 ---------------------------------------------------------------------- Traceback (most recent call last): File "src/openpower/sv/trans/test_pysvp64dis.py", line 27, in _do_tst "'%s' expected '%s'" % (line, expected[i])) AssertionError: 'mtspr 288,0' != 'mtspr 9,0' - mtspr 288,0 ? ^^^ + mtspr 9,0 ? ^ : instruction does not match 'mtspr 9,0' expected 'mtspr 288,0' ---------------------------------------------------------------------- Ran 8 tests in 18.859s FAILED (failures=1, errors=1)
cf. commit b14eaa812791aa6c089bffcfa726a467c175ed8b, should be on dis branch
a$ git checkout b14eaa812791aa6c089bffcfa726a46 Note: switching to 'b14eaa812791aa6c089bffcfa726a46'. $ python3 src/openpower/sv/trans/test_pysvp64dis.py > /tmp/f ERROR: test_2_d_custom_op (__main__.SVSTATETestCase) FAIL: test_7_batch (__main__.SVSTATETestCase) [26:mtspr] : instruction does not match 'mtspr 9,0' expected 'mtspr 288,0'
(In reply to Dmitry Selyutin from comment #50) > Is it the recent version? no. https://git.libre-soc.org/?p=openpower-isa.git;a=commit;h=6b844da793861c21b8221db4c17370502b740601 git checkout 6b844da7938
(In reply to Luke Kenneth Casson Leighton from comment #52) > FAIL: test_7_batch (__main__.SVSTATETestCase) [26:mtspr] > : instruction does not match 'mtspr 9,0' expected 'mtspr 288,0' got it. https://libre-soc.org/openpower/isa/sprset/ n <- spr[5:9] || spr[0:4] sigh. solvable by adding a definition to fields.txt for spr https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=openpower/isatables/fields.text;h=95c76fb77fbcbe99c45c0a41d39a7cf50a0636c9;hb=e3ebeaafbc0fc1864f05746c49c1b6b98b3e12ad 827 SPR (11:20) wark-wark replace with 827 SPR (16:20,11:15) and altering the pseudocode to just "n <- spr"
(In reply to Luke Kenneth Casson Leighton from comment #54) > solvable by adding a definition to fields.txt for spr > and altering the pseudocode to just "n <- spr" sorted. works great.
def test_2_d_custom_op(self): expected = [ 'fishmv 12,2', 'fmvis 12,97', 'addpcis 12,5', ] stopped working. signed operand. probably addpcis. File "/home/lkcl/src/libresoc/openpower-isa/src/openpower/decoder/power_ins n.py", line 1095, in dynamic_operands value = " ".join(dis) File "/home/lkcl/src/libresoc/openpower-isa/src/openpower/decoder/power_ins n.py", line 551, in disassemble value = insn[span] File "/home/lkcl/src/libresoc/openpower-isa/src/openpower/decoder/power_fie lds.py", line 286, in __getitem__ return self.__members[key] KeyError: None
Refactored RM handling again, fixed multiple issues in both power_insn and power_fields modules in scope of these works. Also now RM types inherit the docstring, so we no longer need some table between RM classes and descriptions. Below is an example of what we have for now. As usual, this resides in dis branch; the work is experimental, but I checked pysvp64dis test and it still works. ff ff ff 07 sv.add. *r3,*r7,*r11 15 12 01 7c spec sv.add. RT,RA,RB (OE=0 Rc=1) pcode RT <- (RA) + (RB) binary [0:8] 00000111 [8:16] 11111111 [16:24] 11111111 [24:32] 11111111 [32:40] 01111100 [40:48] 00000001 [48:56] 00010010 [56:64] 00010101 opcodes 011111---------------01000010101 RT (vector) 0000011 38, 39, 40, 41, 42, 19, 20 extra3[0] RA (vector) 0000111 43, 44, 45, 46, 47, 22, 23 extra3[1] RB (vector) 0001011 48, 49, 50, 51, 52, 25, 26 extra3[2] OE 0 53 Rc 1 63 RM normal: Rc=1: pred-result CR sel RM 111111111111111111111111 6, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 RM.mmode 1 6 RM.mask 111 8, 10, 11 RM.elwidth 11 12, 13 RM.ewsrc 11 14, 15 RM.subvl 11 16, 17 RM.mode 11111 27, 28, 29, 30, 31 RM.smask 111 24, 25, 26 RM.extra 111111111 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.extra2 111111111 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.extra2.idx0 11 18, 19 RM.extra2.idx1 11 20, 21 RM.extra2.idx2 11 22, 23 RM.extra2.idx3 11 24, 25 RM.extra3 111111111 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.extra3.idx0 111 18, 19, 20 RM.extra3.idx1 111 21, 22, 23 RM.extra3.idx2 111 24, 25, 26 RM.inv 1 10 RM.CR 11 11, 12
OK we have the first specifiers: vec2, vec3, vec4. ff ff ff 07 sv.add./vec4 *r3,*r7,*r11 15 12 01 7c
(In reply to Dmitry Selyutin from comment #58) > OK we have the first specifiers: vec2, vec3, vec4. > > ff ff ff 07 sv.add./vec4 *r3,*r7,*r11 > 15 12 01 7c added unit test test_10_vec https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=2fbdc7f20190afaa98a504708ae2114107f765e4
OK we only have BRANCH left. The recent changes in dis branch consider common BRANCH bits. It works with the disassembly and tests I've written, and to my understanding matches the tables. However, I cannot make `src/openpower/decoder/isa/test_caller_svp64_bc.py` work. Let's consider `sv.bc/all 12,*1,0xc` instruction. Here's what I get: 00 24 48 05 sv.bc/all 12,*1,0xc 0c 00 81 41 spec sv.bc BO,BI,target_addr AA=0 LK=0 pcode if (mode_is_64bit) then M <- 0 else M <- 32 if ¬BO[2] then CTR <- CTR - 1 ctr_ok <- BO[2] | ((CTR[M:63] != 0) ^ BO[3]) cond_ok <- BO[0] | ¬(CR[BI+32] ^ BO[1]) if ctr_ok & cond_ok then if AA then NIA <-iea EXTS(BD || 0b00) else NIA <-iea CIA + EXTS(BD || 0b00) if LK then LR <-iea CIA + 4 binary [0:8] 00000101 [8:16] 01001000 [16:24] 00100100 [24:32] 00000000 [32:40] 01000001 [40:48] 10000001 [48:56] 00000000 [56:64] 00001100 opcodes 010000------------------------00 BO 01100 38, 39, 40, 41, 42 BI (vector) 000000001 43, 44, 45, 46, 47, 22, 23, {0}, {0}, 43, 44, 45, 46, 47 extra3[1] target_addr = EXTS(BD || 0b00)) BD 0000000000001100 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, {0}, {0} AA 0 62 LK 0 63 RM branch: simple mode RM 000010000010010000000000 6, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 RM.mmode 0 6 RM.mask 000 8, 10, 11 RM.elwidth 10 12, 13 RM.ewsrc 00 14, 15 RM.subvl 00 16, 17 RM.mode 00000 27, 28, 29, 30, 31 RM.smask 000 24, 25, 26 RM.extra 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.extra2 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.extra2.idx0 10 18, 19 RM.extra2.idx1 01 20, 21 RM.extra2.idx2 00 22, 23 RM.extra2.idx3 00 24, 25 RM.extra3 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.extra3.idx0 100 18, 19, 20 RM.extra3.idx1 100 21, 22, 23 RM.extra3.idx2 000 24, 25, 26 RM.ALL 1 12 RM.SNZ 0 13 RM.SL 0 25 RM.SLu 0 26 RM.LRu 0 30 RM.sz 0 31 RM.CTR 0 27 RM.VLS 0 28 This produces RM 0b000010000010010000000000. However, the original code did this: - sv_mode = ((bc_svstep << SVP64MODE.MOD2_MSB) | - (bc_vlset << SVP64MODE.MOD2_LSB) | - (bc_snz << SVP64MODE.BC_SNZ)) - srcwid = (bc_vsb << 1) | bc_lru - destwid = (bc_lru << 1) | bc_all This doesn't look like the correct thing (or at least it doesn't matches the table, where LRu goes to bit 22 and ALL goes to bit 4; same with VSb and (again??) LRu. For now it looks that the original code was incorrect, but I'd like to have a confirmation on this guess.
(In reply to Dmitry Selyutin from comment #60) > This produces RM 0b000010000010010000000000. However, the original code did > this: > > - sv_mode = ((bc_svstep << SVP64MODE.MOD2_MSB) | > - (bc_vlset << SVP64MODE.MOD2_LSB) | > - (bc_snz << SVP64MODE.BC_SNZ)) > - srcwid = (bc_vsb << 1) | bc_lru > - destwid = (bc_lru << 1) | bc_all > > This doesn't look like the correct thing (or at least it doesn't matches the > table, where LRu goes to bit 22 and ALL goes to bit 4; same with VSb and > (again??) LRu. For now it looks that the original code was incorrect, but > I'd like to have a confirmation on this guess. chances are high that if it worked before and you made changes that did not match what worked because of assumptions "it must be wrong therefore it must be changed", that the assumptions are wrong. that, or the only thing ever tested was the "all" mode (which it is, in the unit test). consts.py. 19-23 = 0-4 in SVP64MODEb. SVP64MODE.MOD2_MSB=0 ==> 19 => CTR mode (formerly bc_svstep) SVP64MODE.MOD2_LSB=1 ==> 20 => VLset (correct) SVP64MODE.BC_SNZ=3 ==> 22 => LRu not SNZ - never tested but hey srcwid => 6/7 => CTi/VSB not VSB/LRu - never tested destwid -> 4/5 => LRU/ALL not SNZ/all - probably wrong way. so yes it's just "mostly borked" rather than "totally borked". look in power_svp64_rm.py you will probably find i have the bit-order on these the wrong way round: # Counter-Test Mode. with m.If(mode[SVP64MODE.BC_CTRTEST]): with m.If(self.rm_in.ewsrc[0]): comb += self.bc_ctrtest.eq(SVP64BCCTRMode.TEST_INV) with m.Else(): comb += self.bc_ctrtest.eq(SVP64BCCTRMode.TEST) # BC Mode ALL or ANY (Great-Big-AND-gate or Great-Big-OR-gate) comb += self.bc_gate.eq(self.rm_in.elwidth[0]) # Link-Register Update comb += self.bc_lru.eq(self.rm_in.elwidth[1]) comb += self.bc_vsb.eq(self.rm_in.ewsrc[1]) swap these to: # Counter-Test Mode. with m.If(mode[SVP64MODE.BC_CTRTEST]): with m.If(self.rm_in.ewsrc[1]): comb += self.bc_ctrtest.eq(SVP64BCCTRMode.TEST_INV) with m.Else(): comb += self.bc_ctrtest.eq(SVP64BCCTRMode.TEST) # BC Mode ALL or ANY (Great-Big-AND-gate or Great-Big-OR-gate) comb += self.bc_gate.eq(self.rm_in.elwidth[1]) # Link-Register Update comb += self.bc_lru.eq(self.rm_in.elwidth[0]) comb += self.bc_vsb.eq(self.rm_in.ewsrc[0]) and you'll likely find it "works"
(In reply to Luke Kenneth Casson Leighton from comment #61) > the assumptions are wrong. This is not some "assumption". I see mismatches between the tables and the code. You look at the code and make an "assumption" this is correct, but it contradicts the spec. https://libre-soc.org/openpower/sv/branches/ > that, or the only thing ever tested was the "all" mode (which it is, > in the unit test). And the position of it is wrong. > destwid -> 4/5 => LRU/ALL not SNZ/all - probably wrong way. ALL/SNZ in spec. In this exact order. > so yes it's just "mostly borked" rather than "totally borked". Mostly or totally depends on the spec. I consider the spec, since it was changed recently along with power_insn. > look in power_svp64_rm.py you will probably find i have the bit-order > on these the wrong way round: Again: the "wrong" way depends on which resource we consider first, the code or the spec. If I recall correctly, spec edits were recent. > and you'll likely find it "works" Either it works (without quotes), or the spec is "correct". And the point of that long post was to find out which of these should be considered first.
(In reply to Dmitry Selyutin from comment #62) > Mostly or totally depends on the spec. I consider the spec, since it was > changed recently not RM.branches. only crops and ldst. spec correct. v. late. brainmelt. do power_svp64_rm.py swap.
I adopted the code to the spec, cf. dis branch. But it's up to _you_ to make the decision what's correct, the spec or the code. Otherwise I'm bound to "assumptions". https://git.libre-soc.org/?p=openpower-isa.git;a=commit;h=d7c072fd0c1b1cd1db12e617c3c4e7c3243eb220 https://git.libre-soc.org/?p=openpower-isa.git;a=commit;h=fd84e2b00f65a16518b45023c6b6a8693599d8fe
(In reply to Luke Kenneth Casson Leighton from comment #63) > v. late. brainmelt. Same here. > do power_svp64_rm.py swap. Done.
All branch modes are completed, including complex stuff like below: sv.bc/vs/all/snz/sl/slu/lru 12,*1,0xc sv.bc/vsi/all/snz/sl/slu/lru 12,*1,0xc sv.bc/vsb/all/snz/sl/slu/lru 12,*1,0xc sv.bc/vsbi/all/snz/sl/slu/lru 12,*1,0xc sv.bc/ctr/all/snz/sl/slu/lru 12,*1,0xc sv.bc/cti/all/snz/sl/slu/lru 12,*1,0xc
BTW see how nice this BranchCTRVLSRM class is: class BranchVLSRM(BranchBaseRM): """branch: VLSET mode""" VSb: BaseRM[7] VLI: BaseRM[21] class BranchCTRRM(BranchBaseRM): """branch: CTR-test mode""" CTi: BaseRM[6] class BranchCTRVLSRM(BranchVLSRM, BranchCTRRM): """branch: CTR-test+VLSET mode""" pass I've omitted the specifiers section but these are inherited too. So we really have CTR-test+VLSET, even in code. It inherits first from CTR-test RM, then from VLS RM. To me this new hierarchy looks really cool, it literally matches the spec.
Ah yeah one note on these patches. In pysvp64asm, there are some sections with `if not is_bc`. Eventually these should go down the drain, as well as hacking around srcwid, dstwid et al., and be replaced by RM-specific fields. I started with branches, because I had to change this code to support all modes, but all modes should eventually be switched to this fields-based method, since it closely follows the spec, setting the individual fields. This is exactly how I want to do it in binutils, both assembly and disassembly.
(In reply to Dmitry Selyutin from comment #67) > BTW see how nice this BranchCTRVLSRM class is: > > class BranchVLSRM(BranchBaseRM): > """branch: VLSET mode""" > VSb: BaseRM[7] > VLI: BaseRM[21] > > class BranchCTRRM(BranchBaseRM): > """branch: CTR-test mode""" > CTi: BaseRM[6] > > class BranchCTRVLSRM(BranchVLSRM, BranchCTRRM): > """branch: CTR-test+VLSET mode""" > pass love it. "duh" level of simplicity. (In reply to Dmitry Selyutin from comment #66) > All branch modes are completed, including complex stuff like below: > > sv.bc/vs/all/snz/sl/slu/lru 12,*1,0xc > sv.bc/vsi/all/snz/sl/slu/lru 12,*1,0xc > sv.bc/vsb/all/snz/sl/slu/lru 12,*1,0xc > sv.bc/vsbi/all/snz/sl/slu/lru 12,*1,0xc > sv.bc/ctr/all/snz/sl/slu/lru 12,*1,0xc > sv.bc/cti/all/snz/sl/slu/lru 12,*1,0xc almost-unavoidably-scarily-long! :) you can - ha ha - also have vs/ctr vsbi/cti etc. etc. should be able to get it down on "sl/slu" by combining those into a 2-3-letter acronymn: sl,slu,SLu or something. also, /sz is another one to combine: /snz also means "/sz" * /sz = bit 23=1, bit 5=0 * /snz = bit 23+5=1 * ILLEGAL bit 23=0, bit5 =1 * {nothing} bit 23=0,bit5=0 (In reply to Dmitry Selyutin from comment #68) > Ah yeah one note on these patches. In pysvp64asm, there are some sections > with `if not is_bc`. Eventually these should go down the drain, goooood.
+ elif encmode == 'ctr': + svp64_rm.branch.CTR = 1 + svp64_rm.branch.VLS = 0 <--- needs taking out + svp64_rm.branch.ctr.CTi = 1 + elif encmode == 'cti': + svp64_rm.branch.CTR = 1 + svp64_rm.branch.ctr.CTi = 1 CTR and VLset mode can be combined! - elif encmode == 'snz': # sz (only) already set above - src_zero = 1 - bc_snz = 1 + elif encmode == 'snz': + svp64_rm.branch.SNZ = 1 notice how src_zero=1 was removed? that's a bug. src_zero=1 *must* be enabled if snz is requested. this makes disasm "/sz/snz" redundant: it's either "/sz" or "/snz" or neither.
this is the pseudocode for sv.bc: lr_ok <- LK svlr_ok <- SVRMmode.SL if ctr_ok & cond_ok then if SVRMmode.LRu then lr_ok <- ¬lr_ok if SVRMmode.SLu then svlr_ok <- ¬svlr_ok if lr_ok then LR <-iea CIA + 4 if svlr_ok then SVLR <- SVSTATE that means *four* permutations *each* for: * LK=0/1 and LRu=0/1 * SL=0/1 and SLu=0/1 /lru we can do nothing about, it combines with the mnemonic name, "bc" or "bcl". in theory it would be possible to chuck in the letter "u" on that. "sv.bcu", "sv.bclu". svlr could be done similarly but keeping it to 3 letters max. "/sl" and "/slu" and "/sll" or "/s" "/slu "/sl" or "/sl" "/slu" "sli" where this last is for SL=0,SLu=1 /s is fine because it will not be used elsewhere.
I don't think modifying the names of instruction is a good idea, unless there're unprefixed instructions which match these names. If you have sv.blu, I'd expect blu. And, well, I don't really think we should introduce something like this at all, I don't even like these permutations on VSb/VLI and would rather choose explicit /vsb/vli.
(In reply to Dmitry Selyutin from comment #72) > I don't think modifying the names of instruction is a good idea, unless > there're unprefixed instructions which match these names. If you have > sv.blu, I'd expect blu. true, and there is a hard rule about not doing exactly that. best to stick with it. > And, well, I don't really think we should introduce > something like this at all, I don't even like these permutations on VSb/VLI > and would rather choose explicit /vsb/vli. hum ok. it's going to be mad-length instructions but then again that is normal for 3D GPU ISAs.
(In reply to Luke Kenneth Casson Leighton from comment #70) > CTR and VLset mode can be combined! ah, just saw the followup patch correcting that. just leaves src_zero (and catching /sz/snz in pysvp64dis)
Ah wait I missed sz/snz. Will update.
I've just pushed the changes for /sz and /snz.
3037 echo 'sv.subf/ff=eq 0,0,0' | pysvp64asm > sv.subf.ffirst.tst.s 3038 echo 'sv.subf./ff=eq 0,0,0' | pysvp64asm > sv.subf.ffirst.tst.s 3039 vi sv.subf.ffirst.tst.s 3040 powerpc64le-linux-gnu-as sv.subf.ffirst.tst.s 3041 powerpc64le-linux-gnu-objdump -D ./a.out 3042 echo -n -e '\x0c\x00\x40\x05\x51\x00\x00\x7c' | pysvp64dis -v 0c 00 40 05 sv.subf./ff=eq r0,r0,r0 51 00 00 7c spec sv.subf. RT,RA,RB OE=0 Rc=1 RM normal: Rc=1: ffirst CR sel RM 000000000000000000001100 RM.mode 01100 27, 28, 29, 30, 31 RM.inv 1 29 RM.CR 00 30, 31
cat /tmp/test.s sv.cmp/ff=RC1 *0,1,*4,3 cat /tmp/test.s && \ SILENCELOG=true pysvp64asm /tmp/test.s /tmp/test.py.s && \ powerpc64le-linux-gnu-as /tmp/test.py.s -o /tmp/test.o && \ powerpc64le-linux-gnu-objcopy -Obinary /tmp/test.o /tmp/bin.o && \ pysvp64dis -v /tmp/bin.o sv.cmp/ff=RC1 *0,1,*4,3 09 24 40 05 sv.cmp *0,1,*r4,r3 00 18 21 7c spec sv.cmp BF,L,RA,RB pcode if L = 0 then a <- EXTS((RA)[XLEN/2:XLEN-1]) b <- EXTS((RB)[XLEN/2:XLEN-1]) else a <- (RA) b <- (RB) if a < b then c <- 0b100 else if a > b then c <- 0b010 else c <- 0b001 CR[4*BF+32:4*BF+35] <- c || XER[SO] binary [0:8] 00000101 [8:16] 01000000 [16:24] 00100100 [24:32] 00001001 [32:40] 01111100 [40:48] 00100001 [48:56] 00011000 [56:64] 00000000 opcodes 011111---------------0000000000- BF (vector) 00000 38, 39, 40, 19, 20, {0}, {0} extra3[0] L 1 42 RA (vector) 0000100 43, 44, 45, 46, 47, 22, 23 extra3[1] RB (scalar) 00011 48, 49, 50, 51, 52 extra3[2] RM None RM 000000000010010000001001 6, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 RM.mmode 0 6 RM.mask 000 8, 10, 11 RM.elwidth 00 12, 13 RM.ewsrc 00 14, 15 RM.subvl 00 16, 17 RM.mode 01001 27, 28, 29, 30, 31 RM.smask 000 24, 25, 26 RM.extra 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.extra2 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.extra2.idx0 10 18, 19 RM.extra2.idx1 01 20, 21 RM.extra2.idx2 00 22, 23 RM.extra2.idx3 00 24, 25 RM.extra3 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.extra3.idx0 100 18, 19, 20 RM.extra3.idx1 100 21, 22, 23 RM.extra3.idx2 000 24, 25, 26 RM.SNZ 0 15 RM.simple 000000000010010000001001 6, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 RM.simple.mmode 0 6 RM.simple.mask 000 8, 10, 11 RM.simple.elwidth 00 12, 13 RM.simple.ewsrc 00 14, 15 RM.simple.subvl 00 16, 17 RM.simple.mode 01001 27, 28, 29, 30, 31 RM.simple.smask 000 24, 25, 26 RM.simple.extra 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.simple.extra2 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.simple.extra2.idx0 10 18, 19 RM.simple.extra2.idx1 01 20, 21 RM.simple.extra2.idx2 00 22, 23 RM.simple.extra2.idx3 00 24, 25 RM.simple.extra3 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.simple.extra3.idx0 100 18, 19, 20 RM.simple.extra3.idx1 100 21, 22, 23 RM.simple.extra3.idx2 000 24, 25, 26 RM.simple.SNZ 0 15 RM.simple.RG 1 28 RM.simple.dz 0 30 RM.simple.sz 1 31 RM.mr 000000000010010000001001 6, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 RM.mr.mmode 0 6 RM.mr.mask 000 8, 10, 11 RM.mr.elwidth 00 12, 13 RM.mr.ewsrc 00 14, 15 RM.mr.subvl 00 16, 17 RM.mr.mode 01001 27, 28, 29, 30, 31 RM.mr.smask 000 24, 25, 26 RM.mr.extra 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.mr.extra2 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.mr.extra2.idx0 10 18, 19 RM.mr.extra2.idx1 01 20, 21 RM.mr.extra2.idx2 00 22, 23 RM.mr.extra2.idx3 00 24, 25 RM.mr.extra3 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.mr.extra3.idx0 100 18, 19, 20 RM.mr.extra3.idx1 100 21, 22, 23 RM.mr.extra3.idx2 000 24, 25, 26 RM.mr.SNZ 0 15 RM.mr.RG 1 28 RM.mr.dz 0 30 RM.mr.sz 1 31 RM.ff3 000000000010010000001001 6, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 RM.ff3.mmode 0 6 RM.ff3.mask 000 8, 10, 11 RM.ff3.elwidth 00 12, 13 RM.ff3.ewsrc 00 14, 15 RM.ff3.subvl 00 16, 17 RM.ff3.mode 01001 27, 28, 29, 30, 31 RM.ff3.smask 000 24, 25, 26 RM.ff3.extra 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.ff3.extra2 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.ff3.extra2.idx0 10 18, 19 RM.ff3.extra2.idx1 01 20, 21 RM.ff3.extra2.idx2 00 22, 23 RM.ff3.extra2.idx3 00 24, 25 RM.ff3.extra3 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.ff3.extra3.idx0 100 18, 19, 20 RM.ff3.extra3.idx1 100 21, 22, 23 RM.ff3.extra3.idx2 000 24, 25, 26 RM.ff3.SNZ 0 15 RM.ff3.VLi 1 28 RM.ff3.inv 0 29 RM.ff3.CR 01 30, 31 RM.ff3.zz 0 14 RM.ff3.sz 0 14 RM.ff3.dz 0 14 RM.ff5 000000000010010000001001 6, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 RM.ff5.mmode 0 6 RM.ff5.mask 000 8, 10, 11 RM.ff5.elwidth 00 12, 13 RM.ff5.ewsrc 00 14, 15 RM.ff5.subvl 00 16, 17 RM.ff5.mode 01001 27, 28, 29, 30, 31 RM.ff5.smask 000 24, 25, 26 RM.ff5.extra 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.ff5.extra2 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.ff5.extra2.idx0 10 18, 19 RM.ff5.extra2.idx1 01 20, 21 RM.ff5.extra2.idx2 00 22, 23 RM.ff5.extra2.idx3 00 24, 25 RM.ff5.extra3 100100000 18, 19, 20, 21, 22, 23, 24, 25, 26 RM.ff5.extra3.idx0 100 18, 19, 20 RM.ff5.extra3.idx1 100 21, 22, 23 RM.ff5.extra3.idx2 000 24, 25, 26 RM.ff5.SNZ 0 15 RM.ff5.VLi 1 28 RM.ff5.inv 0 29 RM.ff5.dz 0 30 RM.ff5.sz 1 31