a format needs to be agreed on for sv assembly mnemonics which is acceptable
draft implemented at
gcc list post:
binutils list post:
the goal here is to find the SVP64 assembly mnemonic format that is acceptable to all of:
* gcc developers
* binutils developers
* SV's developers and creators
* assembly code developers
clarity and "least conceptual disruption" should be given high priority.
example: actually changing the number of arguments should be given a lower
priority given that it implies that the underlying v3.0B mnemonic is entirely
different when prefixed by SVP64.
the full set of modes is here:
the overview describing the Simple-V concept is here:
clarifying the characteristics and requirements here (edit as required)
1 SVP64 embeds v3.0B scalar instructions
as a hardware-level for-loop
2 implications: the scalar instruction
needs to be "preserved" i.e. obvious
from the SV-augmentation
3 the SV-augmentation must pass through
gcc macro parsing without interference
in that parsing
4 the SV-augmentation must meet binutils
parsing requirements or at least not
need significant changes to binutils
5 numerical-only register numbering should
from the above, we can evaluate an idea kindly suggested by Segher, to fulfil requirement (5): that SVP64 augmentation of registers as Vector or scalar be done as an additional argument.
svadd 5.v, 6.v, 7.v
it would be:
svadd 5, 6, 7, 1, 1, 1
or svadd 5, 6, 7, 0b111
the issue with this is that it breaks requirement (2), in multiple ways.
firstly, it gives the impression that SV is adding extra arguments to the *scalar* v3.0B instruction, when it is not (SVP64 is *augmenting* the registers)
secondly: some instructions have optional argiments as scalar (sync is an alias for sync 0). if Vectorised it becomes ambiguous as to whether the optional argument applies to the underlying scalar operation or to the Vectorisation Augmentation/Embedding.
thirdly: adding new scalar instructions becomes problematic, in that every new instruction added, by having this lack of abstraction, now has to be evaluated carefully by inventing a new syntax not just for the scalar variant but also for the Vectorised variant.
the "fallback position": ".long xxxxx; op N,N,N" where .long contains a v3.1 EXT01 major opcode
clearly, having gcc output such is not exactly desirable.
an alternative is the addition of an svp64 32 bit instruction:
this would become very confusing due to the fact that the qualified v3.0 32 bit instruction following it will have non-obvious register numbers
(SVP64 augments 5-bit register numbers RA etc, FRA etc and BA etc, and also 3-bit CR fields BFA etc)
what about just modifying the mnemonic:
sv.add. 5.v, 10.s, 12.v
sv.add 120.s, 120.s, 121.s
sv.ldu 20.v, 8(23.s)
sv.add.vsv. 5, 10, 12
sv.add.sss 120, 120, 121
sv.ldu.vs 20, 8(23)
(In reply to Jacob Lifshay from comment #4)
> sv.add.vsv. 5, 10, 12
> sv.add.sss 120, 120, 121
> sv.ldu.vs 20, 8(23)
not keen on these as they separate out the vector-note from the register numbers. this makes it really hard to write assembly code (including hard to write unit tests).
additionally when immediate fields (L, sh, mb) are involved it becomes horribly confusing
also for LD/ST there is a notation for unit stride and element stride where it is clear which one is which by marking the immediate-offset rather than the register
separating out svv to be part of the op is, well, if we're forced to, then we're forced to. the lack of clarity means it's definitely low on the list.
(In reply to Luke Kenneth Casson Leighton from comment #5)
I'd say this is waay less clear...
if the dot in register numbers is the only problem, we might as well drop it. v and s will do just fine ending the register number. but I'm curious as to what motivated segher's objections. was it because '.' could be part of a number?
(In reply to Alexandre Oliva from comment #7)
> if the dot in register numbers is the only problem, we might as well drop
> it. v and s will do just fine ending the register number. but I'm curious
> as to what motivated segher's objections. was it because '.' could be part
> of a number?
yes, he pointed out that "." anywhere will not be well received.
i suspect the only reason that "." is allowed at all is because
it's part of the OpenPOWER v3.0B spec for mnemonics:
add. => set Rc=1
add => set Rc=0
he did say we need some way to support numerical-only register numbers.
to achieve that i am inclined there to say "screw it" and simply have:
svadd/extra=0xNN N1, N2, N3
as an "option" where the N1, N2, N3 is *verbatim* v3.0B *not* the
*SV-augmented* register numbers
svadd/extra=0xNN N1, N2, N3
would translate to:
.long 0x...NN # that NN is the same bits from extra above
add N1, N2, N3 # these do not change
in other words the actual v3.0B augmented opcode is clearly unmodified
(In reply to Jacob Lifshay from comment #6)
> (In reply to Luke Kenneth Casson Leighton from comment #5)
> > 4.v(r3)
> I'd say this is waay less clear...
there does exist an alternative which is to have the ld/st be qualified
with a mode. i thiiink... no i haven't put LD/ST modes into SVP64Asm yet
so can't point you at it.
example (made up):
svld/els/ew=8 RS, D(RA)
this says "els for element-strided)" and the hw knows to deploy this:
for i in range(VL)
# element-strided - the very unclear D.v(RA)
Effective_Address = (i * D) + GPR[RA]
for i in range(VL)
# unit strided
Effective_Address = D + (i * elwidth) + GPR[RA]
also in conversation with Segher i suggested the prefixing of reg names
"vr0" --> Vector-augmented r0
"vf0" --> Vector-augmented f0
unfortunately he said that "vrNN" is already taken (an alias for vsrNN)
however now that i think about it, given that SV applies to *scalar* operations only there's no actual conflict, there.
commit 477bc257449d3679a5ffb9807609da1304606163 (HEAD -> master)
Author: Luke Kenneth Casson Leighton <firstname.lastname@example.org>
Date: Sun Mar 14 14:54:45 2021 +0000
remove "sv." and replace with "sv" in all SVP64Asm
i'm not keen on the lack of a separator. it tends to imply that
"svadd" is a completely different instruction from "add".
just looking at binutils gnu-as tc-ppc.c
this is where the opcode parsing starts. the syntax we came up with
(svadd/qualifiers/x/y/z operand1, operand2, ...) should be pretty easy
see if the opcode starts with "sv", if so, call a function
that parses any "/" separated SVP64 qualifiers. these get
inserted into the "upper" bits of insn (32-63) to be extracted
at the end, replace "/" by "\0" and move str on a bit
gather operands as normal
the key function here seems to be ppc_optional_operand_value
also that ppc_insert_operand seems to be where the "magic" happens
(actually putting the operand into the required location in the
there exists an "override" mechanism in the powerpc_operands
table which ppc_insert_operand can call for doing "special"
this is where register_names are identified by dropping a
structure into "ex" (type ExpressionS)
therefore it should be possible to hook into "register_names"
and either spot the suffix "v" (or ".v", or whatever) or otherwise
extend it to support EXTRA2/3.
extract the upper bits (32-64), which could even have EXT01
pre-inserted into them.
output two assembled instructions (64 bit) rather than one
this is all actually pretty straightforward.
an idea came up which would remove the need to modify gcc's rs6000.md opcode mnemonic generators significantly or even at all:
have gcc create the svp64 prefix as a new 32 bit "fake" instruction which binutils picks up and outputs with an EXT01 Primary Opcode.
this is "odd" to say the least but would require no alteration of *any* of the mnemonic generators that create v3.0B scalar instructions, except perhaps to provide the option to output the prefix.
needs more thought but would be considerably less intrusive than altering every single v3.0B scalar assembly generator to optionally output "sv." in front of the mnemonic
or, worse, duplicating every single mnemonic generator to put an SV-augmented variant in an already overloaded file.
(In reply to Luke Kenneth Casson Leighton from comment #13)
> an idea came up which would remove the need to modify gcc's rs6000.md opcode
> mnemonic generators significantly or even at all:
> have gcc create the svp64 prefix as a new 32 bit "fake" instruction which
> binutils picks up and outputs with an EXT01 Primary Opcode.
That could definitely work -- Arm does something kinda like that with their Thumb-mode IT instruction which sets up a predicate to conditionally execute the next 1-4 instructions.
(In reply to Jacob Lifshay from comment #14)
> > have gcc create the svp64 prefix as a new 32 bit "fake" instruction which
> > binutils picks up and outputs with an EXT01 Primary Opcode.
> That could definitely work -- Arm does something kinda like that with their
> Thumb-mode IT instruction which sets up a predicate to conditionally execute
> the next 1-4 instructions.
ahh iiinteresting! there is a keyword "parallel" in the macro language which associates groups of other macros, typically predicates.
cntlz simple pattern
there are attributes "predicable" and "multiple"
then... urk. i started looking through the arm11 md file, no joy finding anything obvious.
2 minutes agoDetails
My suggestion is that the GCC support for SV look at print_operand()
case '0' to adjust the register names and rs6000_asm_output_opcode()
to adjust the mnemonic instead of changing the output template of all
patterns. One can mark patterns as allowing SV, similar to prefixed
I'm requesting that any changes be as localized and inconspicuous as
possible. Adding attributes and updating the patterns that emit
instructions keeps the changes less visible in the impact on the rest
of the port.
alain modra's helped advise on whether "/" would be acceptable, and given
that we're not using "/" in immediate-operand computations, it should be
"=" on the other hand is not. an available character that is unused by
binutils macro-expansion is "?".
(In reply to Luke Kenneth Casson Leighton from comment #17)
> alain modra's helped advise on whether "/" would be acceptable, and given
> that we're not using "/" in immediate-operand computations, it should be
> "=" on the other hand is not. an available character that is unused by
> binutils macro-expansion is "?".
I think we should use something other than "?" due to it's very unintuitive nature:
(In reply to Jacob Lifshay from comment #18)
> I think we should use something other than "?" due to it's very unintuitive
there are very few choices. alain modra - the 15+ year experienced maintainer
of the powerpc-binutils port - would not have suggested it if there were any