577 – gcc compiler, binutils and assembly macros for OpenPOWER-SV

Bug 577 - gcc compiler, binutils and assembly macros for OpenPOWER-SV

Summary: gcc compiler, binutils and assembly macros for OpenPOWER-SV

Status:	RESOLVED FIXED

Alias:	None

Product:	Libre-SOC's first SoC
Classification:	Unclassified
Component:	Source Code (show other bugs)
Version:	unspecified
Hardware:	Other Linux

Importance:	--- enhancement
Assignee:	Luke Kenneth Casson Leighton

URL:

Depends on:	579 836 550 578 834 907
Blocks:	158
	Show dependency tree / graph

Reported:	2021-01-19 17:04 GMT by Luke Kenneth Casson Leighton
Modified:	2022-10-12 14:35 BST (History)
CC List:	4 users (show)

See Also:	558 615 871 211 917 836
NLnet milestone:	NLNet.2019.10.032.Formal
total budget (EUR) for completion of task and all subtasks:	12000
budget (EUR) for this task, excluding subtasks' budget:	925
parent task for budget allocation:	158
child tasks for budget allocation:	550 578 834 847 857
The table of payments (in EUR) for this task; TOML format:	ghostmansd = { amount = 525, submitted = 2022-09-25, paid = 0222-10-06 } veera = {amount=400, submitted=2022-09-29, paid=2022-10-04}

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Luke Kenneth Casson Leighton 2021-01-19 17:04:20 GMT

in writing programs even in assembler SimpleV needs ppc64 compiler and binutils support even at a basic fundamental level.

whilst not doing full optimisation this milestone allows:

* convenience c/c++ wrapper macros around standard OpenPOWER v3.0B assember to create SimpleV-Vectorisation context
* basic support in ppc64 binutils for SimpleV
* basic support in gcc for SimpleV by first adding abstracted intrinsics and extended register files and going from there.

the primary objective is to first support writing of assembly and move upwards to "correct" (non-optimised) programs, sufficient to do a much more advanced optimisation phase at a much later date.

Comment 1 Luke Kenneth Casson Leighton 2021-03-26 12:12:46 GMT

in issue #615 i am keeping notes from various conversations with ppc binutils and gcc maintainers, as well as OPF.

summary of OPF advice: an architectural fork inside gcc will not be well received due to the implication of ecosystem fragmentation.

one idea came up from David to use the same trick intended for v3.1: there they intend mark entries in rs6000.md as "v3.1prefixableto64bit", and David said he would have no problem with us doing the same thing: set attribute "svp64vectoriseable".

for us this would indicate that when it came to assembly output there would be a special 32bit EXT01 assembly instruction outputted at the front of any instruction marked with the attribute.

Segher then suggested *redefining* the underlying data structure that is used by the macro system for representing registers.

this combination effectively empowers all svp64-marked macro patterns to have a massive addition set of matching capabilities.

on registers alone this would be:

* RT=s RA=s RB=s
* RT=v RA=s RB=s
* ....
* RT=v RA=v RB=v

when element-width overrides are introduced these permutations multiply by 4 for source elwidth override *and another* four for dest elwidth override.

when additional capabilities such as a
saturation are also added, the thought of creating a macro file even one that is autogenerated with all these permutations *per macro* listed explicitly is, at best, described as insane and, frankly, stupid.

a little intelligent thought shows that the pattern-matching can be done implicitly (using existing rs6000.md patterns) when marked with an appropriate attribute.

this will allow us to do very basic (and i mean very basic) matching between vector patterns and svp64-attribute-marked rs6000.md macros.

anything not part of a conditional if/else computation for example: straight unconditional for-loops.

where it gets more complicated is anything that's computed which is to be used for a branch decision. this requires predication (like is used in arm32bit) which is not a "normal" part of ppc except in very special unique circumstances.

avoiding that situation for now and simply doing unconditional for-loop expansion would still be a huge leap forward.

Comment 2 Dmitry Selyutin 2022-08-13 22:07:02 BST

I observe a change with lfs.

     .desc = {
       .in1 = SVP64_IN1_SEL_RA_OR_ZERO,
-      .in2 = SVP64_IN2_SEL_CONST_SVD,
-      .in3 = SVP64_IN3_SEL_RC,
+      .in2 = SVP64_IN2_SEL_CONST_SI,
+      .in3 = SVP64_IN3_SEL_NONE,
       .out = SVP64_OUT_SEL_FRT,
-      .out2 = SVP64_OUT_SEL_NONE,
+      .out2 = SVP64_OUT_SEL_FRT,
       .cr_in = SVP64_CR_IN_SEL_NONE,
       .cr_out = SVP64_CR_OUT_SEL_NONE,
       .sv_ptype = SVP64_PTYPE_P2,
-      .sv_etype = SVP64_ETYPE_EXTRA3,
-      .sv_in1 = SVP64_EXTRA_IDX1,
+      .sv_etype = SVP64_ETYPE_EXTRA2,
+      .sv_in1 = SVP64_EXTRA_NONE,
       .sv_in2 = SVP64_EXTRA_NONE,
       .sv_in3 = SVP64_EXTRA_NONE,
       .sv_out = SVP64_EXTRA_IDX0,

This breaks the remapping algorithm, it was not ready at all for such change. Apparently I miss how to remap this stuff. Ideas/suggestions?