Figure out which target features are required for which SIMD size #131800

RalfJung · 2024-10-16T19:19:09Z

workingjubilee · 2024-10-17T01:54:36Z

cc @programmerjake to confirm what features are relevant re: PowerPC

workingjubilee · 2024-10-17T02:08:12Z

arm: may be able to rip something out of Revise arm platform notes regarding soft float #130987
csky: cc @Dirreke re: target-features that affect vector ABIs?
loongarch: cc @heiher re: target-features that affect vector ABIs?
riscv: cc @kito-cheng or @topperc to confirm for LLVM features that can affect the vector ABI?
s390x: this got a nice nice writeup recently in s390x vector facilities support #130869 (comment)

programmerjake · 2024-10-17T02:11:07Z

+altivec enables 128-bit vectors, I'm not sure if there are any wider types -- there's MMA with 512-bit accumulators, but idk if they are vector types, they're used for matrix ops.

topperc · 2024-10-17T02:29:01Z

riscv: cc @kito-cheng or @topperc to confirm for LLVM features that can affect the vector ABI?

+zve32x, +zve32f, +zve64x, +zve64f, +zve64d, +v, +zvbb, +zvkb. There several others that start with +zv*. There is an implies relationship so they all ultimately imply +zve32x. These change the ABI for fixed length vector arguments/returns in IR in the backend.

When compiling with clang with the default C ABI, fixed length vectors are passed coerced to a scalar integer type or passed indirectly through memory. clang will not create fixed length vector return types or arguments in IR.

There is a fixed length vector ABI being implemented via an attribute llvm/llvm-project#100346. This changes how fixed length vectors are passed.

programmerjake · 2024-10-17T02:40:26Z

vsx just enables an additional 32 registers that overlap with the scalar floating point registers but are otherwise the same as the 32 128-bit registers from vmx (aka. altivec), so no new simd bitwidths.

workingjubilee · 2024-10-17T02:57:47Z

@programmerjake and no alterations to the calling convention?

workingjubilee · 2024-10-17T03:13:30Z

@topperc hm, it seems LLVM doesn't have the crypto implications for these features specified here? is it somewhere else? https://github.com/llvm/llvm-project/blob/d54953ef472bfd8d4b503aae7682aa76c49f8cc0/llvm/lib/Target/RISCV/RISCVFeatures.td#L734-L746

it seems to rather be the opposite, a requirement relationship, but perhaps I'm misunderstanding: https://github.com/llvm/llvm-project/blob/d54953ef472bfd8d4b503aae7682aa76c49f8cc0/llvm/lib/TargetParser/RISCVISAInfo.cpp#L754-L758

topperc · 2024-10-17T03:19:11Z

@topperc hm, it seems LLVM doesn't have the crypto implications for these features specified here? is it somewhere else? https://github.com/llvm/llvm-project/blob/d54953ef472bfd8d4b503aae7682aa76c49f8cc0/llvm/lib/Target/RISCV/RISCVFeatures.td#L734-L746

it seems to rather be the opposite, a requirement relationship, but perhaps I'm misunderstanding: https://github.com/llvm/llvm-project/blob/d54953ef472bfd8d4b503aae7682aa76c49f8cc0/llvm/lib/TargetParser/RISCVISAInfo.cpp#L754-L758

My mistake. I'm not sure why we don't have the implies. gcc does.

programmerjake · 2024-10-17T04:22:19Z

@programmerjake and no alterations to the calling convention?

enabling vsx doesn't alter the calling convention.

RalfJung · 2024-10-17T06:05:50Z

+altivec enables 128-bit vectors, I'm not sure if there are any wider types -- there's MMA with 512-bit accumulators, but idk if they are vector types, they're used for matrix ops.

Are there types which are passed via these MMA registers?

RalfJung · 2024-10-17T06:07:28Z

My mistake. I'm not sure why we don't have the implies. gcc does.

For us it's nice that there's no "implies" here, that makes it a lot easier to check the ABI consequences. ;) This way we juts have to block the actual feature changing something, not other features implying them.

Though maybe this is also unnecessary if #131807 takes care of all that.

EDIT: Ah, that's just for float ABI, not for vectors, is it?

RalfJung · 2024-10-17T06:08:44Z

-Cllvm-args="--riscv-v-vector-bits-min=N"

Uh wait a second, we are exposing another flag that can change ABI? 😢 😭

EDIT: That's probably a discussion for Zulip.

heiher · 2024-10-17T07:12:17Z

LoongArch: According to the LoongArch ABI Specs, vector type parameters and return values are passed in GAR(general-purpose argument registers) or on the stack, and do not rely on vector registers or vector features.

programmerjake · 2024-10-17T07:15:59Z

+altivec enables 128-bit vectors, I'm not sure if there are any wider types -- there's MMA with 512-bit accumulators, but idk if they are vector types, they're used for matrix ops.

Are there types which are passed via these MMA registers?

after a bit more research, there are types for MMA, but you can't pass them by value in function arguments or return, so they're not ABI-breaking: https://clang.godbolt.org/z/e4sTY37Pv
they lower to <256 x i1> and <512 x i1>

workingjubilee · 2024-10-17T07:25:54Z

...concerning. these blockades are in clang's semantic checks, they don't seem to be enforced by LLVM.

workingjubilee · 2024-10-17T07:33:21Z

cc @jacobbramley re: aarch64

workingjubilee · 2024-10-17T07:35:27Z

cc @androm3da re: hexagon

RalfJung · 2024-10-17T08:13:15Z

after a bit more research, there are types for MMA, but you can't pass them by value in function arguments or return

How do we handle that in Rust? We'd need a special pass during collection rejecting them as arguments, likely as part of the simd arg check that this issue is about.

I assume we don't support these types yet, but this will need to be considered when someone decides to add them.

RalfJung · 2024-10-17T08:24:22Z

-Cllvm-args="--riscv-v-vector-bits-min=N"

Uh wait a second, we are exposing another flag that can change ABI? 😢 😭

I only just realized this only affects scalable vector types. Which anyway we don't support. So we can ignore this for now.

Dirreke · 2024-10-17T12:48:14Z

According to the CSKY Development Guide: 4.5 vdsp, the vector width is configured as follows:

For vdspv2, the width is fixed at 128 bits.
For vdspv1, the default width is 128 bits, but it can optionally be set to 64 bits using the -mvdsp-width=64 compiler flag. However, please note that the 64-bit width option is currently unsupported by LLVM.

Therefore, for both versions, you can safely set the vector width to 128 bits.

csky:
"vdspv1" | "vdspv2" => vlen(128),

…ngjubilee ABI checks: add support for tier2 arches See rust-lang#131800 for the data collection behind this change. r? RalfJung

RalfJung · 2024-11-13T10:09:10Z

How big are the SPARC vectors that get unlocked by the vis feature?

…ngjubilee ABI checks: add support for tier2 arches See rust-lang#131800 for the data collection behind this change. r? RalfJung

taiki-e · 2024-11-13T15:02:13Z

How big are the SPARC vectors that get unlocked by the vis feature?

AFAIK it's at least 64-bit.

GCC's SPARC VIS builtins provides vector_size (8) (64-bit) and vector_size (4) (32-bit) vectors.
According to calling conventions implemented by GCC:
- On SPARC32: 64-bit or smaller vector integer is passed using int reg (argument) / FP reg (return value)
- On SPARC64: 128-bit(argument)/256-bit(return value) or smaller vector integer/float are passed using FP reg

RalfJung · 2024-11-13T15:46:57Z

@veluca93's PR has a comment indicating 128 now, that seems wrong then?
64bit vector registers seem kind of odd on a 64bit system though...

On SPARC64: 128-bit(arg)/256-bit(ret) or smaller integer/float vectors are passed using FP reg

That doesn't mean the register is 256 bits large though, is it? It probably uses more than one register? OTOH vectors larger than a register are also odd. So I don't really understand what to do with this.

taiki-e · 2024-11-13T16:26:59Z

SPARC FP registers (f[0-63]) are 32-bit long, and two/four of these are combined to process f64/f128. 64-bit VIS vectors also use two FP registers, as does f64.

128-bit/258-bit vectors are also passed or returned using four/eight FP registers.
https://github.com/gcc-mirror/gcc/blob/730f28b081bea4a749f9b82902446731ec8faa93/gcc/config/sparc/sparc.cc#L7388

(By the way, this complication of the nature of FP registers is one of the reasons I postponed FP register support in the initial implementation of SPARC inline assembly.)

RalfJung · 2024-11-13T16:31:54Z

The relevant question here is: for a vector of size N bits, which target feature must be present so that the vector is passed in a register (whereas if the feature is absent, vectors of that size are passed in a different way). Or I guess in this case, in a group of registers, or whatever these are called.

taiki-e · 2024-11-13T17:03:32Z

IIUC, in that case, we need to refer to the calling conventions'.

According to calling conventions implemented by GCC:

On SPARC32: 64-bit or smaller integer vectors are passed using FP reg

On SPARC64: 128-bit(arg)/256-bit(ret) or smaller integer/float vectors are passed using FP reg

Assuming our ABI code is correct:

"sparc": "vis" => vlen(64)
"sparc64": "vis" => vlen(256)

However, IIUC our ABI code does not support vector types correctly, so for now we may need to treat it as vlen(0) ~~for SPARC32 and vlen(128) for SPARC64~~ (EDIT: see #131800 (comment)) in the lint. (Otherwise, it could incorrectly used the SPARC64 256-bit vectors in arguments or the SPARC32 float vectors. )

RalfJung · 2024-11-13T17:51:40Z

You said it's 256 only for return values... not sure how that makes any sense, but something can't be right here then. It should probably be "128" for sparc64? Like, maybe in the future they define 256 bit vectors and then pass them different for arguments. Or so?

taiki-e · 2024-11-13T19:01:37Z

You said it's 256 only for return values... not sure how that makes any sense, but something can't be right here then.

The 256-bit vector in argument position is passed through memory, not FP registers.

From calling conventions implemented by GCC:

                           size      argument     return value
 vector integer           <=16        FP reg.        FP reg.
 vector integer       16<s<=32        memory         FP reg.
 vector integer            >32        memory         memory

IIUC, our ABI code should handle this correctly by calling make_indirect for vectors with size>16(argument) or size>32(return value). (If it is handled correctly, the 256-bit vector in argument position should not generate a warning anyway, since the lint trigger condition is no longer satisfied.)

maybe in the future they define 256 bit vectors

GCC's VIS builtins support 32-bit and 64-bit vectors, but larger sizes of vector types are available via vector_size(...). (In Rust, the former is core::arch vector type (not available for SPARC yet) and the latter is core::simd or custom #[repr(simd)] vector type)

Here is a code generated by GCC (only 256-bit vector is moved from memory to FP reg): https://godbolt.org/z/5vEdejj6j

taiki-e · 2024-11-13T19:05:34Z

In any case, LLVM doesn't currently support Vector ABI (llvm/llvm-project#45418), so it seems that using vlen(0) in the lint is correct for now.

RalfJung · 2024-11-13T19:54:31Z

I guess the number we need is "for vectors of which size is this calling convention guaranteed to be the final word, and future extensions won't change how those vectors are passed". Even assuming a correct ABI implementation on our side, what is that number for SPARC32 + vis / SPARC64 + vis? Or is this somehow not a well-formed question?

…ngjubilee ABI checks: add support for tier2 arches See rust-lang#131800 for the data collection behind this change. r? RalfJung

taiki-e · 2024-11-14T03:01:55Z

SPARC's Vector ABI is defined based on the existing float and aggregate calling convention, not the VIS ISA ¹, and changing it without a new ABI would also break other non-vector arguments due to the nature of using FP registers. So, I don't believe it can be changed without a new ABI. (This is very different from the x86_64, which extended the ISA in the form of increasing the size of the vector registers.)

That said, if our policy is to not trust the current behavior of psABI with respect to vectors larger than ISA supports ², I think the 64-bit mentioned in #131800 (comment) is safe selection.

sparc_pass_by_reference and sparc_return_in_memory in GCC have comments about this. ↩
I think this is the correct policy for most architectures. ↩

Rollup merge of rust-lang#132842 - veluca93:abi-checks-tier2, r=workingjubilee ABI checks: add support for tier2 arches See rust-lang#131800 for the data collection behind this change. r? RalfJung

RalfJung · 2024-11-14T06:25:47Z

SPARC's Vector ABI is defined based on the existing float and aggregate calling convention, not the VIS ISA

So having or not having vis actually makes no difference for how things are passed?
That would mean we could actually allow arbitrary-size vectors -- once we implement the existing ABI correctly, of course.

taiki-e · 2024-11-14T12:46:41Z

So having or not having vis actually makes no difference for how things are passed?
That would mean we could actually allow arbitrary-size vectors -- once we implement the existing ABI correctly, of course.

Good point. Looking at the assemblies generated by GCC it does not appear to be affected by -mvis (affected by -msoft-float, though): https://godbolt.org/z/3sTMqoaq4

(As for SPARC's soft-float target feature, it should be marked as Forbidden anyway because it affects the float ABI: #131799 (comment))

RalfJung · 2024-11-14T12:50:23Z

Yeah, most targets have a soft-float feature and it should be forbidden on all of them. But not all do, e.g. aarch64 does not.

workingjubilee · 2024-11-16T04:00:58Z

OTOH vectors larger than a register are also odd. So I don't really understand what to do with this.

Fun facts to know and tell: RVV also supports these! it's something called a "register group", and is modified by LMUL ("length multiplier"), which its a bitfield that specifies a value in {8, 4, 2, 1, 1/2, 1/4, 1/8}. When LMUL is greater than 1, instructions that manipulate vectors in vector registers must address a register with an appropriate factor, but they otherwise operate like a vector of the currently-set vector length multiplied by LMUL.

…jubilee ABI checks: add support for some tier3 arches, warn on others. Followup to - rust-lang#132842 - rust-lang#132173 - rust-lang#131800 r? `@workingjubilee`

…ngjubilee ABI checks: add support for some tier3 arches, warn on others. Followup to - rust-lang#132842 - rust-lang#132173 - rust-lang#131800 r? `@workingjubilee`

…ngjubilee ABI checks: add support for some tier3 arches, warn on others. Followup to - rust-lang#132842 - rust-lang#132173 - rust-lang#131800 r? ``@workingjubilee``

Rollup merge of rust-lang#133029 - veluca93:abi-checks-tier3, r=workingjubilee ABI checks: add support for some tier3 arches, warn on others. Followup to - rust-lang#132842 - rust-lang#132173 - rust-lang#131800 r? ``@workingjubilee``

workingjubilee · 2024-11-17T22:00:23Z

Thank you everyone for participating, I believe we have concluded this investigation for now. There are some only-partly-resolved questions so I opened new issues for them:

rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Oct 16, 2024

RalfJung mentioned this issue Oct 16, 2024

The extern "C" ABI of SIMD vector types depends on target features (tracking issue for abi_unsupported_vector_types future-incompatibility lint) #116558

Open

jieyouxu added E-needs-investigation Call for partcipation: This issues needs some investigation to determine current status C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC labels Oct 17, 2024

workingjubilee mentioned this issue Oct 17, 2024

[RISCV] Allow crypto features to imply dependents llvm/llvm-project#112659

Merged

workingjubilee mentioned this issue Nov 16, 2024

ABI checks: add support for some tier3 arches, warn on others. #133029

Merged

workingjubilee removed the E-needs-investigation Call for partcipation: This issues needs some investigation to determine current status label Nov 17, 2024

This was referenced Nov 17, 2024

How should we handle SPARC's vector ABI? #133141

Open

How should we handle matrix ABIs? #133144

Open

How should we handle dynamic vector ABIs? #133146

Open

workingjubilee closed this as completed Nov 17, 2024

Figure out which target features are required for which SIMD size #131800

Figure out which target features are required for which SIMD size #131800

Comments

RalfJung commented Oct 16, 2024 • edited Loading

workingjubilee commented Oct 17, 2024

workingjubilee commented Oct 17, 2024 • edited Loading

programmerjake commented Oct 17, 2024

topperc commented Oct 17, 2024

programmerjake commented Oct 17, 2024

workingjubilee commented Oct 17, 2024

workingjubilee commented Oct 17, 2024

topperc commented Oct 17, 2024

programmerjake commented Oct 17, 2024

RalfJung commented Oct 17, 2024

RalfJung commented Oct 17, 2024 • edited Loading

RalfJung commented Oct 17, 2024 • edited Loading

heiher commented Oct 17, 2024

programmerjake commented Oct 17, 2024 • edited Loading

workingjubilee commented Oct 17, 2024

workingjubilee commented Oct 17, 2024

workingjubilee commented Oct 17, 2024

RalfJung commented Oct 17, 2024

RalfJung commented Oct 17, 2024

Dirreke commented Oct 17, 2024 • edited Loading

RalfJung commented Nov 13, 2024

taiki-e commented Nov 13, 2024 • edited Loading

RalfJung commented Nov 13, 2024 • edited Loading

taiki-e commented Nov 13, 2024

RalfJung commented Nov 13, 2024 • edited Loading

taiki-e commented Nov 13, 2024 • edited Loading

RalfJung commented Nov 13, 2024

taiki-e commented Nov 13, 2024

taiki-e commented Nov 13, 2024 • edited Loading

RalfJung commented Nov 13, 2024

taiki-e commented Nov 14, 2024

Footnotes

RalfJung commented Nov 14, 2024

taiki-e commented Nov 14, 2024

RalfJung commented Nov 14, 2024

workingjubilee commented Nov 16, 2024

workingjubilee commented Nov 17, 2024

RalfJung commented Oct 16, 2024 •

edited

Loading

workingjubilee commented Oct 17, 2024 •

edited

Loading

RalfJung commented Oct 17, 2024 •

edited

Loading

RalfJung commented Oct 17, 2024 •

edited

Loading

programmerjake commented Oct 17, 2024 •

edited

Loading

Dirreke commented Oct 17, 2024 •

edited

Loading

taiki-e commented Nov 13, 2024 •

edited

Loading

RalfJung commented Nov 13, 2024 •

edited

Loading

RalfJung commented Nov 13, 2024 •

edited

Loading

taiki-e commented Nov 13, 2024 •

edited

Loading

taiki-e commented Nov 13, 2024 •

edited

Loading