-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nvptx abi adjustment may keep PassMode::Direct for small aggregates in conv extern "C"
#117480
Comments
@rustbot label +O-NVPTX |
extern "C"
Thanks for pointing this out @RalfJung As a note I had to use Do you know why this test started failing with your changes in #117351 and not before that? The reason I'm asking is that I'm curious to how to make the tests catch it if we have a regression and start producing |
We used to only run the assertion in codegen, i.e. when a function with such argument types actually gets generated or called. The abi compatibility test works on function pointers, it computes their ABI but never calls them so we never enter the part of the codegen backend that had the assertion. I'm going to try to find a way to move this check to the ABI computation logic, but that turns out to be hard since we do compute some ABIs that we don't actually support... |
#117500 does ensure we'd catch this. |
…ct, r=davidtwco NVPTX: Avoid PassMode::Direct for args in C abi Fixes rust-lang#117480 I must admit that I'm confused about `PassMode` altogether, is there a good sum-up threads for this anywhere? I'm especially confused about how "indirect" and "byval" goes together. To me it seems like "indirect" basically means "use a indirection through a pointer", while "byval" basically means "do not use indirection through a pointer". The return used to keep `PassMode::Direct` for small aggregates. It turns out that `make_indirect` messes up the tests and one way to fix it is to keep `PassMode::Direct` for all aggregates. I have mostly seen this PassMode mentioned for args. Is it also a problem for returns? When experimenting with `byval` as an alternative i ran into [this assert](https://github.com/rust-lang/rust/blob/61a3eea8043cc1c7a09c2adda884e27ffa8a1172/compiler/rustc_codegen_llvm/src/abi.rs#L463C22-L463C22) I have added tests for the same kind of types that is already tested for the "ptx-kernel" abi. The tests cannot be enabled until something like rust-lang#117458 is completed and merged. CC: `@RalfJung` since you seem to be the expert on this and have already helped me out tremendously CC: `@RDambrosio016` in case this influence your work on `rustc_codegen_nvvm` `@rustbot` label +O-NVPTX
Rollup merge of rust-lang#117671 - kjetilkjeka:nvptx_c_abi_avoid_direct, r=davidtwco NVPTX: Avoid PassMode::Direct for args in C abi Fixes rust-lang#117480 I must admit that I'm confused about `PassMode` altogether, is there a good sum-up threads for this anywhere? I'm especially confused about how "indirect" and "byval" goes together. To me it seems like "indirect" basically means "use a indirection through a pointer", while "byval" basically means "do not use indirection through a pointer". The return used to keep `PassMode::Direct` for small aggregates. It turns out that `make_indirect` messes up the tests and one way to fix it is to keep `PassMode::Direct` for all aggregates. I have mostly seen this PassMode mentioned for args. Is it also a problem for returns? When experimenting with `byval` as an alternative i ran into [this assert](https://github.com/rust-lang/rust/blob/61a3eea8043cc1c7a09c2adda884e27ffa8a1172/compiler/rustc_codegen_llvm/src/abi.rs#L463C22-L463C22) I have added tests for the same kind of types that is already tested for the "ptx-kernel" abi. The tests cannot be enabled until something like rust-lang#117458 is completed and merged. CC: ``@RalfJung`` since you seem to be the expert on this and have already helped me out tremendously CC: ``@RDambrosio016`` in case this influence your work on `rustc_codegen_nvvm` ``@rustbot`` label +O-NVPTX
This should not have been closed without enabling the relevant tests for nvtpx: |
Actually the issue is still open -- the test fails:
|
#117671 only changed |
This comment explains the problem but is essentially this.
I unfortunately don't think it's possible to have a correct C abi for ptx without accepting Direct or implementing the new ABI handling. |
Hm, I only understand half of that comment.^^ I cannot parse the PTX signature. What does it all mean? How does the PTX signature look like when the struct has multiple fields with padding between them? I also don't know much about on_stack. The fact that that is part of This is all completely terrible. LLVM's types are already ill-suited to express function call ABIs so LLVM backends have to use heuristics to reconstruct the actual ABI (in particular for targets like PTX that aren't actually assembly but still IRs), and then in Rust we use the That said... I am a bit surprised that
what exactly does that look like? For more complicated types this does sound quite tricky to do correctly, though. |
Ptx is an intermediate format where passing something is basically just listing a named argument of the type it is passed as. The C ABI in ptx is will list a byte array of elements with a specific alignment requirement when passing structs. When passing Primitive integer types are mainly passed as the actual type they are, except when they are of less size than 32 bits. All integers of less than 32bits will be padded to a 32bit integer. This is somewhat unfortunate, but also just something we have to deal with. When passing a
When I try to cast to a single 8-bit register it passes it as a
I have had great problems with figuring out how I think my next point of action in this task will be to look into the LLVM output from clang when compiling these functions. Perhaps that can guide what we need to output to LLVM from Rust. I guess a problem arise if that is exactly what we do generate when using |
So if this was a struct with an i32 field it would be
How are you setting up the cast? |
Yes, this is correct.
Ahh, that is not intuitive at all for me, but it works indeed. I made the return tests pass with this code. if ret.layout.is_aggregate() {
let align_bytes = ret.layout.align.abi.bytes();
let size = ret.layout.size;
let reg = match align_bytes {
1 => Reg::i8(),
2 => Reg::i16(),
4 => Reg::i32(),
8 => Reg::i64(),
16 => Reg::i128(),
_ => unreachable!("Align is given as power of 2 no larger than 16 bytes"),
};
if align_bytes == size.bytes() {
ret.cast_to(CastTarget {
prefix: [Some(reg), None, None, None, None, None, None, None],
rest: Uniform::new(Reg::i8(), Size::from_bytes(0)),
attrs: ArgAttributes {
regular: ArgAttribute::default(),
arg_ext: ArgExtension::None,
pointee_size: Size::ZERO,
pointee_align: None,
},
});
} else {
ret.cast_to(Uniform::new(reg, size));
}
} else if ret.layout.size.bits() < 32 {
ret.extend_integer_width_to(32);
} I will most likely get a chance to create a PR on Friday. Thanks for that trick. |
Yeah to be clear I have no idea how CastTarget is intended to be used, I just read the code for how it gets translated into an LLVM type, and then found a way to make that generate the LLVM type you want, for the one case you mentioned. I have no idea how to turn that into a general principle that works for all types. You seem to make a special case for |
I have specifically added a test for this type in #125980. It seems to produce correct results. |
…ssmode, r=davidtwco Nvptx remove direct passmode This PR does what should have been done in rust-lang#117671. That is fully avoid using the `PassMode::Direct` for `extern "C" fn` for `nvptx64-nvidia-cuda` and enable the compatibility test. `@RalfJung` [pointed me in the right direction](rust-lang#117480 (comment)) for solving this issue. There are still some ABI bugs after this PR is merged. These ABI tests are created based on what is actually correct, and since they continue passing with even more of them enabled things are improving. I don't have the time to tackle all the remaining issues right now, but I think getting these improvements merged is very valuable in themselves and plan to tackle more of them long term. This also doesn't remove the use of `PassMode::Direct` for `extern "ptx-kernel" fn`. This was also not trivial to make work. And since the ABI is hidden behind an unstable feature it's less urgent. I don't know if it's correct to request `@RalfJung` as a reviewer (due to team structures), but he helped me a lot to figure out this stuff. If that's not appropriate then `@davidtwco` would be a good candidate since he know about this topic from rust-lang#117671 r? `@RalfJung`
Rollup merge of rust-lang#125980 - kjetilkjeka:nvptx_remove_direct_passmode, r=davidtwco Nvptx remove direct passmode This PR does what should have been done in rust-lang#117671. That is fully avoid using the `PassMode::Direct` for `extern "C" fn` for `nvptx64-nvidia-cuda` and enable the compatibility test. `@RalfJung` [pointed me in the right direction](rust-lang#117480 (comment)) for solving this issue. There are still some ABI bugs after this PR is merged. These ABI tests are created based on what is actually correct, and since they continue passing with even more of them enabled things are improving. I don't have the time to tackle all the remaining issues right now, but I think getting these improvements merged is very valuable in themselves and plan to tackle more of them long term. This also doesn't remove the use of `PassMode::Direct` for `extern "ptx-kernel" fn`. This was also not trivial to make work. And since the ABI is hidden behind an unstable feature it's less urgent. I don't know if it's correct to request `@RalfJung` as a reviewer (due to team structures), but he helped me a lot to figure out this stuff. If that's not appropriate then `@davidtwco` would be a good candidate since he know about this topic from rust-lang#117671 r? `@RalfJung`
rust/compiler/rustc_target/src/abi/call/nvptx64.rs
Lines 10 to 14 in b75b3b3
That's invalid since it doesn't say what to do for smaller aggregates, and they default to the (bad)
Direct
. Instead you have to say which register they are supposed to be passed in. You can check what the other targets are doing. Targets are expected to set an explicit pass mode for all aggregate arguments.This should be easy to reproduce by having a function like
which will likely ICE on the current compiler already.
Originally posted by @RalfJung in #117351 (comment)
The text was updated successfully, but these errors were encountered: