-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking Issue for RFC #2972: Constrained Naked Functions #90957
Comments
"produce a clear warning if any of the above suggestions are not heeded" in RFC seems to contain typo |
This is implemented, checking the box. Adding a new step: "Confirm that all errors and warnings are emitted properly". |
Request for StabilizationA proposal that the SummaryAdds a new attribute, The body of a naked function must consist of a single An example of a naked function: const THREE: usize = 3;
#[naked]
/// Adds three to a number and returns the result.
pub extern "sysv64" fn add_n(number: usize) -> usize {
// SAFETY: the validity of these registers is guaranteed according to the "sysv64" ABI
unsafe {
std::arch::asm!(
"add rdi, {}",
"mov rax, rdi",
"ret",
const THREE,
options(noreturn)
);
}
} DocumentationThe Rust Reference: rust-lang/reference#1153 Tests
HistoryThis feature was originally proposed in RFC 1201, filed on 2015-07-10 and accepted on 2016-03-21. Support for this feature was added in #32410, landing on 2016-03-23. Development languished for several years as it was realized that the semantics given in RFC 1201 were insufficiently specific. To address this, a minimal subset of naked functions was specified by RFC 2972, filed on 2020-08-07 and accepted on 2021-11-16. Prior to the acceptance of RFC 2972, all of the stricter behavior specified by RFC 2972 was implemented as a series of warn-by-default lints that would trigger on existing uses of the Unresolved Questions
|
This stabilizes the feature described in RFC 2972, which supersedes the earlier RFC 1201. Closes rust-lang#32408 Closes rust-lang#90957
There is no observable difference between LLVM adding an epilogue and a following unnamed function starting with instructions identical to a regular epilogue. |
@bjorn3 I believe it can cause problems in circumstances such as #32408 (comment) . If someone were to work around that issue by assuming the existence of extra instructions and manually accounting for it, then their code would be broken if Rust ever stopped generating those instructions. It is this sort of edge case that this defensive wording is designed to address. |
I believe it's possible to observe a difference when |
👍 for stabilizing, though I'd like to make sure that there's consensus on #32408 (comment) . |
Shall we stabilize constrained naked functions? @rfcbot merge |
Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
I'd like to have a whitelist of attributes that are allowed to be applies to a naked function. In the future we may want to lower naked functions directly to LLVM module-level assembly instead of using LLVM's native naked function support. For that to happen rustc needs to be able to manually translate any attributes (e.g. |
Why, because of the epilogue issue? Personally I would much rather see that fixed on LLVM’s end, rather than going to great lengths to work around it in rustc. Or is there another reason? |
Lowering to |
@comex In addition to the epilogue issue, it seems the inlining issue could also be resolved by lowering to @Amanieu I'd be interested in seeing such a list. Obviously part of the appeal of naked functions is that they can be treated just like a normal function in most cases, which includes the ability to be marked with attributes (e.g. |
Most attributes that work with arbitrary functions should work with naked functions IMO. Taking the list here, those make sense to use with naked fns:
The only ones that I think make sense to disallow are those in the "Code Generation" category:
|
codegen `#[naked]` functions using global asm tracking issue: rust-lang#90957 Fixes rust-lang#124375 This implements the approach suggested in the tracking issue: use the existing global assembly infrastructure to emit the body of `#[naked]` functions. The main advantage is that we now have full control over what gets generated, and are no longer dependent on LLVM not sneakily messing with our output (inlining, adding extra instructions, etc). I discussed this approach with `@Amanieu` and while I think the general direction is correct, there is probably a bunch of stuff that needs to change or move around here. I'll leave some inline comments on things that I'm not sure about. Combined with rust-lang#127853, if both accepted, I think that resolves all steps from the tracking issue. r? `@Amanieu`
codegen `#[naked]` functions using global asm tracking issue: rust-lang/rust#90957 Fixes #124375 This implements the approach suggested in the tracking issue: use the existing global assembly infrastructure to emit the body of `#[naked]` functions. The main advantage is that we now have full control over what gets generated, and are no longer dependent on LLVM not sneakily messing with our output (inlining, adding extra instructions, etc). I discussed this approach with `@Amanieu` and while I think the general direction is correct, there is probably a bunch of stuff that needs to change or move around here. I'll leave some inline comments on things that I'm not sure about. Combined with rust-lang/rust#127853, if both accepted, I think that resolves all steps from the tracking issue. r? `@Amanieu`
Request for StabilizationTwo years later, we're ready to try this again. Even though this issue is already marked as having passed FCP, given the amount of time that has passed and the changes in implementation strategy, we should follow the process again. SummaryThe An example of a naked function: const THREE: usize = 3;
#[naked]
pub extern "sysv64" fn add_n(number: usize) -> usize {
// SAFETY: the validity of the used registers
// is guaranteed according to the "sysv64" ABI
unsafe {
core::arch::naked_asm!(
"add rdi, {}",
"mov rax, rdi",
"ret",
const THREE,
)
}
} When the The body of a naked function must consist of a single DocumentationThe Rust Reference: rust-lang/reference#1153 Tests
HistoryThis feature was originally proposed in RFC 1201, filed on 2015-07-10 and accepted on 2016-03-21. Support for this feature was added in #32410, landing on 2016-03-23. Development languished for several years as it was realized that the semantics given in RFC 1201 were insufficiently specific. To address this, a minimal subset of naked functions was specified by RFC 2972, filed on 2020-08-07 and accepted on 2021-11-16. Prior to the acceptance of RFC 2972, all of the stricter behavior specified by RFC 2972 was implemented as a series of warn-by-default lints that would trigger on existing uses of the More recently, the The implementation strategy was changed to emitting a global assembly block. In effect, an extern function extern "C" fn foo() {
core::arch::naked_asm!("ret")
} is emitted as something similar to core::arch::global_asm!(
"foo:",
"ret"
);
extern "C" {
fn foo();
} The codegen approach was chosen over the llvm naked function attribute because:
Finally, there is now an allow list of compatible attributes on naked functions, so that e.g. relevant PRs for these recent changes
unresolved questionsNone |
didn't the "naked asm is implemented using global asm" merge like yesterday? I'd love to have this feature, but we should probably give it a little time to bake after a change like that. |
It was my understanding that |
It will have that time, it would still take months to actually reach stable even when the quickest route is taken (and I suspect it won't be).
the reference says:
But naked functions often do actually return, they just do it from assembly rather than from rust code. That is a fundamental difference, and we decided that |
That's not clear to me though. In #90957 (comment) the generated
Which is it? If the |
The example in that comment seems fundamentally incomplete, because it didn't have a I agree that the example seems wrong in some form or another. |
that example was incorrect, I've fixed it now. Just to be clear: no implicit instructions are ever added to the body: it's just the asm code as written by the user, and the user must make sure that that assembly upholds the contract of |
Would it make sense to have the compiler insert an additional |
Short answer: no, that is exactly what the compiler must never do (insert any instructions that weren't written in the source program). |
@Lokathor I'm not sure I agree. The defined behavior is that the author MUST handle return correctly. Adding a Are you worried about being able to predict code size? |
Yes, being able to predict the exact code size is one reason you'd not want To quote the stabilization report itself:
I think it would be reasonable for debug builds, or some opt-in flag, to add the instruction, if you want to "turn on sanitizers". But it should not be the default in --release builds. EDIT: and "illegal instructions" don't even do anything necessarily on all platforms rust supports. sometimes they are literally nothing but a waste of space for no benefit at all. please no. |
I'm in favor of the simpler policy of never generating additional instructions, but as a datapoint, gcc does actually emit a |
I also agree with this: naked_asm is specifically useful when you want precise control over what instructions are emitted. We don't want the compiler to insert instructions behind your back. |
Some nop padding will be emitted most of the time anyway without the user having control to keep everything aligned. Might as well replace one nop with an ud2 for some extra safety. |
That very much depends on your platform, doesn't it? On old ARM functions are 4 bytes per a32 instruction and sections are usually aligned to only 4 or maybe 8, so there's maybe one Again, I'm not against the possibility of trap instructions being inserted with a flag or configuration, but it should not be required, and I don't even think it should be default. |
I don't think it should be required, but I think we should do it by default when we can. At least with debug assertions enabled. |
Summarizing a few brought up points:
In my opinion rust strives for safety and security even when using low-level mechanism and unsafe code. Adding Would it be a good compromise to put adding the |
@bjorn3 I've not seen I also strongly disagree that I could imagine that in similar cases the machine code has to be composed in some non-trivial way, meaning that disparate, perhaps "malformed" (as per the noreturn requirement, for example) Incredibly niche use case, I know, but this feature is already squarely in the land of niche. It really would be a huge pain to reverse-engineer the extra At the very least, put it behind some flag. Maybe an |
An alternative approach: could a lint be designed to warn on possibly invalid naked asm? I'm not sure how to make this robust, but in principle it would be a |
That's going down an analysis rabbit hole that isn't really feasible (for this case, at least). Not all textual ASM blocks are going to end in That level of understanding of bytecode would be an impressive project in its own right (e.g. unicorn engine levels of complexity), but developing all of that just for linting or automatic I do agree a linting rule here would be preferred, but I don't see how it could feasibly be implemented. |
On x86_64 there is padding after every function to align the next function to 16 bytes.
Adding ud2 after every naked function is indistinguishable from a function which happens to compile to ud2 being placed right after the naked function, so you still get as much WYSIWYG. WYSIWYG only applies to the function itself, not the padding after it.
You have to indicate the size of the naked function if you want to copy it's bytes. This size woulf exclude the ud2 instruction inserted by the compiler and thus not be coppied anyway. |
Linting? what is valid? ret? iret? some future TDX op code extension to enter a TEE? Please! What are you trying to guard against? The whole asm code is unsafe. |
yes, fair point. I was going to make some argument about most
My idea was that it is at least a signal to the programmer to check that their assembly satisfies the constraints: if they deem the lint fired incorrectly, they can just But I agree that it is hard/impossible to actually make it robust (enough) to work in practice. |
That's not entirely true: ELF symbols have a length and typically any padding bytes are not included in this length since they are inserted by the linker. If we append ud2 then this will observably be different when inspecting the symbol table. |
Given that rustc generates inline assembly which defines the symbol, in the inline assembly rustc can specify the length of the symbol to exclude the ud2 instruction. |
Please don't insert UD2's and things like that (or similar things for non-x86 targets) in naked functions. One of the problems with the earlier iterations of naked functions, mentioned above, was that they did this, which made it impossible to have precise control over layout and size. Here's a motivating example: for example, consider two naked functions placed into a specific linker segment; both need to occupy exactly 4KiB (so, page sized on x86) and further they are both supposed to be aligned on 4K boundaries. One may not care what order they're put into the resulting binary, A before B or B before A, but the size and alignment requirements are absolute. They're not copied; instead, one relies on the linker to place them correctly. Adding extra UD2's at the end of these the NF's throws this off. I appreciate the arguments for safety here, and in Rust generally, but naked functions are already incredibly niche: if one is reaching for them one's doing something so far out of the ordinary for most programmers that trying to add a ud2 to catch bugs here feels superfluous. A response to this may be to use module-level assembly instead of naked functions for this sort of thing. While that's valid, there are some properties of naked functions that make them attractive versus global asm: symbol naming and visibility and so on. |
So in the cases when it's already aligned, there's no extra nop to replace, so no ud2 is emitted. So we're back in the territory of "ud2 sometimes, but not other times, and only in debug mode, and changing the preceding body can affect if a ud2 is there, or not". So in such a case, it is functionally impossible to rely on it and therefore going to masquerade a ton of bugs and cause strange behavior in some cases and not others. Some developers will rely on it being there, erroneously of course. But then be surprised when it's not there. Instead of strictly specifying the safety constraints that must hold, which are there already. Also, nops are there to indicate padding. ud2/similar are not padding instructions. It has side effects, and we cannot know how different environments/tools/chips are going to react to them being there, even if they're not hit (e.g. in the case of pipelining). There is a flimsy at best reason for emitting them, and a multitude of strong reasons not to emit them. Adding in such instructions is subverting expectation for some illusion of safety in a place that is very much opt-in and explicitly, objectively unsafe. |
This is a tracking issue for the RFC "Constrained Naked Functions" (rust-lang/rfcs#2972).
The feature gate for the issue is
#![feature(naked_functions)]
.About tracking issues
Tracking issues are used to record the overall progress of implementation.
They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions.
A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature.
Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.
Steps
#[naked]
Unresolved Questions
None.
Implementation history
The text was updated successfully, but these errors were encountered: