-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: "C panic" ABI specifier #2753
Conversation
|
||
By default, Rust assumes that an external function imported with extern "C" | ||
cannot unwind, and Rust will abort if a panic would propagate out of a Rust | ||
function exported with extern "C" function. If you specify the #[unwind] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function exported with extern "C" function. If you specify the #[unwind] | |
function with a non-"Rust" ABI ("`extern`") specification. If you specify the #[unwind(allowed)] |
This verbiage could still be improved
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO | |
The `#[unwind(allowed)]` attribute permits functions with non-Rust ABIs (e.g. `extern "C" fn`) to unwind rather than terminating the process. |
(The current verbiage is bad. Please suggest improvements.)
I don't think we've yet reached consensus on whether to include the (allowed)
. Is it possible for annotations to exist in multiple forms, one that has an argument and one that doesn't? If not, I think we should use (allowed)
for forwards compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for annotations to exist in multiple forms, one that has an argument and one that doesn't?
Yes.
By default, Rust assumes that an external function imported with extern "C" | ||
cannot unwind, and Rust will abort if a panic would propagate out of a Rust | ||
function exported with extern "C" function. If you specify the #[unwind] | ||
attribute on an extern "C" function, Rust will instead allow an unwind (such as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
attribute on an extern "C" function, Rust will instead allow an unwind (such as | |
attribute on a function with a non-"Rust" ABI, Rust will instead allow an unwind (such as |
To be clarified: the annotation can be applied to either function definitions (extern "C" fn...
) or function declarations-without-definitions (extern "C" { fn ...
; I'm not sure if there's a better term for these declarations).
Without this annotation, what is the behavior supposed to be? I'm pretty sure that for maximum soundness, definitions should always abort and declarations should be assumed not to abort. Should declarations without #[unwind(allowed)]
be given automatic "wrapper" functions that would catch unwinding and trigger an abort
, or should they simply propagate the exception as normal?
Thank you all for pushing this forward! Getting this capability is very important to the safety of our Lucet WebAssembly runtime. One case where an attribute might not suffice is for function pointers. For example: fn foo(bar_ptr: usize) {
unsafe {
let bar = std::mem::transmute::<usize, extern "C" fn()>(bar_ptr);
bar();
}
} In the case of Lucet, where we are grabbing function pointers from a dynamically-loaded shared object, we need to be able to make sure these are not |
In case we go the attribute route, my RFC #2602 would provide the syntactic support allowing us to write |
Eesh. That makes me strongly favor deprecating |
Co-Authored-By: Ralf Jung <[email protected]>
Do you feel that Personally, I want to limit the additional spec and compiler complexity and still satisfy the common use cases. As such, currently, I'm not positively inclined towards using editions to change things here (well you might not need an edition anyways; it might be backwards compatible to just add something to all editions (e.g. |
Let me clarify first that I'm not trying to suggest any breaking changes in this RFC! I just think the
My reaction is mostly "wow, how is that annotation becoming part of the type system?" Whereas with an
The only change I'm suggesting for a new edition are to introduce |
If you find I'd like to note that we do have type-system-affecting attributes already, e.g. |
I knew this would be the response to my comment about not liking attributes affecting the type system. 😞 But I'm not sure I understand how
|
It does. #![forbid(safe_packed_borrows)]
// ^-- until the soundness hole goes away and this becomes on by default.
#[repr(packed)]
struct Foo {
bar: u8,
baz: u16,
}
fn stuff() {
let packed = Foo { bar: 0, baz: 0 };
&packed.baz;
//~^ ERROR borrow of packed field is unsafe and requires unsafe function or block
} |
That...also surprises me. How can packed structs be used safely? Edit: that was very much a LMGTFY question; sorry. I read some of the RFC that made it |
Note that while The position in the Longer term, I think a syntax that specifies call/unwind ABI inline and not in an attribute is desirable. Short term, however, we really want to minimally have |
For function pointers, another simpler alternative is to just not add |
As I understand it, the point of this RFC is to only affect the function generation. So function pointers shall all be considered unwind and no nounwind function pointer shall be introduced. It can be done at later time, preferably by refining the ABI definitions. This is conservative approach that can be implemented quickly. Discussion about adding the nounwind flag to the ABI shall be postponed to #2699 or other later RFC. |
# Rationale and alternatives | ||
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO | |
The goal is to enable the current use-cases for unwinding panics through foreign code while providing safer behaviour towards bindings not expecting that, with minimal changes to the language. | |
We should try not to prevent the RFC 2699 or other later RFC from resolving the above disadvantages on top of this one. |
# Unresolved questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO | |
- Since this simple approach does not allow bindings to specify whether they expect callbacks they accept to unwind, we may want to restrict `#[unwind]` to `unsafe` functions. This restriction can be lifted when unwind is made part of the type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this already addressed by your point above that "function pointers shall all be considered unwind and no nounwind function pointer shall be introduced"? I think that statement applies to all FnOnce
types, doesn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
....actually, this should probably be mentioned in the "alternatives" section, now that the primary proposal recommends treating all function pointers as #[unwind(allowed)]
.
@gnzlbg What do you think of this suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BatmanAoD If we switch to extern "C panic" fn
then this issue goes away, because that will also let us handle function pointers easily.
Drawbacks Co-Authored-By: Jan Hudec <[email protected]>
Co-Authored-By: Jan Hudec <[email protected]>
Co-Authored-By: Jan Hudec <[email protected]>
If this turns out to be the case, perhaps we can explore an even weaker variation of this RFC that would drop the emphasis on Rust panics and simply specify the new ABI string in terms of not emitting I would very much like to avoid a situation where we are stuck between either taking on all the downsides to the wrapper approach that I described upthread, running a nightly compiler in production, or even building a custom |
Actually, it's precisely what this RFC should and must do. If we now say that
Then in #2699 we can define how to extend it to handle and/or convert C++ (and structural) exceptions too. I think it should be an extension (i.e. not yet another ABI string), so maybe the more general name would be OK here. |
I don't think so. That paragraph is specifically for changes resulting from specifying unspecified behavior. For example, we say that unwinding through FFI is UB, and that paragraph technically allow us to change it to say that this is now defined to abort (I say technically, because in practice, that paragraph has not allowed this). However, changing the implementation of the Rust unwinding ABI in an ABI breaking way does not specify any behavior. It just changes from one unspecified behavior to another. Also, I don't think it was the intent of that paragraph to allow adding new features to the language with the intent of them breaking later (I don't recall that use case showing up in the RFC discussion).
I have no reason to believe that breaking code that uses "C panic" will be possible in practice because experience shows it isn't. We changed the behavior of unwinding through FFI from "undefined" to "abort" two years ago, where less crates were relying on it, and such a breaking change, which is explicitly allowed per that RFC, was not possible. I have therefore no reasons to believe that such a change will be possible for "C panic" in practice independently of whatever any RFC says. Such a prognosis does not match with the experience that we have already acquired on this issue.
An ABI is a contract between the caller and the callee. The "C panic" proposal leaves the Rust ABI unspecified such that no contract exists, allowing one part to change the contract any time. This can be made to work at one particular point in time, but it does not result in a stable language feature, and code using it can break any time.
Only if those are guaranteed to have a certain ABI. The ABI of the rust unwinding mechanism is unspecified and it is illegal to use it through FFI, so if we discover a better way to do that, we can just ship it to users silently just like we ship new layout optimizations for Users that don't want this could globally opt-in to
I think this is one of the best of all the options explored, and that is essentially what the "C+unwind" / |
In the undefined -> abort case, we had a situation where the folks initiating the change did not know about the potential breakage, much less who specifically would be affected. The change did not offer a workaround for folks whose code would break. There was no subsequent RFC already being drafted to refine the changed behavior By contrast, we now have a very good idea that people depend on this behavior (there's a whole RFC, after all), and who would be affected by breaking Even though these cases both involve the same technical issue, the circumstances around them are completely different. Assuming they would end up in the same place is far too pessimistic to the point of feeling hostile towards would-be users of this RFC. Regarding your interpretation of RFC: Language Semver, it's clear that at this point we disagree. I'll just second @BatmanAoD's request for representatives of the lang team to resolve the disagreement. |
The Rust ABI is unspecified. We change it relatively often, and those changes do not need an RFC. I don't think we ever even had an FCP for them. That's what "we make no guarantees about the Rust ABI" means.
The second time the change was made (a year ago) a workaround was introduced with it (
What's extremely user hostile is telling users that a feature is ""stable"" while actually reserving the right to break code using it any time. Just because you can tolerate such breakage does not mean that all other users will. Preserving user workflows isn't hostility, its user friendliness. Nobody likes to get called friday night because their stable Rust toolchain stopped compiling some correct Rust code. |
How does this impact |
For the "Rust" ABI, if the dylibs are compiled with different Rust toolchains, we provide no guarantees. Iff the dylib links and the ABI changed, then that's UB of the form "calling a function with the wrong ABI", but since name mangling can change, the dylib might not even link. So for dylibs, either one recompiles everything on every toolchain upgrade, or one uses an ABI that's guaranteed not to change in the dylib public API like, e.g., |
Is it actually guaranteed that the But even if |
I believe @gnzlbg's framing is that |
Indeed. Upgrading rustc should not generally require you to upgrade your OS. rustc might eventually drop support for obsolete OS versions, but that's a separate and explicit step, usually taken long after support for newer OS versions is added. Changing the ABI of an existing target means dropping support for the old OS version at the same time as adding support for the new one, which is way too soon. Also, most non-embedded OSes don't make straight-out ABI-breaking changes, since they value being able to run old binaries. Some BSD variants are an exception, e.g.:
|
How to deal with platforms that do ABI breaking changes is a complex topic, e.g., RFC: target-extension, libc#7570: How to deal with breaking changes on platform ?, Rust Internals: Pre-RFC: target extension (dealing with breaking changes at OS level). All platforms (Linux, Windows, *BSDs) do ABI breaking changes relatively often, but not all ABI breaking changes are equally severe, and each platform uses a different bar. Some don't need any handling, some can be handled without users knowing, some require new target triples, and some require new target triples, two versions of every system library, being very careful what you link, etc. |
We discussed this RFC in the lang-team meeting last Thursday. We generally agreed on a few things: General directionWe like the proposal of creating a distinct ABI for "foreign functions that may unwind". It seems to be a good solution that enables forward progress on inserting "abort on unwind" boundaries while still giving people who do want unwinding a migration path. It does not however provide a complete solution: obviously this RFC leaves a number of details unspecified. For example, what sort of unwinding mechanism is permitted across these function boundaries? What happens if unwinding actually occurs? Semver and how it relates to this RFCMuch of the discussion on this thread centers around just what is being guaranteed as stable code here, along with a dash of what constitutes a SemVer breaking change. We felt that it would be ok to declare that certain details are not yet specified, and that those details may change across compiler revisions without it being considered a breaking change -- but that we should be very clear about what those details are. Moreover, there was concern about "de facto stabilization" -- basically, being "locked in" and unable to change details because of the amount of code relying on them. The main way we plan to avoid this is by continuing to make progress on the remaining issues. The primary driver of de facto stabilization would be letting the problem continue to sit for years, thus giving time for implicit dependencies to develop. If we are able to continue making progress and actually enable a sufficient number of use cases, then we have less to fear (though we do not commit to resolving all the uncertainty; our goal is only to support specific key use cases). Proposed actionsTo that end, we discussed doing the following:
Speaking personally, I hope we can use this "working group" as a lang team experiment in handling more complex design initiatives like this. That is, I'd like to create a repository where we can collect design history, and where we can try to focus the conversation in more narrow ways. (Speaking as someone who spent several hours re-reading the comments on just this RFC -- most of which were out of date! -- I can tell you it is very hard to get up-to-speed on this topic based on the existing RFC threads. But I think you already knew that.) One of the key things I think we should start with is clarifying which details we aim to resolve and in what order. OK, also speaking personally, I'm hoping to move away from the term 'working group', as I think it engenders confusion between this sort of working group and (say) the domain working groups such as the embedded working group. I think perhaps the term "project" is more appropriate (as the embedded team has proposed). We'll see. C panic vs C unwindThis is obviously not just a naming issue. Specifically, "C panic" is meant to signal "a function that unwinds with the Rust unwinding mechanism" and "C unwind" with "a funtion that unwinds with the 'native' unwinding mechanism, as given by the current target" (today: DWARF on linux, SEH on msvc). Of those present at the meeting, I believe the consensus was that we would prefer C unwind semantics. The reasoning is that it is clear that we will want some way to declare "native unwinding mechanism" eventually, and so we might as well start with that, especially because the native mechanism and the Rust mechanism are presently the same, so we're not introducing any kind of overhead in doing so. (We could always add a "C panic" ABI later, in other words, if we felt the need.) One other point is that we didn't know of any particular use case for "C panic", whereas "C unwind" has obvious use cases. (Note that the behavior for "C unwind" would be defined by the target, and hence subject to change should target definition or compilation options change, but of course any such changes to existing targets would have to be executed very carefully. This would be analogous to adjusting our defaults to require more modern chipsets and other such changes.) However, we also felt that this was a largely moot distinction. Obviously the two are the same now. Whatever we say today, if we decide in the future to alter the Rust unwinding and native unwinding mechanisms, then we could decide then what the meaning of "C panic" will be. (That said, as @gnzlbg and others have argued, obviously the "most compatible" thing would be for it to continue with the meaning it has today, which is effectively the native unwinding mechanism, since by definition that is what all existing code will be expecting.) (It's also worth pointing out that @joshtriplett wasn't present at the meeting, though they've approved this message.) On terminologyOne final point. We observed that there are a lot of subtle distinctions at play here, for example between "undefined behavior" and "not yet specified" or "target dependent" behavior (and perhaps further between things like "Rust UB", "LLVM UB", and "Spec-only UB"). In terms of this RFC, it makes sense to try and carefully define all terms used. But thinking a bit more broadly it would also be good perhaps to try and document a consistent terminology that we can use across many discussions (and also to explicitly define relationships and differences between this terminology and that used in other language specifications). |
I am not convinced of that. Sure, even with a "minimal RFC" code relying on "C+unwind" to match the native unwind ABI still relies on implementation-specific behavior and can thus break any time -- but that's still way better than UB! Let's take one step at a time?
Code relying on the unwind mechanism will be as (in)correct as code relying on (I think that's also the gist of @nikomatsakis' message.)
Talking just in terms of naming, as in "panic" vs "unwind", note that there is precedence in |
The next iteration of this RFC kind of assumes that this isn't the case, and nobody has raised this issue yet, so could you explain how that could break?
Since we haven't been able to break existing code that relies on this for landing soundness fixes, what makes you believe that we will be able to break code using this feature for landing performance improvements ? |
I do not understand the question. I think I basically just paraphrased what @nikomatsakis wrote, quoting: "We felt that it would be ok to declare that certain details are not yet specified, and that those details may change across compiler revisions without it being considered a breaking change"
"Haven't been able" is putting it wrong IMO. When considering the trade-off between breaking that code and not breaking it, right now the "not breaking" seems like the better option. That might change as new evidence surfaces. And certainly whatever the trade-off is for having abort-on-panic shims has little to do with whatever the trade-off is when someone finds a new amazing unwind ABI that we'd like to use for panics instead of the native one. |
By "details that are not yet specified" I understood that they mean that such an initial RFC might say things like "If a native unwind unwinds through a function frame containing types with destructors, the behavior is undefined" and whatever the implementation does when that happens might change between toolchain versions, but that's ok because we are talking about undefined behavior. That's different from claiming that "code relying on "C+unwind" to match the native unwind ABI still relies on implementation-specific behavior and can thus break any time", since that sounds to me that it would be impossible to write code that uses a minimal "C+unwind" correctly to solve any kind of problem reliably. E.g. code using a minimal "C+unwind" feature that does not support unwinding across stack frames with types with destructors can just be carefully written to avoid such types.
Agreed that the trade-offs are very different. My opinion of the current trade off situation is that breaking code to land a critical language fix is a better reason for breaking code (independently of whether it is worth breaking that code) than doing so for landing a performance improvement. I don't really see us ever breaking any working code for landing any performance improvements. |
I think Niko was pretty clear that this is not about making things UB.
Your notion of "impossible" seems to be such that relying on implementation-specific (unstable) behavior is impossible. That's very black-and-white. There is a huge gray area between "stable and well-defined" and "UB", a position that you seem to ignore (based on the fact that I have to repeat myself right now). There totally is a world where we say that Maybe that's not reliably enough for you, but it's certainly more reliable than UB; so people that are willing to risk UB, I expect, will be fine with this level of reliability. I feel like you are making the perfect the enemy of the better-than-right-now. |
And that removes undefined behavior from the language.
That isn't my notion of "impossible". |
But in such a world, I cannot call In particular, one of the use cases discussed in this thread was a JIT compiler. If the unwinding strategy can change, then the code generator needs to change accordingly depending on the compiler's version. I cannot see how such a situation is compatible with Rust's stability guarantee. That's why I think that if we ever want to have this feature stabilised we need to guarantee that, for a given target, the unwinding strategy of |
IMO if that is the standard, we should not block abort-on-panic shims on a solution. That standard requires a huge level of commitment. That makes the time window so big that it is important to avoid UB in the mean time. The compromise was "we'll get in something simple quickly and only then re-enable the shims". If there is enough opposition to a simple solution to make that impossible, we should just re-enable the shims now. |
We only have one actual Rust compiler. Therefore, I think "implementation-defined" vs. "stable" would be a distinction without much practical difference in this case. I do think that rustc / t-compiler / t-lang should be free to change the unwinding strategy including for existing important tier-1 targets if there are benefits we want (perf, improvements in C/C++, ...). |
The mere possibility of changing the unwinding strategy, in some unspecified, but possibly totally incompatible way, has been mentioned many times and is paralyzing all discussions about unwinding. It's impossible to argue with the hypothetical future (and on hypothetical platforms).
I'm not aware of any plans to change unwinding. And I'm struggling to imagine why and how Rust could make a change to unwinding so massive and disruptive, that it couldn't support unwinding over C any more. Unwinding over C stack frames is really simple. |
It's not so hypothetical as it may sound. C++ is currently considering a new unwinding strategy similar to (but more optimized at the ABI level than) returning The reasoning is that current unwinding strategies are so unpredictable and costly when they are used that many codebases simply forbid unwinding, and the For more information: https://wg21.link/p0709r2 |
@ralfj the “C with native unwinding” ABI is specified by the platform and
allows interfacing with thousands of libraries on each platform. That
standard level of stability that you appear to be against is the same
standard that we follow for extern „C“.
If you are really proposing lowering the standard for “C unwind” then
please argue appropriately. Changing the ABI that “C unwind” implements
would require a level of effort comparable to changing extern “C” to
implement the Rust ABI (reimplementing all C libraries in Rust and updating
the whole ecosystem). I don’t see how 6 weeks time would be enough for
that, now who would do that work, and personally I would never want to have
a dependency that uses such a feature.
|
As much as I like the C++ "static exceptions" proposal, I don't think it's actually relevant here. It's not proposing a new runtime unwinding mechanism or implementation strategy. Rather, it's proposing a new semantics for If anything, the fact that "current unwinding strategies are so unpredictable and costly when they are used that many codebases simply forbid unwinding", combined with the fact that C++ is seriously considering such invasive changes to dramatically reduce the usage of unwinding in practice, to me indicates that the trend in unwinding is not to implement it better but rather to minimize its usage. So at least for now, I strongly agree with referring to a potentially desirable future unwinding implementation as "largely hypothetical", and that we need to find a less paralyzing way of discussing this. Personally, I don't see the problem with just saying that this scenario is so unlikely that, if it ever does come to pass, it's fine to use "heavy" migration mechanisms like a new edition or an explicit |
I am aware that for "normal" calls, we have much better support than what we currently seem to be willing to provide for unwinding. That seems okay though, unwinding is much more tricky / less engraved in stone than the "normal" call ABI (see the other comments re: C++ changing what they do). Also I never said I am against proper stability. I just observe that we / the lang team do not seem ready to make the strong kind of commitment which providing that kind of stability requires. Of course the best solution would be to have something fully stable, but that just seems enough out of reach that I strongly feel it is unreasonable to block landing the abort-on-panic shim on that.
Again, the discussion here is about a stop-gap that just needs to be "good enough" for users that currently rely on UB. My proposal (not really mine, it's the original intent of this RFC as I interpret it) raises the standard compared to the status quo. Surpassing UB in terms of reliability should really not be hard. But then each time we try that, some people make the perfect the enemy of the good, stalling progress. :(
So maybe here's a different angle for this: maybe we can say that the At that point So that would avoid the ABI suddenly changing under the feet of people using Now I am not an FFI author myself (I am in this because I want to fix existing crates relying on UB), so... does this make sense? (And maybe I am just paraphrasing what @Ixrec was saying, not sure.) |
The proposed C++ change is an addition which with an opt-in repurposes some C++ syntax locally, but doesn't change the implementation of the existing unwinding mechanism. If C++ did it, it wouldn't affect Rust. If Rust decided to replace existing mechanism with the new C++ approach (which is basically Is there anything stopping Rust from converting any future unwinding mechanism to an old one (or even setjmp/longjmp) on boundaries of "C unwind" FFI? |
That's not a different angle. That's exactly how "C unwind" is currently being proposed. I don't know to what other proposal you are referring to in your other comments (apparently not this one), but I'm glad to see that you have arrived at this conclusion.
I don't think the default panic strategy (e.g. from
Adding an
Under this proposal Rust code can only be unwinded using the Rust unwind ABI (at the "C unwind" ABI boundary the native mechanism must be "translated" to the Rust one), so
Yes. The only main realization you are missing is that, were the Rust implementation of the |
Dear world, I'm writing with a quick update. As I mentioned in my update comment, the plan of record here is as follows:
In fact, we've already created a repository for tracking the progress of this group. Right now that's a personal repository of mine, but it will move to rust-lang once the group is "official". The details there are still a "WIP" but hopefully it gives you some sense of the roadmap and the steps we plan to take. We're also chatting in For the time being, I'm going to go ahead and close this RFC because this discussion thread has gotten long and is quite distracting. Our intention is to open a fresh RFC that declares the above steps. To be clear, I expect that RFC to move fairly quickly through the process -- I think we've had plenty of discussion on the topic and the tradeoffs involved in different strategies. |
Update 28 October 2019: The effort to provide this language feature has been spun into a working group. See the announcement RFC.
Some preliminary conversation and drafting took place in rust-lang/rust#63909.
There are some remaining "TODO" sections.
Rendered