-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minimum guarantees regarding UnsafeCell
#495
Comments
Note: I think you're using "live" when you actually mean something closer to "active," which obfuscates the point you're trying to make. "Live" has a specific meaning used by the NLL RFC; specifically, that the reference is potentially read from read through in the future; that the borrow region/lifetime is live. Thus in both the The unifying rule is that a byte may be borrowed as As for reads, I'm reasonably sure we've documented somewhere that I do think all cases of ref casting used in std amount to |
Yep, sounds like I mean "active"! I've updated the original text to use that term.
Do you know if a rule like this is documented anywhere normative (ie, not in not-yet-accepted proposals like stacked borrows/tree borrows)? IIUC, you're using the term "permitted" in the same sense that Miri uses it when tracking "permissions" on an allocation (e.g., treating Also, is this an "if and only if" or merely an "only if"? In other words, is it the case that "I am permitted to mutate this byte" is sufficient to guarantee that "I am allowed to borrow this byte as Finally, one concern with this rule: Doesn't this imply that going from
My concern with this case is that it seems possible to make this unsound. For example, imagine that I |
Here I'm use the term mostly informally to include both mutable references and shared references to what the
I carefully worded my rule to only apply to The informal implication is only that
That's defined as UB, because it causes a data race, and data races are defined as UB. The exact same as if there were never a reference created, one thread did |
Ah I see. Unfortunately what we need is some set of behavior which is guaranteed sound (so that we can use it as the precondition of a safety proof). To make this more concrete: If someone could demonstrate a proof that Alternatively, if it's not possible to write such a proof (because the necessary preconditions aren't guaranteed by the Rust Reference or by the standard library documentation), we'd like to know what the gaps are so we can work on adding them as guarantees to the language. |
Note that the std docs explicitly call out that it is considered unsound to project from shared That could imply that the same holds for More generally speaking, std API can only be witness to that something is possible, but std could always have magical instructions not replicable by user code. |
Definitely agreed. But do you believe that it's not possible to prove sound given any of the other guarantees provided by the language? If so, would opsem be comfortable adding the necessary guarantees to make it possible to complete such a proof? |
cc @RalfJung ; I assume you may have thoughts here |
Yeah I may; I also have quite a big backlog of Rust things so I can't say when I'll have time to take a look at this thread. |
Note sure if that was a good idea, it's usually better to keep one-topic-per-thread/issue...
I consider that a Stacked Borrows bug that must be fixed for the official aliasing model. Mixing
Rust does not do C-style TBAA. So yes I consider that pretty much settled. (Not just for UnsafeCell but for all references.)
Note on that example:
You're only using the UnsafeCell by-value here, so
Same as above.
That's basically "Rust has no TBAA except for its reference guarantees". I don't know if this is officially documented somewhere. I don't even know how to document this. A list of "all the UB Rust does not have" could become quite long. ;) Note that on top of what you said here, it is obviously also crucial to not mutate any of the parts that an active shared reference points to outside of an UnsafeCell! And you cannot violate uniqueness of
I am confused by your / @CAD97's use of the term "active" here. I would say a mutable reference that has been reborrowed is still active. It just has a child now. It becomes inactive when it gets invalidated y a conflicting access.
I would say that falls out of the fact that
They are trivially sound because |
On "active:" terms are hard, and there aren't enough words. NLL uses "live" for roughly "potential access later in the CFG" and TB uses "active" for roughly "hasn't been invalidated, would be DB to write through." There isn't a good word for the stronger property of "leaf active," roughly "spurious operations are acceptable and don't change the tag state." In any case, while the disclaimer stating that |
I also didn't really need such a word before, so this never seemed like a problem to me. For the "disagree about where the UnsafeCell are" question, I think "active (including possibly with children)" is the notion we want, isn't it? Though I think I'd put it more precisely as: it is UB to derive
Also worth noting that this is a form of library UB -- it currently seems very likely that there will be no language UB from side-stepping |
Amazing, thank you for this! It'll take me some time to wrap my head around everything, but this is very helpful. One thing that I would like to get clarity on, since it seems like the answer is "this is sound", but I'm not entirely sure that we all agree on why it's sound, or on how to prove that it's sound: See the following code: /// This is a function exposed in a crate's public API.
///
/// # Safety
///
/// The caller promises that `T` and `U` have the same layout and bit validity.
pub unsafe fn as_mut<T, U>(u: &mut UnsafeCell<T>) -> &mut U {
let cell_ptr: *mut UnsafeCell<T> = u;
// SAFETY: TODO
unsafe { &mut *cell_ptr.cast::<U>() }
} Three questions:
At the end of the day, what we're really after is the text of that safety proof. If we can't write it, then we can't commit the relevant code to zerocopy. |
This is sufficient to avoid immediate UB, but not sufficient for soundness. Types with the same layout and bit validity can still have different safety invariants (e.g., But other than that, isn't your function a straight-forward composition of You don't have to justify the soundness of |
Yeah, I'm ignoring safety invariants here to keep the question simple.
I agree that this is in principle straightforward, but:
I know I'm being incredibly pedantic, but my point with this exercise is that we're trying to write a soundness proof that is correct only in virtue of what's documented to the public, not in virtue of what is intended to be documented in the future, what's commonly understood to be true, etc. Often, when we go through these exercises, we discover that things that are "obviously true" are not actually documented anywhere. If you were writing that safety comment, what would you write? My hope in asking is to discover whether sufficient axioms are already provided by the documentation, or whether we need to add things. |
It's not lots, and this is being fixed rust-lang/rust#124251. Considering how often this example is trotted out, I actually wonder if people will have another similar example after that PR is merged. There's reason to think we will eliminate as many of these as possible over time. The sort of implicit reliance makes it hard to detect when third-party code is accidentally relying on such details. Note that before fat pointer layout, the example everyone used was uninitialized arrays of |
Yeah, I'm ignoring safety invariants here to keep the question simple.
You cannot talk precisely about soundness without talking about safety invariants. That is literally the core concept that all of soundness rests on.
|
Since the function is |
the fact that UnsafeCell::get_mut exists isn't a guarantee that we're allowed to do the same thing.
You don't have to do the same thing, you can just call that function and then convert &mut T to &mut U based on whatever makes that sound for you - an argument that does not involve UnsafeCell at all. That's what I keep saying and it is why I do not understand your question. ;)
Where is it documented that "UnsafeCell only makes a difference below &"?
In the UnsafeCell docs:
"Note that only the immutability guarantee for shared references is affected by UnsafeCell. The uniqueness guarantee for mutable references is unaffected."
"there is no magic whatsoever when dealing with exclusive accesses (e.g., through an &mut UnsafeCell<_>): neither the cell nor the wrapped value may be aliased for the duration of that &mut borrow. This is showcased by the .get_mut() accessor, which is a safe getter that yields a &mut T."
|
For an unsafe function, there is no clear definition of soundness. But insofar as it may be a generalization of regular soundness (of safe functions) that takes into account extra preconditions, it is still entirely unclear to me how one could talk about it without considering safety invariants.
Soundness is not "this call does not cause UB inside the function body". Soundness is "this function can be arbitrarily composed with other sound things and the result will never have UB". This idea of composition is key to the notion of soundness, and it requires safety invariants.
|
Hey, sorry for such long radio silence on this! I was traveling to a work conference and had to go into the conference right as you sent you last message, then lost all mental context and it fell off my radar. I think that I understand the model better now, specifically that (in the absence of TBAA):
*Except for in Stacked Borrows, where it’s considered a bug |
That does sound right. So... what is the remaining thing to do in this issue? It began with a list of questions, some subset of which has been answered. I don't know if there is anything actionable that's left to do. |
I believe all of the outstanding questions have been resolved! Thanks so much for spending the time to help me work through all of this. It's been incredibly helpful! |
As part of google/zerocopy#251, we're trying to teach zerocopy to reason in more precise ways about
UnsafeCell
s. I'm making this issue as sort of an omnibus issue to clarify a number of questions regardingUnsafeCell
s. As always with zerocopy, I'm only interested in what can currently be promised, not what might be promised in the future.In particular, my questions concern when it's sound for different code to disagree on whether a given memory region is covered by an
UnsafeCell
.Constraints
Based on Miri's behavior and APIs that already exist in the standard library, we can observe some constraints on what code is sound/unsound. In particular:
It's unsound (under stacked borrows) to have multiple active references which "disagree" about whether a memory region is covered by an
UnsafeCell
Status: can't assume sound
Prior art: #455, #303
As soon as there are multiple references which are "active"* or "not shadowed" or whatever the proper terminology is, and these references disagree about whether a given memory region is covered by an
UnsafeCell
, this is insta-UB according to stacked borrows even if the references are never used for anything. For example:*Previously, this post used the term "live", but @CAD97 pointed out that "active" is the more accurate term.
Miri output
It's still unsound even if the overlap is only "partial"
Status: can't assume sound
In this example, only some of the bytes go from covered by an
UnsafeCell
to not covered.Miri output
It's no longer unsound if the
UnsafeCell
region is of zero sizeStatus: might be able to guarantee sound
In other words, it's not the mere presence of
UnsafeCell
that matters, but rather the bytes that are "covered" byUnsafeCell
s. For this reason, ZSTUnsafeCell
s have no effect; this code is accepted by Miri:It is sound to have multiple references disagree about what type a memory region has so long as those references agree on
UnsafeCell
sStatus: probably guaranteed sound
For example, in both of the following cases, while the types are not the same, the byte ranges covered by
UnsafeCell
s are the same.It is sound to have multiple references disagree about
UnsafeCell
s so long as only one of them is "active"Status: guaranteed sound
In this example, I believe that what's happening is that
u_mut
becomes inaccessible so long as_c_mut
exists, so there's no way to exercise theUnsafeCell
disagreement. Not only is this accepted by Miri in practice, it's also codified in theUnsafeCell
methodsget_mut
(stable) andfrom_mut
(unstable nightly feature).It is sound to read a non-
UnsafeCell
location as anUnsafeCell
viaptr::read
Status: unclear
I'm not sure how to think about why this is sound, and whether it's guaranteed to remain sound.
It is sound to read an
UnsafeCell
location as a non-UnsafeCell
viaptr::read
Status: unclear
I'm even less sure how to think about this one - my intuition tells me that Miri should reject this, but it doesn't.
Guarantees
I would like to be able to rely on the following guarantees. For each guarantee, my questions are:
Sound given
UnsafeCell
agreementIn particular: it is sound to create references to the same memory which disagree about type so long as they agree about which bytes are covered by
UnsafeCell
. It is sound to use such references to perform loads and stores (assuming those loads and stores do not violate the referent types' bit validity).These analyses only consider active references
For the purposes of the previous soundness guarantee, only "active" (i.e., non-shadowed) references are considered.* This allows us to prove that code which reborrows a
&mut T
(with noUnsafeCell
s) as a&mut U
(withUnsafeCell
s) or vice versa (e.g.,UnsafeCell::get_mut
) is sound.I am aware that the formal definition of "active-ness" is up in the air, and subject to models like stacked borrows and tree borrows. I am assuming that there is some way we can formally define borrows so that we can get some minimum guarantees about this section despite not having completely agreed on the full model.
It is sound to read a non-
UnsafeCell
location as anUnsafeCell
viaptr::read
Given
&T
whereT
doesn't contain anyUnsafeCell
s, it is sound to convert&T
to*const U
(whereU
does containUnsafeCell
s) and useptr::read
to read aU
out of that memory location (assuming that this doesn't violateU
's bit validity, read uninitialized bytes, read unaligned memory, etc etc).The text was updated successfully, but these errors were encountered: