-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stacked Borrows vs inner pointers into "stable-deref" types #194
Comments
cc @RalfJung |
On first glance this looks like a duplicate of #148. Can you confirm?
Finding trade-offs like this is exactly why we have Stacked Borrows implemented in Miri. If this is really a duplicate of the above, it will be very interesting to figure out which optimizations it will cost us to support code like this. Maybe we end up needing some kind of annotation for self-referential structs. |
I don't see anything self-referential here. The code is just taking a pointer to the contents of the Box and expects this pointer to remain usable as the Box itself is moved around. |
So the If I modify the example to involve mutation, this can actually cause miscompilation: use std::cell::Cell;
fn helper(val: Box<Cell<u8>>, ptr: *const Cell<u8>) -> u8 {
val.set(10);
unsafe { (*ptr).set(20); }
val.get()
}
fn main() {
let val: Box<Cell<u8>> = Box::new(Cell::new(25));
let ptr: *const Cell<u8> = &*val;
let res = helper(val, ptr);
assert_eq!(res, 20);
} At the time of writing, the assertion passes in debug mode but fails in release mode. I believe the modified example still corresponds to what What's happening is that the Surprisingly, In this example, though, the misoptimization happens because we've violated the Edit: Interestingly, my example does not trigger a miri error. |
I looked into it, and sure enough, you can write an equivalent program without unsafe code using the APIs provided by It should be possible for |
I definitely think it's the same underlying reasoning that would make this and self-referential (i.e. #148) sound. It boils down to:
The problem is that this is a completely implicit reborrowing, effectively from just having the This is the exact behavior that
The result is that |
Another update: It's also possible to violate the This is all you need to violate the condition: pub async fn test() -> i32 {
let a: Cell<i32> = Cell::new(0);
let mut b: &Cell<i32> = &a;
SomeFuture.await;
b.set(100);
a.get()
} It's desugared into something like struct Test {
state: ?,
a: Cell<i32>,
b: *const Cell<i32>,
}
impl Future for Test {
fn poll(self: Pin<&mut Self>, ...) {
loop {
match self.state {
State1 => {
self.b = &self.a;
...poll on SomeFuture...
},
State2 => {
(*self.b).set(100);
Poll::Ready(self.a.get())
},
}
}
}
} Here, On the other hand, if we enter I was sort of hoping to drop a bombshell that might affect async/await stabilization, but no such luck. ;) For a long-term solution, we need to avoid marking |
@comex I think it is worth mentioning that in your example here |
Oh, I forgot what owning-ref does. The pointer into the box lives next to the box, not inside it. I was also confused by "the box is moved", because the box has to stay in place for the pointer to remain valid. But the However, in terms of Stacked Borrows, it doesn't matter where the pointer is stored, whether inside the box or outside of it. What matters is that an inner pointer gets created, then an outer pointer gets used (for a move of a box or a reborrow of a mutable reference), and then the old inner pointer gets used again. So the pattern is the same. |
Note that right now, Stacked Borrows special-cases
Nice example! Also the one for Interesting indeed. Probably this is because all raw pointers are equal for Miri (at least currently; I hope to change this eventually but without
Exactly. Maybe such things "just" need to be specially marked to the compiler.
The compiler doesn't know about the preconditions, so violating the preconditions never explains UB (from an abstract machine perspective). The reason it is UB is that aliasing constraints are violated, and as @comex demonstrated owning-ref can do the same. So should we close this now in favor of #148? I think we established that they are definitely closely related. |
Of course it does? All |
I think it is just ontologically wrong to say "this is UB because the precondition is violated", the IMO more correct view is "this is a precondition because otherwise there is UB" (and the UB is caused by the aliasing model that's part of the abstract machine). Following your argument "the calling code violates its pre-conditions, so the program has undefined behavior", calling |
I claimed that:
I'm not sure how you end up arguing that there are pre-conditions that can be violated without invoking undefined behavior (e.g. I did not make any claims about the source of the undefined behavior, I certainly did not intend to make the tautological claim that this is UB because helper has a precondition due to this being UB. When I read the thread, I got the impression from @comex comment that this might be a mis-optimization. I just wanted to point out that this is intended behavior. |
In the case of More generally, for a pointer I don't know how this would generalise to |
This phrasing heavily implied a general entailment of the form "precondition violated implies UB". Specifically, the "so" in your sentence indicates a causal connection. If you didn't mean to say that, take this as feedback for trying a clearer wording next time. For @comex' example, it also doesn't really matter whether that function is marked
Ah, the term "miscompilation" is the culprit here. Do we have a good word for "code that has UB that the compiler actually exploits", in contrast to "code that has UB but still works as intended by the programmer"? We often say "miscompilation" but that sounds like a compiler bug, which it isn't if the code really does have UB. As far as I can tell, there was never any disagreement that that code is wrong according to our current aliasing rules, and so the compiler is in its right to "miscompile". @comex was not trying to argue the UB away. The point of that comment was that this code does the exact same thing as what you can do with owning-ref using only safe code. So either owning-ref is wrong or our aliasing rules are wrong. Cc @Kimundi (owning-ref author); I also reported this upstream.
We might... :/ |
FWIW, we've attempted to address this in ndarray, where it is using a Vec for allocation/deallocation (partly historical purposes, now for zero-copy transfer of data between Array and Vec). The description of the problem is this:
and solution this:
It feels like a non-issue, because I can't find anywhere that we "interleave" read/writes through the Vec with reads/writes through our own "head pointer". The only places where we write to the Vec (for example in |
To clarify, you are saying this code should be fine and Stacked Borrows should not complain, for the reasons given? |
If this code is the ndarray code in question: Yes, but I don't know the model that well, so it's just a hunch. |
I still feel that if people / libraries (i.e., I am not talking about the compiler-generated unsugaring for generators and the That is, a Otherwise, its pointee can be aliased, and depending on the usage, either by a pointer that assumes immutability, so as long as the
|
Any rust container like `Box<T>`, `Vec<T>` or `String<T>` internally contains a `Unique<T>` pointer, which communicates to the compiler that this container is the owner of that memory location and all access goes through that pointer. See rust-lang/unsafe-code-guidelines#194 for details. Passing out a pointer to the underlying buffer to sqlite could cause UB according to this definition, at least if someone else accesses the buffer through the originial pointer. To prevent that we temporarily leak the Buffer and manage the pointer by ourself. Additionally this change introduces a way to construct the `BoundStatement` as early as possible as part of the `BoundStatement::bind` function, so that all cleanup code can be concetracted in the corresponding `Drop` impl
Any rust container like `Box<T>`, `Vec<T>` or `String<T>` internally contains a `Unique<T>` pointer, which communicates to the compiler that this container is the owner of that memory location and all access goes through that pointer. See rust-lang/unsafe-code-guidelines#194 for details. Passing out a pointer to the underlying buffer to sqlite could cause UB according to this definition, at least if someone else accesses the buffer through the originial pointer. To prevent that we temporarily leak the Buffer and manage the pointer by ourself. Additionally this change introduces a way to construct the `BoundStatement` as early as possible as part of the `BoundStatement::bind` function, so that all cleanup code can be concetracted in the corresponding `Drop` impl
Any rust container like `Box<T>`, `Vec<T>` or `String<T>` internally contains a `Unique<T>` pointer, which communicates to the compiler that this container is the owner of that memory location and all access goes through that pointer. See rust-lang/unsafe-code-guidelines#194 for details. Passing out a pointer to the underlying buffer to sqlite could cause UB according to this definition, at least if someone else accesses the buffer through the originial pointer. To prevent that we temporarily leak the Buffer and manage the pointer by ourself. Additionally this change introduces a way to construct the `BoundStatement` as early as possible as part of the `BoundStatement::bind` function, so that all cleanup code can be concetracted in the corresponding `Drop` impl
We have separate issues to discuss the design of |
Links for those things: |
When running Miri on owning-ref, I ran into an error with this (minimized) snippet (playground):
which errors with:
In this snippet, we're creating a raw pointer into a Box, moving the Box, then dereferencing the raw pointer. The owning-ref crate (and more generally, anything relying on stable_deref_trait) relies on this working. In fact, this is the entire reason for the existence of
stable_deref_trait
.The error message is a little confusing. However, I believe that the read from
ptr
causes theUnique
item on the stack to be disabled (since theUnique
is above theSharedReadOnly
that grants access).Does it make sense to consider this kind of behavior UB, given that
stable_deref_trait
andowning-ref
(and probably other crates as well) rely on it?The text was updated successfully, but these errors were encountered: