-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mutable references vs self-referential structs #148
Comments
In some sense there are two problems here:
If we still had |
If we ignore the fact that references are wrapped inside the fn main() {
let mut local = 42;
let raw_ptr = &mut local as *mut i32; // create raw pointer
let safe_ref = &mut local; // create a reference, which is unique, and hence invalidates the raw pointer
println!("{}", unsafe { *raw_ptr }); // UB
} I explicitly wanted that code to be UB. While raw pointers aliasing with each other is allowed, having a single raw pointer and a mutable reference acts pretty much like having two mutable references -- after all, between the two of them, it's enough if one side does not allow aliasing, since "aliases-with" is a symmetric relation. If we replace So to fix this problem, I see two general options:
|
@RalfJung: Could this be resolved by turning This would likely require turning Pin into Compiler Magic(tm) that is hard to implement for user-defined pointer types, though, and I can imagine backwards-incompatible implications in other areas such as changing the safety contract for extracting mutable references from Pin... EDIT: Ah, I see that you explored a |
You don't need |
The way I understand how Pin works, the self referential references borrow their validity from the one in the pin. So under SB, they're retagged from the pin. The most minimal change that would make pin play nicely with SB I think would be to somehow keep the interior references valid even while the unsafe map_unchecked_mut reference is used. Could it be possible to not retag for pin's map_unchecked_mut and only pop tags when the reference is used to access that memory location? (This is super spitball, sorry) |
Correct. However, then the pin itself eventually gets retagged, and that kills the self-referential borrows. This currently happens all the time, but could be reduced to just "inside the
Well it'd need a magic marker or so.
That's basically option (2) from above. |
#194 indicates that this is a problem with self-referential structs in general, not just self-referential generators. That's not surprising, so I generalized the issue title. |
@RalfJung: Related to your example of:
While working on pin-project, I wanted to write this code: use std::pin::Pin;
struct Foo {
// #[pin]
field1: u8,
field2: bool
}
struct FooProj<'a> {
__self_ptr: *mut Foo,
field1: Pin<&'a mut u8>,
field2: &'a mut bool
}
impl Foo {
fn project<'a>(self: Pin<&'a mut Self>) -> FooProj<'a> {
let this = unsafe { self.get_unchecked_mut() };
let __self_ptr: *mut Foo = this;
let field1 = unsafe { Pin::new_unchecked(&mut this.field1) };
let field2 = &mut this.field2;
FooProj {
__self_ptr,
field1,
field2
}
}
}
impl<'a> FooProj<'a> {
fn unproject(self) -> Pin<&'a mut Foo> {
unsafe {
let this: &mut Foo = &mut *self.__self_ptr;
Pin::new_unchecked(this)
}
}
}
fn main() {
let mut foo = Foo { field1: 25, field2: true };
let foo_pin: Pin<&mut Foo> = Pin::new(&mut foo);
let foo_proj: FooProj = foo_pin.project();
let foo_orig: Pin<&mut Foo> = foo_proj.unproject();
} The key thing here is the
However, Miri flags this as UB for (I believe) a similar reason as your example - creating a mutable reference to a field ends up transitively asserts that we have unique ownership of the base type. Therefore, any pre-existing raw pointers to the base type will be invalidated. Allowing this kind of pattern would make
This is especially useful when working with enums - here's a real-world example in Hyper. However, I can't see a way to avoid creating and later 'upgrading' a raw pointer when trying to safely abstract this pattern in |
With It is the same kind of reasoning as to why this could be sound: in the self-referential case, it's the whole pointer existing that invalidates the part pointers, though their usage regions are disjoint. In the |
Based on rust-lang/unsafe-code-guidelines#148 (comment) by @CAD97 Currently, the generated 'project' method takes a 'Pin<&mut Self>', consuming it. This makes it impossible to use the original Pin<&mut Self> after calling project(), since the 'Pin<&mut Self>' has been moved into the the 'Project' method. This makes it impossible to implement useful pattern when working with enums: ```rust enum Foo { Variant1(#[pin] SomeFuture), Variant2(OtherType) } fn process(foo: Pin<&mut Foo>) { match foo.project() { __FooProjection(fut) => { fut.poll(); let new_foo: Foo = ...; foo.set(new_foo); }, _ => {} } } ``` This pattern is common when implementing a Future combinator - an inner future is polled, and then the containing enum is changed to a new variant. However, as soon as 'project()' is called, it becoms imposible to call 'set' on the original 'Pin<&mut Self>'. To support this pattern, this commit changes the 'project' method to take a '&mut Pin<&mut Self>'. The projection types works exactly as before - however, creating it no longer requires consuming the original 'Pin<&mut Self>' Unfortunately, current limitations of Rust prevent us from simply modifiying the signature of the 'project' method in the inherent impl of the projection type. While using 'Pin<&mut Self>' as a receiver is supported on stable rust, using '&mut Pin<&mut Self>' as a receiver requires the unstable `#![feature(arbitrary_self_types)]` For compatibility with stable Rust, we instead dynamically define a new trait, '__{Type}ProjectionTrait', where {Type} is the name of the type with the `#[pin_project]` attribute. This trait looks like this: ```rust trait __FooProjectionTrait { fn project('a &mut self) -> __FooProjection<'a>; } ``` It is then implemented for `Pin<&mut {Type}>`. This allows the `project` method to be invoked on `&mut Pin<&mut {Type}>`, which is what we want. If Generic Associated Types (rust-lang/rust#44265) were implemented and stablized, we could use a single trait for all pin projections: ```rust trait Projectable { type Projection<'a>; fn project(&'a mut self) -> Self::Projection<'a>; } ``` However, Generic Associated Types are not even implemented on nightly yet, so we need for generate a new trait per type for the forseeable future.
Based on rust-lang/unsafe-code-guidelines#148 (comment) by @CAD97 Currently, the generated 'project' method takes a 'Pin<&mut Self>', consuming it. This makes it impossible to use the original Pin<&mut Self> after calling project(), since the 'Pin<&mut Self>' has been moved into the the 'Project' method. This makes it impossible to implement useful pattern when working with enums: ```rust enum Foo { Variant1(#[pin] SomeFuture), Variant2(OtherType) } fn process(foo: Pin<&mut Foo>) { match foo.project() { __FooProjection(fut) => { fut.poll(); let new_foo: Foo = ...; foo.set(new_foo); }, _ => {} } } ``` This pattern is common when implementing a Future combinator - an inner future is polled, and then the containing enum is changed to a new variant. However, as soon as 'project()' is called, it becoms imposible to call 'set' on the original 'Pin<&mut Self>'. To support this pattern, this commit changes the 'project' method to take a '&mut Pin<&mut Self>'. The projection types works exactly as before - however, creating it no longer requires consuming the original 'Pin<&mut Self>' Unfortunately, current limitations of Rust prevent us from simply modifiying the signature of the 'project' method in the inherent impl of the projection type. While using 'Pin<&mut Self>' as a receiver is supported on stable rust, using '&mut Pin<&mut Self>' as a receiver requires the unstable `#![feature(arbitrary_self_types)]` For compatibility with stable Rust, we instead dynamically define a new trait, '__{Type}ProjectionTrait', where {Type} is the name of the type with the `#[pin_project]` attribute. This trait looks like this: ```rust trait __FooProjectionTrait { fn project(&'a mut self) -> __FooProjection<'a>; } ``` It is then implemented for `Pin<&mut {Type}>`. This allows the `project` method to be invoked on `&mut Pin<&mut {Type}>`, which is what we want. If Generic Associated Types (rust-lang/rust#44265) were implemented and stablized, we could use a single trait for all pin projections: ```rust trait Projectable { type Projection<'a>; fn project(&'a mut self) -> Self::Projection<'a>; } ``` However, Generic Associated Types are not even implemented on nightly yet, so we need for generate a new trait per type for the forseeable future.
47: Make generated 'project' reference take an '&mut Pin<&mut Self>' r=taiki-e a=Aaron1011 Based on rust-lang/unsafe-code-guidelines#148 (comment) by @CAD97 Currently, the generated 'project' method takes a 'Pin<&mut Self>', consuming it. This makes it impossible to use the original Pin<&mut Self> after calling project(), since the 'Pin<&mut Self>' has been moved into the the 'Project' method. This makes it impossible to implement useful pattern when working with enums: ```rust enum Foo { Variant1(#[pin] SomeFuture), Variant2(OtherType) } fn process(foo: Pin<&mut Foo>) { match foo.project() { __FooProjection(fut) => { fut.poll(); let new_foo: Foo = ...; foo.set(new_foo); }, _ => {} } } ``` This pattern is common when implementing a Future combinator - an inner future is polled, and then the containing enum is changed to a new variant. However, as soon as 'project()' is called, it becoms imposible to call 'set' on the original 'Pin<&mut Self>'. To support this pattern, this commit changes the 'project' method to take a '&mut Pin<&mut Self>'. The projection types work exactly as before - however, creating it no longer requires consuming the original 'Pin<&mut Self>' Unfortunately, current limitations of Rust prevent us from simply modifying the signature of the 'project' method in the inherent impl of the projection type. While using 'Pin<&mut Self>' as a receiver is supported on stable rust, using '&mut Pin<&mut Self>' as a receiver requires the unstable `#![feature(arbitrary_self_types)]` For compatibility with stable Rust, we instead dynamically define a new trait, '__{Type}ProjectionTrait', where {Type} is the name of the type with the `#[pin_project]` attribute. This trait looks like this: ```rust trait __FooProjectionTrait { fn project(&'a mut self) -> __FooProjection<'a>; } ``` It is then implemented for `Pin<&mut {Type}>`. This allows the `project` method to be invoked on `&mut Pin<&mut {Type}>`, which is what we want. If Generic Associated Types (rust-lang/rust#44265) were implemented and stabilized, we could use a single trait for all pin projections: ```rust trait Projectable { type Projection<'a>; fn project(&'a mut self) -> Self::Projection<'a>; } ``` However, Generic Associated Types are not even implemented on nightly yet, so we need to generate a new trait per type for the foreseeable future. Co-authored-by: Aaron Hill <[email protected]>
I don't understand what you are trying to say here. The type of the referent doesn't really matter here. (It only matters for its size and where the
Only raw pointers that are not "parents" of the mutable reference get invalidated. So, I think you can fix your code by deriving |
Fixes actix#1321 A better fix would be to change `MessageBody` to take a `Pin<&mut Self>`, rather than a `Pin<&mut Self>`. This will avoid requiring the use of `Box` for all consumers by allowing the caller to determine how to pin the `MessageBody` implementation (e.g. via stack pinning). However, doing so is a breaking change that will affect every user of `MessageBody`. By pinning the inner stream ourselves, we can fix the undefined behavior without breaking the API. I've included @sebzim4500's reproduction case as a new test case. However, due to the nature of undefined behavior, this could pass (and not segfault) even if underlying issue were to regress. Unfortunately, until rust-lang/unsafe-code-guidelines#148 is resolved, it's not even possible to write a Miri test that will pass when the bug is fixed.
Fixes actix#1321 A better fix would be to change `MessageBody` to take a `Pin<&mut Self>`, rather than a `Pin<&mut Self>`. This will avoid requiring the use of `Box` for all consumers by allowing the caller to determine how to pin the `MessageBody` implementation (e.g. via stack pinning). However, doing so is a breaking change that will affect every user of `MessageBody`. By pinning the inner stream ourselves, we can fix the undefined behavior without breaking the API. I've included @sebzim4500's reproduction case as a new test case. However, due to the nature of undefined behavior, this could pass (and not segfault) even if underlying issue were to regress. Unfortunately, until rust-lang/unsafe-code-guidelines#148 is resolved, it's not even possible to write a Miri test that will pass when the bug is fixed.
Fixes #1321 A better fix would be to change `MessageBody` to take a `Pin<&mut Self>`, rather than a `Pin<&mut Self>`. This will avoid requiring the use of `Box` for all consumers by allowing the caller to determine how to pin the `MessageBody` implementation (e.g. via stack pinning). However, doing so is a breaking change that will affect every user of `MessageBody`. By pinning the inner stream ourselves, we can fix the undefined behavior without breaking the API. I've included @sebzim4500's reproduction case as a new test case. However, due to the nature of undefined behavior, this could pass (and not segfault) even if underlying issue were to regress. Unfortunately, until rust-lang/unsafe-code-guidelines#148 is resolved, it's not even possible to write a Miri test that will pass when the bug is fixed. Co-authored-by: Yuki Okushi <[email protected]>
As a passing thought: |
The property we need of some |
disable test with self-referential generator on Miri Running the libcore test suite in Miri currently fails due to the known incompatibility of self-referential generators with Miri's aliasing checks (rust-lang/unsafe-code-guidelines#148). So let's disable that test in Miri for now.
rust-lang/miri#1952 could help with this by disabling uniqueness of mutable references for I wonder what people think about this. |
I think it would be very interesting to have a version of miri that disables uniqueness for One particular concern I have is regarding what happens when you have an |
The moment you create a reference to that field, it would require uniqueness. |
As in, make this configurable via a flag? That's possible, but then the question remains what the default should be. |
Whether it's a flag or not is not that important to me.
Yes, that was the impression I got as well. In Tokio I have been doing this because it's unclear to me exactly what is required: However, we haven't been doing the same for all structs containing a |
Hm, I don't quite understand what that But anyway that seems unrelated to self-referential generators, so we should probably move the discussion elsewhere (e.g. Zulip). |
exclude mutable references to !Unpin types from uniqueness guarantees This basically works around rust-lang/unsafe-code-guidelines#148 by not requiring uniqueness any more for mutable references to self-referential generators. That corresponds to [the same work-around that was applied in rustc itself](https://github.com/rust-lang/rust/blob/b81553267437627af63c79c1a20c73af865a842a/compiler/rustc_middle/src/ty/layout.rs#L2482). I am not entirely sure if this is a good idea since it might hide too many errors in case types are "accidentally" `!Unpin`. OTOH, our test suite still passes, and to my knowledge the vast majority of types is `Unpin`. (`place.layout.ty` is monomorphic, we should always exactly know which type this is.)
I believe I'm running into a variation of this here: mitsuhiko/deser#37 — While I don't have a generator, I have the situation where I am trying to stash away a value temporarily on an iterator like type for one iteration and the returning the value invalidates all pointers. While what I'm doing is probably not a very common issue, it seems like you can run into this in the absence of generators, futures and pin as well. |
Yes, it's definitely a thing that happens outside of the case that's mentioned at the start. |
This issue here is specific about self-referential data structures though, not sure how those would come up in an iterator? So on first sight that does not really sound like the same problem. |
@RalfJung it's not a traditional iterator, it's a struct that is iterated like this: while let Some(value) = thing.next()? { ... } And it works by stashing data onto |
Cross-linking recent zulip discussion about " |
Visiting for backlog bonanza. Retitling to clarify that this is not SB specific |
I also just posted an RFC intended to properly resolve this :)
rust-lang/rfcs#3467
|
As an observer, a thought occurred regarding the UnsafeWindow proposal. Would it be possible to prevent this type from ever being referenced? As in, it should be treated effectively like an UnsafeCell, but creating direct references to an UnsafeWindow should be disallowed. Based on a minute's worth of thought it seems like if UnsafeWindow were allowed to be turned into a reference, we run into the same set of problems every other proposal has, where we end up having to make exemptions for certain referencing patterns wholesale where we could instead restrict it to certain regions through api enforcement. It feels like an approach where UnsafeWindow would instead somehow provide non aliasing mutable references to the inner value may allow for easier restrictions on what is allowed in general while still allowing the self referential behaviour. Thinking in terms of mutability, Id say UnsafeWindow would behave like UnsafeCell, as in an immutable UnsafeWindow would still allow for multiple mutable references to the inner value to exist. In summary, UnsafeWindow would be ownership only, and dereferencing an unsafe window should result in, say, a new reference that is implicitly assumed to be non aliasing. Usage in ways that allow the reference to exist outside of the variable's scope would be UB, but I can' figure out if there is a way to statically disallow such a situation, aince if we attach ifetimes to the created references it would imply they are tied to the UnsafeWindow and we may end up looping around to the aliasing mutable reference problem again. Welp, that RFC sort of is exactly what I envisioned. |
Turns out that stacked borrows and self-referential generators don't go well together. This code fails in Miri:
The reason it fails is that each time through the loop, we call
Pin::as_mut
, which creates a fresh mutable reference to the generator. Since mutable references are unique, that invalidates the pointer that points from one field of the generator to another.This is basically a particularly bad instance of #133.
Cc @cramertj @tmandry @nikomatsakis @arielb1
The text was updated successfully, but these errors were encountered: