- Feature Name:
pin
- Start Date: 2018-02-19
- RFC PR: rust-lang/rfcs#2349
- Rust Issue: rust-lang/rust#49150
Introduce new APIs to libcore / libstd to serve as safe abstractions for data which cannot be safely moved around.
A longstanding problem for Rust has been dealing with types that should not be moved. A common motivation for this is when a struct contains a pointer into its own representation - moving that struct would invalidate that pointer. This use case has become especially important recently with work on generators. Because generators essentially reify a stackframe into an object that can be manipulated in code, it is likely for idiomatic usage of a generator to result in such a self-referential type, if it is allowed.
This proposal adds an API to std which would allow you to guarantee that a particular value will never move again, enabling safe APIs that rely on self-references to exist.
The core goal of this RFC is to provide a reference type where the referent is guaranteed to never move before being dropped. We want to do this with a minimum disruption to the type system, and in fact, this RFC shows that we can achieve the goal without any type system changes.
Let's take that goal apart, piece by piece, from the perspective of the futures (i.e. async/await) use case:
-
Reference type. The reason we need a reference type is that, when working with things like futures, we generally want to combine smaller futures into larger ones, and only at the top level put an entire resulting future into some immovable location. Thus, we need a reference type for methods like
poll
, so that we can break apart a large future into its smaller components, while retaining the guarantee about immobility. -
Never to move before being dropped. Again looking at the futures case, once we begin
poll
ing a future, we want it to be able to store references into itself, which is possible if we can guarantee that the whole future will never move. We don't try to track whether such references exist at the type level, since that would involve cumbersome typestate; instead, we simply decree that by the time you initiallypoll
, you promise to never move an immobile future again.
At the same time, we want to support futures (and iterators, etc.) that can move. While it's possible to do so by providing two distinct Future
(or Iterator
, etc) traits, such designs incur unacceptable ergonomic costs.
The key insight of this RFC is that we can create a new library type, Pin<'a, T>
, which encompasses both movable and immobile referents. The type is paired with a new auto trait, Unpin
, which determines the meaning of Pin<'a, T>
:
- If
T: Unpin
(which is the default), thenPin<'a, T>
is entirely equivalent to&'a mut T
. - If
T: !Unpin
, thenPin<'a, T>
provides a unique reference to aT
with lifetime'a
, but only provides&'a T
access safely. It also guarantees that the referent will never be moved. However, getting&'a mut T
access is unsafe, because operations likemem::replace
mean that&mut
access is enough to move data out of the referent; you must promise not to do so.
To be clear: the sole function of Unpin
is to control the meaning of Pin
. Making Unpin
an auto trait means that the vast majority of types are automatically "movable", so Pin
degenerates to &mut
. In the case that you need immobility, you opt out of Unpin
, and then Pin
becomes meaningful for your type.
Putting this all together, we arrive at the following definition of Future
:
trait Future {
type Item;
type Error;
fn poll(self: Pin<Self>, cx: &mut task::Context) -> Poll<Self::Item, Self::Error>;
}
By default when implementing Future
for a struct, this definition is equivalent to today's, which takes &mut self
. But if you want to allow self-referencing in your future, you just opt out of Unpin
, and Pin
takes care of the rest.
This RFC also provides a pinned analogy to Box
called PinBox<T>
. It works along the same lines as the Pin
type discussed here - if the type implements Unpin
, it functions the same as the unpinned Box
; if the type has opted out of Unpin
, it guarantees that they type behind the reference will not be moved again.
This new auto trait is added to the core::marker
and std::marker
modules:
pub unsafe auto trait Unpin { }
A type that implements Unpin
can be moved out of one of the pinned reference
types discussed later in this RFC. Otherwise, they do not expose a safe API
which allows you to move a value out of them. Because Unpin
is an auto trait,
most types in Rust implement Unpin
. The types which don't are primarily
self-referential types, like certain generators.
This trait is a lang item, but only to generate negative impls for certain
generators. Unlike previous ?Move
proposals, and unlike some traits like
Sized
and Copy
, this trait does not impose any compiler-based semantics
types that do or don't implement it. Instead, the semantics are entirely
enforced through library APIs which use Unpin
as a marker.
The Pin
struct is added to both core::mem
and std::mem
. It is a new kind
of reference, with stronger requirements than &mut T
#[fundamental]
pub struct Pin<'a, T: ?Sized + 'a> {
data: &'a mut T,
}
Pin
implements Deref
, but only implements DerefMut
if the type it
references implements Unpin
. This way, it is not safe to call mem::swap
or
mem::replace
when the type referenced does not implement Unpin
.
impl<'a, T: ?Sized> Deref for Pin<'a, T> { ... }
impl<'a, T: Unpin + ?Sized> DerefMut for Pin<'a, T> { ... }
It can only be safely constructed from references to types that implement
Unpin
:
impl<'a, T: Unpin + ?Sized> Pin<'a, T> {
pub fn new(reference: &'a mut T) -> Pin<'a, T> { ... }
}
It also has a function called borrow
, which allows it to be transformed to a
pin of a shorter lifetime:
impl<'a, T: ?Sized> Pin<'a, T> {
pub fn borrow<'b>(this: &'b mut Pin<'a, T>) -> Pin<'b, T> { ... }
}
It may also implement additional APIs as is useful for type conversions, such
as AsRef
, From
, and so on. Pin
implements CoerceUnsized
as necessary to
make coercing them into trait objects possible.
Pin
can be unsafely constructed from mutable references to types that may not
implement Unpin
. Users who use this constructor must know that the type
they are passing a reference to will never be moved again after the Pin
is
constructed, even after the lifetime of the reference has ended. (For example,
it is always unsafe to construct a Pin
from a reference you did not create,
because you don't know what will happen once the lifetime of that reference
ends.)
impl<'a, T: ?Sized> Pin<'a, T> {
pub unsafe fn new_unchecked(reference: &'a mut T) -> Pin<'a, T> { ... }
}
Pin
also has an API which allows it to be converted into a mutable reference
for a type that doesn't implement Unpin
. Users who use this API must
guarantee that they never move out of the mutable reference they receive.
impl<'a, T: ?Sized> Pin<'a, T> {
pub unsafe fn get_mut<'b>(this: &'b mut Pin<'a, T>) -> &'b mut T { ... }
}
Finally, as a convenience, Pin
implements an unsafe map
function, which
makes it easier to project through a field. Users calling this function must
guarantee that the value returned will not move as long as the referent of this
pin doesn't move (e.g. it is a private field of the value). They also must not
move out of the mutable reference they receive as the closure argument:
impl<'a, T: ?Sized> Pin<'a, T> {
pub unsafe fn map<'b, U, F>(this: &'b mut Pin<'a, T>, f: F) -> Pin<'b, U>
where F: FnOnce(&mut T) -> &mut U
{ ... }
}
// for example:
struct Foo {
bar: Bar,
}
let foo_pin: Pin<Foo>;
let bar_pin: Pin<Bar> = unsafe { Pin::map(&mut foo_pin, |foo| &mut foo.bar) };
// Equivalent to:
let bar_pin: Pin<Bar> = unsafe {
let foo: &mut Foo = Pin::get_mut(&mut foo_pin);
Pin::new_unchecked(&mut foo.bar)
};
The PinBox
type is added to alloc::boxed and std::boxed. It is analogous to
the Box
type in the same way that Pin
is analogous to the reference types,
and it has a similar API.
#[fundamental]
pub struct PinBox<T: ?Sized> {
inner: Box<T>,
}
Unlike Pin
, it is safe to construct a PinBox
from a T
and from a
Box<T>
, even if the type does not implement Unpin
:
impl<T> PinBox<T> {
pub fn new(data: T) -> PinBox<T> { ... }
}
impl<T: ?Sized> From<Box<T>> for PinBox<T> {
fn from(boxed: Box<T>) -> PinBox<T> { ... }
}
It also provides the same Deref impls as Pin
does:
impl<T: ?Sized> Deref for PinBox<T> { ... }
impl<T: Unpin + ?Sized> DerefMut for PinBox<T> { ... }
If the data implements Unpin
, its also safe to convert a PinBox
into a
Box
:
impl<T: Unpin + ?Sized> From<PinBox<T>> for Box<T> { ... }
Finally, it is safe to get a Pin
from borrows of PinBox
:
impl<T: ?Sized> PinBox<T> {
fn as_pin<'a>(&'a mut self) -> Pin<'a, T> { ... }
}
These APIs make PinBox
a reasonable way of handling data which does not
implement Unpin
. Once you heap allocate that data inside of a PinBox
, you
know that it will never change address again, and you can hand out Pin
references to that data.
PinBox
can be unsafely converted into &mut T
and Box<T>
even if the type
it references does not implement Unpin
, similar to Pin
:
impl<T: ?Sized> PinBox<T> {
pub unsafe fn get_mut<'a>(this: &'a mut PinBox<T>) -> &'a mut T { ... }
pub unsafe fn into_inner(this: PinBox<T>) -> Box<T> { ... }
}
Today, the unstable generators feature has an option to create generators which contain references that live across yield points - these are, in effect, internal references into the generator's state machine. Because internal references are invalidated if the type is moved, these kinds of generators ("immovable generators") are currently unsafe to create.
Once the arbitrary_self_types feature becomes object safe, we will make three changes to the generator API:
- We will change the
resume
method to take self byself: Pin<Self>
instead of&mut self
. - We will implement
!Unpin
for the anonymous type of an immovable generator. - We will make it safe to define an immovable generator.
This is an example of how the APIs in this RFC allow for self-referential data types to be created safely.
This adds additional APIs to std, including an auto trait. Such additions should not be taken lightly, and only included if they are well-justified by the abstractions they express.
One previous proposal was to add a built-in Move
trait, similar to Sized
. A
type that did not implement Move
could not be moved after it had been
referenced.
This solution had some problems. First, the ?Move
bound ended up "infecting"
many different APIs where it wasn't relevant, and introduced a breaking change
in several cases where the API bound changed in a non-backwards compatible way.
In a certain sense, this proposal is a much more narrowly scoped version of
?Move
. With ?Move
, any reference could act as the "Pin" reference does
here. However, because of this flexibility, the negative consequences of having
a type that can't be moved had a much broader impact.
Instead, we require APIs to opt into supporting immovability (a niche case) by
operating with the Pin
type, avoiding "infecting" the basic reference type
with concerns around immovable types.
Another alternative we've considered was to just have the APIs which require
immovability be unsafe
. It would be up to the users of these APIs to review
and guarantee that they never moved the self-referential types. For example,
generator would look like this:
trait Generator {
type Yield;
type Return;
unsafe fn resume(&mut self) -> CoResult<Self::Yield, Self::Return>;
}
This would require no extensions to the standard library, but would place the burden on every user who wants to call resume to guarantee (at the risk of memory unsafety) that their types were not moved, or that they were movable. This seemed like a worse trade off than adding these APIs.
In a previous iteration of this RFC, there was a wrapper type called Anchor
that could "anchor" any smart pointer, and there was a hierarchy of traits
relating to the stability of the referent of different pointer types. This has
been replaced with PinBox
.
The primary benefit of this approach was that it was partially integrated with crates like owning-ref and rental, which also use a hierarchy of stability traits. However, because of differences in the requirements, the traits used by owning-ref et al. ended up being a non-overlapping subset of the traits proposed by this RFC from the traits used by the Anchor type. Merging these into a single hierarchy provided relatively little benefit.
And the only types that implemented all of the necessary traits to be put into
an Anchor before were Box<T>
and Vec<T>
. Because you cannot get mutable
access to the smart pointer (unless the referent implements Unpin
), an
Anchor<Vec<T>>
was not really any different from an Anchor<Box<[T]>>
in the
previous iteration of the RFC. For this reason, replacing Anchor
with
PinBox
and just supporting PinBox<[T]>
reduced the API complexity without
losing any expressiveness.
This API supports pinning !Unpin
types in the heap. However, they can also
be safely held in place in the stack, allowing a safe API for creating a Pin
referencing a stack allocated !Unpin
type.
This API is small, and does not become a part of anyone's public API. For that reason, we'll start by allowing it to grow out of tree, in third party crates, before including it in std. Here a version of the API, for reference purposes:
pub fn pinned(data: T) -> PinTemporary<'a, T> {
PinTemporary { data, _marker: PhantomData }
}
struct PinTemporary<'a, T: 'a> {
data: T,
_marker: PhantomData<&'a &'a mut ()>,
}
impl<'a, T> PinTemporary<'a, T> {
pub fn into_pin(&'a mut self) -> Pin<'a, T> {
unsafe { Pin::new_unchecked(&mut self.data) }
}
}
The Pin
type could instead be a new kind of first-class reference - &'a pin T
. This would have some advantages - it would be trivial to project through
fields, for example, and "stack pinning" would not require an API, it would be
natural. However, it has the downside of adding a new reference type, a very
big language change.
For now, we're happy to stick with the Pin
struct in std, and if this type is
ever added, turn the Pin
type into an alias for the reference type.
Instead of just having Pin
, the type called Pin
could instead be PinMut
,
and we could have a type called Pin
, which is like PinMut
, but only
contains a shared, immutable reference.
Because we've determined that it should be safe to immutably dereference
Pin
/PinMut
, this Pin
type would not provide significant guarantees that a
normal immutable reference does not. If a user needs to pass around references
to data stored pinned, an &Pin
(under the definition of Pin
provided in
this RFC) would suffice. For this reason, the Pin
/PinMut
distinction
introduced extra types and complexity without any impactful benefit.
In addition to the future extensions discussed above, the APIs of the three pin types in std will grow over time as they implement more common conversion traits and so on.
We may also choose to require that Pin
uphold stricter guarantees, requiring
that Unpin
data inside the Pin
not leak unless the memory remains valid for
the remainder of the program lifetime. This would make the stack API documented
above unsound, but might also enable other APIs to make use of these guarantees
to ensure that a destructor always runs if the memory becomes invalid.