-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scoped impl Trait for Type
#3634
Conversation
- Defined rules for the observed `TypeId` of generic type parameters' opaque types, rewrote sections on type identity and layout-compatibility - Added section on sealed traits - Corrected the use of 'blanket implementation' vs. 'generic implementation' vs. 'implementation on a generic type' - Sketched two more warnings and one more error - Added a section Behaviour change/Warning: `TypeId` of generic discretised using generic type parameters - Removed the Trait binding site section but kept Coercion to trait objects and slightly expanded it - Added section Unexpected behaviour of `TypeId::of::<Self>()` in implementations on generics in the consumer-side presence of scoped implementations and `transmute` - Near-completely rewrote the Rationale and alternatives section with subheadings and before/after-style examples, added more positive effects of the feature - Rewrote Alternatives section - Removed some Unresolved questions that are now tentatively resolved - Added top-level syntax and a field example to Explicit binding, elaborated a bit more - Added Future possibilities: - Conversions where a generic only cares about specific bounds' consistency - Scoped bounds as contextual alternative to sealed traits - Glue crate suggestions - Various small fixes, adjustments and clarifications
- Added a list of bullet points to the Summary and revised it slightly - Coined the term *implementation environment* to refer to the set of all implementations applicable (to a given type) in a given place - Near-completely rewrote the Logical consistency subsection to add subheadings and examples - Small fixes and adjustments
- Distinguish between implementation-aware generics and implementation-invariant generics - Minor edits and fixes
- Added section "(Pending changes to this draft)". - Revised "`TypeId` of generic type parameters' opaque types" as tuples are implementation-invariant. - Replaced section "Contextual monomorphisation of generic implementations and generic functions" with sections "Binding choice by implementations' bounds" and "Binding and generics". - Added sections "Marking a generic as implementation-invariant is a breaking change" and "Efficient compilation". - Renamed future possibilities section "Scoped bounds as contextual alternative to sealed traits" to "Sealed trait bounds". - Added future possibilities section "Reusable limited-access APIs". - a range of smaller adjustments to wording and formatting
- Implemented the former Future Possibilities section "Explicit binding" into the main text as "inline implementation environments", mainly in form of grammar extensions. - Specified that call expressions capture the implementation environment of their function operand, acting as host for implementation-invariant generics there. - Miscellaneous wording clarifications and fixes. (Turns out you can't call a tuple struct's implicit constructor through a type alias, whoops.)
- Added more guide-level documentation changes - Reordered warnings and errors to the end of the reference-level explanation - Some small adustments to wording and formatting
I wasn't sure what to use as "Start Date", so I used today's date. The main question here is probably whether the feature is easy enough to understand and work with for developers. As far as possible, I tried to do this by not adding any original algorithms and by making sure that the most simple/concise implementation is most likely the ideal one in each situation, or at least can always be refactored without breaing changes, but especially the split between implementation-aware and implementation-invariant generics not quite along which ones are built-ins with special syntax adds a bit of complexity for better ergonomics. Some of the changes are modular, for example the syntax for using a scoped implementation directly, without bringing it into scope, could be added separately from that for importing it into and later reexporting it from another scope, and referring to global implementations in place of scoped ones could be another separate addition, but anything related to type identity would have to be implemented first in order for the feature to be backwards-compatible and not require an edition change. |
This RFC covers the use-cases of the following postponed and draft RFCs:
It would likely help with the following RFC by providing a mechanism for access control: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments for clarification and some questions.
This also affects trait-bounded generic type parameters: If `T` is bounded on `Hash`, then | ||
`TypeId::of::<T>()` results in distinct `TypeId`s in that context depending on the | ||
captured implementation. | ||
|
||
However, note that `TypeId::of::<T>()` and `TypeId::of::<T as Hash in module>()` are | ||
always equivalent for one definition of `T`, as `TypeId::of`'s implementation does **not** | ||
have a `T: Hash` bound! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this was cleared up elsewhere, but as someone looking at this documentation the first time I'm confused by this explanation. On its face it seems like it's saying that Type
and Type as Trait in module
are both the same type and different types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type as Trait in module
is technically not a type at all, as implementations never "stick" to the outside of a type.
It's a type argument consisting of a type and an inline implementation environment.
This combination, along with any scoped implementations inherited from the surrounding scope, is "baked" into the effective (opaque) type of the generic type parameter as part of monomorphisation.
These opaque types are special-cased in terms of how the baked implementation environment influences the observed TypeId
:
-
If the type parameter is wrapped in an implementation-aware generic (even indirectly), e.g. as
TypeId::of::<HashSet<T>>()
, then theTypeId
identifies the combination of the discrete type given as part of the type argument and the full set of baked implementations.
(Normally this would happen because the e.g.HashSet<T>
is used as type argument and then has itsTypeId
taken inside another layer of generic function call.)This is necessary to ensure existing
unsafe
code doesn't accidentally transmute hashtables in violation of their coherence expectations, i.e. part of version 1 of "the hashtable problem".(Writing it out explicitly like above, without indirection through another generic call, also produces the Warning:
TypeId
of implementation-aware generic discretised using generic type parameters, since this is almost never what you actually want to observe when making type-keyed collections.) -
In all other cases, the observed
TypeId
identifies the combination of the type given as type argument and any bounds-relevant captured implementations.This makes type-keyed and type-erased collections work as expected by consumer code, without over-distinguishing entries based on scoped implementations that are irrelevant to their function (which is what I call 'version 2 of "the hashtable problem"').
You can find more on the reasoning here in the section on Logical consistency of type-erased collections.
Please let me know if this works as clarification.
I should add a cross-reference to the Logical consistency explanations here in any case, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is my understanding:
- When a concrete type is used as a type argument, the observed type-identity of the corresponding type parameter in the generic item changes depending on the where-bounds present on the type parameter.
This seems strange, to say the least. As an example:
trait TypeContainer {
type Type;
}
struct Generic<T>(T) /* where T: Marker */;
impl<T> TypeContainer for Generic<T> {
type Type= T;
}
trait Marker {}
mod one {
use impl super::Marker for () {}
pub type Alias = super::Generic<()>;
}
mod two {
use impl super::Marker for () {}
pub type Alias = super::Generic<()>;
}
type X = ();
type Y = <one::Alias as TypeContainer>::Type;
type Z = <two::Alias as TypeContainer>::Type;
// or, equivalently
// type Y = <Generic<() as Marker in one> as TypeContainer>::Type;
// type Z = <Generic<() as Marker in two> as TypeContainer>::Type;
Is it then the case that X, Y, and Z are the same if there is no T: Marker
bound, but distinct if there is one (despite all three types being known to be ()
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is it the case that; no, all three types are known to be identical in concrete code, but generic code defining these three types are not only not known to be identical, but are in fact known to be distinct (as witnessed by runtime type comparison).
If it is intended that the same type can have different type IDs depending on the surrounding code, have you made sure this does not cause a regression of 97156 (perhaps by allowing generic code similar to qcell
to create two unique tokens for the same underlying type)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is it the case that; no, all three types are known to be identical in concrete code, but generic code defining these three types are not only not known to be identical, but are in fact known to be distinct (as witnessed by runtime type comparison).
[…]
This was my intention, but I failed to consider that generic type parameters can flow back into concrete code transparently like this. I'll have to go over this (and the issue) carefully and see how my ideas hold up and where I need to specify more thoroughly or change something.
(Tentatively, I think that the implementations can/should 'fall off' in concrete code so that X
, Y
, Z
and ()
are statically known to be identical and would/could¹ yield identical TypeIds
, but that this would not regress the issue because the different concrete Generic<…>
s still are not at all assignable to each other (regardless of whether the bound is present or not).)
I can't look into this properly quite right now unfortunately, but hopefully I'll manage to do it at some point this week.
¹ I have to read the issue a bit more thoroughly to be sure what the current status is on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tentatively, I think that the implementations can/should 'fall off' in concrete code so that X, Y, Z and () are statically known to be identical and would/could¹ yield identical TypeIds
In effect, implicitly making the type of a generic type parameters a sort of Wrapper<T>
with partial implementation-awareness.
On a related note, I wonder if you can automate implementation-awareness based on impl
blocks (e.g. HashMap
relies on K: Eq + Hash
in one of its impl blocks, so functions from that impl block carry implementation-awareness, rather than the type itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tentatively, I think that the implementations can/should 'fall off' in concrete code so that X, Y, Z and () are statically known to be identical and would/could¹ yield identical TypeIds
In effect, implicitly making the type of a generic type parameters a sort of
Wrapper<T>
with partial implementation-awareness.
In a way, yes. I'll think about whether I can explain it that way to some extent, though it's possible it's too not-quite-the-same (for me) to fit it into the text well.
On a related note, I wonder if you can automate implementation-awareness based on
impl
blocks (e.g.HashMap
relies onK: Eq + Hash
in one of its impl blocks, so functions from that impl block carry implementation-awareness, rather than the type itself?
Implementation-awareness must be shared between distinct impl blocks.
For example, the std
's HashMap
has inherent, Extend
, From
, FromIterator
, Index
, PartialEq
and Eq
impl
s that expect K: Hash
to be coherent. The category likely also should include external implementations in at least some cases, which for simplicity would mean all cases.
Having that information "flow backwards" from impl
s to overall type
s' identities then would introduce a lot of tricky behaviour impacts at a distance, and most likely wouldn't be compatible with early code generation for concrete code.
|
||
See also [scoped-implementation-of-external-sealed-trait]. | ||
|
||
## Type parameters capture their *implementation environment* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sections talking about interaction of scoped implementations with generic code rely on precise terminology that is not defined and not universal within the Rust programming language community, So, I think there should be a section defining terms relating to generics as used in this RFC. I recommend using the following terminology:
Type parameter: a name that acts as a placeholder for a type, introduced by generic type and trait definitions (enum Option<T>
), type alias definitions and generic associated type declarations (type Result<T> = ...;
), generic function declarations (fn collect<T>(...) -> ...
), generic inherent and trait implementations (impl<T> ...
), and (with this RFC) imports of generic scoped implementations; collectively "generic items". The Self
type is a type variable within a trait definition.
Type argument: a type (which can be a concrete type, type parameter, or partially-applied generic type) that takes the place of a type parameter in a use of a generic item. Examples include the i32
in let x: Vec<i32>
, and the [T]
in impl<T> Default for
Box<[T]>`.
Concrete type: a type that contains no type variables.
Using this terminology, type arguments can have implementation environments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just edited this section (and some related wording that I caught) a bit to effectively use this terminology in ef35245 ¹, but adding a section that defines the terms formally is a great idea.
I used 'concrete' and (mainly) 'discrete' interchangeably in this RFC so far, but it's also a good idea to be consistent about that. (I personally would tend to use 'discrete' for types and 'concrete' for values, but that's purely based on vibes so I'll do a bit of research tomorrow to see if there's any precedent in that regard.)
(I should be able to add the definition properly in the next one or two days, depending on how busy I am.)
¹ I'm not sure it was you who pointed this out on Discord. If not: Thank you for the review! If yes: Thanks again!
## Type identity of discrete types | ||
[type-identity-of-discrete-types]: #type-identity-of-discrete-types | ||
|
||
The type identity and `TypeId::of::<…>()` of discrete types, including discretised generics, are not affected by scoped implementations *on* them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically "The identity of a concrete type, including a fully-instantiated generic type, is not affected by any scoped implementations for that type." Perhaps an example would do here as well, e.g. the identity of HasSet<String>
is not affected by any scoped implementations for HashSet<String>
.
`where`-clauses without generics or `Self` type, like `where (): Debug`, **do not** affect binding of implementations within an `impl` or `fn`, as the non-type-parameter-type `()` is unable to receive an *implementation environment* from the discretisation site. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that a scoped implementation of use impl Debug for () { ... }
does not affect the behavior of the function? That is, trivial bounds are still trivial with this RFC, because implementation environments are only carried by type arguments. This could be unexpected behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly that. (More precisely, such a scoped implementation at the caller does not affect the behaviour of the function, but at its definition it does, and the trivial bound is satisfied according to the function definition's implementation environment too.)
I think that, in this case, it's overall better to keep monomorphisation in check and not turn discrete-as-written implementations into monomorphised ones. Rustdoc does display trivial bounds though 🤔
(Another practical issue is inability to write implementation environments inline for anything that's not a generic type argument… but that could just be a case of me personally coming up blank if I think about the syntax for it.)
Changing this doesn't sit well with me because it seems unintuitive (it would make where
-clauses a form of first-class implicit-only parameter) and there may be macros that emit spurious trivial bounds. It would also break that trivial bounds can be used as assertions, at least if allowed without global implementation… though the static_assertions
crate notably does not currently do this.
In general, I'm very hesitant to add anything that reinterprets currently valid Rust code, even if I'm pretty sure this wouldn't be a compilation-breaking change. Maybe it would be a better idea to hide trivial bounds from the docs instead, going for consistency in the other direction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me, but you might consider explicitly calling out that trivial bounds are not affected by scoped implementations.
The following functions do not compile, as the implicitly returned `()` is not stated *inside* the scope where the implementation is available: | ||
|
||
```rust | ||
trait Trait {} | ||
|
||
fn function() -> impl Trait { | ||
^^^^^^^^^^ | ||
use impl Trait for () {} | ||
--------------------- | ||
|
||
// Cannot bind on implicit `()` returned by function body without trailing *Expression*. | ||
} | ||
|
||
fn function2() -> impl Trait { | ||
^^^^^^^^^^ | ||
use impl Trait for () {} | ||
--------------------- | ||
|
||
return; // Cannot bind on `return` without expression. | ||
------- | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would the implicit return not pick up scoped implementations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory it could be made to, but I (subectively) think that would be hard to follow, since for example other items nested in function bodies aren't available to their signature (if I'm not completely mistaken).
The function body block isn't visually located inside itself, so to me it appears one or a half level of scoping higher than a scoped impl Trait for Type
placed inside it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like too subtle of a distinction to make. I don't think most rust programmers conceptualize the implicit return value to be any different than an explicit return value. While, yes, there are already subtle differences with regards to implicit drop and lifetimes, I don't think this RFC should add to those pain points by distinguishing between implicit return, return with expression, and return without expression.
However, I admit that the surface area of this particular change is rather small, as it only affects functions that return ()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong preference in this regard. return
without expression at least could definitely be defined to pick up the environment on the return
token instead without it feeling odd in the long run, so I'll change that here when I can.
If I had totally free reign, I would probably put off the matter with the fully implicit return until the very last minute and then run a survey, but I don't know if that's a thing 😅
It's really hard for me to tell whether reaching the end of the function body should observe the implementation environment just outside or just inside the function, though. From a purely practical/convenience point of view, I think I would say 'inside' too, and I now suspect it may also be closer to developer intent in mixed scenarios, so I'll change it to that.
(Sorry, today I'm having some trouble with the wider reason it took me so long to submit this in the first place, so I'll be a bit slow to implement changes and respond to the larger points you brought up.)
```rust | ||
struct Type; | ||
struct Generic<T>(T); | ||
trait Trait {} | ||
|
||
mod a { | ||
use super::{Type, Generic, Trait}; | ||
pub use impl Trait for Type {} | ||
pub type Alias = Generic<T>; | ||
} | ||
|
||
mod b { | ||
use super::{Type, Generic, Trait}; | ||
pub use impl Trait for Type {} | ||
pub type Alias = Generic<T>; | ||
} | ||
|
||
use impl Trait for a::Alias {} | ||
use impl Trait for b::Alias {} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to be able to have distinct impl Trait for Generic<Type as X in one>
and impl Trait for Generic<Type as X in two>
? This seems unexpected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's required for backwards-compatibility, or at least having a distinct impl Trait for Generic<Type>
that isn't available for Generic<Type as X in one>
or Generic<Type as X in two>
is. Existing code can make safety-critical assumptions about the exact behaviour of the former even if X
is not sealed.
(Forbidding implementations for concrete types that contain captured scoped implementations outright is technically also possible, but I don't think that would helpful.)
Fitting these distinct implementation targets into Rustdoc could be interesting, though, just in terms of information arrangement.
…*" to "## Type arguments capture their *implementation environment*" and clarified the role of type arguments vs. type parameters This was pointed out in the Rust Programming Language Community Discord.
Fixes https://github.com/rust-lang/rfcs/pull/3634/files#r1598697056 . An inline implementation environment can only refer to implementations that it can actually see.
…riority" with more and more precise information
(In fairness I should say that a few of those 🚀 reactions on the OP are from friends who have been following along a bit and seem excited about this. I did not ask them to add those, didn't even give them a link directly to this PR.) |
I think the consequences for type identity should be discussed more.
|
Patched up now in 4d74669, along with cases where the bounds are attached to a concrete type… though that's not enough yet, because even if the trait isn't sealed, implementations on concrete types effectively are. I'll revise it again in a moment. (Beyond Seconding what @oskgo wrote, Relaxing the Orphan Rule is definitely less complicated. |
That's definitely not pretty, even if the cases it forbids should be relatively rare for the most part. (I guess indirecting the bound like that could be one very hacky form of access control? |
…right nevermind, there's more.
can check the |
It's not enough to handle traits that are explicitly sealed. What about the following case: mod M1 {
pub struct Foo;
pub trait Tr1 {
const ID1: u32;
}
impl Tr1 for Foo {
const ID1: u32 = 0;
}
}
pub struct Proof<T: ?Sized>{
_ph: core::marker::PhantomData<T>
}
pub trait Tr2 {
fn prove() -> Proof<Self>;
const ID2: u32;
}
impl Tr2 for M1::Foo {
fn prove() -> Proof<Self> {Proof{_ph: core::marker::PhantomData}}
const ID2: u32 = 0;
}
pub fn explode<T: Tr2 + M1::Tr1>() {
T::prove();
if T::ID1 != T::ID2 {
unsafe {core::hint::unreachable_unchecked()}
}
} Many variations of this can be made. The only essential part is that it should be impossible to construct an instance of The list of forbidden implementation combinations is starting to look quite complex now. To avoid errors I think that an explanation for why these are necessary, and more importantly why they are sufficient should be present somewhere. Personally I have no idea what an explanation for why the list is sufficient would even look like though. EDIT: Also, I'm not sure how this RFC interacts with future changes to the language that might expand on the situations where we can observe or rely on the consistency of trait implementations across different scopes. How would this RFC constrain future language design? |
Your example isn't unsound under this RFC with the latest corrections. It falls under the As for implementation consistency: Outside of these combinations with a separate proof type, the inconsistency isn't observable. The effects on the runtime type identity of generic type parameters ensure that generic code safely perceives the types as distinct. In some cases this can lead to an unexpected mismatch, but shouldn't in any way that causes unsoundness with current crates1. Footnotes
|
use std::{any::TypeId, collections::HashSet, sync::Mutex};
pub struct Token<T>(PhantomData<T>);
static STORAGE: Mutex<HashSet<TypeId>> = Mutex::default();
pub fn insert<T>() -> Token<T> {
STORAGE.lock().unwrap().insert(TypeId::of::<T>());
Token(PhantomData)
}
pub fn frob<T: Frob>(token: &Token<T>) {
unsafe { STORAGE.lock().unwrap().get(&TypeId::of::<T>()).unwrap_unchecked(); }
} is the pattern of currently legal code that would become unsound here since the caller can You can create a non-static version of this by attaching an invariant must-match lifetime to collection and token type, then exposing collection instances only one at a time with a transient lifetime to a callback. Edit: Would actually be unsound without the transmute, yes. |
@Tamschi It's not sealed, at least not in the conventional sense. I've changed the example to add Obviously this is extremely contrived, but if you're going to change what code is considered sound then you should provide guidance on which rules unsafe code should follow after the implementation of this RFC, and the new rules should be consistent with existing practice. People using unsafe Rust are already relying on trait implementations they can observe to stay consistent across different scopes in various ways. That's an implicit assumption in current Rust. It might be possible to allow them to keep this rule by adding enough exceptions to when scoped impls can be made and just making sure that counterexamples are unobservable, but then the language needs to make sure they stay unobservable as it evolves, we need to make sure that all counterexamples (that exist, not just that I can come up with) are dealt with and we need to deal with counterexamples coming from future language changes or block those changes. |
I don't think the example is all that contrived, the pattern seems quite common. |
Maybe my notation is a bit imprecise here. The idea is that the contents of |
Oh, I see. In that case, the crate that publishes |
It's sound under current rules because
|
I don't think this is sound.
Source: https://jacko.io/safety_and_soundness.html Consider following.
I just caused UB without any unsafe code. That to me points that For |
@Ygg01 I probably should have defined In case you don't know about the sealed trait pattern, Anyways, that example is outdated since it has been mitigated in the more recent changes to the RFC. It should be easy to check whether a trait is sealed. My newer example is a variant of this that's not so easy to check for. |
Ok. I assume you meant the proof one? It doesn't even have the It still doesn't enforce its bounds. pub struct Fubar;
impl Tr1 for Fubar{
const ID1: u32 = 0;
}
impl Tr2 for Fubar {
fn prove() -> Proof<Self> {Proof{_ph: core::marker::PhantomData}}
const ID2: u32 = 1;
}
fn main() {
explode::<Fubar>(); // UB here, no unsafe, explode is unsound.
} If you write code that depends on two traits having to have their |
@Ygg01 The |
Ahh, now I get it. Sorry, I missed that before. You're right, that does effectively seal the trait and make your example sound. I also found another case where an assumption can be made: use std::mem::MaybeUninit;
pub trait Friend {
fn prepare(&mut self);
}
pub trait HasUnsafe {
/// #Safety
///
/// May only be called after `<Self as Fried>::prepare(&mut self)` has returned at least once.
unsafe fn assume(&self);
}
pub struct Type(MaybeUninit<u8>);
impl Type {
pub fn new() -> Self {
Self(MaybeUninit::uninit())
}
}
impl Friend for Type {
fn prepare(&mut self) {
self.0.write(0);
}
}
impl HasUnsafe for Type {
unsafe fn assume(&self) {
dbg!(unsafe {
// SAFETY:
// According to the safety requirements for this function, `Friend::prepare`
// has been called on this instance, so `self.0` is initialised (which this
// function knows because it's in the same crate as or the behaviour is
// documented on the implementation of `Friend` for `Type`).
self.0.assume_init();
});
}
}
pub fn sound<T: Friend + HasUnsafe>(value: &mut T) {
value.prepare();
unsafe {
// SAFETY: `.prepare` has been called.
value.assume();
}
} With this RFC, ¹ and any variation of the link (i.e. as associated type or type argument), in either direction I just understood that these combination issues are why specialisation requires marking further-specialisable |
As this feels largely academic now, there is a much simpler but still interesting flavor of all this:
Does this solve any problems for which delegation makes solutions hard? |
Unfortunately not, unless the module has knowledge of implementation details of the original mod deimpl Type { // Hide all implementations of all traits for Type
unsafe impl<..> Trait1<..> for Type<..>; // Restore the impl of Trait1 for Type
impl<..> Trait2<..> for Type<..> { ... } // Provide a new impl of Trait2 for Type
} could be sound if it does, if the It's possible to achieve the same effect for this RFC by forbidding directly type-linked bounds fulfillment that combines scoped and global implementations in general and also requiring |
An opt-in to having traits allow scoped impls would work, but opting in would then be a breaking change so it couldn't be used on stdlib traits. Still, one can argue that scoped impls are less likely to be useful for std anyways because pushing crate maintainers to implement stdlib traits isn't as hard as pushing them to implement third party traits. I think that requiring opt-in (together with exportable scoped impls) could be sufficient to allow a second serialization library to start growing support without first needing maintainer buy-in. It's going to be less useful for specialization, but that might be necessary for a sound version of this anyways. There's still the drawback of unsafe Rust users needing to learn that they can't assume that traits opting in will have a unique consistent implementation. It's unclear to me how much of a restriction this is in practice, maybe a future possibility would be to allow generics to prioritize between trait implementations to get back the original behavior when this is needed. |
I think that doing both would work and be a good balance:
For most uses (derives for common serialisation libraries, for example, but also of Since scoping has higher priority than Specialisation, this approach also avoids semver hazards. There would still be friction when using such traits as collection items though, I think, since that would represent a link between e.g. It's still allowed in safe code to inline-specify the implementation environment for a generic type argument so that it captures only global implementations. ¹ I also have to think about how far such types can be published… Though at a glance I think that any code that can statically see that the implementation is scoped would have to make a further bounded call resolving with a global implementation to practically mix them, which would trigger the restriction there, without it being a soundness or breakage issue for any code that exists today, at least. There may still be a problem with If a trait definition opts into allowing alternatives, then:
It would be good to also require the signifier for this on each global implementation of an opted-in trait. This would signal to users of default trait Trait {} // Is this an auto-trait? 🤔
default impl<T> Trait for T {} // This may have alternatives! 🙂 I don't know if there's a keyword that would be a good fit there, currently. Edit (next day): A small addendum: The (only, I believe) reason that this needs to be marked on the trait definition and not just global implementations is that it needs to be consistent across specialisation-differentiated global implementations to not create a semver hazard… more precisely a specific global implementation mustn't deny it when added later than a more general global implementation that does allow global/scoped-mixing. It's possible that this may be a non-issue though, for different reasons in different cases. There are eight cases, depending on whether the special or general implementation came first and whether each implementation (wants to) allow mixing with (type-linked) scoped implementations of other traits. Mitigating factors are where existing global implementations are guaranteed to be visible and limitations to where global implementations may be added. For example, if an existing (broader) global implementation that allows mixing is guaranteed to be visible everywhere a new more specific global implementation may be added, then the semver hazard can be mitigated by warning on specialisations that don't allow mixing when at least one of its respective next-more-general implementations (in terms of specialisation!) does allow mixing. Additionally, if specialising implementations always lie in order in the crate dependency hierarchy (I think that's the case due to the current orphan rule, but please correct me if not), then later allowing mixing for an implementation that may do so is not an unsound or semver-breaking change because mixing would remain disabled anywhere the existing specialisations are in use already. By the same condition, it would be possible for a specialising implementation to allow mixing while its more general counterparts do not. This needs to be checked carefully for all eight cases though (or 12, if you add allowing mixing for an existing global implementation), and right now I'm pretty sure I won't be able to do that before the end of next week.
|
I (finally) had some time to start looking into this just now:
|
Point three is not as complex as I had feared - in the least rigid solution, a specialisable implementation that newly adds final associated items where it didn't have any before must opt into mixing to avoid a breaking change, but that's it. Unfortunately, this is most likely something that can't be linted for statically with enough accuracy for the lint to be useful. It can be detected easily by comparing the API to a previous version of the crate, but I don't think there's precedent for that outside of third party tools like cargo-semver-checks or cargo-public-api. Outside of that, I could only confirm the development-time 'race condition' I mentioned in point two of my previous comment. It can be avoided by declaring 'a newly mixing-enabled specialisable impl that was previously specialisable or absent' a breaking change, but I don't have the guidelines knowledge here to decide that, so I'll defer it into an unresolved question. Everything else seems to work out nicely. The case where a new specific impl shadows an existing mixing-enabled impl is easy to lint for and everything else seems to be either non-breaking or wouldn't compile. Of course this isn't the actual formal write-up, so something else may come up. Unrelatedly, blanket implementations effectively seal a trait in some ways and I still have to look into that properly. Adding such a blanket implementation (that's not specialisable) already is a breaking change, so I think letting that narrowly prevent scoped implementations doesn't cause additional SemVer hazards vs. the status quo. A second opt-in on impls to allow them anyway should be fairly easily possible and unproblematic, though, and specialisable impls without final associated items should always allow scoped implementations since they aren't sealing at all. (Regarding blanket implementations with final associated items… I think the natural action may be to let scoped implementations 'inherit' them? This is just a hunch, but I think this should work out seamlessly without any workarounds.) |
I have to agree with everything @lcnr wrote here -- I am very much in favor of implementing (some form of) scoped impl trait. But we are not in a position to implement it and I don't want to invest energy in the RFC until that time. I think this would be a good thing to propose as a project goal for 2025H1 once that program gets going in October (though I don't know that we'll be in a position to drive it by then, it'd be a good way to have that conversation). Until then, I think keeping this RFC open is misleading (I would like us to be a bit more aggressive on closing RFCs that don't have an active path to go forward). Therefore I'm going to close the RFC but I want to encourage @Tamschi to reach out to me on Zulip and discuss the idea of how to move this forward. |
(I marked this as 'postponed' to indicate that there is a desire for the feature later -- we have sometimes used FCPs for this but I think that's a lot of procedural overhead and am inclined not to require it.) |
This may be a bit long. I thought it would be best to go over the interactions with other language features and tooling as well as potential impacts thoroughly in advance, since while the addition to the language itself can be explained quickly, it does interact with a lot of other Rust features and aspects of the crate ecosystem.
In (very) short this is a version of orphan-rule avoidance (i.e. non-unique trait implementations) that does not suffer from either version of "the hashtable problem", does not obscure imports, can be safely ignored unless needed (but is discoverable), is seamlessly compatible with specialisation and tries to strike a good balance between usability and complexity, so that developers are automatically funneled towards correct and unsurprising but also highly compatible code.
I'm aware that there are often rash proposals in this direction, but I believe I've covered all previously-discussed issues.
As the overall RFC is a fairly dense read, I've added cross-references and less-formal comments in form of blockquotes to it.
Thanks
Rendered