-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
impl IntoIterator for arrays #49000
impl IntoIterator for arrays #49000
Conversation
This allows an array to move its values out through iteration. I find this especially handy for `flat_map`, to expand one item into several without having to allocate a `Vec`, like one of the new tests: fn test_iterator_flat_map() { assert!((0..5).flat_map(|i| [2 * i, 2 * i + 1]).eq(0..10)); } Note the array must be moved for the iterator to own it, so you probably don't want this for large `T` or very many items. But for small arrays, it should be faster than bothering with a vector and the heap.
It comes up fairly often that users are surprised they can't directly iterate array values. For example, this reddit user didn't understand why the compiler was creating an iterator with reference items: |
@bors try |
impl IntoIterator for arrays This allows an array to move its values out through iteration. This was attempted once before in #32871, but closed because the `IntoIter<T, [T; $N]>` type is not something we would want to stabilize. However, RFC 2000's const generics (#44580) are now on the horizon, so we can plan on changing this to `IntoIter<T, const N: usize>` before stabilization. Adding the `impl IntoIterator` now will allows folks to go ahead and iterate arrays in stable code. They just won't be able to name the `array::IntoIter` type or use its inherent `as_slice`/`as_mut_slice` methods until they've stabilized. Quite a few iterator examples were already using `.into_iter()` on arrays, getting auto-deref'ed to the slice iterator. These were easily fixed by calling `.iter()` instead, but it shows that this might cause a lot of breaking changes in the wild, and we'll need a crater run to evaluate this. Outside of examples, there was only one instance of in-tree code that had a problem. Fixes #25725. r? @alexcrichton
@@ -210,6 +214,21 @@ macro_rules! array_impls { | |||
} | |||
} | |||
|
|||
#[unstable(feature = "array_into_iter", issue = "0")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trait implementations are still insta-stable, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yeah. I'll need to update the truly unstable items with a tracking issue (if this PR is accepted), so I'll change the impls to stable then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Event if the IntoIter
type is left as unstable, it will still be reachable through <<[u8; 0]> as IntoIter>::Iter>
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, you can write something like:
fn my_iter() -> <[u8; 0] as IntoIterator>::IntoIter {
[].into_iter()
}
I think that's OK. We could change the associated type entirely without breaking such code.
☀️ Test successful - status-travis |
I'm curious whether the discussion in postponed rust-lang/rfcs#2185 applies to this as well. |
I was just following up on my older PR. I wasn't aware of that RFC, but it would apply, yes. If nothing else, I guess we'll have this PR as a proof of concept, that the implementation is feasible. I don't agree with the conclusion though. I think this lack is a usability wart that needs fixing, and it is easily fixed. The |
Do you mind if I incorporate some of this into the |
@novacrazy I don't mind at all - go for it! |
#[inline] | ||
fn next(&mut self) -> Option<T> { | ||
if self.index < self.index_back { | ||
let p = &self.array.as_slice()[self.index]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these indexes use ptr::offset
so as not to acquire &[T]
to a partially uninitialized [T]
?
The same applies down in next_back
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is an uninitialized slice really a problem if we never read from those parts? I would think it's fine, otherwise sequences like mem::uninitialized()
+ ptr::write()
would be a problem too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's certainly problematic under the types-as-contracts view. AFAICT the unsafe code guidelines aren't solid enough to really give a hard answer yet, but it would certainly be better to not create invalid slices. Incidentially, yes, mem::uninitialized
is problematic :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the case, how can I even get the pointer base to offset from? Deref for ManuallyDrop
will create a momentary reference here too. Or even with a local union type, getting the pointer to the field will require forming a reference first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add an unstable ManuallyDrop::as_ptr
and mark ManuallyDrop
as repr(transparent)
. This will tell you that *const ManuallyDrop<T>
points to the same place as*const T
would.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, repr(transparent)
is only allowed on a struct
. The RFC left unions as future work: rust-lang/rfcs#1758 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need repr(transparent) for this, since this is just a question of memory layout, not calling convention. repr(C) would suffice, or inside libstd we could even simply rely on what rustc currently does (which should place the union contents at the same address as the union).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rkruppe I've been thinking more about this, and if my conclusion in #49056 (comment) is correct, then this PR would be fine with the current "invalid slice" usage, since the values left in the slice would still be valid values of type T, just ones in the "dropped" typestate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, here's a comparison using pointers to avoid possibly dropped/uninit refs:
cuviper/rust@array-intoiter...cuviper:array-intoiter-ptr
I can submit the ManuallyDrop
additions in a separate PR, if you think it's worth it.
since the values left in the slice would still be valid values of type T, just ones in the "dropped" typestate.
The Clone
implementation will create truly uninitialized entries too. IMO the unsafe rules should not declare such references invalid if they're not read, but we can deal with it in pointers if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that clone implementation is certainly become UB due to the uninitialized, but that's true of basically every use of mem::uninitialized, so we'll have to go clean them all up later no matter what once the uninit RFC lands
#[inline] | ||
#[unstable(feature = "array_into_iter", issue = "0")] | ||
pub fn as_mut_slice(&mut self) -> &mut [T] { | ||
&mut self.array.as_mut_slice()[self.index..self.index_back] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here-- I think this implementation should use ptr::offset
and from_raw_parts
so as not to acquire a reference to a partially uninitialized slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above, how can I get a base pointer at all without forming some reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above-- if you make ManuallyDrop
repr(transparent)
, then you know that pointers to it will point directly to its element.
Genuine question: considering how |
@clarcharr The Iterator may be moved, which changes the address. |
Ping from triage @alexcrichton! This PR needs your review. |
// Only values in array[index..index_back] are alive at any given time. | ||
// Values from array[..index] and array[index_back..] are already moved/dropped. | ||
array: ManuallyDrop<A>, | ||
index: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I've personally enjoyed doing historically is storing these two indices as Range<usize>
as it helps make implementations for the iterator methods easier
@@ -1242,7 +1242,7 @@ pub trait Iterator { | |||
/// // let's try that again | |||
/// let a = [1, 2, 3]; | |||
/// | |||
/// let mut iter = a.into_iter(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh dear, these changes mean that this is technically a breaking change, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly what's going on, this "breaking change" is no different from the way adding a new method to Iterator
which (if e.g. itertools already implements that method) would break all users of itertools who used the method syntax instead of the UFC syntax to call said method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct yeah. This doesn't mean it's impossible to land, just that it needs more scrutiny/review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does feel like one of those changes that, while (allowed) breaking, people have wanted for a very long time and we definitely want to do at some point.
cc @rust-lang/libs, it looks like this may cause compiling code to stop compiling but fall under the umbrella of "accepted type inference breakage". Along those lines I wanted to cc others on this (especially as everything added is unstable) to make see if anyone's got concerns. I'd personally prefer to hold off on stabilizing these for now until we can get the true |
If we hold off we should at least create a future compat lint that suggests iter over into_iter, so the breakage will be less severe |
I think we should merge this, but not stabilize the types until they have the correct signature. However I think you'll be able to use them on stable (without naming the type) because the I'd also like to see a crater run, but unless its really bad I'm not that perturbed because it feels like you're walking into breakage by calling |
It's terrifying enough that we already had code subject to breakage in our code base. If traits are insta-stable (or they can't act if they doesn't exist if the feature is disabled), we cannot really just merge this. |
We could use inherent methods to have the iterator without the Could we make the I'm game for adding a compat/deprecation warning, but I'm not sure how to do this, at least not at the library level. I guess it would have to be checked as a special case within the compiler? |
Shadowing existing I’m much more uncomfortable with merging this if the feature is effectively insta-stable while we have definite plans to change the "shape" of the return types later, even if these types can’t be named directly yet. |
cc @rust-lang/infra, can this get a crater run? @bors: try |
It isn't possible today. I've opened rust-lang/crater#388 to track this. cc @pietroalbini. |
I think this lint has record of triggering on std (vague memory), and I think it's not possible to apply this patch without some isolation (however feature-gate does not apply to traits, and edition-gate cannot be used for this purpose as well), given that this lint has triggered on at least one of my project I contribute to. |
I think this will technically qualify as a minor breaking change -- the new trait implementation could break some existing code, but is easily worked around. The same is true of many library additions. But crater can help judge how pervasive this is, so we can decide if it's too much headache. A fallback position is that we forgo |
Perhaps we could still add an |
No, it's precisely the method call syntax that gets us into trouble, where auto-deref currently resolves Direct uses of |
Hmm, we could add an inherent array |
To expand on that a little, why not even mark the inherent In hindsight, it might be good to recommend that any collection-like types implementing Deref to another collection (e.g., slices, vec) should always "reserve" their |
This feels like one of those things we just need to do, even if it's a bit painful. There are so many places just calling out for this, like being able to do |
What about doing something like this?
Explicitly calling |
@xfix Yes, that's what @Mark-Simulacrum and I are talking about in our most recent comments. I guess we would need a new
|
It is now possible to start Crater runs with |
I don't think I have permissions for this, but I can give it a shot and maybe someone else will pick it up for me when it fails ;) @craterbot run name=clippy-test-run mode=clippy start=stable end=stable+rustflags=-Dclippy::into_iter_on_array |
👌 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
🚧 Experiment ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more |
🎉 Experiment
|
Tons of failures with:
|
It seems passing clippy lints via |
We discussed the inherent-method trick at all-hands, but discovered that it doesn't actually work on bare values. See in this playground that the final bare call does not hit the inherent method. Method resolution will choose our new |
This allows an array to move its values out through iteration.
This was attempted once before in #32871, but closed because the
IntoIter<T, [T; $N]>
type is not something we would want to stabilize. However, RFC 2000's const generics (#44580) are now on the horizon, so we can plan on changing this toIntoIter<T, const N: usize>
before stabilization.Adding the
impl IntoIterator
now will allows folks to go ahead and iterate arrays in stable code. They just won't be able to name thearray::IntoIter
type or use its inherentas_slice
/as_mut_slice
methods until they've stabilized.Quite a few iterator examples were already using
.into_iter()
on arrays, getting auto-deref'ed to the slice iterator. These were easily fixed by calling.iter()
instead, but it shows that this might cause a lot of breaking changes in the wild, and we'll need a crater run to evaluate this. Outside of examples, there was only one instance of in-tree code that had a problem.Fixes #25725.
r? @alexcrichton