Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for step_by stabilization #27741

Closed
aturon opened this issue Aug 12, 2015 · 107 comments
Closed

Tracking issue for step_by stabilization #27741

aturon opened this issue Aug 12, 2015 · 107 comments
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. E-help-wanted Call for participation: Help is requested to fix this issue. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. finished-final-comment-period The final comment period is finished for this PR / Issue. P-medium Medium priority T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@aturon
Copy link
Member

aturon commented Aug 12, 2015

Update (@SimonSapin): this is now the tracking issue for an iterator adaptor based on Iterator::nth:

pub trait Iterator {
    fn step_by(self, step: usize) -> StepBy<Self>
}

The Step trait used for making ranges iterators is now tracked at #42168.

Original issue:


The step_by method makes it possible to step through ranges with arbitrary-sized steps. There are several issues that need to be addressed before we can stabilize it:

  • The design/stabiliztion of the Step trait, which is currently a bit of a mess
  • The behavior on negative steps (see this thread)
  • Likely the design/stabilization of Zero/One, tracked here
@aturon aturon added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. B-unstable Blocker: Implemented in the nightly compiler and unstable. I-nominated labels Aug 12, 2015
@aturon
Copy link
Member Author

aturon commented Aug 12, 2015

Nominating for P-High.

@alexcrichton
Copy link
Member

triage: P-high

@rust-highfive rust-highfive added P-high High priority and removed I-nominated labels Aug 19, 2015
@huonw
Copy link
Member

huonw commented Aug 19, 2015

cc #23588

@sfackler
Copy link
Member

We should probably start moving on this in the next cycle or drop the P-high.

@aturon aturon added the E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. label Sep 23, 2015
@aturon
Copy link
Member Author

aturon commented Sep 23, 2015

Would definitely like some help getting this over the finish line. I'm happy to mentor!

@alexcrichton
Copy link
Member

The library subteam decided to not move this to FCP this cycle. The API is pretty likely to need revision in one form or another and that should be taken care of before a FCP happens. I believe discussion is still going on in the internals post though!

Also moving to P-medium

@alexcrichton alexcrichton added P-medium Medium priority and removed I-nominated P-high High priority labels Sep 24, 2015
@dschatzberg
Copy link
Contributor

It seems odd to me that the by type is Self. It seems similar to the Add trait where I can choose the RHS and the sum type separately. It seems like the type of the steps should similarly be potentially different from the elements themselves (steps are effectively differences).

@shssoichiro
Copy link
Contributor

Where do we stand on this at the moment? This is a feature I'd very much like to use in stable, and am willing to provide help to get this finalized (with some mentoring).

@ranma42
Copy link
Contributor

ranma42 commented Jan 24, 2016

Based on the documentation, I would expect (0u8..).step_by(1).count() to be 256, but currently it diverges. Is this a bug in the documentation or in the implementation?

@SimonSapin
Copy link
Contributor

Why do you expect that? Overflowing the range of an integer type is an error (it panics in debug mode), it’s not the expected end of a RangeFrom iterator (which doesn’t have an end).

@ranma42
Copy link
Contributor

ranma42 commented Jan 24, 2016

@SimonSapin The example for RangeFrom::step_by is

for i in (0u8..).step_by(2) {
    println!("{}", i);
}

I think that this example is misleading: its explanation says "This prints all even u8 values.", but instead it overflows on panic in debug and loops forever in release.

@SimonSapin
Copy link
Contributor

Indeed, that example is wrong. How does this look? #31172

@comex
Copy link
Contributor

comex commented Mar 13, 2016

Ditto @dschatzberg's comment. Imagine something like stepping through a range of two Instants by a Duration.

@nodakai
Copy link
Contributor

nodakai commented Mar 15, 2016

A sequence of floats like 0.1, 0.2, ..., 0.5 is so common in numerical codes that Matlab has a built-in operator for it:

But Step is only implemented for integer types, perhaps in the fear of a numerical error around the endpoint.

I understand your rigorous approach but it would be handy to have step_by() for floats (perhaps as an optional trait?)

@nodakai
Copy link
Contributor

nodakai commented Mar 15, 2016

Also, I think we should consider inclusion of Itertools::step(usize) into the stdlib as well (thanks u/levansfg on Reddit for letting me know)

which, to my eyes, is more generic (applicable to all iterators)

@yigal100
Copy link

IMHO, step_by should be a general abstraction for all iterators as mentioned by @nodakai and the current implementation that uses addition must be at most a specialization for efficiency sake for integral types (now that Rust has specialization implemented).

@dhardy
Copy link
Contributor

dhardy commented May 27, 2016

yigal100's approach seems sensible, in which case step_by should take a usize regardless of the iterator type and just call next a configurable number of times (>= 0).

Another option would be to drop step_by and instead allow a Range to be multiplied by something; e.g. (1..3) * -2 would return -2, -4. In this case it would be nice to be able to write (0..10) * 0.1, however, where would the (int->float) conversions come in? Also, it might be nice to be able to use other operators: (0..101) / 100.0 + 1.0 for 101 values from 1 to 2.

Neither of these options requires a One trait really; it depends how Rust converts from the a..b syntax to an iterator (if this is restricted to integer types, as seems to be the case, then 1 as TYPE is sufficient).

The questions which I think need answering are:

1 What are the main use-cases?

In my mind these are:

a. producing a sequential list of array indices (i.e. non-negative integers), usually stepping by +1 but possibly also -1, 2, etc., and
b. producing an evenly spaced set of sample points, usually floating-point and with any spacing. This doesn't need to be accommodated by the standard library.
c. (possibly) skipping some items in a general iterator (Itertools::step(usize) that nodaki mentioned)

2 What restrictions are there on syntax / implementation?

As far as I am aware, the a..b syntax must be kept for integers and must be usable as an iterator.

Is there a fundamental reason that -20..20 or 0.5..2.5 or "a".."z" cannot be implemented? Is there already a commitment to enable (0..8).step_by(2) in this form?

3 Flexibility, optimizability and intuitiveness

Which is clearer, (0..100).step_by(10) or (0..10) * 10? ` Or some other way?

Is there any point allowing negative steps? Maybe not for indices since .reverse() is clearer but possibly for a range of sample points (e.g. (0..11) / -100 to yield 0, -100, -200, ..., -1000).

One thing I like about the RANGE * SCALAR syntax is that it might enable type-conversions to floats or user-defined types without requiring support for those types in the a..b syntax.

@SimonSapin
Copy link
Contributor

I agree that Iterator::step_by is good, regardless of whether it is the solution for ranges. Let’s stabilize it.

@rfcbot fcp merge

The tracking issue for range (stepped) iterator is #42168

@rfcbot
Copy link

rfcbot commented Mar 17, 2018

Team member @SimonSapin has proposed to merge this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Mar 17, 2018
@gnzlbg
Copy link
Contributor

gnzlbg commented Mar 18, 2018

Is there an RFC or post or something that documents the behavior being proposed for stabilization?

I find it extremely hard to extract from this issue what is exactly being proposed for stabilization, what is the rationale, alternatives that have been explored, etc.

I am not suggesting that this needs an RFC, but at least a summary of the work done and why what we have now is better and suitable for stabilization that what we had before. As a maintainer of collections implement step_by I would find such information useful to update my collections accordingly.

@SimonSapin
Copy link
Contributor

@gnzlbg Yeah, this issue has a lot of history and can be hard to read now. Iterator::step_by(self, n: usize) -> StepBy<Self> returns a new iterator that is based on the inner iterator’s nth method:

https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.step_by

Creates an iterator starting at the same point, but stepping by
the given amount at each iteration.

Note that it will always return the first element of the iterator,
regardless of the step given.

Panics

The method will panic if the given step is 0.

Examples

Basic usage:

#![feature(iterator_step_by)]
let a = [0, 1, 2, 3, 4, 5];
let mut iter = a.into_iter().step_by(2);

assert_eq!(iter.next(), Some(&0));
assert_eq!(iter.next(), Some(&2));
assert_eq!(iter.next(), Some(&4));
assert_eq!(iter.next(), None);

@gnzlbg
Copy link
Contributor

gnzlbg commented Mar 18, 2018

@SimonSapin thanks for the summary, that is really helpful.

@bluss
Copy link
Member

bluss commented Mar 18, 2018

I agree that step_by is good for iterators in general, but we risk a situation where we have to advise not using .step_by on ranges, which is going to be uphill ergonomically. I had hopes .step_by(usize) would be straightforward for ranges, but it's a bit complicated.

With the right tooling and right method name, a “increment type matching” stepping adaptor for ranges might still be easy to find for users.

Problems with step_by for ranges include such things as:

  • Can't use u64 steps for an u64 range
  • How do we implement a for example i16 range with usize step, in a way that is efficient, ideally “zero cost”.
    • Maybe we can do most of the work already in the constructor of the stepping adaptor for an Range<i16>. We figure out what an "equivalent" step that fits inside an i16 is and adjust the step accordingly.

@leonardo-m
Copy link

leonardo-m commented Mar 18, 2018

Could two different named methods ("step" and step_by"?) solve this issue, having one for ranges that takes an usize, and one for intervals that takes the same type as the interval?

@ivandardi
Copy link
Contributor

Can't we have the step parameter be an associated type? Then on the generic Iterator we can have it be usize and for ranges we override it with Idx? Like

trait Iterator {
    type Item;
    type Idx = usize;
   
    /// ...
}

impl Iterator for Range<Idx=u16> {
    type Idx = Self::Idx;
    
    /// ...
}

Or something similar is this doesn't work out or if it's not desirable to add a new associated type to the Iterator trait.

@sfackler
Copy link
Member

@ivandardi If we did that Range<u16> wouldn't implement Iterator<Item = u16> anymore.

@letheed
Copy link
Contributor

letheed commented Mar 18, 2018

@bluss How about renaming step_by to every or every_nth since it relies on nth. I agree the word "step" could bring confusion. The method is valuable on its own though even if it's not the solution for number ranges.

@SimonSapin
Copy link
Contributor

I’m ok with every_nth / EveryNth.

@cuviper
Copy link
Member

cuviper commented Mar 18, 2018

Semantics and naming choices like every_nth were discussed in PR #41439. I think it would be confusing to name this similar to nth, since they're off by one -- step_by(x) is the same as next() then repeating nth(x-1).

@letheed
Copy link
Contributor

letheed commented Mar 20, 2018

Then how about every.

@rfcbot
Copy link

rfcbot commented Apr 18, 2018

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Apr 18, 2018
@rfcbot
Copy link

rfcbot commented Apr 28, 2018

The final comment period is now complete.

@scottmcm
Copy link
Member

I looked at overriding StepBy::try_fold to see if it would help. It does seem to help, but the biggest thing for me was that it clarified what made it hard for LLVM: the inner loop for (a..b).step_by(7).map(|x| x ^ (x - 1)).sum() is:

	mov	r8d, eax
	lea	r9d, [rcx - 1]
	lea	eax, [rcx - 2]
	xor	eax, r9d
	add	eax, r8d
	mov	r9d, ecx
	add	r9d, 6
	setb	r8b
	cmp	r9d, edx
	jae	.LBB1_6
	add	ecx, 7
	test	r8b, r8b
	je	.LBB1_4

which doesn't simplify further because of the slightly-awkward semantics of nth. It needs to return the 6th item and step by 7. (So the Range iterator is doing both Step::add_usize(6) and Step::add_one().)

I think this'd be perfectly efficient in terms of a "return next and also step forward" method, since the Range iterator for that would just return start and replace it with start.add_usize(7). I dunno if there'd be an appetite for such a thing, though...

@Centril Centril added disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels May 24, 2018
tmccombs added a commit to tmccombs/rust that referenced this issue Jun 3, 2018
bors added a commit that referenced this issue Jun 10, 2018
@Emerentius
Copy link
Contributor

Could we amend a note to step_by allowing us to use the pre-stepping semantics that @scottmcm spoke of? Given that iteration can have side-effects, I fear we may be overconstraining ourselves with regards to range iterators. While the current side-effect free integer ranges can deal with this, RangeFrom would already be changing semantics with pre-stepping because of overflow. It could affect additional situations when the Step trait will be stabilized.

While specializations of StepBy<Range*<_>> catch the most common case performance issues, they miss all the more complicated ones where the Range is hidden within an adapter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. E-help-wanted Call for participation: Help is requested to fix this issue. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. finished-final-comment-period The final comment period is finished for this PR / Issue. P-medium Medium priority T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests