Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Iterators.cycle does not work for stateful iterators #43235

Open
jakobnissen opened this issue Nov 26, 2021 · 2 comments
Open

Bug: Iterators.cycle does not work for stateful iterators #43235

jakobnissen opened this issue Nov 26, 2021 · 2 comments
Labels
bug Indicates an unexpected problem or unintended behavior iteration Involves iteration or the iteration protocol

Comments

@jakobnissen
Copy link
Contributor

MWE:

julia> it = Iterators.cycle(Iterators.Stateful(1:3));

julia> z = zip(it, 1:10);

julia> length(z)
10

julia> collect(z)
10-element Vector{Tuple{Int64, Int64}}:
 (1, 1)
 (2, 2)
 (3, 3)
 (0, 0)
 (0, 0)
 (0, 0)
 (0, 0)
 (0, 0)
 (0, 0)
 (0, 0)

The 7 last elements of the vector are un-initialized and therefore set to arbitrary values (which could include #undef elements!)

This happens because Iterators.cycle assumes its underlying iterator is stateless and can be resumed from the beginning by calling iterate.

@JeffBezanson JeffBezanson added the iteration Involves iteration or the iteration protocol label Jun 27, 2022
@Moelf
Copy link
Contributor

Moelf commented Jul 3, 2022

Iterators.cycle(x::Iterators.Stateful)

should throw an error, because there's no way to cycle a stateful iterator other than recording all iterations, imagine x here being a file Stream or something

@jakobnissen
Copy link
Contributor Author

This is what Python does. Ideally, we should do just that for stateful iterators.
If that is not feasible, we should have it error for any stateful iterator (which also includes EachLine iterators, and any Generators built on Stateful or Eachline, and so on, recursively.). The problem is that there is currently no way of figuring out if an iterator is stateful.
I propose the following:

  1. Address Introduce iterator trait: IsStateful #43388 first - this means defaulting to IsStateful
  2. Have Iterators.cycle check the statefulness trait. If stateful, store all seen element in a Vector. There will be some false positives here, i.e. stateless iterators that have no opted into the Stateless trait. That's unfortunate, but I believe it's a necessity.

The proposal above is non-breaking and will fix the behaviour of Iterators.cycle. The downside is that it will cause needless caching of results for stateless iterators which have not opted in to the upcoming IsStateless trait. I believe that is the best we can do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior iteration Involves iteration or the iteration protocol
Projects
None yet
Development

No branches or pull requests

4 participants