Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling Iterators.takewhile(predicate, itr::Stateful) causes the first non-matching element to be dropped from the Stateful Iterator #48195

Open
ajahraus opened this issue Jan 9, 2023 · 4 comments

Comments

@ajahraus
Copy link

ajahraus commented Jan 9, 2023

To me, it's clear that the intended behaviour aught to be that calling Iterators.takewhile on a Stateful Iterator should return all leading matching elements, and leave the Stateful Iterator with all remaining elements.

One possible solution would be to use peek for takewhile if the iterator is Stateful, rather than iterate as seen here,

julia/base/iterators.jl

Lines 856 to 861 in 53a0a69

function iterate(ibl::TakeWhile, itr...)
y = iterate(ibl.xs,itr...)
y === nothing && return nothing
ibl.pred(y[1]) || return nothing
y
end

I'm not certain if this can be applied to the generic implementation as a zero-cost abstraction, or if it would be better handled through multiple dispatch and a more specific method for Stateful Iterators.

I suspect that a similar issue is affecting list comprehensions and Stateful Iterators. I haven't tested that myself, but in my research regarding this issue, I saw several people bringing up a similar complaint.

@ajahraus
Copy link
Author

ajahraus commented Jan 9, 2023

Also, I'm not actually certain that this particular function is where the change would need to be made. It's possible that the change should be made in the constructor of the TakeWhile struct, or even within the implementation of Stateful some place.

@ajahraus
Copy link
Author

I took a swing at implementing this in a few different ways, but nothing I did seemed to work. I also referenced Python's itertools library to see what they did, and it seems that there's a similar issue there.

Calling takewhile will drop the first non-matching element. Instead, they have a before_and_after implementation in the itertools Recipies section, which returns two iterators, the one corresponding to the takewhile portion, and the one corresponding to the rest. But, there's a specific provision for yielding the transition element.

Would there be something similar necessary for Julia?

@JZL
Copy link

JZL commented Apr 27, 2023

I agree this is an unfortunate combination of events and just hit it too (I was reading a large CSV.Rows in dynamic chunks. Had to use Stateful so I could start where I left off, but takewhile loses rows).

Not the end of the world because I can iterate through, push!'ing into the main collector while elements pass, and on the last iteration, push! into a collector for the next round. But it would be cool if the docs warned of this, or there was a solution. Returning two iterators (the 'taken' elements + an iterator to the start of the rest) would be nice, or even returning the main 'taken' elements + the single next element would be nice.

@jakobnissen
Copy link
Contributor

So this is part of a larger problem in Julia where Base code just assumes all iterators are stateless. This is an assumption that I think we must remove from Base, everywhere we find it. That includes the method of iterate in your OP.

So, given that iterate can't simply iterate its underlying iterator and throw it away, I see two approaches:

  • Either use peek - but this generally does not work.
  • Or, store the next element of the iterator in the TakeWhile struct similar to how Stateful works. This is generally pretty tricky for a few reasons, one being that TakeWhile then needs to statically now its iterator's state and last element's type which is very hard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants