Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement locked iteration for PyList #4789

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

ngoldbaum
Copy link
Contributor

@ngoldbaum ngoldbaum commented Dec 10, 2024

Fixes #4571.

Re-enables get_item_unchecked on the free-threaded build (with a new free-threaded-specific note about safety), adds locked_for_each, and implements a number of iterator methods for BouldListIterator on the free-threaded build to amortize synchronization overhead where possible.

Largely follows the implementation and tests from #4439, along with fixes similar to the ones I implemented for #4788.

@@ -302,7 +315,7 @@ impl<'py> PyListMethods<'py> for Bound<'py, PyList> {
/// # Safety
///
/// Caller must verify that the index is within the bounds of the list.
#[cfg(not(any(Py_LIMITED_API, Py_GIL_DISABLED)))]
#[cfg(not(Py_LIMITED_API))]
Copy link
Contributor Author

@ngoldbaum ngoldbaum Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just now noticing while looking over the code: maybe this should be cfg(not(any(Py_LIMITED_API, PyPy)))?

@ngoldbaum
Copy link
Contributor Author

I was chatting with @epilys on an IRC channel we both use and he had a suggestion to avoid the inner struct. I've included a commit from him implementing that.

Comment on lines +499 to 507
macro_rules! split_borrow {
($instance:expr, $index:ident, $length:ident, $list:ident) => {
let Self {
ref mut $index,
ref mut $length,
ref $list,
} = $instance;
};
}
Copy link
Member

@mejrs mejrs Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than just using this to split the borrow, maybe we should just have a macro that wraps

        crate::sync::with_critical_section(list, || {
            ...
        })

as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to do this this afternoon and got stuck. I'm pretty novice at writing macros - do you know of any similar examples I could look at somewhere?

Here's what I tried, but this doesn't compile if I used it in e.g. the fold implementation: https://gist.github.com/ngoldbaum/13ac11629f042ae0bee84559e4e8bb31


crate::sync::with_critical_section(list, || {
let mut accum = init;
while let Some(x) = unsafe { Self::next_unchecked(index, length, list) } {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really safe to call next_unchecked here (and elsewhere)? Can't the closure modify the list?

Copy link
Contributor Author

@ngoldbaum ngoldbaum Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More thoughts about thread safety here:

  • It is possible for another thread to try to acquire a critical section on the list, but we hold a critical section here so that thread will block until this thread exits the critical section.
  • That means the index and length are correct going into f, despite us getting them without any synchronization in next_unchecked.
  • 99% of the time, the critical section never getd released. It does get released if f creates a new innermost critical section on the list, but then only this thread can access the list still. It is possible that fold is getting called recursively, in which case this thread would create new innermost critical sections until the recursion terminates.

(did some edits of the text above to drop irrelevant references to the GIL)

length: &mut Length,
list: &Bound<'py, PyList>,
) -> Option<Bound<'py, PyAny>> {
let length = length.0.min(list.len());
Copy link
Contributor Author

@ngoldbaum ngoldbaum Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mejrs if a closure updates the length and the old length stored in the iterator is out-of-bounds, this step means the index.0 < length check below is False so the iterator returns None and the iteration terminates.

We hold a critical section so other threads can't modify the list between here and the next get_item_unchecked call.

Does that make sense? Are you worried about other scenarios?

In any case, I'll try to add a test where the closure modifies the list to see what happens....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See test_iter_fold_out_of_bounds added in the last commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add locked iterations APIs for dicts and lists
3 participants