Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync: Add try_recv back #3639

Closed

Conversation

zaharidichev
Copy link
Contributor

Signed-off-by: Zahari Dichev [email protected]

Motivation

Some time ago, the Receiver::try_recv was removed, due to users experiencing
race conditions when consuming from the chan. This problem caused consumers
to receive a TryRecvError::Empty even though a write to the chan has been complete.

Solution

The solution in this PR is to destinguish between a slot in the concurrent
block list not being ready and the list being empty. When the chan is not
empty and a slot is still not ready to be consumed, we spin in try_recv.

Fix: #3350

Signed-off-by: Zahari Dichev <[email protected]>
@Darksonn Darksonn added A-tokio Area: The main tokio crate M-sync Module: tokio/sync labels Mar 22, 2021
Signed-off-by: Zahari Dichev <[email protected]>
@carllerche
Copy link
Member

Generally speaking, we try to avoid spinning. It’s going to be tricky to avoid that here though. I’m interested in thoughts on that.

@zaharidichev
Copy link
Contributor Author

zaharidichev commented Mar 26, 2021

@carllerche Thought about it a little bit. The way things are setup at the moment, we need to either spin on pop() or spin on push(), where we try to CAS in a block state where all previous slots have been set to ready...

Another option that I have not spent too much time on is to make the receiver be able to "look ahead". So if there is a slot ahead that has been set as Ready, just read its contents and return them, marking it as Read. Then it can be ignored once the index is advanced to point at it. Not sure whether that can actually work. Figuring that out across block boundaries makes this a bit harder. WDYT? Any better suggestions?

return Ok(value);
}
Some(Closed) => return Err(TryRecvError::Closed),
Some(NotReady) => thread::yield_now(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be using std::hint::spin_loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I borrowed it from https://github.com/tokio-rs/tokio/blob/master/tokio/src/sync/mpsc/block.rs#L348. It has a comment that convinced me this is the right thing to use. But is this comment actually correct? Unclear. Any thoughts ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what about this bit of docs: https://docs.rs/loom/0.4.0/loom/#yielding

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment you linked appears to be wrong, as the non-loom version just re-exports std::thread. It also has a comment on MSRV.

@zaharidichev zaharidichev requested a review from Darksonn April 19, 2021 13:40
@gerdoe-jr
Copy link

How does it go?

@Darksonn
Copy link
Contributor

Darksonn commented Jun 1, 2021

The PR is relatively complicated, and I haven't found the time to read it thoroughly.

Comment on lines 347 to 352
None => {
// The compare-and-swap succeeded and the newly allocated block
// is successfully pushed.
self.set_has_next();
return new_block;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thread could see the compare and swap before this set_has_next has completed. Is that a problem?

Copy link
Contributor Author

@zaharidichev zaharidichev Jun 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Darksonn Trying to page all of that back as it has been a while. I had a thought about that when I was writing it and I think it is ok. I think what happens in the situation that you are describing is that for example if you have two blocks such:

        B1                               B2
| a | b | c | _ | _ | -------> | _ | _ | _ | _ |

If we are trying to read at the second to last slot of B1 and a value is being written at the first slot of B2 and the CAS has succeeded and the thread reading B1 observes that, this means that we have a block but a value will not be written to this block before the current one is marked as having a next block. In this case, we will receive an Empty error, which will technically be correct because the value is still not there. Does that sound correct or I am making things up?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the consequence of this is that it reports Empty if a send is happening concurrently, then that's fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Darksonn is that PR still relevant? Is there plan to merge it at some point. If not we can close it I guess.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do want to add back try_recv, but I'm not sufficiently familiar with the internals of the mpsc channel to properly review it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, makes sense. Thanks for looking at it nevertheless

Comment on lines +335 to +337
// Notify the rx tasks
self.rx_waker.wake();
self.not_ready_waker.wake();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other places that wake rx_waker. Should the other places not be updated as well?

run(&ctx);

for th in ths {
th.join().unwrap()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
th.join().unwrap()
th.join().unwrap();

Comment on lines +229 to +232
/// Marks the block as having a next one
fn set_has_next(&self) {
self.ready_slots.fetch_or(HAS_NEXT, Release);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You call this in grow, but not in try_push. That seems incorrect to me.

@Darksonn
Copy link
Contributor

Thank you for doing this PR. We did not end up accepting it because it was complicated and a simpler solution was available, but the solution that just got merged used several ideas from your PR, and reviewing your PR was helpful in coming up with the solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-sync Module: tokio/sync
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add back mpsc try_recv
4 participants