-
Notifications
You must be signed in to change notification settings - Fork 623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always refresh ring on topology changes and reconnections #1680
Always refresh ring on topology changes and reconnections #1680
Conversation
Hi! Thanks for the pull request. Did you mean to put removal of Could you please update the description of commit 326072b to what the description of the merge request says? It seems gocql currently does not de-bounce the ring refresh, so always refreshing the ring could theoretically cause a lot of loads if events come in quick succession. I suspect this is not a problem in practice, but we can still add the de-bounce (maybe in a separate pull request?) to guard against that. Another thing that I noticed while reviewing the code is that Otherwise this looks good to me. It would be great if you could add a test to make sure that we don't introduce the same issues in the future. |
3d83e75
to
829456e
Compare
Thanks for the quick review. I've changed the commits a bit so they should be clearer now.
I removed
The topology change events are being debounced but I agree that it would probably be a good idea to debounce the ring refresh operations themselves. I'm happy to address this in a subsequent PR or I can just create a new issue so another contributor can pick it up, let me know what you prefer.
I'll start working on this and I'll also look into writing a test for this. |
Techically,
Feel free to send a pull request, that would be great.
Thanks! |
829456e
to
e204b56
Compare
I've rebased my branch and pushed commits to make For the debouncer, I would think having an event debouncer would be enough and I see that |
@martin-sucha Can you take a look at the merge request? This should help tremendously with K8's environments. Thanks! |
I've implemented a debouncer, can you take a look and see if this is aligned with what you were thinking? I'm going to look into tests now. |
295a4a5
to
a891a8b
Compare
I've added unit tests but I don't see a way to simulate a topology change for an integration test, does the current test code support writing an integration test that could test whether the control connection updates the topology when it reconnects? Oh I've also changed the |
Yeah, the current integration tests use ccm and it would probably be hard to write an integration test that changes topology. We should probably use Go code directly instead of ccm to setup the cluster. That is a bigger chunk of work though. I've opened #1686 to track that.
Nice catch. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for preparing the changes! Looks good to me in general, I've added some inline comments.
Managed to add a ccm test that tests whether the control connection refreshes the ring on reconnect. I used I'll take a look at your review feedback now, thanks. |
c0ca465
to
c91e08f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I've noticed one more thing, added inline comment.
- remove Conn.localHostInfo() and add Session.hostInfoFromIter() - add ringDescriber.getLocalHostInfo() - remove local host from ringDescriber.getClusterPeerInfo() returned slice - make ringDescriber.GetHosts() call ringDescriber.getLocalHostInfo() in addition to ringDescriber.getClusterPeerInfo()
- Ring refresh can be requested by topology event handler and control connection reconnections
c91e08f
to
b9fa942
Compare
@martin-sucha @joao-r-reis Is this ready to be reviewed and merged? Really looking forward to getting this out as it will help several of our customers. |
I'm waiting on a review and ready to implement any changes that are suggested. I've addressed all comments of the previous reviews. |
Looks good to me. Thank you! |
It seems there is a data race in the ccm test. @joao-r-reis could you please check #1691 ? |
@joao-r-reis @martin-sucha Any updates? What's needed at this point? |
This has already been released in v1.4.0 AFAIK |
Currently, there are some cases where gocql does not refresh the ring during a control connection reconnection. If topology changes are happening on the cluster during the reconnection, some
NEW_NODE
andREMOVED_NODE
protocol events might be missed.To fix this, every successful reconnection on the control connection should trigger a ring refresh operation. This PR adds that.
Also, this PR fixes an issue where
NEW_NODE
events can be overwritten byUP
events and cause certain nodes to not be added to the ring. This is an issue already described in #1669 but the fix proposed by this PR is a bit different than one outlined on that PR. This PR causes any topology event to trigger a ring refresh regardless of how many nodes were removed or added. This eliminates the need to have duplicated logic around adding and removing nodes in bothringDescriber
andSession
.Note: this is not an implementation of #1681.