Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query_executor: Fixed race between Query.Release() and speculative executions #1684

Conversation

wprzytula
Copy link
Contributor

@wprzytula wprzytula commented Mar 30, 2023

As described in #1315, there is a race condition when there is more than one concurrent execution (by means of speculative execution) and Query.Release() is called after one of them completes. Query.Release() calls Query.reset(), which in turns zeroes the whole Query. Then, after another execution completes, it tries to access metrics, but they are already set to nil by the call to Release(), so a segfault is triggered.

The solution is quite simple: a ref counter is introduced into the Query. It is obviously initially set to 1. Every execution fiber (i.e. every execution goroutine) has the refcount atomically incremented before it is started, and decrements the refcount on completion. Query.Release() merely decrements the refcount. Query.reset() is called by whichever goroutine that decrements the refcount to 0.

Fixes: #1315

@mykaul
Copy link

mykaul commented Apr 3, 2023

@martin-sucha - will you be able to review this change? TIA!

Copy link
Contributor

@martin-sucha martin-sucha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pull request! Code looks good to me. Could you please also update AUTHORS and CHANGELOG.md as per https://github.com/gocql/gocql/blob/master/CONTRIBUTING.md#minimum-requirement-checklist ?

…ecutions

As described in apache#1315, there is a race condition when there is more than
one concurrent execution (by means of speculative execution) and
Query.Release() is called after one of them completes. Query.Release()
calls Query.reset(), which in turns zeroes the whole Query. Then, after
another execution completes, it tries to access metrics, but they are
already set to nil by the call to Release(), so a segfault is triggered.

The solution is quite simple: a ref counter is introduced into the Query.
It is obviously initially set to 1. Every execution fiber
(i.e. every execution goroutine) has the refcount atomically incremented
before it is started, and decrements the refcount on completion.
Query.Release() merely decrements the refcount. Query.reset() is called
by whichever goroutine that decrements the refcount to 0.
@wprzytula wprzytula force-pushed the fix-race-on-speculative-and-release-upstream branch from 15ea2a4 to 47d05f0 Compare April 12, 2023 10:48
@wprzytula
Copy link
Contributor Author

Thanks for the pull request! Code looks good to me. Could you please also update AUTHORS and CHANGELOG.md as per https://github.com/gocql/gocql/blob/master/CONTRIBUTING.md#minimum-requirement-checklist ?

Done.

@wprzytula wprzytula requested a review from martin-sucha April 12, 2023 10:49
@wprzytula
Copy link
Contributor Author

@martin-sucha ping

@martin-sucha martin-sucha merged commit a4fc89a into apache:master Apr 20, 2023
@martin-sucha
Copy link
Contributor

Thanks! Merged 🚀

@wprzytula wprzytula deleted the fix-race-on-speculative-and-release-upstream branch April 20, 2023 11:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data race in Query
3 participants