-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: avoid unbalanced session pool creation #2442
Conversation
A query storm at the startup of the client library before the session pool had initialized could cause the creation of an unbalanced session pool. This again would put a large batch of sessions using the same gRPC channel at the head of the pool, which could then continously be used by the application.
…googleapis/java-spanner into randomize-session-position-at-first-add
…googleapis/java-spanner into randomize-session-position-at-first-add
// A session should only be added at a random position the first time it is added to | ||
// the pool. All following releases into the pool should happen at the front of the | ||
// pool. | ||
session.releaseToPosition = Position.FIRST; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a bit hacky that we are mutating session.releaseToPosition
within the switch case. Instead of doing this within the switch case, should we add this condition as a branch at L2451?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that adding it as a branch at L2451 would work. The basic logic is:
- The session should normally only be added to the pool at a random position once.
- So we need an if/switch case that determines whether it is currently
RANDOM
, and if it is, then it should be reset toFIRST
so it is not randomized the next time.
I've changed the switch case to an if-else, as we currently only support FIRST
and RANDOM
, so a switch does not really make sense. I think that makes the logic a bit clearer as well.
google-cloud-spanner/src/main/java/com/google/cloud/spanner/SessionPool.java
Show resolved
Hide resolved
google-cloud-spanner/src/main/java/com/google/cloud/spanner/SessionPool.java
Outdated
Show resolved
Hide resolved
google-cloud-spanner/src/test/java/com/google/cloud/spanner/SessionPoolMaintainerTest.java
Show resolved
Hide resolved
google-cloud-spanner/src/test/java/com/google/cloud/spanner/SessionPoolStressTest.java
Show resolved
Hide resolved
@@ -2423,6 +2471,48 @@ private void releaseSession(PooledSession session, Position position) { | |||
} | |||
} | |||
|
|||
private boolean isUnbalanced(PooledSession session) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add @VisibleForTesting
and add a few more unit tests for this method? It's doing a bunch of things which is currently untested with the existing tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good point. I've changed it a bit so we have a static method that does the actual calculation. That method is easier to test than one that depends on all the state fields in the SessionPool class. And I've added different tests for that method.
google-cloud-spanner/src/main/java/com/google/cloud/spanner/SessionPool.java
Outdated
Show resolved
Hide resolved
sessions.remove(name); | ||
return null; | ||
} | ||
// We do not take the test lock here, as we already hold the session pool lock. Taking the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is unrelated to the session pool re-balancing, but it fixes a potential deadlock in this test. The deadlock was caused by the fact that the session pool maintenance task has taken the SessionPool#lock
when it comes here. Trying to also take the lock
of this test case could cause a deadlock if one of the tests in this file had already taken this lock, and then tried to execute an action on the session pool that would also require the SessionPool#lock
.
return true; | ||
} | ||
} | ||
} | ||
// Ideally, the use of a channel in the checked out sessions is exactly | ||
// numCheckedOut / numChannels | ||
// We check whether we are more than a factor two away from that perfect distribution. | ||
// If we are, then we randomize. | ||
// If we are, then we re-balance. | ||
count = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bug in the previous version that was found with the additional test cases.
session.releaseToPosition = Position.FIRST; | ||
} | ||
if (session.releaseToPosition == Position.RANDOM && !sessions.isEmpty()) { | ||
// A session should only be added at a random position the first time it is added to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the condition !sessions.isEmpty()
, apart from the first session, all other sessions will enter the else block
and get added to the front of the queue. If this is intended, then isn't this comment inconsistent?
How are we measuring if a session was added for the first time or has been added before. Shouldn't that be a different state variable that get's toggled the first time the session was added to the pool?
Or are you saying session.releaseToPosition == Position.RANDOM
returns true only when session was released first time. I guess all other releases will toggle session.releaseToPosition
to Position.FIRST
Assuming that session.releaseToPosition = Position.RANDOM indicates first time insertion isn't great right. And it can easily break the algorithm.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was indeed outdated. It was initially written before adding the isUnbalanced(..)
part to the solution, as I initially thought that it would be enough to only randomize the order when a session was initially added to the pool. That was not enough, because it could easily happen that a session would be used multiple times before it would actually be released into the pool (that is; because there was always a waiter for a session, which meant that the session was given to that waiter instead of being released to the pool). In addition, it could still be that an initial randomization would give an unbalanced pool.
I've updated the comment to reflect the current situation.
Due to the condition
!sessions.isEmpty()
, apart from the first session, all other sessions will enter the else block and get added to the front of the queue. If this is intended, then isn't this comment inconsistent?
Hmmm.... I'm not sure I understand your question correctly. The !sessions.isEmpty()
ensures that only the first session to be released into the pool enters the else block. We do that because adding it at a random place in an empty list does not make sense.
How are we measuring if a session was added for the first time or has been added before. Shouldn't that be a different state variable that get's toggled the first time the session was added to the pool?
The original implementation would only set it to RANDOM
when the session was created, and then reset it to FIRST
here. So in that implementation, it was a valid indicator for whether it was the first time it was added. With the addition of checking for an unbalanced pool, that has changed (but the comment unfortunately not), and a session can be added at a random position multiple times, but the value is always reset to FIRST
after it has been added, as that is the default as long as the pool remains balanced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I had got confused thinking each session when it enters the queue first has to be put in front of the queue. But I think you meant only the first session. Now it's clear.
🤖 I have created a release *beep* *boop* --- ## [6.47.0](https://togithub.com/googleapis/java-spanner/compare/v6.46.0...v6.47.0) (2023-09-12) ### Features * Add devcontainers for enabling github codespaces usage. ([#2605](https://togithub.com/googleapis/java-spanner/issues/2605)) ([a7d60f1](https://togithub.com/googleapis/java-spanner/commit/a7d60f13781f87054a1631ca511492c5c8334751)) * Disable dynamic code loading properties by default ([#2606](https://togithub.com/googleapis/java-spanner/issues/2606)) ([d855ebb](https://togithub.com/googleapis/java-spanner/commit/d855ebbd2dec11cdd6cdbe326de81115632598cd)) ### Bug Fixes * Add reflection configurations for com.google.rpc classes ([#2617](https://togithub.com/googleapis/java-spanner/issues/2617)) ([c42460a](https://togithub.com/googleapis/java-spanner/commit/c42460ae7b6bb5874cc18c7aecff34186dcbff2a)) * Avoid unbalanced session pool creation ([#2442](https://togithub.com/googleapis/java-spanner/issues/2442)) ([db751ce](https://togithub.com/googleapis/java-spanner/commit/db751ceebc8b6981d00cd07ce4742196cc1dd50d)) ### Dependencies * Update dependency com.google.cloud:google-cloud-shared-dependencies to v3.15.0 ([#2615](https://togithub.com/googleapis/java-spanner/issues/2615)) ([ac762fb](https://togithub.com/googleapis/java-spanner/commit/ac762fbf079db79eab5f2ebee971b850ac89eb11)) --- This PR was generated with [Release Please](https://togithub.com/googleapis/release-please). See [documentation](https://togithub.com/googleapis/release-please#release-please).
🤖 I have created a release *beep* *boop* --- ## [6.47.0](https://togithub.com/googleapis/java-spanner/compare/v6.46.0...v6.47.0) (2023-09-12) ### Features * Add devcontainers for enabling github codespaces usage. ([googleapis#2605](https://togithub.com/googleapis/java-spanner/issues/2605)) ([a7d60f1](https://togithub.com/googleapis/java-spanner/commit/a7d60f13781f87054a1631ca511492c5c8334751)) * Disable dynamic code loading properties by default ([googleapis#2606](https://togithub.com/googleapis/java-spanner/issues/2606)) ([d855ebb](https://togithub.com/googleapis/java-spanner/commit/d855ebbd2dec11cdd6cdbe326de81115632598cd)) ### Bug Fixes * Add reflection configurations for com.google.rpc classes ([googleapis#2617](https://togithub.com/googleapis/java-spanner/issues/2617)) ([c42460a](https://togithub.com/googleapis/java-spanner/commit/c42460ae7b6bb5874cc18c7aecff34186dcbff2a)) * Avoid unbalanced session pool creation ([googleapis#2442](https://togithub.com/googleapis/java-spanner/issues/2442)) ([db751ce](https://togithub.com/googleapis/java-spanner/commit/db751ceebc8b6981d00cd07ce4742196cc1dd50d)) ### Dependencies * Update dependency com.google.cloud:google-cloud-shared-dependencies to v3.15.0 ([googleapis#2615](https://togithub.com/googleapis/java-spanner/issues/2615)) ([ac762fb](https://togithub.com/googleapis/java-spanner/commit/ac762fbf079db79eab5f2ebee971b850ac89eb11)) --- This PR was generated with [Release Please](https://togithub.com/googleapis/release-please). See [documentation](https://togithub.com/googleapis/release-please#release-please).
A query storm at the startup of the client library before the session pool had initialized could cause the creation of an unbalanced session pool. This again would put a large batch of sessions using the same gRPC channel at the head of the pool, which could then continously be used by the application.