Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Token aware host policy chooses primary replicas only #1621

Closed
havaker opened this issue May 10, 2022 · 1 comment · Fixed by #1714
Closed

Token aware host policy chooses primary replicas only #1621

havaker opened this issue May 10, 2022 · 1 comment · Fixed by #1714
Labels

Comments

@havaker
Copy link

havaker commented May 10, 2022

If no keyspace is specified in ClusterConfig, then the token aware host policy has a degraded functionality.

What version of Gocql are you using?

master

What did you do?

I did not specify .Keyspace in ClusterConfig and used gocql.TokenAwareHostPolicy(gocql.RoundRobinHostPolicy(), gocql.ShuffleReplicas()) as an host selection policy. I then run a few SELECT queries against a 3 node cluster.

package main

import (
	"time"

	"github.com/gocql/gocql"
)

func main() {
	/* The example assumes the following CQL was used to setup the keyspace:
	   create keyspace example with replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 };
	   create table example.tweet(timeline text, id UUID, text text, PRIMARY KEY(id));
	*/

	cluster := gocql.NewCluster("localhost:9042")

	//cluster.Keyspace = "example"

	cluster.PoolConfig.HostSelectionPolicy = gocql.TokenAwareHostPolicy(gocql.RoundRobinHostPolicy(), gocql.ShuffleReplicas())
	session, err := cluster.CreateSession()
	if err != nil {
		panic(err)
	}
	defer session.Close()

	someUUID := gocql.TimeUUID()
	for i := 0; i < 997; i++ {
		if err := session.Query(`SELECT id FROM example.tweet WHERE id = ?`, someUUID).Exec(); err != nil {
			panic(err)
		}

		time.Sleep(time.Second)
	}
}

What did you expect to see?

I expected each of 3 nodes to receive some queries.

What did you see instead?

Only the primary replica received the select queries (even though I was operating on a 3-node cluster, on a keyspace with replication factor 3 and passed gocql.ShuffleReplicas() to TokenAwareHostPolicy).

More

This issue persists if the .Keyspace field is not equal to a keyspace directly specified in a query string. For example:

cluster.Keyspace = "example"

// later

session.Query(`SELECT id FROM differentkeyspacename.tweet WHERE id = ?`, someUUID).Exec(

The root of this problem is the way that tokenAwareHostPolicy manages replica maps for keyspaces.

When a Pick method of tokenAwareHostPolicy is invoked, replica hosts are chosen using previously build replica maps. Because replica maps are only* built for keyspace returned by tokenAwareHostPolicy.getKeyspaceName(), this codepath is triggered resulting in a selection of a primary replica only.

* they are also build when a KeyspaceUpdateEvent is received by the policy. I haven't observed any of those events during my tests.

Sending every query to its primary replica can cause unequal load in a cluster. I did not find any records of this behavior in the documentation, so I'm considering it a bug.

@martin-sucha
Copy link
Contributor

Hi! I haven't tried to reproduce yet, but indeed this seems like a bug. Thanks for reporting it!

fruch added a commit to fruch/scylla-bench that referenced this issue Jun 27, 2022
Seem what's described in apache/cassandra-gocql-driver#1621 might be
affecting us, and making scylla-bench send unblanced
traffic to scylla

REF: apache/cassandra-gocql-driver#1621
fruch added a commit to fruch/scylla-cluster-tests that referenced this issue Jun 27, 2022
fruch added a commit to fruch/scylla-bench that referenced this issue Sep 1, 2022
Seem what's described in apache/cassandra-gocql-driver#1621 might be
affecting us, and making scylla-bench send unblanced
traffic to scylla

REF: apache/cassandra-gocql-driver#1621
fruch added a commit to fruch/scylla-bench that referenced this issue Sep 1, 2022
Seem what's described in apache/cassandra-gocql-driver#1621 might be
affecting us, and making scylla-bench send unblanced
traffic to scylla

REF: apache/cassandra-gocql-driver#1621
roydahan pushed a commit to scylladb/scylla-bench that referenced this issue Sep 4, 2022
Seem what's described in apache/cassandra-gocql-driver#1621 might be
affecting us, and making scylla-bench send unblanced
traffic to scylla

REF: apache/cassandra-gocql-driver#1621
sylwiaszunejko added a commit to sylwiaszunejko/gocql that referenced this issue Jul 12, 2023
Previously TokenAwarePolicy always used Keyspace explicitly set in
cluster.Keyspace regardless of the keyspace in the Query. Now after
preparing statement Keyspace and Table names are transferred to the
Query and it can make use of that.

Fixes: apache#1621
sylwiaszunejko added a commit to sylwiaszunejko/gocql that referenced this issue Jul 14, 2023
Previously TokenAwarePolicy always used Keyspace explicitly set in
cluster.Keyspace regardless of the keyspace in the Query. Now after
preparing statement Keyspace and Table names are transferred to the
Query and it can make use of that.

Fixes: apache#1621
sylwiaszunejko added a commit to sylwiaszunejko/gocql that referenced this issue Jul 14, 2023
Previously TokenAwarePolicy always used Keyspace explicitly set in
cluster.Keyspace regardless of the keyspace in the Query. Now after
preparing statement Keyspace and Table names are transferred to the
Query and it can make use of that.

Fixes: apache#1621
sylwiaszunejko added a commit to sylwiaszunejko/gocql that referenced this issue Jul 14, 2023
Previously TokenAwarePolicy always used Keyspace explicitly set in
cluster.Keyspace regardless of the keyspace in the Query. Now after
preparing statement Keyspace and Table names are transferred to the
Query and it can make use of that.

Fixes: apache#1621
sylwiaszunejko added a commit to sylwiaszunejko/gocql that referenced this issue Jul 14, 2023
Previously TokenAwarePolicy always used Keyspace explicitly set in
cluster.Keyspace regardless of the keyspace in the Query. Now after
preparing statement Keyspace and Table names are transferred to the
Query and it can make use of that.

Fixes: apache#1621
sylwiaszunejko added a commit to sylwiaszunejko/gocql that referenced this issue Jul 17, 2023
Previously TokenAwarePolicy always used Keyspace explicitly set in
cluster.Keyspace regardless of the keyspace in the Query. Now after
preparing statement Keyspace and Table names are transferred to the
Query and it can make use of that.

Fixes: apache#1621
sylwiaszunejko added a commit to sylwiaszunejko/gocql that referenced this issue Jul 17, 2023
Previously TokenAwarePolicy always used Keyspace explicitly set in
cluster.Keyspace regardless of the keyspace in the Query. Now after
preparing statement Keyspace and Table names are transferred to the
Query and it can make use of that.

Fixes: apache#1621
sylwiaszunejko added a commit to sylwiaszunejko/gocql that referenced this issue Jul 21, 2023
Previously TokenAwarePolicy always used Keyspace explicitly set in
cluster.Keyspace regardless of the keyspace in the Query. Now after
preparing statement Keyspace and Table names are transferred to the
Query and it can make use of that.

Fixes: apache#1621
sylwiaszunejko added a commit to sylwiaszunejko/scylla-bench that referenced this issue Aug 9, 2023
Setting the keyspace explicitly was done to workaround
the following bug: apache/cassandra-gocql-driver#1621

After fixing this bug it is not necessary anymore.
(PR fixing it: apache/cassandra-gocql-driver#1714)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants