Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: we address hosts using string(rune(shardID)), not by itoa(shardD) #5952

Merged
merged 3 commits into from
Apr 29, 2024

Conversation

dkrotx
Copy link
Member

@dkrotx dkrotx commented Apr 29, 2024

I wish it be the other way, but we can't change how it works now since
in all the other places we already use rune(shardID).
This was not a critical issue - we just returned ShardOwnershipLostError
with bad target host in very specific cases. For instance, at the
persistence engine error - when we got an error from transaction.

Previously we returned wrong host from ShardOwnershipLostError which led cadence-frontend to hit a wrong
host when cadence-history instance lost ownership of the shard during persistance operation.

Everywhere else we use hashring.Lookup(string(rune(shardID))), except for two cases.

Did not check it, looks very much like an error + hard to reproduce (we should be in a middle of persistance Op. for this)

I think it is a clear error now, so let's just fix.

Fixed a bug which sometimes caused a wrong host in ShardOwnershipLostError.

Copy link

codecov bot commented Apr 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 62.82%. Comparing base (b858891) to head (f519f89).

Additional details and impacted files
Files Coverage Δ
client/history/peer_resolver.go 100.00% <100.00%> (ø)
service/frontend/admin/handler.go 47.44% <100.00%> (+1.22%) ⬆️
service/history/handler/handler.go 24.16% <100.00%> (+0.81%) ⬆️
service/history/lookup/lookup.go 100.00% <100.00%> (ø)
service/history/shard/controller.go 68.91% <100.00%> (ø)

... and 7 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b858891...f519f89. Read the comment docs.

dkrotx added 2 commits April 29, 2024 16:57
itoa(shardD)

I wish it be the other way, but we can't change how it works now since
in all the other places we already use rune(shardID).
This was not a critical issue - we just returned ShardOwnershipLostError
with bad target host in very specific cases. For instance, at the
persistence engine error - when we got an error from transaction.
Also some refactoring of existing test: don't duplicate test-name in
table-tests and do not mock expectations in SetupTest.
@dkrotx dkrotx force-pushed the fix-shardid-misuse branch from 7312cd2 to d1c9a27 Compare April 29, 2024 14:57
@@ -2069,7 +2068,7 @@ func (h *handlerImpl) RatelimitUpdate(
func (h *handlerImpl) convertError(err error) error {
switch err := err.(type) {
case *persistence.ShardOwnershipLostError:
info, err2 := h.GetMembershipResolver().Lookup(service.History, strconv.Itoa(err.ShardID))
info, err2 := h.GetMembershipResolver().Lookup(service.History, string(rune(err.ShardID)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe as a followup PR, but could we have a function that accepts shardID and then hides this logic in 1 place? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea!
I'm afraid codecov for new lines won't pass this though :-(
Let me check what's with the coverage of the other multiple lines where Lookup() is used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out, not that bad! Had to introduce a new directory to avoid cycle dependencies on go-generate though.

Copy link
Member

@taylanisikdemir taylanisikdemir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch and thanks for addressing this. +1 for @3vilhamster 's idea of moving the shardid casting to a common place

@dkrotx dkrotx merged commit 8871abc into cadence-workflow:master Apr 29, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants