-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
grpclb: recover after receiving an empty server list #4879
Conversation
The root cause is SubConn not handling empty addr list correctly, filed #4881 |
balancer/grpclb/grpclb_test.go
Outdated
}} | ||
|
||
creds := serverNameCheckCreds{} | ||
ctx, cancel := context.WithTimeout(context.Background(), 20*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now have a defaultTestTimeout
const in this file. You can use that instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
balancer/grpclb/grpclb_test.go
Outdated
for i := 0; i < 100 && !gotUnavailable; i++ { | ||
func() { | ||
ctx, cancel := context.WithTimeout(ctx, time.Second*1) | ||
defer cancel() | ||
_, err := testC.EmptyCall(ctx, &testpb.Empty{}) | ||
if status.Code(err) == codes.Unavailable { | ||
gotUnavailable = true | ||
} | ||
}() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we check clientConn state instead of this? Are we sure this will not be flaky?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is copied from Alex's reproduction code.
I don't think this will be flaky. But I simplified the test a bit.
break | ||
} | ||
if sc != nil { | ||
if len(backendAddrs) == 0 { | ||
lb.cc.cc.RemoveSubConn(sc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better to handle this UpdateAddresses()
so that every balancer need not handle this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That's what #4881 is tracking.
6a60fa9
to
db2b3bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of minor nits.
balancer/grpclb/grpclb_test.go
Outdated
creds := serverNameCheckCreds{} | ||
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout) | ||
defer cancel() | ||
cc, err := grpc.DialContext(ctx, r.Scheme()+":///"+beServerName, grpc.WithResolvers(r), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit: Could we not split these lines in a random way. In a recent PR, I cleaned up this test to instead do
cc, err := grpc.Dial(r.Scheme()+":///"+beServerName,
grpc.WithResolvers(r),
grpc.WithTransportCredentials(&serverNameCheckCreds{}),
grpc.WithContextDialer(fakeNameDialer))
if err != nil {
t.Fatalf("Failed to dial to the backend %v", err)
}
which is a tad more readable I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
balancer/grpclb/grpclb_test.go
Outdated
r.UpdateState(resolver.State{ | ||
Addresses: []resolver.Address{{ | ||
Addr: tss.lbAddr, | ||
Type: resolver.GRPCLB, | ||
ServerName: lbServerName, | ||
}}, | ||
ServiceConfig: scpr, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: similarly I cleaned up calls to UpdateState
to include balancer addresses through an attribute instead of the deprecated Type
field.
rs = grpclbstate.Set(resolver.State{ServiceConfig: r.CC.ParseServiceConfig(grpclbConfig)},
&grpclbstate.State{BalancerAddresses: []resolver.Address{{
Addr: tss.lbAddr,
ServerName: lbServerName,
}}})
r.UpdateState(rs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
When grpclb receives an empty address list, it calls
sc.UpdateAddress([])
, which closes the underlyingaddrConn
. It won't recover if a new addr list is received later.RELEASE NOTES: