WIP fix(core): allow requests to be queued in CONNECTING state (#374) #583

ztzg · 2020-02-09T16:09:19Z

With this patch, requests issued while the client is in the CONNECTING state get queued instead of raising a (misleading) SessionExpiredError.

TBD:

This patch does not prevent requests which have been queued but not emitted from being rejected with ConnectionLoss. It should also get rid of the second part of _notify_pending, shouldn't it? [Not doing this, as the C/Java clients do not, either.]
This patch does not try to limit the maximum queue length, which should probably be controlled via a new KazooClient parameter. (How about max_queue_length? What should it default to? Which exception should we raise when the queue overflows?) [Not doing this, as the C/Java clients do not, either.]

ztzg · 2020-02-09T17:13:04Z

@jeffwidman, @StephenSorriaux: As far as I can tell, just removing the check for CONNECTING from KazooClient._call is sufficient to obtain the desired behavior; _queue and _write_sock should remain valid in that state (with the exception of the "race with close()" which is already handled by the code). Do you agree?

jeffwidman · 2020-02-10T18:47:10Z

First, I have not looked deeply at this section of the code, so please take what I say with a grain of salt.

removing the check for CONNECTING from KazooClient._call is sufficient to obtain the desired behavior; _queue and _write_sock should remain valid in that state (with the exception of the "race with close()" which is already handled by the code). Do you agree?

That sounds very reasonable.

This patch does not prevent requests which have been queued but not emitted from being rejected with ConnectionLoss. It should also get rid of the second part of _notify_pending, shouldn't it?

I'm not sure. @StephenSorriaux can you also take a look at this? It's a very small code change, but connection blips are common enough that we do want to make sure we get the model right.

This patch does not try to limit the maximum queue length, which should probably be controlled via a new KazooClient parameter. (How about max_queue_length? What should it default to? Which exception should we raise when the queue overflows?)

How do the Java and C clients handle this? IMO they're the canonical examples so most people will be expecting similar behavior...

ztzg · 2020-02-10T21:43:22Z

This patch does not try to limit the maximum queue length, which should probably be controlled via a new KazooClient parameter. (How about max_queue_length? What should it default to? Which exception should we raise when the queue overflows?)

How do the Java and C clients handle this? IMO they're the canonical examples so most people will be expecting similar behavior...

From the top of my head: they just queue requests, and don't have a maximum queue length. (I will double-check when I circle back to this.)

ztzg · 2020-02-12T18:59:10Z

This patch does not try to limit the maximum queue length, which should probably be controlled via a new KazooClient parameter. (How about max_queue_length? What should it default to? Which exception should we raise when the queue overflows?)

How do the Java and C clients handle this? IMO they're the canonical examples so most people will be expecting similar behavior...

Java: The client simply queues the request "packet" on a LinkedBlockingQueue of capacity Integer.MAX_VALUE (2^31 - 1):

https://github.com/apache/zookeeper/blob/517ecde9dd3cfe574118231ec242ea73b80a4e79/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L533

It thus (indirectly) throws an IllegalStateException("Queue full") after queuing 2+ billions of packets.
C: Same, except that the client uses an unbounded intrusive linked list, returning ZSYSTEMERROR if it fails to allocate memory:

https://github.com/apache/zookeeper/blob/ae1a52c75b19e15db5913ac34d34b1c6fbf75e36/zookeeper-client/zookeeper-client-c/src/zookeeper.c#L1645-L1652

Neither checks the state of the connection before doing so.

jeffwidman · 2020-02-12T20:10:19Z

Sounds good, we can do the same. That handles one of your outstanding TODOs.

For the second part:

This patch does not prevent requests which have been queued but not emitted from being rejected with ConnectionLoss. It should also get rid of the second part of _notify_pending, shouldn't it?

Can you explain this a little more?
On the surface it sounds like a good thing if requests that were queued but then not emitted due to a permanent network failure eventually raise a ConnectionLoss... but we'd want to make sure some timeout had passed to guarantee permanent failure. Because if it was a temp network blip that the library recovers successfully and then sends these messages, then obviously we want to mark as successful.

Sorry I'm asking for all these details, I'm trying to be somewhat thoughtful on a limited time budget and I know you've looked at this code deeply so already understand it.

Looking forward to landing this soon!

ztzg · 2020-02-12T20:43:12Z

Sounds good, we can do the same. That handles one of your outstanding TODOs.

Seems reasonable.

For the second part:

This patch does not prevent requests which have been queued but not emitted from being rejected with ConnectionLoss. It should also get rid of the second part of _notify_pending, shouldn't it?

Can you explain this a little more?

Sure.

On the surface it sounds like a good thing if requests that were queued but then not emitted due to a permanent network failure eventually raise a ConnectionLoss... but we'd want to make sure some timeout had passed to guarantee permanent failure. Because if it was a temp network blip that the library recovers successfully and then sends these messages, then obviously we want to mark as successful.

First, something I am not planning to change: when the connection is lost, the requests which have been emitted but are not completed fail with ConnectionLoss (as the ACKs will never reach the client anyway).

With the current version of the patch, the requests which were queued but not emitted are also cancelled with the same exception:

kazoo/kazoo/client.py

Lines 564 to 570 in 3a1d442

    
           while True: 
        
               try: 
        
                   request, async_object = self._queue.popleft() 
        
                   if async_object: 
        
                       async_object.set_exception(exc) 
        
               except IndexError: 
        
                   break

The idea would be not to cancel these requests when the new state is CONNECTING—as the client is trying to validate an existing session, and these "pristine" requests could very well be submitted in-order over the new connection if the session is recovered.

(Moreover, this would be more consistent with the rest of the patch; it would not matter if a request was submitted just before or during the "outage"—as long as it was not emitted.)

I have implemented the feature, but it currently breaks some assumptions in the tests; I will append a a second commit to the PR once I get to fix that. (And it is fine with me if we decide to only merge the first half in the end.)

Sorry I'm asking for all these details, I'm trying to be somewhat thoughtful on a limited time budget and I know you've looked at this code deeply so already understand it.

No problem! I have looked somewhat deeply, but am not very familiar with the codebase—so I am happy that you are questioning my assumptions.

jeffwidman · 2020-02-12T22:25:07Z

Thanks, that is helpful.

I definitely agree with you that queued but not emitted should not fail while in CONNECTING state... essentially we want that reconnection to be invisible to the application (other than error logs or possibly if they can explicitly check the state of each message they can see it's not complete yet).

The one thing I do want to make sure of is that if the connection never manages to recover, all the queued-but-not-emitted requests are in some way are visible as either failed or incomplete to the application. I think you understand why, but if not let me know and I can explain further.

Assuming that is true, then I am in agreement with everything you said.

Thank you again!

ztzg · 2020-02-13T05:44:12Z

The one thing I do want to make sure of is that if the connection never manages to recover, all the queued-but-not-emitted requests are in some way are visible as either failed or incomplete to the application.

ACK; I think we are on the same page :)

StephenSorriaux · 2020-02-13T12:54:16Z

Thank you for this PR and sorry for this late reply.

I agree with both of you on the fact that the _queue should not be emptied if the connection was somehow temporary lost.

Do you think it would be possible to precise this "new" behavior somewhere in the documentation?

…#374) As discussed in python-zk#570 (comment): With this patch, requests issued while the client is in the 'CONNECTING' state get queued instead of raising a (misleading) 'SessionExpiredError'.

…thon-zk#374) If the connection is lost, but the state is 'CONNECTING', the client is trying to revalidate an existing session. When that happens, the requests which have been dequeued but are still pending are interrupted with a 'ConnectionLoss' exception--as we don't know if the packet reached the server, and the ACKs will never come back anyway. Without this patch, the requests which are still queued, and thus haven't been emitted are also cancelled with the same exception (see bottom half of '_notify_pending'). It seems that there is no reason to cancel such requests when the new state is 'CONNECTING', as the client is trying to validate an existing session; these "pristine" requests could very well be submitted in-order over the new connection if the session is recovered. This also seems more consistent with the first patch for issue python-zk#374: it does not matter whether a request was queued just before or during the "outage"--as long as it was not emitted. This patch implements that feature. ATTN: The patch is marked WIP because it should most probably *NOT* be merged--as it turns out that both the Java and the C client cancel such requests when the connection is lost. Java: ClientCnxn socket error -> cleanAndNotifyState -> cleanup -> conLossPacket/remove C: Socket error -> handle_socket_error_msg -> handle_error -> cleanup -> cleanup_bufs -> free_buffers/free_completions

ztzg · 2020-02-14T18:55:13Z

@StephenSorriaux wrote:

Thank you for this PR and sorry for this late reply.

No problem!

I agree with both of you on the fact that the _queue should not be emptied if the connection was somehow temporary lost.

@jeffwidman, @StephenSorriaux: I believe the first commit of this series (the actual fix for #374), should be good to go. It does not try to bound the queue length, but neither do the Java or C clients.

I have now implemented the new feature (not emptying the queue on retry-able disconnect), and have fixed/augmented the tests—but am now inclined to think that it should not be merged! (Hence the WIP in the commit title.)

Indeed, it turns out that both the Java and C clients cancel such requests when the connection is lost:

Java: ClientCnxn socket error → cleanAndNotifyState → cleanup → conLossPacket/remove;
C: Socket error → handle_socket_error_msg → handle_error → cleanup → cleanup_bufs → free_buffers/free_completions.

I have double-checked; this happens even when the errors are not fatal and the session is recovered. (The C client doesn't even have a way to distinguish between queued & pending, and thus naturally cancels everything.)

I'm not sure it makes a difference in practice, as there is a "race" between the queuing of requests and the session event notification—particularly in the Java version which does not even lock the outgoingQueue—and one cannot assume that every request submitted before the notification is going to be canceled. But staying aligned with the Java/C clients might just be a better idea.

Anyway: I have pushed both patches, and will let you have a look. Unless you disagree, I will respin without the second one in a few days.

Do you think it would be possible to precise this "new" behavior somewhere in the documentation?

Sure; I'll add a note for each patch we end up including.

jeffwidman · 2020-02-16T06:01:52Z

Oh boy, that's a doozy of a research report. Thank you for digging into this. If you ever get tired of consulting and want a full time gig, I know a number of folks who are always looking to hire engineers with such attention to detail. 😁

So I think you're on the right track here... on the surface at least, it seems the Java and C clients are making a mistake. But I also agree we should not deviate from them because most knowledgable ZK folks tend to be experts in ZK who switch-hit in polyglot environments so it's easier for the ecosystem if the behavior stays consistent across clients, even if it's arguably slightly incorrect (but clearly workable otherwise it would have been fixed long ago).

So I think we should do as you suggest:

open a second PR with just the first fix
keep the second commit in this PR around (ie, don't force-push and obliterate it), but close it for now so it doesn't get merged.

Furthermore, I think we should go one step further and submit an issue to the upstream ZK team to change this behavior in the Java and C clients.

Do you want to submit this or shall I? If you don't have time I'm willing to do it if, but you discovered the issue, you understand it far better than I do, and as a consultant it's always a nice trust builder with potential clients to say that you've got a patch accepted to a core project like Zookeeper. Up to you, I just want to make sure in some way this is pushed upstream which which should result in either a fix or further clarification.

As a heads up, my personal experience with the core ZK project is that it tends to be a slower moving project.

Thoughts?

ceache · 2020-02-16T22:56:51Z

@jeff Damien knows, he already has helped get SASL support in the C client 😉

…

On Sun, Feb 16, 2020, 01:01 Jeff Widman ***@***.***> wrote: Oh boy, that's a doozy of a research report. Thank you for digging into this. If you ever get tired of consulting and want a full time gig, I know a number of folks who are always looking to hire engineers with such attention to detail. 😁 So I think you're on the right track here... on the surface at least, it seems the Java and C clients are making a mistake. But I also agree we should not deviate from them because most knowledgable ZK folks tend to be experts in ZK who switch-hit in polyglot environments so it's easier for the ecosystem if the behavior stays consistent across clients, even if it's arguably slightly incorrect (but clearly workable otherwise it would have been fixed long ago). So I think we should do as you suggest: 1. open a second PR with just the first fix 2. keep the second commit in this PR around (ie, don't force-push and obliterate it), but close it for now so it doesn't get merged. Furthermore, I think we should go one step further and submit an issue to the upstream ZK team to change this behavior in the Java and C clients. Do you want to submit this or shall I? If you don't have time I'm willing to do it if, but you discovered the issue, you understand it far better than I do, and as a consultant it's always a nice trust builder with potential clients to say that you've got a patch accepted to a core project like Zookeeper. Up to you, I just want to make sure in some way this is pushed upstream which which should result in either a fix or further clarification. As a heads up, my personal experience with the core ZK project is that it tends to be a slower moving project. Thoughts? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#583?email_source=notifications&email_token=AAIFTHVGYWVJKACUGHAVTYTRDDJFFA5CNFSM4KSCUODKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL36VLI#issuecomment-586672813>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAIFTHQ6RDKBNTWYIMJ7BODRDDJFFANCNFSM4KSCUODA> .

…#374) With this patch, requests issued while the client is in the 'CONNECTING' state get queued instead of raising a misleading 'SessionExpiredError'. This fixes python-zk#374, and brings Kazoo more in line with the Java and C clients. See the 'kazoo.client.KazooClient.state' documentation as well as these discussions for more details: python-zk#570 (comment) python-zk#583 (comment)

ztzg · 2020-02-17T18:11:42Z

Oh boy, that's a doozy of a research report. Thank you for digging into this. If you ever get tired of consulting and want a full time gig, I know a number of folks who are always looking to hire engineers with such attention to detail.

:) I'm just trying to be a bit careful with concurrent systems programming; I know that one never is too prudent with those. But thank you for the kind words !

So I think you're on the right track here... on the surface at least, it seems the Java and C clients are making a mistake. But I also agree we should not deviate from them because most knowledgable ZK folks tend to be experts in ZK who switch-hit in polyglot environments so it's easier for the ecosystem if the behavior stays consistent across clients, even if it's arguably slightly incorrect (but clearly workable otherwise it would have been fixed long ago).

Right. But I don't know if I would say "incorrect"; perhaps just a bit pessimistic and/or overly disruptive.

(Which is not out of character; ZK currently is of a pessimistic nature. It would be nice to have something akin to ZOOKEEPER-22, but that proposal has been lingering for a long time now! Raft has that feature IIRC.)

So I think we should do as you suggest:

open a second PR with just the first fix

Done: #588.

That commit is based on the first fix, but with an expanded set of tests adapted from what I had cooked up for the second part.

keep the second commit in this PR around (ie, don't force-push and obliterate it), but close it for now so it doesn't get merged.

Okay; good idea.

Furthermore, I think we should go one step further and submit an issue to the upstream ZK team to change this behavior in the Java and C clients.

Do you want to submit this or shall I? […] I just want to make sure in some way this is pushed upstream which which should result in either a fix or further clarification.

I am definitely going to keep this in mind, and on my TODO list. I would like to think a bit more about which "use-cases" it could break, though.

(I could imagine an application batching asynchronous requests and hoping to get away with it by taking note of state changes; it could suddenly find itself in a strange place much more often.)

I am planning to ask for clarification on the ML a bit later, unless I identify an issue with this "proposal" in the meantime. (I will report here in any case.)

But feel free to go ahead if you are curious and/or impatient, I'm not trying to collect points, and will gladly follow that conversation!

As a heads up, my personal experience with the core ZK project is that it tends to be a slower moving project.

Right; as @ceache mentions, I have already gotten a taste of it :) The good news is that they seem to have picked up some steam lately!

Cheers, -D

jeffwidman · 2020-02-18T06:51:01Z

Sounds good.

TBH, I switched teams at my day job last fall and am not currently responsible for any production ZK ensembles or applications that talk to ZK, so due to lack of time I unsubscribed from the mailing list... I pitch in here from time to time for fun and to give back to the community so that PR's don't languish. That said, I'll be curious to hear what the eventual outcome is.

…#374) With this patch, requests issued while the client is in the 'CONNECTING' state get queued instead of raising a misleading 'SessionExpiredError'. This fixes python-zk#374, and brings Kazoo more in line with the Java and C clients. See the 'kazoo.client.KazooClient.state' documentation as well as these discussions for more details: python-zk#570 (comment) python-zk#583 (comment)

With this patch, requests issued while the client is in the 'CONNECTING' state get queued instead of raising a misleading 'SessionExpiredError'. This fixes #374, and brings Kazoo more in line with the Java and C clients. See the 'kazoo.client.KazooClient.state' documentation as well as these discussions for more details: #570 (comment) #583 (comment)

ztzg force-pushed the fix-374-queue-requests-while-connecting branch from 6f719bc to 3a1d442 Compare February 9, 2020 16:53

ztzg requested review from jeffwidman and StephenSorriaux February 9, 2020 17:07

ztzg added 2 commits February 14, 2020 17:54

fix(core): allow requests to be queued in CONNECTING state (python-zk…

2b0380d

…#374) As discussed in python-zk#570 (comment): With this patch, requests issued while the client is in the 'CONNECTING' state get queued instead of raising a (misleading) 'SessionExpiredError'.

ztzg force-pushed the fix-374-queue-requests-while-connecting branch from 3a1d442 to b989f10 Compare February 14, 2020 18:23

ztzg mentioned this pull request Feb 17, 2020

fix(core): allow requests to be queued in CONNECTING state (#374) [v2] #588

Merged

4 tasks

ztzg closed this Feb 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP fix(core): allow requests to be queued in CONNECTING state (#374) #583

WIP fix(core): allow requests to be queued in CONNECTING state (#374) #583

ztzg commented Feb 9, 2020 •

edited

Loading

ztzg commented Feb 9, 2020

jeffwidman commented Feb 10, 2020

ztzg commented Feb 10, 2020

ztzg commented Feb 12, 2020

jeffwidman commented Feb 12, 2020

ztzg commented Feb 12, 2020

jeffwidman commented Feb 12, 2020 •

edited

Loading

ztzg commented Feb 13, 2020

StephenSorriaux commented Feb 13, 2020

ztzg commented Feb 14, 2020

jeffwidman commented Feb 16, 2020

ceache commented Feb 16, 2020 via email

ztzg commented Feb 17, 2020

jeffwidman commented Feb 18, 2020 •

edited

Loading

WIP fix(core): allow requests to be queued in CONNECTING state (#374) #583

WIP fix(core): allow requests to be queued in CONNECTING state (#374) #583

Conversation

ztzg commented Feb 9, 2020 • edited Loading

ztzg commented Feb 9, 2020

jeffwidman commented Feb 10, 2020

ztzg commented Feb 10, 2020

ztzg commented Feb 12, 2020

jeffwidman commented Feb 12, 2020

ztzg commented Feb 12, 2020

jeffwidman commented Feb 12, 2020 • edited Loading

ztzg commented Feb 13, 2020

StephenSorriaux commented Feb 13, 2020

ztzg commented Feb 14, 2020

jeffwidman commented Feb 16, 2020

ceache commented Feb 16, 2020 via email

ztzg commented Feb 17, 2020

jeffwidman commented Feb 18, 2020 • edited Loading

ztzg commented Feb 9, 2020 •

edited

Loading

jeffwidman commented Feb 12, 2020 •

edited

Loading

jeffwidman commented Feb 18, 2020 •

edited

Loading