-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ending a session gracefully #31
Comments
(If we agree this sounds plausible, I will instead express it as a PR against the protocol specs alongside Python implementations) |
I'd like to leave this up to the application level protocol how to deal with session closing, as the needs are really diverse. Between rendezvous messages, Dilation and Transit, there are seven different combinations of communication channels that may be used, of which five are practicable (using Dilation and Transit connections at once makes no sense). Each application should decide on one communication channel as their "source of truth", and implement a proper closing mechanism there. However, I agree that there is need to close a mailbox for all participants (instead of having a per-client counter and only closing after the last one is gone), see for example #19 |
"Close a mailbox for all participants" is precisely the use-case being addressed here. (And as a bonus, it should serve as a "source of truth" for many protocols -- of course, other protocols could choose a different way to shut down if desired). |
The reason I don't want to do this is precisely because of possible confusion, and because I believe all application protocols need this feature. Therefore, it makes sense to have a built-in way to do it. The mailbox is the correct communication channel because it's the only one that definitely exists. Transit or Dilation may never be established. Since an application protocol that only uses the mailbox is possible, it has to be the mailbox channel. Note that the existing file-transfer application protocol does the session-closing implicitly: only a single offer is allowed, and only a single file-transfer will occur. Thus, the Transit connection never needs to stay open and there is no ambiguity as to when the each side should close the mailbox. There is a remaining problem, though: if a transfer is cancelled. There is no explicit way to signal this, and the protocol suffers some ambiguous behavior in those cases. An explicit "close" signal like the one described here could answer that need. For Dilation the above is no longer true: the session may stay open for an arbitrary amount of time. The only other application protocols I know of are magic-folder and tahoe-lafs Grid invitations: both of those also have a non-ambiguous end, so don't strictly need this mechanism. However, they do not have a way to cancel an ongoing invite -- this could be used for that. |
The rendezvous server already enforces that at most two parties may be in one room. Therefore I'd like to expand the semantics some more on that premise:
There are also my port forwarding and transfer v2 experiments (the former being released but marked as experimental feature), both automatically close the wormhole as soon as a transit connection is established and use that for subsequent error handling and everything else. |
Changing the semantics of That is, if side A sends a Note that applications don't have to use the above proposed flow as the only way to shut down. This would still work with the port-forwarding flow you outline, I think? That is, you follow the above phased close once Transit is established. (In fact, you could ignored the phased close entirely and have each side implicitly know when to |
(Note that I'm not totally against using |
Hm, do you have an example of a protocol that might make use of that close behavior on the happy path? I understand your proposal, but it adds some significant complexity to the protocol so I need a compelling use-case to convince me it is worth the cost. |
Dilated File Transfer edit: to expand, there isn't a way to "end the session" in the proposed Dilated File Transfer application protocol. This would be it. |
Actually, Dilated file transfer can use that same mechanism on the Dilation level: closing the control channel will already do the closing/closed dance and then once the control channel is closed the Wormhole may safely be closed without need for further negotiation. |
Right, the control-channel could be used (however, this is more general -- and covers the cases where we've never actually succeeded at opening a Dilation channel -- which may never succeed). That is, the only communications channel the definitely exists is the mailbox. |
That's why I was specifically talking about the happy path. On the unhappy path, I think just closing in the face and triggering errors on the other side is fine, because it would have been an error anyways. For example, in a dilated file transfer session one side decides that it wants to close. If there are any ongoing transfers those will be awaited, but it will reject all new offers and make no send offers either. Once the last transfer completed, it can simply close the connection (ignoring graceful shutdown of the Dilation control channel for now). If the other side was to make an offer attempt concurrently it would simply get a connection lost error, which would be fine and expected. |
Again, I'm not saying we can't use the control-connection (in fact, that was my first attempt at this) but it's not as general (i.e. it doesn't help mailbox-only protocols). (It's Dilation-only -- and of course depends on that channel being established in the first place. Application-protocols inventing baroque ways of shutdown is precisely why I wanted a more-general way (we've already got 3 different ways to shut down with just 2 protocols under consideration here). The "ongoing transfers awaited" part is why I believe we need an intermediate state between "open" and "closed" (i.e. "closing", where no new messages / offers / etc are sent but ongoing things including in-flight mailbox messages can be completed). There's a few options proposed, maybe it makes sense to summarize the pro/con of them and missing features etc. Are there any besides:
There's another edge-case here which might need more-careful wording of the spec: side A has decided to stop the Dilated File Transfer session (i.e. entered This allows the subchannel with the ongoing transfer to be re-established (via a further |
Maybe this is easier to analyze as a pull-request or better write-up than a github comment ... do we at least agree that a unilateral |
Could we make the feature optional so that applications may use either depending on what they need? I'd be in for the change of giving the current CLOSE signal stronger properties, and adding a second shutdown mechanism for other use cases. |
By "giving CLOSE stronger properties" you mean changing the protocol so that either side can unilaterally close a mailbox? |
What problem does that solve? |
When the send side abandons before ever seeing the other side and then the other side joins, it will be stuck currently. This could be avoided. Probably also helps fixing #19. |
It could still be stuck with the I don't think the change helps with 19, because neither Bob nor Carol fully join the mailbox if I understand it correctly (so they wouldn't have a chance to send My problem here is this: changing the behavior of |
Actually I think there might be some issues about a
|
The "phase" itself isn't encrypted ... but yeah, what to do with the payload. Could we special-case this part to say that if you receive any Ordering is indeed an interesting question. One solution might be to include information in the But, let's see: a client re-connects to a mailbox and receives:
Can it still operate correctly? With the extra information knowing that The other side won't fully close (i.e the second I'll try writing a more formal spec with all the above considerations as well, then we can see if any other edge-cases etc come to mind. |
Going back to this, I'd like to explore a solution that works with the The reason I want the the Mailbox server to be directly involved (compared to just sending yet another message over the Mailbox to indicate closing), is that otherwise a client might disconnect before sending a first message, leaving the other side hanging. But if the Mailbox server is involved, it can act like if the disconnected side closed the Mailbox and thus correctly notify the other side. |
Adding another phase doesn't necessarily rule out "the server being involved". Anyway that sounds like an important case to capture no matter what the ultimate solution is. |
p.s. since the phase is unencrypted, using the "phase" approach can allow "the server to be involved" it it needs to be. For example, it could enforce the requirement that a side who has sent a "phase=closing" must not send any more number-phase messages. After re-reading this, I think we could/should encrypt the close message-body (that tells the other side the highest number-phase message sent) while still allowing the server to do the proper things:
Note that for "early close" we can't encrypt a message-body -- but also we don't need one, because there's no 'highest message' to communicate. However, that should be made clear in any spec. |
Considering the state diagram in #36 I'm not sure we'd need to do anything at all on the server -- it can still consider a mailbox "open" while the peers do their two However, it could split open into three states: "open", "half-open", "closed" (keeping "done" as well) to distinguish the above situations. "half-open" is when one peer has sent "phase=closing", and "closed" is when the other peer has sent its "phase=closing". Then "done" is reached after both sides do "close". We didn't talk about versioning yet. Maybe this needs a version signal... Lets consider back/forward interop without an explicit version signal: |
Meeting today, @piegamesde and I tentatively decided to pursue an approach limited to Dilation: essentially use application-layer messages similar to the "phase=closing" or the "close/closing/closed" ideas above. This avoids changing the core protocol and gives us some operational experience with the concept -- while still leaving open the possibility of bringing this up to the core protocol level later. (Since Dilation is still marked "experimental" it's somewhat less disruptive as well). |
Consider magic-wormhole connections as having two layers: the generic setup of a channel (or many subchannels in Dilation) and an "application layer" on top (such as file-transfer).
Most "application layer" protocols will want a way to gracefully end their session.
(I use "session" here to include Dilation because those may span many TCP connections).
The file-transfer protocol has an implicit end-of-session: any offer that is exchanged is the only one, so no more mailbox communication will happen after that.
However, the next-generation file transfer (see magic-wormhole/magic-wormhole-protocols#23 ) for example needs some way to indicate that the entire session is over.
I expect all Dilation protocols will have a similar need. In Dilation specifically, we could choose to declare that closing the control-channel is this "session is over" signal. However: that doesn't work for non-Dilated protocols; and the control-channel may never exist (because of network conditions). So, we need a mailbox-level message here.
Another consideration (thanks @warner) is synchronization around
close
signals. In subchannels in Dilation, there is a "half-closed" state where side A has sent all the data they intend to, and sent a CLOSE message. So side A knows side B will see all their data before the CLOSE. Similarly, since side A will wait for side B's CLOSE message, they now they'll see all of side B's data. Maybe it is possible to extend this synchronization to the mailbox itself.Currently, the
w.close()
API sends a CLOSE message to the relay, which responds with a CLOSED (but doesn't, for example, send a CLOSED to the other side). So an option could be to re-use this signal -- amend the protocol so that either side could receive aCLOSED
message that isn't in response to aCLOSE
. This kind of breaks the symmetry of close/closed though.I think an explicit mailbox-level message + phase should be added to the protocol. This can be used by both "classic" Transit-using (or even mailbox-only) protocols and by Dilation. It also allows proper recover in "journaled" mode, I believe.
The rough idea is this:
A
) sends its final mailbox messagephase=closing
mailbox message with body{"closing": true}
phase=closing
message, sideB
posts any remaining mailbox messages as then a finalphase=closing
message with body{"closed": true}
.B
MUST close its mailbox immediately after this (sends CLOSE to mailbox-server, gets CLOSED back) -- it knows sideA
will produce no new messagesA
receives any additional mailbox messages from sideB
(maybe none)A
gets aphase=closing
with{"closed": true}
from sideB
A
now knows it has all mailbox messages, and MUST immediately close its mailbox (because it knows the other side has done this also)The text was updated successfully, but these errors were encountered: