Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to reword Dilation specification #33

Open
wants to merge 16 commits into
base: dilation
Choose a base branch
from
Open

Conversation

piegamesde
Copy link
Member

At the moment it is written more as the documentation of the Python implementation. In the end, we want it to be a standalone document that helps others implement that protocol too, without having to know any of the Python internals or Twisted.

We want this repository to be agnostic of any site generators or similar.
Information related to rendering the markdown as web page should be stored
externally.
It doesn't seem to be reference anywhere and also looks like a prototype/sketch.
Closes #30.

Those files also contained a couple of paragraphs – mainly the general introduction –
that will have to be rewritten in a client-agnostic style.
dilation-protocol.md Outdated Show resolved Hide resolved
piegamesde and others added 10 commits March 20, 2023 22:29
Rewrote the existing topics. Tried to minimize the discussion of any hypotheticals
and future work, as this should be done outside of the document itself.

Added privacy sections discussing file size metadata and IP addresses.
- Deprecate the `current_cli_version` key
- Improved wording, added type information
Slowly make the naming more consistent.
semantic newlines; document what the implementation does; versions are integers
update (just) the Dilation abilities section
At the moment it is written more as the documentation of the Python implementation.
In the end, we want it to be a standalone document that helps others implement that
protocol too, without having to know any of the Python internals or Twisted.
@piegamesde piegamesde changed the title [WIP] Try to reword Dilation specification Try to reword Dilation specification May 24, 2023
@piegamesde piegamesde marked this pull request as ready for review May 24, 2023 12:50
@piegamesde piegamesde requested a review from meejah May 24, 2023 12:50
Copy link
Member

@meejah meejah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see benefits in keeping layers distinct, or at least describing them that way. Naming aside, grouping the different concerns I think helps implementers -- knowing that the L4 layer doesn't have to care about anything to do with channel-selection, encryption, deserialization, framing etc. Basically all it is concerned with is the high-level messages and their acknowledgement.

We don't have to strictly follow what's already there, but some separation of concerns is a big benefit I think. For example, you have a "todo" about ordering of subchannel vs seq-num serialization, but that doesn't matter because you've already gotten "one whole message" from L2 so it's irrelevant what order the bytes are in.

Have you read the latest draft of the python specification? Also it would be good to link that in this document.

(Some inline comments too of course :)

index.md Show resolved Hide resolved
Additionally, the protocol takes care of automatically re-establishing such a channel after a connection interruption,
including re-sending all messages that got lost.
Furthermore, it has the concept of sub-channels built in,
and will take care of the sub-channel management and data multiplexing too.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "kind of" gets at the "durability" aspect, but I feel that should be stated more clearly.

What you get from the subchannels is a stream of data that will definitely be delivered to the other side even changing networks, etc, until one side gives up or both sides are disconnected from the mailbox for "too long".

It is additionally used by the Dilation implementation to exchange internal coordination messages.

The usual Transit connection is replaced by a Dilation connection, that is managed by the Dilation implementation.
Compared to the Transit connection, it has additional features like resilience against connection interruptions and sub-channels.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this hints at durability too, but similar to above doesn't quite state that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, what would you like to hear here?

dilation-protocol.md Outdated Show resolved Hide resolved
dilation-protocol.md Outdated Show resolved Hide resolved
* 0x04 CLOSE, 4-byte subchannel-id, 4-byte seqnum
* 0x05 ACK, 4-byte response-seqnum

TODO it would make more sense to have the seqnum before the subchannel-id, processing wise ~piegames
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Also you have to deserialize all of this to process it, so I can't see why order matters at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it mostly doesn't matter, this is just me being pedantic.

dilation-protocol.md Show resolved Hide resolved

### Flow Control

TODO review and potentially remove? ~piegames
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say: absolutely not, flow-control is pretty central to streaming applications.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, as far as I can see flow control is something for the Dilation implementation to handle, and requires no protocol support at all. So if you remove the Python specifics from the paragraph, it will go something like "Implementations might offer some flow-control functionality or quality of service for the sub-channels, so that a single very busy channel does not starve the others. This is not a requirement though, as it significantly increases implementation complexity and most applications do just fine without".

Copy link
Member

@meejah meejah May 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it would still say the stuff about pausing ALL channels whenever a single one needs to provide backpressure. That is, that implementations should not attempt to provide their own subchannel flow-control, and instead rely on TCP's. (Competing flow-controllers on the same channel do not play nicely)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, I think I was confusing flow control and quality of service. (Tbf not entirely my fault, because the paragraph also talks about the one while actually describing the latter.)

So is the takeaway here "don't do your own flow control but feel free to implement QoS"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original (https://github.com/magic-wormhole/magic-wormhole/blob/master/docs/dilation-protocol.md#flow-control) doesn't mention QoS at all, and I wouldn't mention it here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know but it talks about QoS: "e.g. a chat conversation sharing a wormhole with file-transfer might prefer the IM text to take priority". Therefore why not talk about that explicitly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that's certainly a use-case for QoS -- but what do you think is worth mentioning here? There's nothing to be done by implementers -- they have to pause "the TCP stream" to achieve backpressure.

If anything, this would be an interesting point for application developers: mixing high-bandwidth subchannels with low-bandwith ones may perform poorly under bad network conditions. But I think all they can do here is "not do that"?

That is, I don't believe that sentence is supposed to imply to Dilation implementers that they should attempt to implement some sort of QoS system. I don't recall details on how that'll interact (or not) with TCP windowing etc. (Usually QoS is done per-TCP-stream, e.g. prioritizing one whole stream over another).

dilation-protocol.md Outdated Show resolved Hide resolved
This involves sending a `reconnect` message and according hints over the Mailbox connection,
the same way as for initially establishing the connection.
Once a new connection is established and the handshake completes,
both sides re-transmit their outbound messages that were lost due to the connection interruption.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Careful here: these are L4 messages. If you instead re-transmitted L2 messages, the crypto may break (and another reason to use distinct keys for different generations).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I see you resolved this, but I don't seen changes .. maybe your workflow is different from mine).

This should also say how a peer determines which messages were lost (or potentially lost).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops I'm sorry.

I usually keep track of open discussions as my TODO list (at least for things that are easily fixable), but I forgot to push my changes after doing so.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unresolving because of "This should also say how a peer determines which messages were lost"

@piegamesde
Copy link
Member Author

Thanks for the review

I still see benefits in keeping layers distinct, or at least describing them that way. […]
We don't have to strictly follow what's already there, but some separation of concerns is a big benefit I think.

I don't like the concept of "layer" because to me it invokes feeling of an "OSI layers" stack, which is only true for parts of it. I'd rather think of them as interdependent components, and treat them as such. But you are right, the new wording could make the separation a bit more clear again.

One thing to note is that sometimes the boundaries are a bit fuzzy: For example while Python has L5 stacked on top of L4, one could also write the code that mostly does both at once, but delegates parts of it to a "subchannel manager". (Basically the reverse from what Python does.)

Have you read the latest draft of the python specification? Also it would be good to link that in this document.

Not really no, these changes here are mostly a rebase from my work in last autumn. I think it has changes over there that I should incorporate here? (If possible, I'd like to delegate that too to a future PR)

@meejah
Copy link
Member

meejah commented May 25, 2023

Yes, they're like the OSI stack: a "higher layer" can depend on a lower one, but NOT vice-versa. This is a good thing.

So they're not "inter"-dependent. It's fine if you want to assign different names, I suppose, but for network protocol implementers the OSI stack is very familiar and talking about "layers" is pretty common in lots of Internet protocols.

It also lets you forget / compartmentalize complexity: I can forget about framing, encryption, ACKs, re-sends, etc etc at the higher layers and just think about the logical flow of in-order messages. It also helps with state-machine diagrams and separation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants