Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebRTC opening failure panic #2900

Closed
tomaka opened this issue Oct 19, 2022 · 6 comments · Fixed by #2936
Closed

WebRTC opening failure panic #2900

tomaka opened this issue Oct 19, 2022 · 6 comments · Fixed by #2936

Comments

@tomaka
Copy link
Contributor

tomaka commented Oct 19, 2022

When a WebRTC connection fails to open, there seems to be a panic here:

let stream = lock.streams.get_mut(&(connection_id, stream_id)).unwrap();

@tomaka
Copy link
Contributor Author

tomaka commented Oct 25, 2022

The problem is simply caused by the fact that the browser calls the close or error callback on the data channel when it fails to open.

Because we only register the data channel after the open callback is called, it's not there yet.

This is a bit tricky to solve, because the definition of "open" of a WebRTC connection is a bit hacky.

@tomaka
Copy link
Contributor Author

tomaka commented Oct 25, 2022

I think that this is simply solved by making connection_stream_open return the stream ID, and make connection_stream_opened interpret already-assigned IDs as "outbound" and unknown IDs as "inbound".

@tomaka
Copy link
Contributor Author

tomaka commented Oct 25, 2022

I'll wait for #2925 in order to not introduce annoying merge conflicts.

@tomaka
Copy link
Contributor Author

tomaka commented Oct 27, 2022

Actually, the idea is that we should just try opening the substream again.
There is no gain is actually tracking the pending substream openings compared to using a timeout.

@melekes
Copy link
Contributor

melekes commented Oct 27, 2022

Related comment from the spec:

Implementations SHOULD setup all the necessary callbacks (e.g.
[`ondatachannel`](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/datachannel_event))
before starting the Noise handshake. This is to avoid scenarios like one where
_A_ initiates a stream before _B_ got a chance to set the `ondatachannel`
callback. This would result in _B_ ignoring all the messages coming from _A_
targeting that stream.

@mergify mergify bot closed this as completed in #2936 Oct 27, 2022
mergify bot pushed a commit that referenced this issue Oct 27, 2022
Fix #2900

In order to open outbound substreams on a WebRTC connection, the
mechanism that the API of smoldot uses is as follows: one function to
ask the implementation to open a substream, and one callback that is
later called in order to notify that a substream has been fully opened.

The API of smoldot doesn't provide a way to report that opening an
outbound substream has failed. This is because this isn't really
supposed to happen. If you have a connection, opening a substream is
purely a "local side change": you allocate an ID, then send a message.
It should maybe not even be asynchronous, but that's off-topic here.

The problem here is that this mental model doesn't really fit what the
browser does. In particular, the browser does nothing until a data
channel is first opened, plus the browser can call `onclose` or
`onerror` on the data channel before `onopen`. To accomodate for that,
the current code is a bit hacky in the sense that it reports the
connection as immediately open.

This PR does it a bit more differently and more properly: we immediately
open a data channel at the same time as we create the
`RTCPeerConnection` in order for the browser to start connecting, even
though smoldot didn't request a channel yet. We store this first data
channel on the side, and when smoldot asks for a data channel we provide
this one.
If a data channel closes while it is still opening (the handshake one or
another), we just kill the entire connection.
@tomaka
Copy link
Contributor Author

tomaka commented Oct 27, 2022

Related comment from the spec:

That's not really related.
The problem here was that Firefox calls onclose or onerror on the handshake substream because we fail to reach the server, and our implementation didn't consider this possibility. I was assuming that onclose/onerror would only be called after the substream has been opened, and that we'd get a onconnectionstatechange event instead for the reach failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants