-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fake out requests system in peers.rs #2369
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Automatically approving tomaka's pull requests. This auto-approval will be removed once more maintainers are active.
twiggy diff reportDifference in .wasm size before and after this pull request.
|
@@ -11,6 +11,7 @@ | |||
- Fix another panic in case of a carefully-crafted LEB128 length. ([#2337](https://github.com/paritytech/smoldot/pull/2337)) | |||
- Fix a panic when decoding a block header containing a large number of Aura authorities. ([#2338](https://github.com/paritytech/smoldot/pull/2338)) | |||
- Fix multiple panics when decoding network messages in case where these messages were truncated. ([#2340](https://github.com/paritytech/smoldot/pull/2340), [#2355](https://github.com/paritytech/smoldot/pull/2355)) | |||
- Fix panic when the Kademlia random discovery process initiates a request on a connection that has just started shutting down. ([#2369](https://github.com/paritytech/smoldot/pull/2369)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm mentioning Kademlia because I've only ever seen this problem happen with the Kademlia discovery. In theory it can happen with other protocols, but in practice it doesn't seem to, probably because we accidentally conveniently never start any other kind of requests during shut down.
It's not immediately obvious to me why the answer is yes. If we've reported that connection is shutting down in
I think we've talked about this before a little bit, but this bug again is a "reminder" that the underlying design is too complex and buggy, and it would be far simpler to have a single connection to a peer instead of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Well, that's #2370
This opens us to many issues and even an attack:
So C managed to force A to connect twice to B. If we add some code that closes one connection when there are two, the question is which connection do you close? The older one or the newer one? Node A knows that the newer connection is "wrong", but node B doesn't know. The only good solution to that problem is to allow 2 simultaneous connections per peer. See also paritytech/substrate#4272 (comment) |
Can't |
Thanks for the link! Didn't know there were some issues with |
C could also advertise an IP that it controls, and then when A connects it tunnels the traffic between A and this IP to B, and thus make A believe that this IP is actually B. Comparing IPs is not a good way of ensuring that you don't connect to the same peer. In addition to this, there's also the fact that libp2p has many different ways of connecting to a peer, such as using a relay. Sometimes you don't have an IP address available.
Well, the design is IMO not wrong by itself. The problem here is more that the abstraction of This PR right now is about fixing a panic. |
That's what peer IDs are for, correct? And the same process (
That makes sense, thanks 🙏 |
I don't know which certificates you're mentioning. The original problem is that C can force A to connect to B. If A was already connected to B, then C will have forced A to open a second connection. |
Fix #2361
This PR solves a tricky problem in
peers.rs
.A connection can be in three phases: handshaking, established, or shutting down (and, later, completely shut down, but that doesn't count as a phase because when the connection is shut down it no longer exists).
Once a connection starts its shutdown phase, it is forbidden to start new requests on this connection, but incoming notifications can still arrive.
This is not a problem in
connection.rs
because we simply indicate when a connection starts shutting down, and the upper layers of the code are aware of the fact that they shouldn't start any new request but can still receiving incoming notifications.But in
peers.rs
this is problematic, becausepeers.rs
groups together multiple connections belonging to the same peer and exposes them as just one. It is for example possible to have one connection to a peer shutting down but another connection to the same peer is still established. For this reason,peers.rs
doesn't report shut downs, and instead simply says "we are connected to a peer" or "we are no longer connected to a peer".But this causes another problem: if you have only one connection left and it is shutting down, should you still tell the upper API layers that you're connected? The answer is yes, otherwise it would be weird potentially receive incoming notifications from peers you are not connected to.
But if the upper API layers think that you are connected to a peer despite having only one connection that is shutting down, what happens if these upper API layers start new requests? This is the problem that this PR solves.
This PR introduces a fake requests system in
peers.rs
. When a request is started and all the connections to that peer are shutting down, we create a fake request and later report as event that the request has failed.