Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shrink Swarm API and empower NetworkBehaviours? #3314

Closed
thomaseizinger opened this issue Jan 9, 2023 · 8 comments
Closed

Shrink Swarm API and empower NetworkBehaviours? #3314

thomaseizinger opened this issue Jan 9, 2023 · 8 comments
Labels
decision-pending Marks issues where a decision is pending before we can move forward. difficulty:moderate priority:important The changes needed are critical for libp2p, or are blocking another project

Comments

@thomaseizinger
Copy link
Contributor

Prior issues:

I'd like to propose that we adopt the following design:

  1. Remove pretty much all public functions of Swarm apart from access to the inner NetworkBehaviour.
  2. Remove SwarmEvent in favor of only emitting NetworkBehaviour::OutEvent as part of Stream::poll_next
  3. Extend NetworkBehaviours with all capabilities required to do what users can currently do via the Swarm API

The benefits of this design are:

  • More powerful NetworkBehaviours, see Extend NetworkBehaviourAction to add and remove listeners #3291 as an example
  • Less code duplication within our codebase, currently FromSwarm and SwarmEvent are very similar
  • Clear responsibilities of components: Swarm is just a "runtime" and plugins have full control over its state

Some ideas on how we can achieve this:

  • The banning and unbanning APIs will already go away once we start to leverage feat(swarm)!: allow NetworkBehaviours to manage connections #3254.
  • We can provide a Control behaviour that exposes public functions for dialing nodes and listening on addresses.
  • We can provide an Info or Stats behaviour that tracks connected peers, external addresses etc
  • We can provide a Log behaviour that emits nice log messages for anything that happens within the Swarm
@thomaseizinger thomaseizinger added difficulty:moderate decision-pending Marks issues where a decision is pending before we can move forward. priority:important The changes needed are critical for libp2p, or are blocking another project labels Jan 9, 2023
@thomaseizinger
Copy link
Contributor Author

thomaseizinger commented Jan 9, 2023

The priority:important only applies to reaching a decision. The actual implementation of this is priority:nice-to-have in my opinion and will almost certainly have to be done in multiple steps and can happen over a longer period of time, similar to #2217.

I think reaching a decision here is important so we know which direction to move into.

@thomaseizinger
Copy link
Contributor Author

Related: #3305.

Our examples are confusing users because we showcase a linear flow based on SwarmEvents when really, users need to implement state machines on top of Swarm to handle events correctly.

Our way of expressing state machines are NetworkBehaviours. Moving forward in the direction of this issue would require us to implement our examples as a NetworkBehaviour.

That is good because it makes modular pieces that you can actually use in our application. Unfortunately, it is also a bit clunky because our current custom derive is an all or nothing solution. If we force all user logic to be a NetworkBehaviour then we should make it easy to compose behaviours and add custom logic at the same time.

I don't have a definite answer to how to best do this at the moment but one option we should explore is splitting up the NetworkBehaviour trait into multiple so one can pick and choose what to override and what to implement yourself.

@thomaseizinger
Copy link
Contributor Author

we should make it easy to compose behaviours and add custom logic at the same time.

I wonder if we can get away with designing them such that the "policy" behaviour (i.e. the user code) doesn't need to compose our "mechanism" behaviours (i.e. the dcutr one).

Libraries are most useful if they provide mechanism (performing an action such as dctur) but are ignorant about policy (when to perform the action). Afaik, we currently always perform dcutr as soon as we dial a peer through a relay. "Always" is a policy that may or may not be what the user wants.

@thomaseizinger
Copy link
Contributor Author

Afaik, we currently always perform dcutr as soon as we dial a peer through a relay. "Always" is a policy that may or may not be what the user wants.

This issue actually spans further: Currently, almost1 all NetworkBehaviours treat all connections the same, i.e. libp2p-ping will start pinging regardless whether the connection is relayed or not. Relayed connections however are resource-constrained so we should be more selective on what kind of traffic we run over a relay. This is just one example though. From a design perspective, it doesn't scale to have each NetworkBehaviour re-implement rules on when it should or should not be active.

Footnotes

  1. The only exception I know of is libp2p-relay itself: We will refuse to run a relay on top of a relayed connection to avoid circuits.

@mxinden
Copy link
Member

mxinden commented Jan 23, 2023

We can provide a Control behaviour that exposes public functions for dialing nodes and listening on addresses.

Is there a use-case where one would not want a Control behaviour? In case there is not, I propose duplicating the must-have functionalities (e.g. dialing) on Swarm. In other words, I see NetworkBehaviour as a way to make things optional, but there is no value in making something optional (e.g. dialing) if everyone always uses it.

Extend NetworkBehaviours with all capabilities required to do what users can currently do via the Swarm API

👍

Less code duplication within our codebase, currently FromSwarm and SwarmEvent are very similar

That would be nice indeed.

@thomaseizinger
Copy link
Contributor Author

We can provide a Control behaviour that exposes public functions for dialing nodes and listening on addresses.

Is there a use-case where one would not want a Control behaviour? In case there is not, I propose duplicating the must-have functionalities (e.g. dialing) on Swarm. In other words, I see NetworkBehaviour as a way to make things optional, but there is no value in making something optional (e.g. dialing) if everyone always uses it.

It would be a way of encouraging users towards implementing their networking logic as NetworkBehaviours. Perhaps that in itself is not a good idea.

@elenaf9
Copy link
Contributor

elenaf9 commented Jan 30, 2023

Thank you for starting this discussion @thomaseizinger!

I feel like this proposal is kinda going into the opposite direction than previous changes on our Swarm API. We have recently encouraged users to rather implement their logic on top of the swarm, e.g. with changes like #2100 and #2784.
Changing this now would require major refactorings from existing users, and I am not sure if it's worth it given that our current design imo is already quite reasonable.

The purpose of NetworkBehaviour (and related traits) is currently to implement protocols that we speak with remote peers, including a handler for connections, etc. As the tutorial nicely puts it:

While the previously introduced trait Transport defines how to send bytes on the network, a NetworkBehaviour defines
what bytes to send on the network.

I always found that easy to understand, and also the swarm as the interface that wraps the Transport and NetworkBehaviour and implements protocol-independent must-have functionalities. (We could maybe do better with our naming though.)

If we can simplify the NetworkBehaviour trait and related components, e.g. with efforts like #2863, I am definitely in favor of doing that. I also agree that NetworkBehaviours should be able to do everything that can be done as a Swarm, and especially to get rid of the duplication between FromSwarm and SwarmEvent.
That said, I don't think we should force our users to implement all of their logic as NetworkBehaviour. Imo it makes sense here to offer the flexibility to decide based on individual user-cases and application what logic should be implemented as NetworkBehaviour, and what on top of the Swarm.

@thomaseizinger
Copy link
Contributor Author

Thanks for the comment @elenaf9 ! I don't see it as such a drastic step "backwards".

#2784 was about handling behaviour events outside of Swarm. We still encourage that with this proposal as the events are returned from polling Swarm.

Perhaps it is a bit too early for this proposal as we don't have the necessary infrastructure yet for behaviours to collaborate amongst each other which is what we'd need.

Long-term, I think we should keep this design in mind. Polling the Swarm with minimal latency is crucial for applications. Driving it as part of a bigger event-loop can cause problems if that loop has other yield points.

Short-term, a meaningful step might be to strive for SwarmEvent to be (naming subject to bike-shedding):

enum SwarmEvent {
    FromBehaviour(TOut),
    FromSwarm(FromSwarm)
}

@thomaseizinger thomaseizinger closed this as not planned Won't fix, can't repro, duplicate, stale May 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
decision-pending Marks issues where a decision is pending before we can move forward. difficulty:moderate priority:important The changes needed are critical for libp2p, or are blocking another project
Projects
None yet
Development

No branches or pull requests

3 participants