Refactor out routing system #655

aschmahmann · 2020-05-27T16:36:48Z

In order to support #616 we need to have more configurability of how records received from the DHT are processed (i.e. everything we do aside from the Kademlia logic itself).

This PR takes a stab at that by introducing a set of processing functions that operate on the received records and then making those configurable via routing options.

One thing that's currently missing from the PR is the ability to configure who updating puts are sent to when we do a GetValue/SearchValue query.

A next step to examine is extracting the networking/RPC code from the routing + query logic so that they are attached to separate objects (if not packages) to make exposing additional functionality less overwhelming and confusing to new users.

Still a WIP, but looking for some feedback @aarshkshah1992

aschmahmann · 2020-05-28T23:04:50Z

lookup.go

+// GetClosestPeersSeeded is the Kademlia 'node lookup' operation
+func (dht *IpfsDHT) GetClosestPeersSeeded(ctx context.Context, key string, seedPeers []peer.ID, useRTPeers bool) (<-chan peer.ID, error) {


This is the first of these double functions, it's a bit unfortunate that we have to expose 2x as many functions for the same thing. We could move them into a new struct, but I'm not sure if it's worth it. Thoughts?

If we're going to leave these functions on the main struct we should have a consistent naming scheme for these. Extended feeds pretty general and something to do with seeding the query or being a continuable query seems a little specific. I'm up for suggestions, otherwise I'll just use [OriginalName]Exteneded everywhere.

Also, while we're rewriting this it would be great to return []peer.ID instead of chan peer.ID, but some tests may have to be modified.

aschmahmann · 2020-05-28T23:08:25Z

routing/pipeline.go

+)
+
+var (
+	logger = logging.Logger("dht.routing")


Should we just use the dht logger here?

aschmahmann · 2020-05-28T23:11:49Z

routing/pipes.go

+type Processor interface {
+	Process(interface{}, func()) (interface{}, error)
+}


This is pretty unfortunate and is related to Go's lack of generics. We could also do RecordProcessor and ProviderProcessor here and just duplicate more of the code. Both options seems similarly gross to me, but having a single interface makes some of this a bit simpler (e.g. less routing options, sharing the CountStopper, etc.).

Also, once we move to a unified record system this should hopefully go away 🙏

aschmahmann · 2020-05-28T23:13:49Z

routing/wrapper.go

+// FindProviders searches for the providers corresponding to given Key and streams the results.
+func FindProviders(ctx context.Context, key multihash.Multihash, findProvsFn findProvsFn, processors []Processor, cfg *routing.Options) (<-chan peer.AddrInfo, <-chan []peer.ID, error) {


This is basically just a duplicated version of the SearchValue function, but with added type safety. Not sure if it's worth it, but seems reasonable.

aschmahmann · 2020-05-28T23:16:12Z

routing/wrapper.go

+	outChSize := maxRequestedRecords
+	if outChSize == 0 {
+		outChSize = 1
+	}
+	out := make(chan peer.AddrInfo, outChSize)


This code has previously existed and for a requested provider record count > 0 the channel buffer = requested record count. I don't know if this is really necessary anymore, it seems like we could just set the channel size to 1.

While this doesn't really hurt too much on the memory front it does feel a little weird and certainly doesn't align with the SearchValue setup.

Thoughts on just removing this and setting the channel size to 1?

aschmahmann · 2020-05-28T23:17:11Z

routing_options.go

+//
+// Deprecated: use github.com/libp2p/go-libp2p-kad-dht/routing.Quorum


Maybe moving the routing options into it's own package was overkill. Thoughts?

aschmahmann · 2020-05-29T00:07:25Z

routing.go

-				case <-ctx.Done():
-					return false
-				}
+		processors = []dhtrouting.Processor{validation, quorum, bestValue}


The current code actually does quorum -> bestValue. The faithful representation of this here would be quorum -> validation -> bestValue (will fix in the next PR update). It's unfortunate that we end up validating twice (once in the getValues function and once in the pipeline) we can optimize this as well.

We should also figure out what (if anything) we want to do with the quorum function for SearchValue. I think requiring a certain number of the latest records to be equal would be reasonable, and would allow us to throw out invalid records and make things a little easier. Alternatively, we could just drop the quorum function entirely.

aschmahmann · 2020-05-29T00:11:25Z

routing.go

+			MaxCount: dhtrouting.GetQuorum(&cfg),
+		}
+
+		processors = []dhtrouting.Processor{newValuesOnly, quorum}


Note: While SearchValue does quorum -> bestValue, here we do newValuesOnly -> quorum. I'd like some comments explaining the pipeline order for each function.

aschmahmann · 2020-05-29T00:13:08Z

routing.go

-		// If we have enough peers locally, don't bother with remote RPC
-		// TODO: is this a DOS vector?


If people start querying with count = 0 then this isn't a problem, so may be we should just emphasize that?

aschmahmann · 2020-05-29T00:14:13Z

routing/pipes.go

+package routing
+


More comments needed in this file. Also, is this separate package currently a good idea?

go.sum

…ht test

… return the closest peers used in the query

aschmahmann requested a review from aarshkshah1992 May 27, 2020 16:39

aschmahmann force-pushed the feat/refactor-routing branch 2 times, most recently from 72f8ab7 to eca6287 Compare May 28, 2020 23:01

aschmahmann commented May 29, 2020

View reviewed changes

aschmahmann commented Jun 1, 2020

View reviewed changes

go.sum Outdated Show resolved Hide resolved

aschmahmann mentioned this pull request Jun 3, 2020

Extract DHT message sender from the DHT #659

Merged

aschmahmann force-pushed the feat/refactor-routing branch from eca6287 to a97bb8f Compare January 4, 2021 09:16

aschmahmann changed the base branch from master to refactor/extract-messages January 4, 2021 09:17

aschmahmann marked this pull request as draft January 4, 2021 09:17

aschmahmann added 7 commits January 4, 2021 15:02

refactor: remove extraneous quorum check in dht.GetValue

051e0ee

refactor: move quorum default value to be defined with quorum option

44375c3

refactor: move quorum option to into a separate routing package

757500b

refactor: switch to non-deprecated version of Quorum option in dual d…

1cdfa39

…ht test

feat: add extended routing functions that take functional options and…

6a546cf

… return the closest peers used in the query

feat: add support for controlling the seed peers used in queries

9b4bfb9

feat: rework routing to be more configurable

961c097

aschmahmann force-pushed the feat/refactor-routing branch from a97bb8f to 961c097 Compare January 4, 2021 20:03

aschmahmann changed the base branch from refactor/extract-messages to master January 4, 2021 20:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor out routing system #655

Refactor out routing system #655

aschmahmann commented May 27, 2020 •

edited

Loading

aschmahmann May 28, 2020

aschmahmann May 28, 2020

aschmahmann May 28, 2020

aschmahmann May 28, 2020

aschmahmann May 28, 2020

aschmahmann May 28, 2020

aschmahmann May 29, 2020 •

edited

Loading

aschmahmann May 29, 2020

aschmahmann May 29, 2020

aschmahmann May 29, 2020

		// GetClosestPeersSeeded is the Kademlia 'node lookup' operation
		func (dht *IpfsDHT) GetClosestPeersSeeded(ctx context.Context, key string, seedPeers []peer.ID, useRTPeers bool) (<-chan peer.ID, error) {

		// FindProviders searches for the providers corresponding to given Key and streams the results.
		func FindProviders(ctx context.Context, key multihash.Multihash, findProvsFn findProvsFn, processors []Processor, cfg *routing.Options) (<-chan peer.AddrInfo, <-chan []peer.ID, error) {

		//
		// Deprecated: use github.com/libp2p/go-libp2p-kad-dht/routing.Quorum

		// If we have enough peers locally, don't bother with remote RPC
		// TODO: is this a DOS vector?

Refactor out routing system #655

Are you sure you want to change the base?

Refactor out routing system #655

Conversation

aschmahmann commented May 27, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aschmahmann May 29, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aschmahmann commented May 27, 2020 •

edited

Loading

aschmahmann May 29, 2020 •

edited

Loading