Skip to content
This repository has been archived by the owner on Aug 19, 2022. It is now read-only.

Implement datastore-backed peerstore #28

Closed
6 tasks done
bigs opened this issue Jun 4, 2018 · 14 comments
Closed
6 tasks done

Implement datastore-backed peerstore #28

bigs opened this issue Jun 4, 2018 · 14 comments
Assignees

Comments

@bigs
Copy link
Contributor

bigs commented Jun 4, 2018

Creating an issue so we can track progress/discuss here. There's an open PR (#10) that seems to have stalled/disappeared. I'll be taking inspiration from that and starting anew.

Right now i'm thinking all of the various *Books will use the same datastore, potentially changing that in the future so that each Book might be able to use a different underlying store.

Todos

(To be migrated to a PR onceup)

  • Implement basic benchmarks
  • Implement AddrBook
  • Implement KeyBook
  • Implement Metrics
  • Implement Peerstore meta methods
  • Extend benchmarks to run the same scenarios against each Peerstore impl
@bigs bigs self-assigned this Jun 4, 2018
@bigs
Copy link
Contributor Author

bigs commented Jun 4, 2018

Keeping it abstract for now, but wondering if using Badger or RocksDB directly wouldn't be of a big benefit here. They can handle TTL and the like on their own, and would enable the use of multiple column families with different sorts, if we desired.

@Stebalien
Copy link
Member

We really do want to be generic over the datastore. Personally, I'd just occasionally sweep the datastore.

However, we can probably add some (optional) expiration functionality to the datastore interface if this becomes a performance issue. That is, define a new interface that provides PutExpiring/SetExpiration functions and a function ExpiringDatastore(d ds.Datastore) ExpiringDatastore that "upgrades" the datastore to one that supports expiration if it doesn't already support it. However, that'll be a lot of work...

@bigs
Copy link
Contributor Author

bigs commented Jun 5, 2018

@Stebalien Sounds good, why mentioned a similar desire! I'm wrapping up a relatively straightforward pure badger impl to benchmark off of, but will look into extending Datastore once that's set! Looking to add two features if it can be done in a manner that makes sense:

  1. Transaction support
  2. TTL (& potentially other metadata)

EDIT: I do like the idea of the "upgrading" a naive Datastore to support TTL, but agree that's probably a separate issue. I'll probably start with something like:

type ExpiringDatastore interface {
    Datastore
    PutExpiring(...)
    SetExpiration(...)
}

type Txn interface {
    Datastore
    Commit(...)
    Rollback(...)
}

type TransactionalDatastore interface {
    Datastore
    NewTransaction(...) *Txn
}

@bigs
Copy link
Contributor Author

bigs commented Jun 5, 2018

then could have something like

MakeExpiringDatastore(ds.Datastore) ds.ExpiringDatastore

@bigs
Copy link
Contributor Author

bigs commented Jun 7, 2018

@Stebalien Just mulling over this a bit... Is it reasonable to mix and match like so:

// ds is both a TransactionalDatastore and a TTLDatastore
txn, ok := ds.NewTransaction().(TTLDatastore)
txn.PutExpiring(...)
txn.Commit()

@Stebalien
Copy link
Member

In theory. Unfortunately, we start beating our heads against the type system (and I can't blame go here because I don't know of any type systems that can handle this).

Basically, we have:

  1. A ton of extension interfaces: CheckedDatastore, ScrubbedDatastore, Batching, etc...
  2. A ton of wrappers: Delayed, Mount, Sync, Autobatch, etc...

And, unfortunately, every wrapper needs to implement and forward every extension
interface to the underlying datastore. It's kind of a nightmare.

Bah humbug...

Back on topic,

Using adapters here will actually be a bit interesting. Unlike our current adapters, these will need to store additional state. That means that if a user opens the datastore and doesn't view it through the adapter, bad things will happen. I'm not sure what the best solution is.

Just mulling over this a bit... Is it reasonable to mix and match like so:

Yes, but that's really tricky. Depending on how TTLs are implemented, that could make every transaction conflict with every other transaction.


(we should probably move this discussion to an issue in the go-datastore repo)

@bigs
Copy link
Contributor Author

bigs commented Jun 7, 2018

Typeclasses and modules compose pretty cleanly 😛! Monad transformers exhibit that kind of extensionality pretty well! Type systems aside... moved the conversation to ipfs/go-datastore#87.

@bigs
Copy link
Contributor Author

bigs commented Aug 27, 2018

@raulk, here are some guidelines for tomorrow!

background

i've just recently merged a big change that introduces a version of the datastore that runs on go-ds-badger. it was implemented before i landed the TxnDatastore and TTLDatastore interfaces, so let's get things up to speed over here!

take a look at badger to get a basic understanding of what the database is capable of. the transaction semantics are particularly important for us. likewise, check out the new datastore interfaces.

your changes will all be made in the AddrBook implementation of DatastoreAddrManager. the corresponding tests are pretty comprehensive (though could be more so!), so you should be confident your change is ok if all the tests pass.

we'll start with these changes and then sync up when they're all done to talk about optimizations we can make and ways to improve the correctness of the address stream manager.

goals

note, these don't need to happen all at once!

  • establish a log of benchmark performance to help us track performance at a glance
    • BENCHMARKLOG.md?
    • list of <commit hash>, <benchmark results> tuples?
    • start with latest master
    • update after your changes
  • replace instances of Batching with TxnDatastore and update transactions accordingly
    • simple refactor/replacement
  • update Addrs to read in a single transaction (readOnly = true)
  • update ClearAddrs to leverage transactions
  • replace ttlmanager with appropriate use of TTLDatastore
    • you'll have to use type assignment to cast the transaction to a TTLDatastore

@raulk
Copy link
Member

raulk commented Aug 31, 2018

Benchmark logs here for now: #31

@raulk
Copy link
Member

raulk commented Sep 4, 2018

To support the above goals and establish a comprehensive baseline for benchmarking, I am also proceeding with the following:

  • add the ability to disable the cache, and to adjust its size.
  • add an options struct for the DS Peerstore.
  • more benchmarks (SetAddrs, Addrs, ClearAddrs, etc).

@bigs
Copy link
Contributor Author

bigs commented Sep 4, 2018

what are we thinking wrt the options struct? what kind of stuff would you like to be able to configure?

@raulk
Copy link
Member

raulk commented Sep 4, 2018

@bigs Here it is: https://github.com/libp2p/go-libp2p-peerstore/pull/33/files#diff-2ab9c4c4a50a480d085b467584d97980R14

We need it –at least for now– to be able to benchmark different configurations, i.e. with/without ARCCache, and the performance degradation with peers with many multiaddrs.

@raulk
Copy link
Member

raulk commented Sep 20, 2018

Everything on this ticket appears to be done, so closing. Reopen if it wasn't the case!

@raulk raulk closed this as completed Sep 20, 2018
@ghost ghost removed the status/in-progress In progress label Sep 20, 2018
@bigs
Copy link
Contributor Author

bigs commented Sep 21, 2018 via email

@libp2p libp2p deleted a comment from bitcard Mar 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants