Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some groundwork for pipelining the store #3084

Merged
merged 15 commits into from
Jan 18, 2022
Merged

Conversation

lutter
Copy link
Collaborator

@lutter lutter commented Dec 17, 2021

This PR contains some fairly simple groundwork in preparation of pipelining the WritableStore. It might be good to merge that now since it touches things in quite a few places outside the store. Pipelining proper should then mostly happen within the store.

The biggest functional change in this PR is that the WritableAgent caches the block pointer and cursor in memory rather than getting it from the database.

@lutter lutter changed the title Some groundwork for pipeling the store Some groundwork for pipelining the store Dec 17, 2021
@leoyvens leoyvens self-requested a review December 17, 2021 20:55
@lutter lutter force-pushed the lutter/store-pipeline-prep branch from a673aa0 to f3d1b1a Compare December 18, 2021 21:17
@leoyvens
Copy link
Collaborator

Two points, which became easier to address after #3085:

  1. I think we can make the ownership of the writable agent more precise, by having the instance manager own it directly rather than have the store serve as a registry of writable agents (TIL this is the multiton pattern). It's the only place that calls writable_store, except for the polling block stream which still calls deployment_synced on it. But that's moving to the instance manager after Make Firehose subgraphs have their sync status be updated correctly #3047, and it could be moved to its own trait anyways.

  2. The instance manager is the only place where the block cursor and ptr are queried, at block stream construction. We could change this to be queried only at subgraph startup, and have the instance manager keep track of the latest block cursor and ptr it sent to the store (on apply and revert of block operations), rather than querying the writable agent for it. Btw, well noticed that the firehose cursor isn't updated on revert_block_operations, this seems wrong, ping @maoueh.

@maoueh
Copy link
Contributor

maoueh commented Dec 21, 2021

Yep indeed, that's wrong. We should update the cursor in this case also definitely.

@lutter
Copy link
Collaborator Author

lutter commented Dec 21, 2021

1. I think we can make the ownership of the writable agent more precise, by having the instance manager own it directly rather than have the store serve as a registry of writable agents (TIL this is the [multiton](https://wiki.c2.com/?MultitonPattern) pattern). It's the only place that calls `writable_store`, except for the polling block stream which still calls `deployment_synced` on it. But that's moving to the instance manager after [Make Firehose subgraphs have their sync status be updated correctly #3047](https://github.com/graphprotocol/graph-node/issues/3047), and it could be moved to its own trait anyways.

The problem with this is that writable stores are also constructed in tests - I had an initial go at that, and they are all over the place when it comes to constructing writable stores. But in the tests, the pattern will need to be

  • make some changes to the store
  • wait for those changes to actually hit the database
  • perform some queries to make sure the db is in the expected state

I suspect there's some utility we can add to the test-store to make that all easier that would mimic the instance manager's handling of writable stores, but it's still something I need to figure out.

2. The instance manager is the only place where the block cursor and ptr are queried, at block stream construction. We could change this to be queried only at subgraph startup, and have the instance manager keep track of the latest block cursor and ptr it sent to the store (on apply and revert of block operations), rather than querying the writable agent for it. Btw, well noticed that the firehose cursor isn't updated on `revert_block_operations`, this seems wrong, ping @maoueh.

I like that - we could change it so that block cursor and ptr are returned from start_subgraph_deployment, and can't be queried through a WritableStore anymore.

@lutter
Copy link
Collaborator Author

lutter commented Dec 21, 2021

Which leaves the question: in terms of this PR, what should change in it? Just remove the commit store: Make SubgraphStore.writable idempotent and change things after this has been committed?

@leoyvens
Copy link
Collaborator

@lutter I'd be fine to merge this with that commit removed, but how does that leave the WritableAgent, will it be dead code?

@lutter lutter force-pushed the lutter/store-pipeline-prep branch from f3d1b1a to c58521c Compare January 11, 2022 23:12
@lutter
Copy link
Collaborator Author

lutter commented Jan 11, 2022

@lutter I'd be fine to merge this with that commit removed, but how does that leave the WritableAgent, will it be dead code?

I just rebased the branch to latest master.

Thinking some more about making SubgraphStore.writable idempotent (a 'multiton'), I don't think there's a safe way to make sure that writable is called at most once for any given deployment. And once we write async to the store, having two WritableStore instances for the same deployment will result in fireworks/incorrect data. IMHO, making writable not idempotent is just too dangerous since it's too easy to violate the constraint "call this only once for any given deployment for the lifetime of the process" in the future, and you'll get no warning that the constraint was violated.

@leoyvens
Copy link
Collaborator

Do you think it's actually feasible to error in SubgraphStore::writable if there are outstanding references to the writable agent, by checking the ref count? That would be a strong invariant check.

@lutter lutter force-pushed the lutter/store-pipeline-prep branch from c58521c to 97846d6 Compare January 12, 2022 23:57
@lutter
Copy link
Collaborator Author

lutter commented Jan 13, 2022

Do you think it's actually feasible to error in SubgraphStore::writable if there are outstanding references to the writable agent, by checking the ref count? That would be a strong invariant check.

We'd still need a map from deployment id to that agent/writable; at the very least, we'd need to keep a HashSet of deployment ids for which we already made a WritableStore. It will also not work for tests which call writable(..) all over the place; I think it's best to keep maintaining a map to ensure idempotency.

@leoyvens
Copy link
Collaborator

@lutter Yes, I was thinking of how we could leverage the map to check an invariant that's truly useful. Having two instances of WritableAgent would be suspicious, but having two references to the same agent would also be suspicious. So the multiton doesn't prevent misuse, it just allows a different misuse. The misuse of two references is actually the scary one to me, because the inconsistencies it can create are more subtle and therefore harder to catch in tests.

But I'll defer to you if you prefer to keep the current approach which deals with WritableAgent through references. I'd perhaps suggest moving the mutex to a wrapping WritableAgent(Arc<Mutex<InnerWritableAgent>>) so that you can write &mut self methods. And then you could consider having a misuse check that errors if the mutex is already held by another thread when trying to acquire it.

lutter added 15 commits January 18, 2022 09:34
That is the last place still using it. With that, it's also clear that
MockStore does not need to implement SubgraphStore
That makes it possible to keep the SubgraphStore out of the runtime host
altogether and helps clarify that it doesn't need direct store access
except for ENS lookups
Right now, it is only a wrapper around WritableStore, but we will evolve it
into an agent that manages writing to the database asynchronously. We also
stop impelementing the `WritableStore` trait for `WritableStore` since we
will need to modify some method signatures.
@lutter lutter force-pushed the lutter/store-pipeline-prep branch from 97846d6 to 1645118 Compare January 18, 2022 17:42
@lutter lutter merged commit 1645118 into master Jan 18, 2022
@lutter lutter deleted the lutter/store-pipeline-prep branch January 18, 2022 18:12
@leoyvens leoyvens mentioned this pull request Feb 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants