Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Resolve performance bottleneck for querying epoch ownership #5948

Closed
octol opened this issue May 7, 2020 · 2 comments · Fixed by #5962
Closed

Resolve performance bottleneck for querying epoch ownership #5948

octol opened this issue May 7, 2020 · 2 comments · Fixed by #5962
Assignees
Labels
I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task.

Comments

@octol
Copy link
Contributor

octol commented May 7, 2020

The babe_epochAuthorship RPC call can potentially take a long time to complete. The code that was originally designed for block authoring is sub-optimal for querying all the slots at once and is causing noticeable performance problems

Idea:

  • We should break apart authorship::claim_slot in Babe to be able to take a specific key to try

This should be resolved before we can merge paritytech/polkadot#1065

@octol octol added I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task. M4-core labels May 7, 2020
@octol octol self-assigned this May 7, 2020
@burdges
Copy link

burdges commented May 7, 2020

Why do we loop over keys? Ala this flat_map?

How does it identify the BABE key from the keystore there? It should find only the current one, yes? We've no reason to sign under inactive keys.

Are we rotating BABE keys often? We need not rotate them often really, but maybe some people prefer to do so.

vrf_sign_after_check costs roughly 20 microseconds to fail, or roughly 35 microseconds to succeed, roughly twice as long as doing an ed25519 signature. You can reduce the succeed case if you need only the output ala #5876 not the signature. It should not really be called in a loop over keys though, well not unless you've one substrate instance running many validators, which sucks for other reasons.

There is another performance hit in how we compute the randomness for each BABE epoch. We currently accumulate a Vec and hash it all at once. We could iteratively do the hashing, but the cost of increasing header size by 32 bytes, so maybe this Vec actually makes the most sense, but if you recompute it in a loop then it'll suck, and if you must pull it from slow storage in a loop then it'll suck even more.

@andresilva
Copy link
Contributor

Currently the BABE implementation supports authoring under multiple keys on the same node. So we check the current authority set against all matching keys in the keystore (usually just one or none, since people don't normally run multiple BABE authorities on the same node).

The problem here is that for this API that calculates slot assignments we are fetching (and parsing) the keys from the keystore for each slot we are trying, the bottleneck is actually parsing the keys. We're essentially doing:

for slot in epoch {
  all_local_keys = fetch_all_matching_keys_from_keystore();
  for key in all_public_keys {
    try_to_claim_slot()
  } 
}

Whereas we should do:

all_local_keys = fetch_all_matching_keys_from_keystore();
for slot in epoch {
  for key in all_public_keys {
    try_to_claim_slot()
  } 
}

After we do this change I suspect we will no longer have any performance bottleneck.

There is another performance hit in how we compute the randomness for each BABE epoch. We currently accumulate a Vec and hash it all at once. We could iteratively do the hashing, but the cost of increasing header size by 32 bytes, so maybe this Vec actually makes the most sense, but if you recompute it in a loop then it'll suck, and if you must pull it from slow storage in a loop then it'll suck even more.

I guess that would be advantageous since we wouldn't have to maintain a large pool of data in runtime storage, although this vec is bounded by epoch length so I think it's working out OK.
The current epoch randomness is always pulled from memory IIRC (since it is announced 1 epoch in advance), so here we just fetch the epoch randomness and then check all epoch slot assignments against it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants