Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize against ValidatorPubkeyCache timeouts #2327

Closed
paulhauner opened this issue Apr 29, 2021 · 9 comments
Closed

Optimize against ValidatorPubkeyCache timeouts #2327

paulhauner opened this issue Apr 29, 2021 · 9 comments
Assignees
Labels
enhancement New feature or request optimization Something to make Lighthouse run more efficiently.

Comments

@paulhauner
Copy link
Member

paulhauner commented Apr 29, 2021

Description

A user in Discord recently posted the following log:

Apr 28 19:09:08.447 CRIT Beacon block processing error           error: ValidatorPubkeyCacheLockTimeout, service: beacon
Apr 28 19:09:08.447 WARN BlockProcessingFailure                  outcome: ValidatorPubkeyCacheLockTimeout, msg: unexpected condition in processing block.

This user was not the first one, I've personally seen quite a few users report this same rror.

The validator_pubkey_cache is protected by a TimeoutRwLock. This means there can only be lock contention when something holds the write lock.

I think (it's worth confirming) this is the only place we take a write-lock on the pubkey cache:

// If there are new validators in this block, update our pubkey cache.
//
// We perform this _before_ adding the block to fork choice because the pubkey cache is
// used by attestation processing which will only process an attestation if the block is
// known to fork choice. This ordering ensure that the pubkey cache is always up-to-date.
self.validator_pubkey_cache
.try_write_for(VALIDATOR_PUBKEY_CACHE_LOCK_TIMEOUT)
.ok_or(Error::ValidatorPubkeyCacheLockTimeout)?
.import_new_pubkeys(&state)?;

I think (also worth confirming) that bulk of the time holding the write-lock is spent inside this function:

/// Adds zero or more validators to `self`.
fn import<I>(&mut self, validator_keys: I) -> Result<(), BeaconChainError>
where
I: Iterator<Item = PublicKeyBytes> + ExactSizeIterator,
{
self.pubkey_bytes.reserve(validator_keys.len());
self.pubkeys.reserve(validator_keys.len());
self.indices.reserve(validator_keys.len());
for pubkey in validator_keys {
let i = self.pubkeys.len();
if self.indices.contains_key(&pubkey) {
return Err(BeaconChainError::DuplicateValidatorPublicKey);
}
// The item is written to disk _before_ it is written into
// the local struct.
//
// This means that a pubkey cache read from disk will always be equivalent to or
// _later than_ the cache that was running in the previous instance of Lighthouse.
//
// The motivation behind this ordering is that we do not want to have states that
// reference a pubkey that is not in our cache. However, it's fine to have pubkeys
// that are never referenced in a state.
match &mut self.backing {
PubkeyCacheBacking::File(persistence_file) => {
persistence_file.append(i, &pubkey)?;
}
PubkeyCacheBacking::Database(store) => {
store.put_item(&DatabasePubkey::key_for_index(i), &DatabasePubkey(pubkey))?;
}
}
self.pubkeys.push(
(&pubkey)
.try_into()
.map_err(BeaconChainError::InvalidValidatorPubkeyBytes)?,
);
self.pubkey_bytes.push(pubkey);
self.indices.insert(pubkey, i);
}
Ok(())
}

So, it stands to reason that if we can optimize out some time spent in ValidatorPubkeyCache::import then we can avoid ValidatorPubkeyCacheLockTimeout errors.

I think the first step should be to add metrics to ValidatorPubkeyCache::import and determine what's taking the longest.

Potential Optimization

I'm guessing that the write-to-disk part is taking the longest.

It's really important that we retain the current behaviour where the on-disk "backing" is updated somewhat-atomically with the internal Vecs and HashMap.

One potential option is to have a "prepare" step where we can copy the backing to a new random file and add the new keys to it. Then, we have a "commit" phase where we replace the old backing with the new (only a single file copy operation) and update the internal structures.

The "prepare" step only needs a read-lock, whist the "commit" still needs a write-lock (but is hopefully much faster).

Once we determine it is indeed writing to disk that's slow we can expand upon this.

@paulhauner paulhauner added the enhancement New feature or request label Apr 29, 2021
@michaelsproul
Copy link
Member

The default backing is the DB now (since #2234), so we can't use an intermediate file per se, although I agree we can probably do a prepare/commit step with a read lock initially during the DB write, and then a write lock during the mutation.

@paulhauner
Copy link
Member Author

The default backing is the DB now

Oh yeah, I forgot about this.

I agree we can probably do a prepare/commit step with a read lock initially during the DB write, and then a write lock during the mutation.

We just need to maintain some sort of mechanism to ensure we mutate the DB and in-memory cache in lock-step. That might involve moving the locks inside ValidatorPubkeyCache.

@michaelsproul
Copy link
Member

michaelsproul commented May 7, 2021

@noslav
Copy link

noslav commented Sep 11, 2022

Hi lighthouse team, we're currently experiencing issues related to this please find the following log

Sep 09 21:41:21 cqt-operator-p1 systemd[1]: Started lighthouse eth2 service.
Sep 11 02:00:38 cqt-operator-p1 lighthouse[179796]: Sep 11 02:00:38.094 CRIT Beacon block processing error           error: ValidatorPubkeyCacheLockTimeout, service: beacon
Sep 11 02:25:18 cqt-operator-p1 lighthouse[179796]: Sep 11 02:25:18.922 CRIT Chain removed                           op: batch processing result, reason: WrongBatchState("Validation failed for batch in wrong state: AwaitingProcessing(16Uiu2HAm8HcQWjTmpEeqW96nDeYRs5c9bYV7SMPYSyeVGbEdPExU, 63 blocks)"), id: 12382284213649068281, from: 60234, to: 145933, end_root: 0x8093…fac2, current_target: 60238, batches: 7, peers: 4, validated_batches: 0, state: Syncing, sync_type: Finalized, service: sync
Sep 11 09:22:16 cqt-operator-p1 lighthouse[179796]: Sep 11 09:22:16.746 CRIT Beacon block processing error           error: ValidatorPubkeyCacheLockTimeout, service: beacon
Sep 11 10:44:41 cqt-operator-p1 lighthouse[179796]: Sep 11 10:44:41.898 CRIT Beacon block processing error           error: ValidatorPubkeyCacheLockTimeout, service: beacon
Sep 11 16:18:41 cqt-operator-p1 lighthouse[179796]: Sep 11 16:18:41.779 CRIT Beacon block processing error           error: ValidatorPubkeyCacheLockTimeout, service: beacon

The bn is running with the following config -

lighthouse \
--network mainnet \
beacon_node \
--datadir /scratch/lighthouse \
--debug-level crit \
--http \
--checkpoint-sync-url https://beaconstate.ethstaker.cc \
--execution-endpoint http://localhost:8551 \
--execution-jwt /scratch/node/eth-mainnet/geth/jwtsecret

To add we've no idea of the status of the beacon node sync from the above log but a quick size scan of the datadir gave me the following -

root@cqt-operator-p1:/scratch/lighthouse/beacon# sudo du -sh *
42G	chain_db
6.3G	freezer_db
590M	logs
16K	network

the running CL version -

lighthouse --version
Lighthouse v3.1.0-aa022f4
BLS library: blst
SHA256 hardware acceleration: true
Specs: mainnet (true), minimal (false), gnosis (false)

in conjunction with EL version -

Geth
Version: 1.10.23-stable
Git Commit: 5774d758f4f1e45c23ff6c81b957445b6bfcc5e8
Git Commit Date: 20220910
Architecture: amd64
Go Version: go1.18.1
Operating System: linux

OS -

Linux cqt-operator-p1 #15-Ubuntu SMP Fri Jun 10 11:30:24 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Kindly advise on how to proceed!

@michaelsproul
Copy link
Member

@noslav Try turning on the --disable-lock-timeouts flag. It will silence the error and prevent spurious retries but will not make the node process blocks any faster (the fundamental reason for the timeout). You might want to look into improving the CPU and or I/O performance of your hardware as well.

This issue is about improving our handling of the cache, which will not be available until a future release.

@noslav
Copy link

noslav commented Sep 12, 2022

@noslav Try turning on the --disable-lock-timeouts flag. It will silence the error and prevent spurious retries but will not make the node process blocks any faster (the fundamental reason for the timeout). You might want to look into improving the CPU and or I/O performance of your hardware as well.

This issue is about improving our handling of the cache, which will not be available until a future release.

thank you for the quick turnaround!

@noslav
Copy link

noslav commented Sep 12, 2022

@michaelsproul your suggestion helped, thank you! However, I'm wondering why in my logs I don't get the mentioned line here in the faq section on the merge migration page

INFO Ready for the merge, current_difficulty: 10789363, terminal_total_difficulty: 10790000

logs currently only show the following

Sep 12 00:19:41 cqt-operator-p1 systemd[1]: Stopped lighthouse eth2 service.
Sep 12 00:19:41 cqt-operator-p1 systemd[1]: lighthouse.service: Consumed 12h 27min 57.266s CPU time.
Sep 12 00:19:41 cqt-operator-p1 systemd[1]: Started lighthouse eth2 service.
|

has this anything to do with my EL configuration?

@michaelsproul
Copy link
Member

@noslav Remove the flag --debug-level crit, that flag is causing your node to only log critical errors, which is quite a strange configuration and probably not what you want!

@michaelsproul
Copy link
Member

I've overhauled the pubkey cache on the tree-states branch by making it into a generic cache for all validator information. I might try a new approach there using a CoW map for the data so that we can clone it when reading, and then mutate a copy while writing to disk, which then gets swapped into place.

@michaelsproul michaelsproul added the optimization Something to make Lighthouse run more efficiently. label Nov 17, 2022
@michaelsproul michaelsproul self-assigned this Nov 17, 2022
bors bot pushed a commit that referenced this issue Nov 30, 2022
## Issue Addressed

Closes #2327

## Proposed Changes

This is an extension of some ideas I implemented while working on `tree-states`:

- Cache the indexed attestations from blocks in the `ConsensusContext`. Previously we were re-computing them 3-4 times over.
- Clean up `import_block` by splitting each part into `import_block_XXX`.
- Move some stuff off hot paths, specifically:
  - Relocate non-essential tasks that were running between receiving the payload verification status and priming the early attester cache. These tasks are moved after the cache priming:
    - Attestation observation
    - Validator monitor updates
    - Slasher updates
    - Updating the shuffling cache
  - Fork choice attestation observation now happens at the end of block verification in parallel with payload verification (this seems to save 5-10ms).
  - Payload verification now happens _before_ advancing the pre-state and writing it to disk! States were previously being written eagerly and adding ~20-30ms in front of verifying the execution payload. State catchup also sometimes takes ~500ms if we get a cache miss and need to rebuild the tree hash cache.

The remaining task that's taking substantial time (~20ms) is importing the block to fork choice. I _think_ this is because of pull-tips, and we should be able to optimise it out with a clever total active balance cache in the state (which would be computed in parallel with payload verification). I've decided to leave that for future work though. For now it can be observed via the new `beacon_block_processing_post_exec_pre_attestable_seconds` metric.


Co-authored-by: Michael Sproul <[email protected]>
macladson pushed a commit to macladson/lighthouse that referenced this issue Jan 5, 2023
## Issue Addressed

Closes sigp#2327

## Proposed Changes

This is an extension of some ideas I implemented while working on `tree-states`:

- Cache the indexed attestations from blocks in the `ConsensusContext`. Previously we were re-computing them 3-4 times over.
- Clean up `import_block` by splitting each part into `import_block_XXX`.
- Move some stuff off hot paths, specifically:
  - Relocate non-essential tasks that were running between receiving the payload verification status and priming the early attester cache. These tasks are moved after the cache priming:
    - Attestation observation
    - Validator monitor updates
    - Slasher updates
    - Updating the shuffling cache
  - Fork choice attestation observation now happens at the end of block verification in parallel with payload verification (this seems to save 5-10ms).
  - Payload verification now happens _before_ advancing the pre-state and writing it to disk! States were previously being written eagerly and adding ~20-30ms in front of verifying the execution payload. State catchup also sometimes takes ~500ms if we get a cache miss and need to rebuild the tree hash cache.

The remaining task that's taking substantial time (~20ms) is importing the block to fork choice. I _think_ this is because of pull-tips, and we should be able to optimise it out with a clever total active balance cache in the state (which would be computed in parallel with payload verification). I've decided to leave that for future work though. For now it can be observed via the new `beacon_block_processing_post_exec_pre_attestable_seconds` metric.


Co-authored-by: Michael Sproul <[email protected]>
paulhauner added a commit to paulhauner/lighthouse that referenced this issue Jan 19, 2023
Squashed commit of the following:

commit ad08d07
Author: Paul Hauner <[email protected]>
Date:   Mon Dec 5 16:51:06 2022 +1100

    Remove crits for late block

commit 8e85d62
Author: Paul Hauner <[email protected]>
Date:   Mon Dec 5 16:48:43 2022 +1100

    Downgrade log for payload reveal failure

commit 84392d6
Author: Michael Sproul <[email protected]>
Date:   Fri Dec 2 00:07:43 2022 +0000

    Delete DB schema migrations for v11 and earlier (sigp#3761)

    ## Proposed Changes

    Now that the Gnosis merge is scheduled, all users should have upgraded beyond Lighthouse v3.0.0. Accordingly we can delete schema migrations for versions prior to v3.0.0.

    ## Additional Info

    I also deleted the state cache stuff I added in sigp#3714 as it turned out to be useless for the light client proofs due to the one-slot offset.

commit 18c9be5
Author: Mac L <[email protected]>
Date:   Thu Dec 1 06:03:53 2022 +0000

    Add API endpoint to count statuses of all validators (sigp#3756)

    ## Issue Addressed

    sigp#3724

    ## Proposed Changes

    Adds an endpoint to quickly count the number of occurances of each status in the validator set.

    ## Usage

    ```bash
    curl -X GET "http://localhost:5052/lighthouse/ui/validator_count" -H "accept: application/json" | jq
    ```

    ```json
    {
      "data": {
        "active_ongoing":479508,
        "active_exiting":0,
        "active_slashed":0,
        "pending_initialized":28,
        "pending_queued":0,
        "withdrawal_possible":933,
        "withdrawal_done":0,
        "exited_unslashed":0,
        "exited_slashed":3
      }
    }
    ```

commit 2211504
Author: Michael Sproul <[email protected]>
Date:   Wed Nov 30 05:22:58 2022 +0000

    Prioritise important parts of block processing (sigp#3696)

    ## Issue Addressed

    Closes sigp#2327

    ## Proposed Changes

    This is an extension of some ideas I implemented while working on `tree-states`:

    - Cache the indexed attestations from blocks in the `ConsensusContext`. Previously we were re-computing them 3-4 times over.
    - Clean up `import_block` by splitting each part into `import_block_XXX`.
    - Move some stuff off hot paths, specifically:
      - Relocate non-essential tasks that were running between receiving the payload verification status and priming the early attester cache. These tasks are moved after the cache priming:
        - Attestation observation
        - Validator monitor updates
        - Slasher updates
        - Updating the shuffling cache
      - Fork choice attestation observation now happens at the end of block verification in parallel with payload verification (this seems to save 5-10ms).
      - Payload verification now happens _before_ advancing the pre-state and writing it to disk! States were previously being written eagerly and adding ~20-30ms in front of verifying the execution payload. State catchup also sometimes takes ~500ms if we get a cache miss and need to rebuild the tree hash cache.

    The remaining task that's taking substantial time (~20ms) is importing the block to fork choice. I _think_ this is because of pull-tips, and we should be able to optimise it out with a clever total active balance cache in the state (which would be computed in parallel with payload verification). I've decided to leave that for future work though. For now it can be observed via the new `beacon_block_processing_post_exec_pre_attestable_seconds` metric.

    Co-authored-by: Michael Sproul <[email protected]>

commit b4f4c0d
Author: Divma <[email protected]>
Date:   Wed Nov 30 03:21:35 2022 +0000

    Ipv6 bootnodes (sigp#3752)

    ## Issue Addressed
    our bootnodes as of now support only ipv4. this makes it so that they support ipv6

    ## Proposed Changes
    - Adds code necessary to update the bootnodes to run on dual stack nodes and therefore contact and store ipv6 nodes.
    - Adds some metrics about connectivity type of stored peers. It might have been nice to see some metrics over the sessions but that feels out of scope right now.

    ## Additional Info
    - some code quality improvements sneaked in since the changes seemed small
    - I think it depends on the OS, but enabling mapped addresses on an ipv6 node without dual stack support enabled could fail silently, making these nodes effectively ipv6 only. In the future I'll probably change this to use two sockets, which should fail loudly

commit 3534c85
Author: GeemoCandama <[email protected]>
Date:   Tue Nov 29 08:19:27 2022 +0000

    Optimize finalized chain sync by skipping newPayload messages (sigp#3738)

    ## Issue Addressed

    sigp#3704

    ## Proposed Changes
    Adds is_syncing_finalized: bool parameter for block verification functions. Sets the payload_verification_status to Optimistic if is_syncing_finalized is true. Uses SyncState in NetworkGlobals in BeaconProcessor to retrieve the syncing status.

    ## Additional Info
    I could implement FinalizedSignatureVerifiedBlock if you think it would be nicer.

commit a2969ba
Author: Paul Hauner <[email protected]>
Date:   Tue Nov 29 05:51:42 2022 +0000

    Improve debugging experience for builder proposals (sigp#3725)

    ## Issue Addressed

    NA

    ## Proposed Changes

    This PR sets out to improve the logging/metrics experience when interacting with the builder. Namely, it:

    - Adds/changes metrics (see "Metrics Changes" section).
    - Adds new logs which show the duration of requests to the builder/local EL.
    - Refactors existing logs for consistency and so that the `parent_hash` is include in all relevant logs (we can grep for this field when trying to trace the flow of block production).

    Additionally, when I was implementing this PR I noticed that we skip some verification of the builder payload in the scenario where the builder return `Ok` but the local EL returns with `Err`. Namely, we were skipping the bid signature and other values like parent hash and prev randao. In this PR I've changed it so we *always* check these values and reject the bid if they're incorrect. With these changes, we'll sometimes choose to skip a proposal rather than propose something invalid -- that's the only side-effect to the changes that I can see.

    ## Metrics Changes

    - Changed: `execution_layer_request_times`:
        - `method = "get_blinded_payload_local"`: time taken to get a payload from a local EE.
        - `method = "get_blinded_payload_builder"`: time taken to get a blinded payload from a builder.
        - `method = "post_blinded_payload_builder"`: time taken to get a builder to reveal a payload they've previously supplied us.
    - `execution_layer_get_payload_outcome`
        - `outcome = "success"`: we successfully produced a payload from a builder or local EE.
        - `outcome = "failure"`: we were unable to get a payload from a builder or local EE.
    - New: `execution_layer_builder_reveal_payload_outcome`
        - `outcome = "success"`: a builder revealed a payload from a signed, blinded block.
        - `outcome = "failure"`: the builder did not reveal the payload.
    - New: `execution_layer_get_payload_source`
        - `type = "builder"`: we used a payload from a builder to produce a block.
        - `type = "local"`: we used a payload from a local EE to produce a block.
    - New: `execution_layer_get_payload_builder_rejections` has a `reason` field to describe why we rejected a payload from a builder.
    - New: `execution_layer_payload_bids` tracks the bid (in gwei) from the builder or local EE (local EE not yet supported, waiting on EEs to expose the value). Can only record values that fit inside an i64 (roughly 9 million ETH).
    ## Additional Info

    NA

commit 99ec9d9
Author: kevinbogner <[email protected]>
Date:   Mon Nov 28 10:05:43 2022 +0000

    Add Run a Node guide (sigp#3681)

    ## Issue Addressed

    Related to sigp#3672

    ## Proposed Changes

    - Added a guide to run a node. Mainly, copy and paste from 'Merge Migration' and 'Checkpoint Sync'.
    - Ranked it high in ToC:
      - Introduction
      - Installation
      - Run a Node
      - Become a Validator
    	...
    - Hid 'Merge Migration' in ToC.

    ## Additional Info

    - Should I add/rephrase/delete something?
    - Now there is some redundancy:
      - 'Run a node' and 'Checkpoint Sync' contain similar information.
      - Same for 'Run a node' and 'Become a Validator'.

    Co-authored-by: kevinbogner <[email protected]>
    Co-authored-by: Michael Sproul <[email protected]>

commit 2779017
Author: Age Manning <[email protected]>
Date:   Mon Nov 28 07:36:52 2022 +0000

    Gossipsub fast message id change (sigp#3755)

    For improved consistency, this mixes in the topic into our fast message id for more consistent tracking of messages across topics.

commit c881b80
Author: Mac L <[email protected]>
Date:   Mon Nov 28 00:22:53 2022 +0000

    Add CLI flag for gui requirements (sigp#3731)

    ## Issue Addressed

    sigp#3723

    ## Proposed Changes

    Adds a new CLI flag `--gui` which enables all the various flags required for the gui to function properly.
    Currently enables the `--http` and `--validator-monitor-auto` flags.

commit 969ff24
Author: Mac L <[email protected]>
Date:   Fri Nov 25 07:57:11 2022 +0000

    Add CLI flag to opt in to world-readable log files (sigp#3747)

    ## Issue Addressed

    sigp#3732

    ## Proposed Changes

    Add a CLI flag to allow users to opt out of the restrictive permissions of the log files.

    ## Additional Info

    This is not recommended for most users. The log files can contain sensitive information such as validator indices, public keys and API tokens (see sigp#2438). However some users using a multi-user setup may find this helpful if they understand the risks involved.

commit e9bf7f7
Author: antondlr <[email protected]>
Date:   Fri Nov 25 07:57:10 2022 +0000

    remove commas from comma-separated kv pairs (sigp#3737)

    ## Issue Addressed

    Logs are in comma separated kv list, but the values sometimes contain commas, which breaks parsing

commit d5a2de7
Author: Giulio rebuffo <[email protected]>
Date:   Fri Nov 25 05:19:00 2022 +0000

    Added LightClientBootstrap V1  (sigp#3711)

    ## Issue Addressed

    Partially addresses sigp#3651

    ## Proposed Changes

    Adds server-side support for light_client_bootstrap_v1 topic

    ## Additional Info

    This PR, creates each time a bootstrap without using cache, I do not know how necessary a cache is in this case as this topic is not supposed to be called frequently and IMHO we can just prevent abuse by using the limiter, but let me know what you think or if there is any caveat to this, or if it is necessary only for the sake of good practice.

    Co-authored-by: Pawan Dhananjay <[email protected]>
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this issue Jan 6, 2024
Closes sigp#2327

This is an extension of some ideas I implemented while working on `tree-states`:

- Cache the indexed attestations from blocks in the `ConsensusContext`. Previously we were re-computing them 3-4 times over.
- Clean up `import_block` by splitting each part into `import_block_XXX`.
- Move some stuff off hot paths, specifically:
  - Relocate non-essential tasks that were running between receiving the payload verification status and priming the early attester cache. These tasks are moved after the cache priming:
    - Attestation observation
    - Validator monitor updates
    - Slasher updates
    - Updating the shuffling cache
  - Fork choice attestation observation now happens at the end of block verification in parallel with payload verification (this seems to save 5-10ms).
  - Payload verification now happens _before_ advancing the pre-state and writing it to disk! States were previously being written eagerly and adding ~20-30ms in front of verifying the execution payload. State catchup also sometimes takes ~500ms if we get a cache miss and need to rebuild the tree hash cache.

The remaining task that's taking substantial time (~20ms) is importing the block to fork choice. I _think_ this is because of pull-tips, and we should be able to optimise it out with a clever total active balance cache in the state (which would be computed in parallel with payload verification). I've decided to leave that for future work though. For now it can be observed via the new `beacon_block_processing_post_exec_pre_attestable_seconds` metric.

Co-authored-by: Michael Sproul <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request optimization Something to make Lighthouse run more efficiently.
Projects
None yet
Development

No branches or pull requests

3 participants