Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Added LightClientBootstrap V1 #3711

Closed
wants to merge 9 commits into from
Closed

[Merged by Bors] - Added LightClientBootstrap V1 #3711

wants to merge 9 commits into from

Conversation

Giulio2002
Copy link
Contributor

@Giulio2002 Giulio2002 commented Nov 9, 2022

Issue Addressed

Partially addresses #3651

Proposed Changes

Adds server-side support for light_client_bootstrap_v1 topic

Additional Info

This PR, creates each time a bootstrap without using cache, I do not know how necessary a cache is in this case as this topic is not supposed to be called frequently and IMHO we can just prevent abuse by using the limiter, but let me know what you think or if there is any caveat to this, or if it is necessary only for the sake of good practice.

@Giulio2002 Giulio2002 changed the base branch from stable to unstable November 9, 2022 22:20
@Giulio2002 Giulio2002 marked this pull request as draft November 9, 2022 22:22
@Giulio2002 Giulio2002 marked this pull request as ready for review November 10, 2022 01:14
@Giulio2002
Copy link
Contributor Author

I think it should work, but I am not sure how to test it though, I can try figure out a way to implement static peers in Erigon Lightclient or connect it to nimbus. Is there an easier way to test rpc calls?

@Giulio2002
Copy link
Contributor Author

but other than that it can be reviewed at first glance still

Copy link
Member

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, this is looking very promising

I'm not the most well-versed networking expert, but I've given feedback on the parts I can, and we can get one of the network gurus to double check everything before merging.

beacon_node/lighthouse_network/src/rpc/methods.rs Outdated Show resolved Hide resolved
beacon_node/lighthouse_network/src/service/mod.rs Outdated Show resolved Hide resolved
beacon_node/network/src/beacon_processor/mod.rs Outdated Show resolved Hide resolved
beacon_node/network/src/beacon_processor/mod.rs Outdated Show resolved Hide resolved
beacon_node/network/src/beacon_processor/mod.rs Outdated Show resolved Hide resolved
bors bot pushed a commit that referenced this pull request Nov 11, 2022
## Issue Addressed

Part of #3651.

## Proposed Changes

Add a flag for enabling the light client server, which should be checked before gossip/RPC traffic is processed (e.g. #3693, #3711). The flag is available at runtime from `beacon_chain.config.enable_light_client_server`.

Additionally, a new method `BeaconChain::with_mutable_state_for_block` is added which I envisage being used for computing light client updates. Unfortunately its performance will be quite poor on average because it will only run quickly with access to the tree hash cache. Each slot the tree hash cache is only available for a brief window of time between the head block being processed and the state advance at 9s in the slot. When the state advance happens the cache is moved and mutated to get ready for the next slot, which makes it no longer useful for merkle proofs related to the head block. Rather than spend more time trying to optimise this I think we should continue prototyping with this code, and I'll make sure `tree-states` is ready to ship before we enable the light client server in prod (cf. #3206).

## Additional Info

I also fixed a bug in the implementation of `BeaconState::compute_merkle_proof` whereby the tree hash cache was moved with `.take()` but never put back with `.restore()`.
Copy link
Member

@pawanjay176 pawanjay176 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good.

beacon_node/lighthouse_network/src/service/mod.rs Outdated Show resolved Hide resolved
beacon_node/lighthouse_network/src/rpc/outbound.rs Outdated Show resolved Hide resolved
@Giulio2002
Copy link
Contributor Author

I fixed all the comments except the one about the flag, will look through @michaelsproul commit

@Giulio2002
Copy link
Contributor Author

I also added the missing piece that was replaced with unreacheable and also added the comments needed.

@pawanjay176 pawanjay176 mentioned this pull request Nov 14, 2022
@pawanjay176
Copy link
Member

@Giulio2002 I made a PR here with the required changes to support the conditional advertisement of the rpc protocols #3720

We had some redundant code in the rpc module that I fixed there. Feel free to cherry pick the commits into this branch :)

@Giulio2002
Copy link
Contributor Author

Yes thank you, was very confused by the comments :)

@Giulio2002
Copy link
Contributor Author

Will close some of the comments above which I see you fixed yourself

@Giulio2002
Copy link
Contributor Author

Oh i see you added the check, then I guess it is ready for final review :)

@Giulio2002
Copy link
Contributor Author

Giulio2002 commented Nov 15, 2022

also, why not enabling it by default? ideally it is supposed to be the opposite, enabled by default but disabled manually if needed. I think that is what @michaelsproul wrote on #3651. Note that this adds broader support of lightclients in the Ethereum network so It does not affect Lighthouse itself in any way, other than serving extra stuff.

@michaelsproul
Copy link
Member

@Giulio2002 I agree the medium-term plan should be to enable by default but I want to be able to cut releases in the meantime where light client support is not fully complete/tested. I'm also worried perf will be an issue with our current caches

Copy link
Member

@michaelsproul michaelsproul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from my side, thank you @Giulio2002 for your patience during review.

We're about to cut a v3.3.0 release, but once the dust from that settles and one of the networking crew signs off on it we can merge this.

@michaelsproul michaelsproul added blocked v3.4.0 Minor release following v3.3.0 labels Nov 21, 2022
Copy link
Member

@pawanjay176 pawanjay176 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Would be great if @divagant-martian or @AgeManning can take a look at the conditional rpc protocol advertisement logic as well.

Copy link
Member

@AgeManning AgeManning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me

@@ -156,6 +157,10 @@ const MAX_BLOCKS_BY_RANGE_QUEUE_LEN: usize = 1_024;
/// will be stored before we start dropping them.
const MAX_BLOCKS_BY_ROOTS_QUEUE_LEN: usize = 1_024;

/// The maximum number of queued `LightClientBootstrapRequest` objects received from the network RPC that
/// will be stored before we start dropping them.
const MAX_LIGHT_CLIENT_BOOTSTRAP_QUEUE_LEN: usize = 1_024;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to raise this length is probably higher than it needs to be. I can see it's on-par with blocks_by_root.

The blocks_by_root has an RPC rate limit of 128 per peer per 10 seconds, whereas the light client bootstrap only allows 1 per peer per 10 seconds.

I think we'd rate limit before we'd hit this queue length. If this is supposed to be designed to be similar to blocks by root, might want to increase the RPC limit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to leave this as-is for now and we can revisit in future

@michaelsproul michaelsproul added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review blocked labels Nov 21, 2022
@michaelsproul

This comment was marked as outdated.

@michaelsproul

This comment was marked as outdated.

@bors

This comment was marked as outdated.

@michaelsproul
Copy link
Member

bors r+

@michaelsproul
Copy link
Member

bors r-

@bors
Copy link

bors bot commented Nov 25, 2022

Canceled.

@michaelsproul
Copy link
Member

bors r+

bors bot pushed a commit that referenced this pull request Nov 25, 2022
## Issue Addressed

Partially addresses #3651

## Proposed Changes

Adds server-side support for light_client_bootstrap_v1 topic

## Additional Info

This PR, creates each time a bootstrap without using cache, I do not know how necessary a cache is in this case as this topic is not supposed to be called frequently and IMHO we can just prevent abuse by using the limiter, but let me know what you think or if there is any caveat to this, or if it is necessary only for the sake of good practice.


Co-authored-by: Pawan Dhananjay <[email protected]>
@bors bors bot changed the title Added LightClientBootstrap V1 [Merged by Bors] - Added LightClientBootstrap V1 Nov 25, 2022
@bors bors bot closed this Nov 25, 2022
macladson pushed a commit to macladson/lighthouse that referenced this pull request Jan 5, 2023
## Issue Addressed

Part of sigp#3651.

## Proposed Changes

Add a flag for enabling the light client server, which should be checked before gossip/RPC traffic is processed (e.g. sigp#3693, sigp#3711). The flag is available at runtime from `beacon_chain.config.enable_light_client_server`.

Additionally, a new method `BeaconChain::with_mutable_state_for_block` is added which I envisage being used for computing light client updates. Unfortunately its performance will be quite poor on average because it will only run quickly with access to the tree hash cache. Each slot the tree hash cache is only available for a brief window of time between the head block being processed and the state advance at 9s in the slot. When the state advance happens the cache is moved and mutated to get ready for the next slot, which makes it no longer useful for merkle proofs related to the head block. Rather than spend more time trying to optimise this I think we should continue prototyping with this code, and I'll make sure `tree-states` is ready to ship before we enable the light client server in prod (cf. sigp#3206).

## Additional Info

I also fixed a bug in the implementation of `BeaconState::compute_merkle_proof` whereby the tree hash cache was moved with `.take()` but never put back with `.restore()`.
macladson pushed a commit to macladson/lighthouse that referenced this pull request Jan 5, 2023
## Issue Addressed

Partially addresses sigp#3651

## Proposed Changes

Adds server-side support for light_client_bootstrap_v1 topic

## Additional Info

This PR, creates each time a bootstrap without using cache, I do not know how necessary a cache is in this case as this topic is not supposed to be called frequently and IMHO we can just prevent abuse by using the limiter, but let me know what you think or if there is any caveat to this, or if it is necessary only for the sake of good practice.


Co-authored-by: Pawan Dhananjay <[email protected]>
paulhauner added a commit to paulhauner/lighthouse that referenced this pull request Jan 19, 2023
Squashed commit of the following:

commit ad08d07
Author: Paul Hauner <[email protected]>
Date:   Mon Dec 5 16:51:06 2022 +1100

    Remove crits for late block

commit 8e85d62
Author: Paul Hauner <[email protected]>
Date:   Mon Dec 5 16:48:43 2022 +1100

    Downgrade log for payload reveal failure

commit 84392d6
Author: Michael Sproul <[email protected]>
Date:   Fri Dec 2 00:07:43 2022 +0000

    Delete DB schema migrations for v11 and earlier (sigp#3761)

    ## Proposed Changes

    Now that the Gnosis merge is scheduled, all users should have upgraded beyond Lighthouse v3.0.0. Accordingly we can delete schema migrations for versions prior to v3.0.0.

    ## Additional Info

    I also deleted the state cache stuff I added in sigp#3714 as it turned out to be useless for the light client proofs due to the one-slot offset.

commit 18c9be5
Author: Mac L <[email protected]>
Date:   Thu Dec 1 06:03:53 2022 +0000

    Add API endpoint to count statuses of all validators (sigp#3756)

    ## Issue Addressed

    sigp#3724

    ## Proposed Changes

    Adds an endpoint to quickly count the number of occurances of each status in the validator set.

    ## Usage

    ```bash
    curl -X GET "http://localhost:5052/lighthouse/ui/validator_count" -H "accept: application/json" | jq
    ```

    ```json
    {
      "data": {
        "active_ongoing":479508,
        "active_exiting":0,
        "active_slashed":0,
        "pending_initialized":28,
        "pending_queued":0,
        "withdrawal_possible":933,
        "withdrawal_done":0,
        "exited_unslashed":0,
        "exited_slashed":3
      }
    }
    ```

commit 2211504
Author: Michael Sproul <[email protected]>
Date:   Wed Nov 30 05:22:58 2022 +0000

    Prioritise important parts of block processing (sigp#3696)

    ## Issue Addressed

    Closes sigp#2327

    ## Proposed Changes

    This is an extension of some ideas I implemented while working on `tree-states`:

    - Cache the indexed attestations from blocks in the `ConsensusContext`. Previously we were re-computing them 3-4 times over.
    - Clean up `import_block` by splitting each part into `import_block_XXX`.
    - Move some stuff off hot paths, specifically:
      - Relocate non-essential tasks that were running between receiving the payload verification status and priming the early attester cache. These tasks are moved after the cache priming:
        - Attestation observation
        - Validator monitor updates
        - Slasher updates
        - Updating the shuffling cache
      - Fork choice attestation observation now happens at the end of block verification in parallel with payload verification (this seems to save 5-10ms).
      - Payload verification now happens _before_ advancing the pre-state and writing it to disk! States were previously being written eagerly and adding ~20-30ms in front of verifying the execution payload. State catchup also sometimes takes ~500ms if we get a cache miss and need to rebuild the tree hash cache.

    The remaining task that's taking substantial time (~20ms) is importing the block to fork choice. I _think_ this is because of pull-tips, and we should be able to optimise it out with a clever total active balance cache in the state (which would be computed in parallel with payload verification). I've decided to leave that for future work though. For now it can be observed via the new `beacon_block_processing_post_exec_pre_attestable_seconds` metric.

    Co-authored-by: Michael Sproul <[email protected]>

commit b4f4c0d
Author: Divma <[email protected]>
Date:   Wed Nov 30 03:21:35 2022 +0000

    Ipv6 bootnodes (sigp#3752)

    ## Issue Addressed
    our bootnodes as of now support only ipv4. this makes it so that they support ipv6

    ## Proposed Changes
    - Adds code necessary to update the bootnodes to run on dual stack nodes and therefore contact and store ipv6 nodes.
    - Adds some metrics about connectivity type of stored peers. It might have been nice to see some metrics over the sessions but that feels out of scope right now.

    ## Additional Info
    - some code quality improvements sneaked in since the changes seemed small
    - I think it depends on the OS, but enabling mapped addresses on an ipv6 node without dual stack support enabled could fail silently, making these nodes effectively ipv6 only. In the future I'll probably change this to use two sockets, which should fail loudly

commit 3534c85
Author: GeemoCandama <[email protected]>
Date:   Tue Nov 29 08:19:27 2022 +0000

    Optimize finalized chain sync by skipping newPayload messages (sigp#3738)

    ## Issue Addressed

    sigp#3704

    ## Proposed Changes
    Adds is_syncing_finalized: bool parameter for block verification functions. Sets the payload_verification_status to Optimistic if is_syncing_finalized is true. Uses SyncState in NetworkGlobals in BeaconProcessor to retrieve the syncing status.

    ## Additional Info
    I could implement FinalizedSignatureVerifiedBlock if you think it would be nicer.

commit a2969ba
Author: Paul Hauner <[email protected]>
Date:   Tue Nov 29 05:51:42 2022 +0000

    Improve debugging experience for builder proposals (sigp#3725)

    ## Issue Addressed

    NA

    ## Proposed Changes

    This PR sets out to improve the logging/metrics experience when interacting with the builder. Namely, it:

    - Adds/changes metrics (see "Metrics Changes" section).
    - Adds new logs which show the duration of requests to the builder/local EL.
    - Refactors existing logs for consistency and so that the `parent_hash` is include in all relevant logs (we can grep for this field when trying to trace the flow of block production).

    Additionally, when I was implementing this PR I noticed that we skip some verification of the builder payload in the scenario where the builder return `Ok` but the local EL returns with `Err`. Namely, we were skipping the bid signature and other values like parent hash and prev randao. In this PR I've changed it so we *always* check these values and reject the bid if they're incorrect. With these changes, we'll sometimes choose to skip a proposal rather than propose something invalid -- that's the only side-effect to the changes that I can see.

    ## Metrics Changes

    - Changed: `execution_layer_request_times`:
        - `method = "get_blinded_payload_local"`: time taken to get a payload from a local EE.
        - `method = "get_blinded_payload_builder"`: time taken to get a blinded payload from a builder.
        - `method = "post_blinded_payload_builder"`: time taken to get a builder to reveal a payload they've previously supplied us.
    - `execution_layer_get_payload_outcome`
        - `outcome = "success"`: we successfully produced a payload from a builder or local EE.
        - `outcome = "failure"`: we were unable to get a payload from a builder or local EE.
    - New: `execution_layer_builder_reveal_payload_outcome`
        - `outcome = "success"`: a builder revealed a payload from a signed, blinded block.
        - `outcome = "failure"`: the builder did not reveal the payload.
    - New: `execution_layer_get_payload_source`
        - `type = "builder"`: we used a payload from a builder to produce a block.
        - `type = "local"`: we used a payload from a local EE to produce a block.
    - New: `execution_layer_get_payload_builder_rejections` has a `reason` field to describe why we rejected a payload from a builder.
    - New: `execution_layer_payload_bids` tracks the bid (in gwei) from the builder or local EE (local EE not yet supported, waiting on EEs to expose the value). Can only record values that fit inside an i64 (roughly 9 million ETH).
    ## Additional Info

    NA

commit 99ec9d9
Author: kevinbogner <[email protected]>
Date:   Mon Nov 28 10:05:43 2022 +0000

    Add Run a Node guide (sigp#3681)

    ## Issue Addressed

    Related to sigp#3672

    ## Proposed Changes

    - Added a guide to run a node. Mainly, copy and paste from 'Merge Migration' and 'Checkpoint Sync'.
    - Ranked it high in ToC:
      - Introduction
      - Installation
      - Run a Node
      - Become a Validator
    	...
    - Hid 'Merge Migration' in ToC.

    ## Additional Info

    - Should I add/rephrase/delete something?
    - Now there is some redundancy:
      - 'Run a node' and 'Checkpoint Sync' contain similar information.
      - Same for 'Run a node' and 'Become a Validator'.

    Co-authored-by: kevinbogner <[email protected]>
    Co-authored-by: Michael Sproul <[email protected]>

commit 2779017
Author: Age Manning <[email protected]>
Date:   Mon Nov 28 07:36:52 2022 +0000

    Gossipsub fast message id change (sigp#3755)

    For improved consistency, this mixes in the topic into our fast message id for more consistent tracking of messages across topics.

commit c881b80
Author: Mac L <[email protected]>
Date:   Mon Nov 28 00:22:53 2022 +0000

    Add CLI flag for gui requirements (sigp#3731)

    ## Issue Addressed

    sigp#3723

    ## Proposed Changes

    Adds a new CLI flag `--gui` which enables all the various flags required for the gui to function properly.
    Currently enables the `--http` and `--validator-monitor-auto` flags.

commit 969ff24
Author: Mac L <[email protected]>
Date:   Fri Nov 25 07:57:11 2022 +0000

    Add CLI flag to opt in to world-readable log files (sigp#3747)

    ## Issue Addressed

    sigp#3732

    ## Proposed Changes

    Add a CLI flag to allow users to opt out of the restrictive permissions of the log files.

    ## Additional Info

    This is not recommended for most users. The log files can contain sensitive information such as validator indices, public keys and API tokens (see sigp#2438). However some users using a multi-user setup may find this helpful if they understand the risks involved.

commit e9bf7f7
Author: antondlr <[email protected]>
Date:   Fri Nov 25 07:57:10 2022 +0000

    remove commas from comma-separated kv pairs (sigp#3737)

    ## Issue Addressed

    Logs are in comma separated kv list, but the values sometimes contain commas, which breaks parsing

commit d5a2de7
Author: Giulio rebuffo <[email protected]>
Date:   Fri Nov 25 05:19:00 2022 +0000

    Added LightClientBootstrap V1  (sigp#3711)

    ## Issue Addressed

    Partially addresses sigp#3651

    ## Proposed Changes

    Adds server-side support for light_client_bootstrap_v1 topic

    ## Additional Info

    This PR, creates each time a bootstrap without using cache, I do not know how necessary a cache is in this case as this topic is not supposed to be called frequently and IMHO we can just prevent abuse by using the limiter, but let me know what you think or if there is any caveat to this, or if it is necessary only for the sake of good practice.

    Co-authored-by: Pawan Dhananjay <[email protected]>
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this pull request Jan 6, 2024
## Issue Addressed

Part of sigp#3651.

## Proposed Changes

Add a flag for enabling the light client server, which should be checked before gossip/RPC traffic is processed (e.g. sigp#3693, sigp#3711). The flag is available at runtime from `beacon_chain.config.enable_light_client_server`.

Additionally, a new method `BeaconChain::with_mutable_state_for_block` is added which I envisage being used for computing light client updates. Unfortunately its performance will be quite poor on average because it will only run quickly with access to the tree hash cache. Each slot the tree hash cache is only available for a brief window of time between the head block being processed and the state advance at 9s in the slot. When the state advance happens the cache is moved and mutated to get ready for the next slot, which makes it no longer useful for merkle proofs related to the head block. Rather than spend more time trying to optimise this I think we should continue prototyping with this code, and I'll make sure `tree-states` is ready to ship before we enable the light client server in prod (cf. sigp#3206).

## Additional Info

I also fixed a bug in the implementation of `BeaconState::compute_merkle_proof` whereby the tree hash cache was moved with `.take()` but never put back with `.restore()`.
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this pull request Jan 6, 2024
Partially addresses sigp#3651

Adds server-side support for light_client_bootstrap_v1 topic

This PR, creates each time a bootstrap without using cache, I do not know how necessary a cache is in this case as this topic is not supposed to be called frequently and IMHO we can just prevent abuse by using the limiter, but let me know what you think or if there is any caveat to this, or if it is necessary only for the sake of good practice.

Co-authored-by: Pawan Dhananjay <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-for-merge This PR is ready to merge. v3.4.0 Minor release following v3.3.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants