Improvements to all_forks syncing #2114

tomaka · 2022-03-03T09:25:09Z

I don't know how to title this PR more appropriately than "generic improvements to the code".
This PR basically finishes the original vision of the pending_blocks.rs module by adding a unnecessary_unverified_blocks function.

For context, the data structure in pending_blocks.rs tracks blocks that we know exist but can't be verified yet (because they're not a direct child of a block that has been verified, or because haven't downloaded their header yet, or in the case of the full node because we only have their header and not their body).

This newly-added unnecessary_unverified_blocks function returns a list of unverified blocks that aren't strictly necessary to make the syncing progress. These blocks are blocks that are known to be on a bad fork, and blocks that are somewhere in-between the best block of a peer and a locally-known fork.

For example, if we are at block 10 and a peer announces block 15, we will download block 14, then block 13, then block 12, etc. (note: in reality we should download all 5 blocks at once, but in the worst case scenario we'll download them one by one). After block 13 has been downloaded, we can remove block 14 from memory (and keep block 13) without impacting the syncing, as the download of block 12 will still happen.

These unnecessary unverified blocks are removed from the data structure if it reaches a certain size.
It is important to do so, in order to avoid spam attacks where peers announce for example block 9999 when the chain is actually only at block 10. The PR sets the arbitrary threshold to 100 blocks, meaning that in that example, after we've downloaded block 9900, we will start removing block 9998, then block 9997, etc. Despite removing blocks, we will, in finite time, still be able to determine whether this alternative chain is good or bad. If the announcement of block 9999 turned out to be legit, then we will be synchronized at block 110, and will start downloading the blocks from block 9999 again from scratch. It isn't great, but again it guarantees that we'll be fully in sync in finite time, and without an explosion in memory usage.
For long range downloads, such as the initial syncing, the so-called "optimistic" syncing should be used instead. In other words, long-range downloads are out of scope of this code.

In addition to this new method, this PR does a lot of renaming and documentation clean ups, in order to clarify how the data structure works. I've had to do this in order to implement unnecessary_unverified_blocks because I was myself a bit lost. It introduces some small hacks into all_forks.rs, but all_forks.rs isn't super clean at the moment, and I don't think its quality is really degraded by these changes. It is necessary to put pending_blocks.rs in a clean state before all_forks.rs itself can be cleaned up.

@melekes I understand that this is a bit complicated since you don't know about this code at all, so I'm fine with you not properly reviewing this.

mergify

Automatically approving tomaka's pull requests. This auto-approval will be removed once more maintainers are active.

github-actions · 2022-03-03T09:31:32Z

twiggy diff report

Difference in .wasm size before and after this pull request.

 Delta Bytes │ Item
─────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       +1080 ┊ hashbrown::raw::RawTable<T,A>::reserve_rehash::h6f932820bd8f7864
        +568 ┊ smoldot::sync::all_forks::sources::AllForksSources<TSrc>::remove_known_block::h7f3bc0383120c363
        -565 ┊ prost::message::Message::merge::h065af8b8a2eab9ba
        -565 ┊ prost::message::Message::merge::h6c488a5b7681e719
        -565 ┊ prost::message::Message::merge::h93b1409e682863f3
        +514 ┊ core::mem::drop::h1ddff9021a0da073
        -374 ┊ core::ptr::drop_in_place<core::future::from_generator::GenFuture<smoldot_light_base::runtime_service::RuntimeService<smoldot_light_wasm::platform::Platform>::new::{{closure}}>>::h26fc98b6914430e4
        -337 ┊ smoldot::sync::all_forks::pending_blocks::PendingBlocks<TBl,TRq,TSrc>::set_block_header_body_known::h48f686aa80496c3b
        +337 ┊ smoldot::sync::all_forks::pending_blocks::PendingBlocks<TBl,TRq,TSrc>::set_unverified_block_header_body_known::h56662fa26b9c61e3
        -335 ┊ smoldot::sync::all_forks::pending_blocks::PendingBlocks<TBl,TRq,TSrc>::set_block_header_known::h2724687081db0087
        +335 ┊ smoldot::sync::all_forks::pending_blocks::PendingBlocks<TBl,TRq,TSrc>::set_unverified_block_header_known::h9eb5c2ba9de6a0ba
        +328 ┊ <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold::h011281a60f7e070d
        +304 ┊ smoldot::sync::all_forks::AllForksSync<TBl,TRq,TSrc>::block_from_source::ha4689e3d733da697
        +299 ┊ smoldot::sync::all_forks::pending_blocks::PendingBlocks<TBl,TRq,TSrc>::unnecessary_unverified_blocks::ha3b2cb5beac741e6
        +284 ┊ smoldot::sync::all_forks::disjoint::DisjointBlocks<TBl>::is_parent_bad::h757308a771c701c9
        +282 ┊ core::mem::drop::h361b7d43ee22eb41
        -277 ┊ smoldot::sync::all_forks::AllForksSync<TBl,TRq,TSrc>::new::hdcfae8b5b8323ec0
        +266 ┊ <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold::h51828dfe1349a26e
        +263 ┊ smoldot::sync::all_forks::AllForksSync<TBl,TRq,TSrc>::new::hd17a0fb2ff12d7e3
        +249 ┊ <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::try_fold::ha6a7b9b084ffd66a
       +3730 ┊ ... and 330 more.
       +8175 ┊ Σ [350 Total Rows]

melekes · 2022-03-03T16:11:08Z

@tomaka thanks for the write up 🙏 it's very helpful.

For long range downloads, such as the initial syncing, the so-called "optimistic" syncing should be used instead. In other words, long-range downloads are out of scope of this code.

is there some sort of the automatic switching between optimistic and all-sync options in reality? or the user has to manually choose the method which makes more sense to him/her?

tomaka · 2022-03-03T16:16:50Z

is there some sort of the automatic switching between optimistic and all-sync options in reality?

The situation at the moment is:

The light client does a warp sync to the head of the chain and then immediately switches to all-forks.
The full node only uses optimistic syncing. All-forks is not used at all in the full node, and that's because it doesn't fully support full node capabilities yet.

The idea for the full node is that you would always start in optimistic mode and then transition to all-forks mode.
When to transition from optimistic to all-forks and when to possibly transition back is actually unclear to me.

Substrate currently uses some heuristics that are something along the lines of "if the majority of peers have a best block that is >50 blocks higher than ours, then use optimistic mode" (the equivalent of optimistic mode is actually called "major syncing" in Substrate).
I don't think there's a better solution than heuristics.

melekes

lgtm, but have a few nitpicks

src/sync/all.rs

src/sync/all_forks.rs

src/sync/all_forks/pending_blocks.rs

Co-authored-by: Anton Kaliaev <[email protected]>

melekes

🚀

tomaka added 8 commits March 2, 2022 11:51

Some clarifications to docs of the all_forks submodules

f7bcbc3

Add PendingBlocks::unnecessary_blocks

60de2fa

Docfixes

11e53e5

Rename and tweak the "remove" functions of pending_blocks

1a1a6cb

Remove unnecessary blocks in all_forks

006c630

Implement banned blocks

21575a1

Banned blocks in Config now an iterator

23f330a

Many renames and documentation

e39e2bc

mergify bot approved these changes Mar 3, 2022

View reviewed changes

Docs tweaks

7258aa0

melekes reviewed Mar 3, 2022

View reviewed changes

tomaka and others added 3 commits March 4, 2022 10:49

Update src/sync/all_forks/pending_blocks.rs

bdc39ec

Co-authored-by: Anton Kaliaev <[email protected]>

mark_unverified_block_as_bad

bb30419

Renames

c82248b

melekes approved these changes Mar 4, 2022

View reviewed changes

tomaka added the automerge Automatically merge pull request as soon as possible label Mar 4, 2022

Merge branch 'main' into clarify-all-forks-docs

bf4fb8a

mergify bot merged commit 71128fa into paritytech:main Mar 4, 2022

tomaka deleted the clarify-all-forks-docs branch March 4, 2022 15:57

tomaka mentioned this pull request Mar 22, 2022

Don't remove source knows block after verification #2168

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements to all_forks syncing #2114

Improvements to all_forks syncing #2114

tomaka commented Mar 3, 2022 •

edited

Loading

mergify bot left a comment

github-actions bot commented Mar 3, 2022 •

edited

Loading

melekes commented Mar 3, 2022

tomaka commented Mar 3, 2022 •

edited

Loading

melekes left a comment

melekes left a comment

Improvements to all_forks syncing #2114

Improvements to all_forks syncing #2114

Conversation

tomaka commented Mar 3, 2022 • edited Loading

mergify bot left a comment

Choose a reason for hiding this comment

github-actions bot commented Mar 3, 2022 • edited Loading

twiggy diff report

melekes commented Mar 3, 2022

tomaka commented Mar 3, 2022 • edited Loading

melekes left a comment

Choose a reason for hiding this comment

melekes left a comment

Choose a reason for hiding this comment

tomaka commented Mar 3, 2022 •

edited

Loading

github-actions bot commented Mar 3, 2022 •

edited

Loading

tomaka commented Mar 3, 2022 •

edited

Loading