-
Notifications
You must be signed in to change notification settings - Fork 743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sc-consensus-beefy: pump gossip engine while waiting for initialization conditions #3435
sc-consensus-beefy: pump gossip engine while waiting for initialization conditions #3435
Conversation
This new threshold is inline with the thresholds of the other gossip protocols. On Kusama where there are 1000 validators, each gossiping at least one message per round, the 10k threshold gets hit during node restarts when there is an expected lag between gossip engine initialization and BEEFY worker initialization. During this lag, more than 10k messages build up and an error message is put up. With this commit, that annoying message goes away. Signed-off-by: Adrian Catangiu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM, but perhaps it worth to leave a couple of "why?" comments everywhere where we are using that new Action::Ignore
. Something like: "The worker is not initialized yet, so messages may still be valid and we shall not punish senders". Or even better (which may result in a worse code though): add explicit fn is_worker_active()
function to Filter
and if it is active, return Action::Ignore
and otherwise change reputation.
Without it, we may revert it accidentally in the future
// Pump peer reports | ||
_ = &mut beefy_comms.gossip_report_stream.next() => { | ||
continue | ||
}, | ||
// Pump gossip engine. | ||
_ = &mut beefy_comms.gossip_engine => { | ||
error!(target: LOG_TARGET, "🥩 Gossip engine has unexpectedly terminated."); | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if we could just initialize the BeefyComms
including the gossip engine and gossip report stream after BeefyWorkerBuilder::async_initialize
finished in order to avoid pumping them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent a few days trying to make that happen a few months ago, as it should work in theory, but in practice it breaks integration tests.
I've not seen any problems in local testing when running local nodes, but integration tests that are using sc_network_test
are not finding local peers when I'd initialize GossipEngine after waiting for BEEFY genesis. I couldn't get to the bottom of it and dropped the idea.
I'd be very happy if someone could make it work :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. Would it make sense to add a separate issue for this ?
31546c8
…on conditions (#3435) As part of BEEFY worker/voter initialization the task waits for certain chain and backend conditions to be fulfilled: - BEEFY consensus enabled on-chain & GRANDPA best finalized higher than on-chain BEEFY genesis block, - backend has synced headers for BEEFY mandatory blocks between best BEEFY and best GRANDPA. During this waiting time, any messages gossiped on the BEEFY topic for current chain get enqueued in the gossip engine, leading to RAM bloating and output warning/error messages when the wait time is non-negligible (like during a clean sync). This PR adds logic to pump the gossip engine while waiting for other things to make sure gossiped messages get consumed (practically discarded until worker is fully initialized). Also raises the warning threshold for enqueued messages from 10k to 100k. This is in line with the other gossip protocols on the node. Fixes #3390 --------- Signed-off-by: Adrian Catangiu <[email protected]>
…on conditions (paritytech#3435) As part of BEEFY worker/voter initialization the task waits for certain chain and backend conditions to be fulfilled: - BEEFY consensus enabled on-chain & GRANDPA best finalized higher than on-chain BEEFY genesis block, - backend has synced headers for BEEFY mandatory blocks between best BEEFY and best GRANDPA. During this waiting time, any messages gossiped on the BEEFY topic for current chain get enqueued in the gossip engine, leading to RAM bloating and output warning/error messages when the wait time is non-negligible (like during a clean sync). This PR adds logic to pump the gossip engine while waiting for other things to make sure gossiped messages get consumed (practically discarded until worker is fully initialized). Also raises the warning threshold for enqueued messages from 10k to 100k. This is in line with the other gossip protocols on the node. Fixes paritytech#3390 --------- Signed-off-by: Adrian Catangiu <[email protected]>
As part of BEEFY worker/voter initialization the task waits for certain chain and backend conditions to be fulfilled:
During this waiting time, any messages gossiped on the BEEFY topic for current chain get enqueued in the gossip engine, leading to RAM bloating and output warning/error messages when the wait time is non-negligible (like during a clean sync).
This PR adds logic to pump the gossip engine while waiting for other things to make sure gossiped messages get consumed (practically discarded until worker is fully initialized).
Also raises the warning threshold for enqueued messages from 10k to 100k. This is in line with the other gossip protocols on the node.
Fixes #3390