-
Notifications
You must be signed in to change notification settings - Fork 729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use dynamic aura slot duration in lookahead collator #3211
Conversation
There is not that much that could be done here. So, I think it is fine that these errors are appearing for this small moment in time. |
While thinking again about this and looking at the code, the error should not happen. This is because we are not using the correct block to determine the slot duration or probably not the correct block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix looks sane to me.
However, I'm not familiar with all the internal stuff, so I would like to understand what happens in the following scenarios(I can help with the testing):
-
We upgrade the parachain runtime but the collator nodes is still running the old version, before Enable async backing on all testnet system chains #2949 and Enable async backing on asset-hub-rococo #2826.
-
We upgrade the collator node, but the runtime is still the old ones before Enable async backing on all testnet system chains #2949 and Enable async backing on asset-hub-rococo #2826.
-
We upgrade both runtime and collator node, does the relay chain need async backing enabled or will it work even with a relay chain where async backing is not enabled.
Btw: #2826, was included in 1.7.0 what will happen if they deploy it(node or runtime) without this fix?
Not checked yet, probably it's not a valid scenario. Collators should upgrade before the runtime upgrade (the same way we should wait for the relay chain nodes to upgrade before applying a runtime upgrade with some new features utilized with by nodes)
The parachain will progress normally, but we'll see runtime panics in logs every second block. They are harmless but frightening 😵💫
Not sure, my guess would be the same as 2., but would be good to test.
Again, not 100% sure, my guess is the collators might have to be restarted after the runtime upgrade, otherwise the parachain will stall. |
bfb30e2
to
3e8e268
Compare
It's a follow-up of #2949. It enables the lookahead collator to dynamically adjust the aura slot size, which may change during the runtime upgrade. It also addressed a couple of issues with time constants I missed in the original PR.
Good news: it works. The parachain successfully switches from sync backing with 12s slots to async backing with 6s slots.
Bad news: during the transitional period of a single block in which the actual runtime upgrade is performed, it still gets the old slot duration of 12s (as it gets it from the best block), resulting in a runtime panic (logs follow). That doesn't affect the following block production of the parachain. Ideas on how to improve the situation are appreciated.