Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive operationInaccessible events #1804

Closed
josepot opened this issue Apr 30, 2024 · 2 comments · Fixed by #1812
Closed

Excessive operationInaccessible events #1804

josepot opened this issue Apr 30, 2024 · 2 comments · Fixed by #1812

Comments

@josepot
Copy link
Contributor

josepot commented Apr 30, 2024

It happens fairly often that one (or some) of the initial storage operations gets "hanged" by an operationInaccessible event. Polkadot-API does its best to try again later, and it always ends up working (eventually), but sometimes it takes several attempts. Meaning that it can take ~15 seconds (~20 attempts, one every ~750ms) for the operation to actually resolve.

Therefore, what polkadot-api does internally is to try the same storage query with the next finalized/best block (unless the consumer has specifically requested the storage entry against a block in particular). However, sometimes the next finalized/best block also takes a while to arrive, and that ends up translating in a very deteriorated UX due to very long loading times.

It's worth pointing out that this only seems to happen after right after "initializing" the chain.

The smoldot logs:
sm-logs.txt

The logs of the messages send over JSON-RPC (I noticed that in the smoldot logs they are trimmed, so just in case):
wire-logs.txt

@josepot josepot changed the title Excesive operationInaccessible events Excessive operationInaccessible events Apr 30, 2024
@tomaka
Copy link
Contributor

tomaka commented May 2, 2024

In the logs there are the following Polkadot blocks, each a child of the next:

  • 0x6ec4... (number 20575359): relay chain block whose parahead is 0x70a9...
  • 0xf7dc...: relay chain block whose parahead is 0x70a9...
  • 0x2a37...: relay chain block whose parahead is 0x208b...
  • 0xf702...: relay chain block whose parahead is 0x208b...

On the parachain peer-to-peer network, we see:

  • 0x208b...: parachain block reported as the best block of parachain nodes when we connect to them
  • 0x6c33...: parachain block announced later

The inaccessible block is 0x70a9...


So what I think happens is that on the parachain peer-to-peer network we see blocks that are ahead of what the relay chain has marked as best.
My guess is that I never noticed this happening before because we now have asynchronous backing, although I don't fully understand the consequences of asynchronous backing from the top of my head.

Unfortunately, smoldot doesn't parse parachain blocks, because in principle they don't have to be valid headers. Consequently, smoldot doesn't understand that 0x70a9... is a parent of 0x208b.... In theory, any peer that knows 0x208b... also knows 0x70a9..., but smoldot doesn't understand that and thinks that none of the peers knows 0x70a9....

At some point the relay chain will mark 0x208b... as best block and things work again.

@tomaka
Copy link
Contributor

tomaka commented May 2, 2024

I think that, given the way parachains work, a simple and correct fix is to assume that by default all parachain nodes know the block that the relay chain has marked as best. If it turns out to not be the case, then we can ban them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants