Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compact Trie Proof format #56

Closed
wants to merge 3 commits into from
Closed

Compact Trie Proof format #56

wants to merge 3 commits into from

Conversation

cheme
Copy link

@cheme cheme commented Nov 16, 2022

I currently make it a draft as it is something that seems good to switch to to me but is not strictly needed, and current .

It comes form conversation I had with @tomaka, and the realization that the current compact proof format can be a bit awkward.

So opening this is for initial feedbacks.

The design is largely inspired by turbogeth one but without the useless instructions and taking in consideration the possibility to have a sequence of compact proof.
(proto implement the eth variant but I don't want to describe them, just mentionning it is not incompatible).

So I create this PR to start discussion here.

Some part are undefined. More notably all the codec question as it is rather boring things that needs to be defined depending on frequency of nodes and other consideration but can be completed in a second step: my proto just use scale codec instead.

The intention would be to replace all existing Compact proof usage and extend to light client rpc too.

The algorithm here is mostly the decoding one, the algorithm for building can also be added, but it can be inferred from the decoding one, and the current implementation is a bit related to parity trie crate node iterator internals.

@tomaka
Copy link

tomaka commented Nov 16, 2022

Correct me if I'm wrong: right now, the existing compact proof format is used for:

  • State requests (I haven't checked, but I think you said that?)
  • Within parachains POVs. However this is purely an implementation detail of specific parachains (i.e. the parachain generates the proof then later decodes it, but the relay chain only sees an opaque blob), and thus isn't part of Polkadot.

So if we modify the format, these are the things that would break/change.

@cheme
Copy link
Author

cheme commented Nov 16, 2022

* State requests (I haven't checked, but I think you said that?)

Only warp sync uses it (I mean not the current light client rpc). But we can probably version the protobuf schema somehow.

* Within parachains POVs. However this is purely an implementation detail of specific parachains (i.e. the parachain generates the proof then later decodes it, but the relay chain only sees an opaque blob), and thus isn't part of Polkadot.

Yes, changing the compact proof in cumulus would still required collator to update their client to create a pov compatible with the new pvf (this can be tricky but we did it in the past when adding the first compact proof format), but on the verification side all decoding is in the pvf.

@tomaka
Copy link

tomaka commented Nov 17, 2022

For what it's worth, my immediate reaction to reading the draft is: I have no idea what to think of this, it's really too complicated and too big of a leap for me to figure out what the advantages and drawbacks of this format are.

@cheme
Copy link
Author

cheme commented Nov 17, 2022

Psp may go a bit too far.
Short description of format would be (complexity comes mostly from sequence of proof that can be out of the psp and rules to prevent two different encoding for a same state):

Sequence of instruction:

  • PushKey(nibbles): append nibbles to current cursor
  • PopKey(nb nibbles): remove n nibbles from the current cursor
  • value(bytes): set value at the current cursor
  • hash(index, hash): set child hash in one child of the current cursor

So it is mostly description of proof content, ordered by trie node iteration.

One discutable point is that value is written when node get accessed, and hashes are only written when node get out of scope.

I will try to write the decoding in a recursive pseudo-code form later (which may be clearer than the stack based algorithm describe here).

@lamafab
Copy link
Contributor

lamafab commented Nov 21, 2022

This PSP has quite a bit of typos and it's quite hard to follow. Imo, once you officially decide to use your codec2 implementation in Substrate, I will need to spec it for spec.polkadot.network anyway. As of now, this PSP reads mostly like a general overview rather than a specification.

I'll leave this open, but it probably needs to be written differently in the future.

@cheme
Copy link
Author

cheme commented Jul 31, 2023

Small note about writing the child hash when poping node and child value after push: it allows saving a key push operation when there is no values.
When writing something else it also can be rather awkward if the proof does skip some content, then due to inline node we can have a non sequenced child access. I end up struggling a bit with it with some algo.
So it now make sense to change to something strictly sequenced (operation strictly following child index sequence: and value being considered as index -1).
The example can become:

KeyPush: b"b" without last nibble (1)
HashChild: SomeOtherHash at 1 (b"alfa" branch)
KeyPush: b"bravo" without first (9)
Value: b"bravo"
KeyPop: 9
KeyPush: b"do" without first nibble (3)
Value: b"verb"
KeyPush: b"g" all nibbes (2)
Value: b"puppy"
KeyPush: b"e" all nibbes (2)
Value: [0; 32]
KeyPop: 7
KeyPush: b"hor" skipping first nibble and last nibble (4)
HashChild: SomeHash at 2 (second nibble of b"r")
KeyPush: b"use" skipping first nibble (5)
Value: b"building" // This is not queried but include because inline value of an included node

so I am closing and will open it again when I got branch rewritten and this written in a simpler way.

@cheme cheme closed this Jul 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants