Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trie-db: Fetch the closest merkle value #199

Merged
merged 35 commits into from
Sep 11, 2023
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
bf44483
trie-db: Add `get_closest_merkle_value` to Trie trait
lexnv Aug 18, 2023
c2e2ecd
trie-db: Extract the merkle value
lexnv Aug 21, 2023
a93733f
trie-db/test: Check merkle value on update key
lexnv Aug 21, 2023
546fdb5
Update trie-db/src/lookup.rs
lexnv Aug 23, 2023
adfd6f1
Update trie-db/src/lookup.rs
lexnv Aug 23, 2023
b7db016
trie-db: Rename look_up_merkle_without_cache function
lexnv Aug 23, 2023
fa10f70
trie-db/tests: Check closest descendant of partial keys
lexnv Aug 23, 2023
5998874
trie-db: Adjust lookups for partial keys
lexnv Aug 23, 2023
fb365bb
trie-db/tests: Check non-existent key and branch nodes
lexnv Aug 23, 2023
77f19e2
trie-db: Ensure recording of `NonExisting` for leaves
lexnv Aug 23, 2023
7e2ae36
trie-db: Ensure the merkle descedent hash is returned
lexnv Aug 23, 2023
40d700c
trie-db/tests: Extend tests with branch nodes and single key db
lexnv Aug 23, 2023
1124af8
trie-db/tests: Check trie modification and merkle propagation
lexnv Aug 24, 2023
bc2d977
trie-db/tests: Use `PrefixedKey` instead of `HashKey`
lexnv Aug 28, 2023
b46eecd
trie-db/tests: Test extra keys for `test_merkle_value`
lexnv Aug 28, 2023
f714033
trie-db: Return the extension node hash
lexnv Aug 28, 2023
82127bb
trie-db: Use `starts_with` method instead of common prefix
lexnv Aug 28, 2023
1b5fcae
trie-db: Return no merkle value on empty node
lexnv Aug 28, 2023
28bb9fa
trie-db/tests: Ensure inline nodes
lexnv Aug 28, 2023
5de55d0
trie-db/tests: Check empty trie with empty keys
lexnv Aug 28, 2023
d6c8a55
trie-db/tests: Add extra keys to check
lexnv Aug 28, 2023
e2ffe4f
trie-db: Use `starts_with` for extension nodes
lexnv Aug 28, 2023
21b7307
trie-db: Rename merkle lookups to lookup_first_descendant
lexnv Aug 29, 2023
5c6ea20
trie-db: Return inline hashes properly
lexnv Aug 29, 2023
1a6fa44
trie-db: Implement starts_with_slice for NibbleVec
lexnv Aug 29, 2023
db496a7
trie-db: Use cache for finding first descendent hash
lexnv Aug 29, 2023
770a5f5
trie-db: Introduce caching for descedent node access
lexnv Aug 29, 2023
9fb3bb3
trie-db: Use rstd::vec::Vec
lexnv Aug 29, 2023
5d29a75
trie-db: Forward merkle value for fatdb and sectriedb
lexnv Sep 6, 2023
bc32ac8
trie-db: Rename `get_closest_merkle_value` to `lookup_first_descendant`
lexnv Sep 6, 2023
d538fe3
trie-db: Introduce MerkleValue to return inline nodes and hashes
lexnv Sep 6, 2023
5c05937
trie-db: Remove inner function for merkle value lookups
lexnv Sep 6, 2023
b8a589f
Update trie-db/src/lib.rs
lexnv Sep 11, 2023
d9af8ce
Update trie-db/src/lib.rs
lexnv Sep 11, 2023
1fa98e5
Apply fmt
lexnv Sep 11, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions trie-db/src/fatdb.rs
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,13 @@ where
self.raw.get_with(L::Hash::hash(key).as_ref(), query)
}

fn get_closest_merkle_value(
&self,
_key: &[u8],
lexnv marked this conversation as resolved.
Show resolved Hide resolved
) -> Result<Option<TrieHash<L>>, TrieHash<L>, CError<L>> {
Ok(None)
lexnv marked this conversation as resolved.
Show resolved Hide resolved
}

fn iter<'a>(
&'a self,
) -> Result<
Expand Down
13 changes: 13 additions & 0 deletions trie-db/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,12 @@ pub trait Trie<L: TrieLayout> {
query: Q,
) -> Result<Option<Q::Item>, TrieHash<L>, CError<L>>;

/// Returns the merkle value of the closest descendant node of the key.
fn get_closest_merkle_value(
lexnv marked this conversation as resolved.
Show resolved Hide resolved
&self,
key: &[u8],
) -> Result<Option<TrieHash<L>>, TrieHash<L>, CError<L>>;

/// Returns a depth-first iterator over the elements of trie.
fn iter<'a>(
&'a self,
Expand Down Expand Up @@ -404,6 +410,13 @@ impl<'db, 'cache, L: TrieLayout> Trie<L> for TrieKinds<'db, 'cache, L> {
wrapper!(self, get_with, key, query)
}

fn get_closest_merkle_value(
&self,
key: &[u8],
) -> Result<Option<TrieHash<L>>, TrieHash<L>, CError<L>> {
wrapper!(self, get_closest_merkle_value, key)
}

fn iter<'a>(
&'a self,
) -> Result<
Expand Down
148 changes: 148 additions & 0 deletions trie-db/src/lookup.rs
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,19 @@ where
}
}

/// Look up the closest merkle value.
///
/// When the provided key leads to a node, then the merkle value of that node
/// is returned. However, if the key does not lead to a node, then the closest
/// merkle value is returned.
pub fn look_up_merkle(
bkchr marked this conversation as resolved.
Show resolved Hide resolved
lexnv marked this conversation as resolved.
Show resolved Hide resolved
self,
full_key: &[u8],
nibble_key: NibbleSlice,
) -> Result<Option<TrieHash<L>>, TrieHash<L>, CError<L>> {
self.lookup_first_descendent_without_cache(nibble_key, full_key)
}

/// Look up the given `nibble_key`.
///
/// If the value is found, it will be passed to the given function to decode or copy.
Expand Down Expand Up @@ -655,4 +668,139 @@ where
}
Ok(None)
}

/// Look up the merkle value (hash) of the node that is the closest descendant for the provided
/// key.
///
/// When the provided key leads to a node, then the merkle value of that node
/// is returned. However, if the key does not lead to a node, then the merkle value
/// of the closest descendant is returned. `None` if no such descendant exists.
fn lookup_first_descendent_without_cache(
mut self,
nibble_key: NibbleSlice,
full_key: &[u8],
) -> Result<Option<TrieHash<L>>, TrieHash<L>, CError<L>> {
bkchr marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type is wrong.

enum MerkleValue {
     /// Is returned if `v.len() < MAX_INLINE_VALUE`.
     Node(Vec<u8>),
     Hash(H),
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did propose to use hash for inline node (generally I would prefer if inline node is something users don't need to be aware off). But I did not double check the rpc proposal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe trie crate api could return this enum, but the rpc return hash of the data value for inline node (this way we can avoid some hashing if this api get use by runtime in the future).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the spec is speaking about the merkle value and not the hash. The point being that if the data is smaller, we save bandwidth etc.

(this way we can avoid some hashing if this api get use by runtime in the future)

This doesn't add any extra hashing. If you need hashing for the runtime, we can just do it where we use it. There would be no extra hashing involved as we currently also hash the inline node manually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense! I've added the MerkleValue enum to be explicit about what the trie-db returns.

Then in substrate's RPC layer we could further hash(MerkleValue::Node(..)) such that we return a consistent view of nodes and not care about inline ones. And if this function gets ever called from the runtime we save a Layout:hash() call 🙏

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then in substrate's RPC layer we could further hash(MerkleValue::Node(..)) such that we return a consistent view of nodes and not care about inline ones.

This still goes against the size argument I laid out above. I opened paritytech/json-rpc-interface-spec#88 to be sure that we use the correct value in the RPC.

let mut partial = nibble_key;
let mut hash = self.hash;
let mut key_nibbles = 0;

// this loop iterates through non-inline nodes.
for depth in 0.. {
let node_data = match self.db.get(&hash, nibble_key.mid(key_nibbles).left()) {
cheme marked this conversation as resolved.
Show resolved Hide resolved
Some(value) => value,
None =>
return Err(Box::new(match depth {
0 => TrieError::InvalidStateRoot(hash),
_ => TrieError::IncompleteDatabase(hash),
})),
};

self.record(|| TrieAccess::EncodedNode {
hash,
encoded_node: node_data.as_slice().into(),
});

// this loop iterates through all inline children (usually max 1)
// without incrementing the depth.
let mut node_data = &node_data[..];
loop {
let decoded = match L::Codec::decode(node_data) {
Ok(node) => node,
Err(e) => return Err(Box::new(TrieError::DecoderError(hash, e))),
};

let next_node = match decoded {
Node::Leaf(slice, _) => {
if !partial.starts_with(&slice) {
self.record(|| TrieAccess::NonExisting { full_key });
return Ok(None)
}

if partial.len() != slice.len() {
self.record(|| TrieAccess::NonExisting { full_key });
}

return Ok(Some(hash))
},
Node::Extension(slice, item) =>
if partial.starts_with(&slice) {
partial = partial.mid(slice.len());
key_nibbles += slice.len();
item
} else {
self.record(|| TrieAccess::NonExisting { full_key });

return Ok(None)
},
Node::Branch(children, value) =>
if partial.is_empty() {
if value.is_none() {
self.record(|| TrieAccess::NonExisting { full_key });
}

return Ok(Some(hash))
} else {
match children[partial.at(0) as usize] {
Some(x) => {
partial = partial.mid(1);
key_nibbles += 1;
x
},
None => {
self.record(|| TrieAccess::NonExisting { full_key });

return Ok(Some(hash))
},
}
},
Node::NibbledBranch(slice, children, value) => {
if !partial.starts_with(&slice) {
self.record(|| TrieAccess::NonExisting { full_key });

return Ok(None)
}

if partial.len() == slice.len() {
if value.is_none() {
self.record(|| TrieAccess::NonExisting { full_key });
}

return Ok(Some(hash))
} else {
match children[partial.at(slice.len()) as usize] {
Some(x) => {
partial = partial.mid(slice.len() + 1);
key_nibbles += slice.len() + 1;
x
},
None => {
self.record(|| TrieAccess::NonExisting { full_key });

return Ok(Some(hash))
},
}
}
},
Node::Empty => {
self.record(|| TrieAccess::NonExisting { full_key });

return Ok(Some(hash))
lexnv marked this conversation as resolved.
Show resolved Hide resolved
},
};

// check if new node data is inline or hash.
match next_node {
NodeHandle::Hash(data) => {
hash = decode_hash::<L::Hash>(data)
.ok_or_else(|| Box::new(TrieError::InvalidHash(hash, data.to_vec())))?;
break
},
NodeHandle::Inline(data) => {
node_data = data;
cheme marked this conversation as resolved.
Show resolved Hide resolved
},
}
}
}
Ok(None)
}
}
7 changes: 7 additions & 0 deletions trie-db/src/sectriedb.rs
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,13 @@ where
self.raw.get_with(L::Hash::hash(key).as_ref(), query)
}

fn get_closest_merkle_value(
&self,
_key: &[u8],
) -> Result<Option<TrieHash<L>>, TrieHash<L>, CError<L>> {
Ok(None)
lexnv marked this conversation as resolved.
Show resolved Hide resolved
}

fn iter<'a>(
&'a self,
) -> Result<
Expand Down
17 changes: 17 additions & 0 deletions trie-db/src/triedb.rs
Original file line number Diff line number Diff line change
Expand Up @@ -252,6 +252,23 @@ where
.look_up(key, NibbleSlice::new(key))
}

fn get_closest_merkle_value(
&self,
key: &[u8],
) -> Result<Option<TrieHash<L>>, TrieHash<L>, CError<L>> {
let mut cache = self.cache.as_ref().map(|c| c.borrow_mut());
let mut recorder = self.recorder.as_ref().map(|r| r.borrow_mut());

Lookup::<L, _> {
db: self.db,
query: |_: &[u8]| (),
hash: *self.root,
cache: cache.as_mut().map(|c| &mut ***c as &mut dyn TrieCache<L::Codec>),
recorder: recorder.as_mut().map(|r| &mut ***r as &mut dyn TrieRecorder<TrieHash<L>>),
}
.look_up_merkle(key, NibbleSlice::new(key))
}

fn iter<'a>(
&'a self,
) -> Result<
Expand Down
52 changes: 52 additions & 0 deletions trie-db/test/src/triedb.rs
Original file line number Diff line number Diff line change
Expand Up @@ -644,6 +644,58 @@ fn test_recorder_with_cache_get_hash_internal<T: TrieLayout>() {
}
}

test_layouts!(test_merkle_value, test_merkle_value_internal);
fn test_merkle_value_internal<T: TrieLayout>() {
let mut memdb = MemoryDB::<T::Hash, HashKey<_>, DBValue>::default();
lexnv marked this conversation as resolved.
Show resolved Hide resolved
let mut root = Default::default();

// Data set.
let key_value = vec![
(b"A".to_vec(), vec![1; 64]),
(b"AA".to_vec(), vec![2; 64]),
(b"AAA".to_vec(), vec![3; 64]),
(b"AAB".to_vec(), vec![4; 64]),
(b"AB".to_vec(), vec![5; 4]),
(b"B".to_vec(), vec![6; 64]),
];
lexnv marked this conversation as resolved.
Show resolved Hide resolved
{
let mut t = TrieDBMutBuilder::<T>::new(&mut memdb, &mut root).build();
for (key, value) in &key_value {
t.insert(key, value).unwrap();
}
}

// Ensure we can fetch the merkle values for all present keys.
let trie = TrieDBBuilder::<T>::new(&memdb, &root).build();
for (key, _) in &key_value {
trie.get_closest_merkle_value(key).unwrap().unwrap();
}

// Key is not present in the trie, but the closest descendant is AAA.
let hash = trie.get_closest_merkle_value(b"AAAA").unwrap().unwrap();
let expected = trie.get_closest_merkle_value(b"AAA").unwrap().unwrap();
assert_eq!(hash, expected);

// Key is not present in the trie, but the closest descendant is AB.
let hash = trie.get_closest_merkle_value(b"ABA").unwrap().unwrap();
let expected = trie.get_closest_merkle_value(b"AB").unwrap().unwrap();
assert_eq!(hash, expected);

// Key is not present in the tire.
// At the nibble level (4 bits), b"A" is represented by number 65 in binary: 0100 0001.
// The key 0000 01000 shares no common prefix in the trie.
let not_present = trie.get_closest_merkle_value(&[0b_0000_0100]).unwrap();
assert!(not_present.is_none());

// b"D" should exist since it shares a prefix with b"A" and it returns the branch node
// when the layout is `ExtensionLayout` or a nibble branch otherwise.
let d_key = trie.get_closest_merkle_value(&[0b_0100_0100]).unwrap().unwrap();
// The same is true for b"C".
let c_key = trie.get_closest_merkle_value(&[0b_0100_0011]).unwrap().unwrap();
// Check the same branch is returned.
assert_eq!(d_key, c_key);
}

test_layouts!(iterator_seek_with_recorder, iterator_seek_with_recorder_internal);
fn iterator_seek_with_recorder_internal<T: TrieLayout>() {
let d = vec![b"A".to_vec(), b"AA".to_vec(), b"AB".to_vec(), b"B".to_vec()];
Expand Down