Perform the protobuf encoding and decoding manually #2326

tomaka · 2022-05-30T08:35:46Z

Unfortunately, the situation regarding protobuf in the Rust ecosystem is rather dire. prost is the only proper library available, and it needs to perform a very annoying build step that also requires cmake to be installed. The bindings generated by prost are also very Vec-heavy and cannot be made zero cost. It is also not possible to detect missing fields, as prost will automatically use a default value if a field is missing.

For these reasons, this PR replaces the prost library with manual encoding and decoding of the protobuf messages.

In terms in code, it makes the code less easy to read because we replace field names with their indices. However, I believe that it's worth it. All the protocols we use are stable, so there is no risk of mistake in backporting a change to the protocol, which is the main reason for having easy-to-read field names.

mergify

Automatically approving tomaka's pull requests. This auto-approval will be removed once more maintainers are active.

tomaka · 2022-05-30T08:44:11Z

The wasm-node-size-diff step fails because it requires cmake to determine the size of the Wasm before this PR. 🤦

github-actions · 2022-05-30T08:52:55Z

twiggy diff report

Difference in .wasm size before and after this pull request.

 Delta Bytes │ Item
─────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      -45999 ┊ smoldot::executor::host::ReadyToRun::run_once::h034c05dac545ce7d
      +45999 ┊ smoldot::executor::host::ReadyToRun::run_once::h9143cfdcb862bc50
      -41864 ┊ smoldot::json_rpc::methods::MethodCall::from_defs::h3f7bd17a212d344d
      +41864 ┊ smoldot::json_rpc::methods::MethodCall::from_defs::he627e2c81e43f5f9
      -11616 ┊ <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h3d717b0b0d0e23c5
      +11616 ┊ <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hecce9ec513aeb33f
      +11199 ┊ <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h0e41b4655da4a6de
      -11199 ┊ <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h65cfbc287eec0090
      +11113 ┊ smoldot_light_base::Client<TChain,TPlat>::add_chain::h3312cbfdcaefe59e
      -11113 ┊ smoldot_light_base::Client<TChain,TPlat>::add_chain::h519ec746a91d74bd
      +10445 ┊ smoldot::network::service::ChainNetwork<TNow>::next_event::h7700155e7591caf1
      +10402 ┊ <parity_wasm::elements::ops::Instruction as parity_wasm::elements::Deserialize>::deserialize::h3c8b846fb00a17fc
      -10402 ┊ <parity_wasm::elements::ops::Instruction as parity_wasm::elements::Deserialize>::deserialize::hcf29e180bf4354ec
      -10386 ┊ smoldot::network::service::ChainNetwork<TNow>::next_event::hbd1d383b35744ebc
      +10168 ┊ smoldot::json_rpc::methods::Response::to_json_response::hc6b81792187cc405
      -10168 ┊ smoldot::json_rpc::methods::Response::to_json_response::heabfd849462b61b5
      +10089 ┊ <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h2427c60317fdc9c5
      -10089 ┊ <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h3934ac9bf69c8920
       +9676 ┊ smoldot::libp2p::connection::established::substream::Substream<TNow,TRqUd,TNotifUd>::read_write2::h0e9746a6f8ff98e1
       -9676 ┊ smoldot::libp2p::connection::established::substream::Substream<TNow,TRqUd,TNotifUd>::read_write2::hfca4161fb57e159a
      +47072 ┊ ... and 27203 more.
      +75016 ┊ Σ [27223 Total Rows]

melekes

It would be good to add some tests for the encoder in src/util/protobuf.rs

melekes · 2022-05-30T12:45:53Z

src/libp2p/peer_id.rs

+                for slice in protobuf::bytes_tag_encode(2, key) {
+                    out.extend_from_slice(slice.as_ref());
+                }
+                debug_assert_eq!(out.len(), CAPACITY);


do u want to add debug_assert_eq in other similar places as well?

So it's a bit complicated. The expected length of the final buffer is not so easily calculable because numbers are encoded in LEB128, in other words in a variable-length way.
For example, if you send a buffer of 30 bytes, the message will be 32 bytes (30 + 2). But if you send a buffer of 150 bytes, the message will be 153 bytes (150 + 3).

I've done this debug_assert_eq! thing in peer_id.rs because the encoding of a PeerId is fully deterministic and always has the same size, but in other places I feel like hardcoding a capacity would be a bit too hacky.

melekes · 2022-05-30T13:00:57Z

src/network/kademlia.rs

-    let mut buf = Vec::with_capacity(protobuf.encoded_len());
-    protobuf.encode(&mut buf).unwrap();
-    buf
+    let mut out = Vec::new();


what about capacity?

I'll add some capacities everywhere 👍

tomaka · 2022-05-30T16:12:39Z

Pushed some tests.

Through these tests I've found a way to make the LEB128 decoding code panic. Fixed it and added that to the CHANGELOG.

Perform the protobuf encoding and decoding manually

0e26f26

mergify bot approved these changes May 30, 2022

View reviewed changes

tomaka added 3 commits May 30, 2022 10:37

Link to noise protobuf

a4b41e5

Remove unused function to fix warning

e66f187

Restore libc6-dev-i386 on 32bits target CI

e78e276

Restore cmake in wasm-node-size-diff step

e5cf2cc

Remove more traces of cmake

48cf472

melekes approved these changes May 30, 2022

View reviewed changes

tomaka added 4 commits May 30, 2022 17:06

Forgot a protobuf file

942e9d9

Add values for with_capacity

a949e19

Rustfmt

cf96ba8

Add tests

6a86d3f

tomaka added the automerge Automatically merge pull request as soon as possible label May 30, 2022

Rustfmt

066ae1f

mergify bot merged commit 69b11f5 into paritytech:main May 30, 2022

tomaka deleted the protobuf branch May 30, 2022 16:47

tomaka mentioned this pull request Jun 6, 2022

Fix LEB128-related panic again #2337

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perform the protobuf encoding and decoding manually #2326

Perform the protobuf encoding and decoding manually #2326

tomaka commented May 30, 2022

mergify bot left a comment

tomaka commented May 30, 2022

github-actions bot commented May 30, 2022 •

edited

Loading

melekes left a comment

melekes May 30, 2022

tomaka May 30, 2022 •

edited

Loading

melekes May 30, 2022

tomaka May 30, 2022

tomaka commented May 30, 2022

Perform the protobuf encoding and decoding manually #2326

Perform the protobuf encoding and decoding manually #2326

Conversation

tomaka commented May 30, 2022

mergify bot left a comment

Choose a reason for hiding this comment

tomaka commented May 30, 2022

github-actions bot commented May 30, 2022 • edited Loading

twiggy diff report

melekes left a comment

Choose a reason for hiding this comment

melekes May 30, 2022

Choose a reason for hiding this comment

tomaka May 30, 2022 • edited Loading

Choose a reason for hiding this comment

melekes May 30, 2022

Choose a reason for hiding this comment

tomaka May 30, 2022

Choose a reason for hiding this comment

tomaka commented May 30, 2022

github-actions bot commented May 30, 2022 •

edited

Loading

tomaka May 30, 2022 •

edited

Loading