Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform the protobuf encoding and decoding manually #2326

Merged
merged 11 commits into from
May 30, 2022

Conversation

tomaka
Copy link
Contributor

@tomaka tomaka commented May 30, 2022

Fix #2285
Fix #2312

Unfortunately, the situation regarding protobuf in the Rust ecosystem is rather dire. prost is the only proper library available, and it needs to perform a very annoying build step that also requires cmake to be installed. The bindings generated by prost are also very Vec-heavy and cannot be made zero cost. It is also not possible to detect missing fields, as prost will automatically use a default value if a field is missing.

For these reasons, this PR replaces the prost library with manual encoding and decoding of the protobuf messages.

In terms in code, it makes the code less easy to read because we replace field names with their indices. However, I believe that it's worth it. All the protocols we use are stable, so there is no risk of mistake in backporting a change to the protocol, which is the main reason for having easy-to-read field names.

Copy link
Contributor

@mergify mergify bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automatically approving tomaka's pull requests. This auto-approval will be removed once more maintainers are active.

@tomaka
Copy link
Contributor Author

tomaka commented May 30, 2022

The wasm-node-size-diff step fails because it requires cmake to determine the size of the Wasm before this PR. 🤦

@github-actions
Copy link
Contributor

github-actions bot commented May 30, 2022

twiggy diff report

Difference in .wasm size before and after this pull request.


 Delta Bytes │ Item
─────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
      -45999 ┊ smoldot::executor::host::ReadyToRun::run_once::h034c05dac545ce7d
      +45999 ┊ smoldot::executor::host::ReadyToRun::run_once::h9143cfdcb862bc50
      -41864 ┊ smoldot::json_rpc::methods::MethodCall::from_defs::h3f7bd17a212d344d
      +41864 ┊ smoldot::json_rpc::methods::MethodCall::from_defs::he627e2c81e43f5f9
      -11616 ┊ <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h3d717b0b0d0e23c5
      +11616 ┊ <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hecce9ec513aeb33f
      +11199 ┊ <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h0e41b4655da4a6de
      -11199 ┊ <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h65cfbc287eec0090
      +11113 ┊ smoldot_light_base::Client<TChain,TPlat>::add_chain::h3312cbfdcaefe59e
      -11113 ┊ smoldot_light_base::Client<TChain,TPlat>::add_chain::h519ec746a91d74bd
      +10445 ┊ smoldot::network::service::ChainNetwork<TNow>::next_event::h7700155e7591caf1
      +10402 ┊ <parity_wasm::elements::ops::Instruction as parity_wasm::elements::Deserialize>::deserialize::h3c8b846fb00a17fc
      -10402 ┊ <parity_wasm::elements::ops::Instruction as parity_wasm::elements::Deserialize>::deserialize::hcf29e180bf4354ec
      -10386 ┊ smoldot::network::service::ChainNetwork<TNow>::next_event::hbd1d383b35744ebc
      +10168 ┊ smoldot::json_rpc::methods::Response::to_json_response::hc6b81792187cc405
      -10168 ┊ smoldot::json_rpc::methods::Response::to_json_response::heabfd849462b61b5
      +10089 ┊ <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h2427c60317fdc9c5
      -10089 ┊ <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h3934ac9bf69c8920
       +9676 ┊ smoldot::libp2p::connection::established::substream::Substream<TNow,TRqUd,TNotifUd>::read_write2::h0e9746a6f8ff98e1
       -9676 ┊ smoldot::libp2p::connection::established::substream::Substream<TNow,TRqUd,TNotifUd>::read_write2::hfca4161fb57e159a
      +47072 ┊ ... and 27203 more.
      +75016 ┊ Σ [27223 Total Rows]

Copy link
Contributor

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add some tests for the encoder in src/util/protobuf.rs

for slice in protobuf::bytes_tag_encode(2, key) {
out.extend_from_slice(slice.as_ref());
}
debug_assert_eq!(out.len(), CAPACITY);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do u want to add debug_assert_eq in other similar places as well?

Copy link
Contributor Author

@tomaka tomaka May 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's a bit complicated. The expected length of the final buffer is not so easily calculable because numbers are encoded in LEB128, in other words in a variable-length way.
For example, if you send a buffer of 30 bytes, the message will be 32 bytes (30 + 2). But if you send a buffer of 150 bytes, the message will be 153 bytes (150 + 3).

I've done this debug_assert_eq! thing in peer_id.rs because the encoding of a PeerId is fully deterministic and always has the same size, but in other places I feel like hardcoding a capacity would be a bit too hacky.

let mut buf = Vec::with_capacity(protobuf.encoded_len());
protobuf.encode(&mut buf).unwrap();
buf
let mut out = Vec::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about capacity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add some capacities everywhere 👍

@tomaka
Copy link
Contributor Author

tomaka commented May 30, 2022

Pushed some tests.

Through these tests I've found a way to make the LEB128 decoding code panic. Fixed it and added that to the CHANGELOG.

@tomaka tomaka added the automerge Automatically merge pull request as soon as possible label May 30, 2022
@mergify mergify bot merged commit 69b11f5 into paritytech:main May 30, 2022
@tomaka tomaka deleted the protobuf branch May 30, 2022 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge Automatically merge pull request as soon as possible
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ensure determinism of PeerId encoding Figure out a solution for the protobuf files
2 participants