RPC performance decimated #906

emielsebastiaan · 2020-03-17T11:49:37Z

For our Polkascan use-case we extensively query the Substrate RPC endpoints.
We notice a very significant difference in performance between v0.7.20 and v0.7.25.
Kusama v0.7.20 is approximately 8 times faster than v0.7.25.

We have run v0.7.25 with various OPTIONS:

--state-cache-size 8192000000
--max-runtime-instances 256
--db-cache=8192

This does not make a significant difference.
In general our harvester requests the following RPCs for each and every block:

chain_getBlock
state_getStorage (various calls)

We would like to get back to the performance we had when we were using v0.7.20.
Please advise.

The text was updated successfully, but these errors were encountered:

bkchr · 2020-03-17T12:24:38Z

@tomusdrw did we had any big changes to the rpc crate lately?

tomusdrw · 2020-03-17T13:39:26Z

I presume it's HTTP server, right?

Just checked and we bumped jsonrpc from 14.0.3 to 14.0.5. There is a change that may definitely affect that:
https://github.com/paritytech/jsonrpc/pull/518/files#diff-b40889955acdf574fe9db43bfc17372dL541

It should still though utilise all 4 threads of the runtime spawned by rpc-servers crate (part of substrate).

@emielvanderhoek would you mind checking:

What is the load distribution between cores in these two different versions (i.e. are we using more cores on 0.7.20)?
If you are ok compiling the client, could you check this branch:
https://github.com/paritytech/polkadot/tree/td-old-http to see if it brings the performance back to satisfactory levels?

Also what metric do you use to measure performance? Is it average response time? Or do you mean resource utilisation (as in 0.7.20 taking 8x less resources)?

emielsebastiaan · 2020-03-17T15:28:49Z

Yes we are using RPC over HTTP (not WS).
I have built the branched release (https://github.com/paritytech/polkadot/tree/td-old-http) and ran it. It is performing slightly better but unfortunately not anywhere near the v0.7.20 level.

I currently have a very coarse grained performance metric (I am aware of that). I'll follow-up to better define that soon.

tomusdrw · 2020-03-17T15:33:19Z

@emielvanderhoek Thanks a lot for running the branch, it suggests that the issue is not fully caused by jsonrpc change, so we need to dig deeper.
It would be best to collect some performance data, but I'm not entirely sure what the best format would be?

@arkpar is it possible that some DB/cache options were changed and could be causing that. Seems that --db-cache 8G does not really fix the issue, was there anything else? Can you suggest how to best collect performance metrics? Should we try with valgrind --tool=callgrind or do you have better ideas?

emielsebastiaan · 2020-03-17T16:03:27Z

What is the load distribution between cores in these two different versions (i.e. are we using more cores on 0.7.20)?

I will check the difference between v0.7.20 and v0.7.25. I know the machine I am running on has eight cores/threads available.

arkpar · 2020-03-17T17:58:29Z

We need a benchmark for this.

chain_getBlock
state_getStorage

These don't invoke runtime so it can't be wasm execution.
Would be nice to get some kind of profiling report.

@arkpar is it possible that some DB/cache options were changed and could be causing that. Seems that --db-cache 8G does not really fix the issue, was there anything else? Can you suggest how to best collect performance metrics? Should we try with valgrind --tool=callgrind or do you have better ideas?

valgrind or perf
Here are some instructions for the latter:
https://rust-lang.github.io/packed_simd/perf-guide/prof/linux.html

emielsebastiaan · 2020-03-17T18:27:38Z

Ok new info...
When I ran the branched version (https://github.com/paritytech/polkadot/tree/td-old-http) I did not run it with the --db-cache 8192 option; I omitted this option altogether. Just now I did run it with this added option and it did increase overall performance.

I will see what I can do to get real numbers (benchmark).
For now an indication of performance is:

v0.7.20: "fast"
v0.7.25: "slow" (~1/10 * fast)
v0.7.25 with --db-cache 8192: "slow" (~1/8 * fast)
v0.7.25-90bf7bc (td-old-http): "slow" (~1/6 * fast)
v0.7.25-90bf7bc (td-old-http) with --db-cache 8192: "slow" (~1/3 * fast)

Like I said we currently have a coarse grained benchmark.
Our harvester works in batches of 10 blocks and fetches from HTTP-RPC 'chain_getBlock' and various 'state_getStorage'. Then it processes the data and stores it in our relational database.
Fast is defined as what we were used to (~10 blocks per second).
Slow goes all the way down to less than one block per second.

arkpar · 2020-03-17T18:51:49Z

@emielvanderhoek Could you provide instructions to run the harvester?
https://github.com/polkascan/polkascan-pre are these up to date?

emielsebastiaan · 2020-03-19T08:46:57Z

@arkpar I would need to check. We are currently wrapping up some grant work and hence most of that is pending a big refactor.

arjanz · 2020-03-31T09:44:16Z

I included a simple script what basically loops through a 1000 blocks the same way the harvester would (90% of our RPC calls are extrinsics and events), I noticed that just after a clean sync the performance of both versions are somewhat the same, I suspect the performance drops when the database grows bigger but couldn't confirm yet.

On a Python 3.6+ env run:

pip install substrate-interface
python rpc_perf_test.py http://[ip-address]:9933

rpc_perf_test.py.zip

emielsebastiaan · 2020-03-31T09:49:11Z

Unfortunately I cannot get the v0.7.20 version to sync anymore. So it is hard to get an objective baseline for performance with this script...

5330d84e CLI: naming clean-up. (paritytech#897) f99f2225 Westend<>Rococo Headers Relay (paritytech#875) 72c9117b Use complex headers+messages relay in test deployments (paritytech#905) 48423d5b Stop recursing when creating test headers (paritytech#906) f8586fd4 Fix outstanding bridge names. (paritytech#901) 54b683b3 Complex headers+messages Millau<->Rialto relay (paritytech#878) c0e77ca1 fix message generator scripts (paritytech#900) debf3a82 Use Substrate state_getReadProof RPC method to get storage proofs (paritytech#893) c3fa7216 Support more than `u8::max_value` GRANDPA validators (paritytech#896) e5cb87f9 Grandpa Pallet Pruning (paritytech#890) 0b6a8920 RestartNeeded is a connection error (paritytech#894) 2cf5fa26 CLI: Estimate Fee (paritytech#888) 7dace624 CLI: Send Message (paritytech#886) f8eaecfa CLI: Encode Message (paritytech#889) 1610f868 Bump `jsonrpsee` to Alpha.3 (paritytech#892) d665b531 Use new Cargo feature resolver (paritytech#891) ce2ee6ed Rialto Millau Maintenance Dashboard (paritytech#881) 7c585ce8 Revert to older nightly. (paritytech#887) 73a0470e Adding GrandpaJustification custom type (paritytech#882) b9ccea9c Install CA certificates in relay images (paritytech#880) ec7841a2 fix widget names (paritytech#879) REVERT: 746a4027 Accidentally committed `cargo-expand`ed code 🤦 REVERT: 1a5d09c5 Add note to more closely match `initialize` Call variant REVERT: fdd6e6b3 Add `submit_finality_proof` mock Call variant REVERT: 768b053e Simplify the Rococo and Westend signing params REVERT: 62aca80e Add Westend<>Rococo variants to `relay_headers` REVERT: 0bcb0f51 Add Westend<>Rococo variants to `init_bridge` REVERT: 01d1305f Use mock Westend and Rococo finaltiy tx calls REVERT: fb34b9dd Add modules for Rococo<>Westend header sync git-subtree-dir: bridges git-subtree-split: 5330d84e9511e38cf9d9ec765bee865fedd4b260

bkchr added the I9-footprint An enhancement to provide a smaller (system load, memory, network or disk) footprint. label Mar 17, 2020

smohan-dw mentioned this issue Dec 5, 2021

V0.9.13 dhiway/cord#33

Merged

rphmeier closed this as completed Oct 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RPC performance decimated #906

RPC performance decimated #906

emielsebastiaan commented Mar 17, 2020

bkchr commented Mar 17, 2020

tomusdrw commented Mar 17, 2020

emielsebastiaan commented Mar 17, 2020

tomusdrw commented Mar 17, 2020

emielsebastiaan commented Mar 17, 2020

arkpar commented Mar 17, 2020

emielsebastiaan commented Mar 17, 2020 •

edited

Loading

arkpar commented Mar 17, 2020

emielsebastiaan commented Mar 19, 2020

arjanz commented Mar 31, 2020

emielsebastiaan commented Mar 31, 2020

RPC performance decimated #906

RPC performance decimated #906

Comments

emielsebastiaan commented Mar 17, 2020

bkchr commented Mar 17, 2020

tomusdrw commented Mar 17, 2020

emielsebastiaan commented Mar 17, 2020

tomusdrw commented Mar 17, 2020

emielsebastiaan commented Mar 17, 2020

arkpar commented Mar 17, 2020

emielsebastiaan commented Mar 17, 2020 • edited Loading

arkpar commented Mar 17, 2020

emielsebastiaan commented Mar 19, 2020

arjanz commented Mar 31, 2020

emielsebastiaan commented Mar 31, 2020

emielsebastiaan commented Mar 17, 2020 •

edited

Loading