Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit smoldot's CPU usage to 50% #900

Merged
merged 6 commits into from
Apr 6, 2022
Merged

Conversation

tomaka
Copy link
Contributor

@tomaka tomaka commented Mar 25, 2022

This change is there really to avoid a worst case scenario where smoldot has an issue that causes it to never go to sleep.

This will very slightly slow down syncing (verifying the warp sync proof is a pure CPU operation), but avoiding big CPU spikes is more desirable than saving a few milliseconds.

@@ -10,6 +10,7 @@
### Changed

- Update @substrate/smoldot-light to [version 0.6.8](https://github.com/paritytech/smoldot/blob/main/bin/wasm-node/CHANGELOG.md#068---2022-03-23) ([#890](https://github.com/paritytech/substrate-connect/pull/890))
- The smoldot background worker will now bound its CPU usage to 50% of one CPU on average. ([#900](https://github.com/paritytech/substrate-connect/pull/900))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not sure if this is something that end user should know. Its extension-internal - right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's visible to the user that the usage of their CPU is bounded to 50%

@wirednkod wirednkod self-requested a review March 30, 2022 22:21
@wirednkod
Copy link
Contributor

There is something print when zombienet tests are running other than uncaught exception - but it is happening after the 140 seconds of the timeout (timeout comes first - prior to log so script exits).
When spawn locally zombienet the actual error print can be found below:

Timeout duration was set to 1.
(node:27030) TimeoutOverflowWarning: 19327215118 does not fit into a 32-bit signed integer.
Timeout duration was set to 1.
(node:27030) TimeoutOverflowWarning: 19327215118 does not fit into a 32-bit signed integer.
Timeout duration was set to 1.
(node:27030) TimeoutOverflowWarning: 19327215118 does not fit into a 32-bit signed integer.
Timeout duration was set to 1.
(node:27030) TimeoutOverflowWarning: 19327215117 does not fit into a 32-bit signed integer.
Timeout duration was set to 1.
(node:27030) TimeoutOverflowWarning: 19327215117 does not fit into a 32-bit signed integer.
Timeout duration was set to 1.
(node:27030) TimeoutOverflowWarning: 19327215117 does not fit into a 32-bit signed integer.
Timeout duration was set to 1.
(node:27030) TimeoutOverflowWarning: 19327215117 does not fit into a 32-bit signed integer.
Timeout duration was set to 1.
(node:27030) TimeoutOverflowWarning: 19327215117 does not fit into a 32-bit signed integer.
Timeout duration was set to 1.
(node:27030) TimeoutOverflowWarning: 19327215117 does not fit into a 32-bit signed integer.
Timeout duration was set to 1.
uncaughtException
Error: Timeout running custom js-script (140)

This sample is from a param of 0.75 (I think) passed to the cpuRateLimit param. Similar results were identified with 0.5, 0.75, 0.85, 0.9 - though with param set to 1 everything works as expected.

With my none-existent rust skills I can only assume that the issue comes either in here or here but again my rust is very bad to say for sure.

@tomaka
Copy link
Contributor Author

tomaka commented Mar 31, 2022

paritytech/smoldot#2188

@tomaka
Copy link
Contributor Author

tomaka commented Mar 31, 2022

paritytech/smoldot#2189

@tomaka
Copy link
Contributor Author

tomaka commented Mar 31, 2022

Despite the bugs in smoldot, it would still be great to figure out why this uncaughtException isn't caught, and properly catch it

@wirednkod
Copy link
Contributor

Despite the bugs in smoldot, it would still be great to figure out why this uncaughtException isn't caught, and properly catch it

Went through zombienet code with @pepoviola . The uncaughtException is actually a console.log as can be seen here and the "error" was the timeout occurred:

Error: Timeout running custom js-script (140)

@pepoviola
Copy link
Contributor

Despite the bugs in smoldot, it would still be great to figure out why this uncaughtException isn't caught, and properly catch it

Went through zombienet code with @pepoviola . The uncaughtException is actually a console.log as can be seen here and the "error" was the timeout occurred:

Error: Timeout running custom js-script (140)

Yes, this will be fixed in the next version of zombienet.
Thanks!

@tomaka tomaka merged commit de55bd3 into paritytech:main Apr 6, 2022
@tomaka tomaka deleted the cpu-limit branch April 6, 2022 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants