-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very high CPU usage with empty example using just default plugins #1462
Comments
Similar issue on Windows 10.0.19042 Build 19042 with Bevy 4.0: around 30-40% CPU usage and 40-50% integrated Intel GPU usage (instead of using dedicated Nvidia GPU) on a blank window, causing laptop fans to start spinning like crazy: use bevy::prelude::*;
fn main() {
App::build().add_plugins(DefaultPlugins).run();
} |
After further testing, I seem to have similar symptoms when just using use bevy::prelude::*;
fn main() {
App::build().add_plugins(MinimalPlugins).run();
} |
Can you add the frame time and log diagnostic plugins and respond with your framerate? I have a feeling that you aren't getting frame limited and so the cpu is just spinning:
When bevy is frame limited (and built in release mode) it is much kinder to cpus. |
Oh wait you already did that :) |
Just ran the code in this issue's description (using the master branch) and got the following results: @malj I expect Bevy 0.4 to not be frame limited with the example you posted due to us moving to Mailbox vsync instead of Fifo vsync. We have since reverted that change on master until we can sort out a better frame limiting approach. However @nehalem501 clearly has frame limiting, so something else is going on there. Maybe it related to macs? If I find time i'll try testing that out. |
I actually have similar results on my Linux machine. According to the logs, frame limiting seems to work. Same code as in my initial bug report, running a release build with git bevy revision a895256. Ubuntu 20.04 x86_64 with a 16t/8c AMD CPU, Nvidia proprietary GPU drivers, X11 (Unity DE). Rust version:
What
logs:
|
On @cart's recommendation I am testing performance for Here is my diff on index db98543b..ecc5f457 100644
--- a/crates/bevy_tasks/src/task_pool.rs
+++ b/crates/bevy_tasks/src/task_pool.rs
@@ -205,22 +205,15 @@ impl TaskPool {
// this so we must convert to 'static here to appease the compiler as it is unable to
// validate safety.
let fut: Pin<&mut (dyn Future<Output = Vec<T>>)> = fut;
- let fut: Pin<&'static mut (dyn Future<Output = Vec<T>> + 'static)> =
+ let fut: Pin<&'static mut (dyn Future<Output = Vec<T>> + Send + 'static)> =
unsafe { mem::transmute(fut) };
// The thread that calls scope() will participate in driving tasks in the pool forward
// until the tasks that are spawned by this scope() call complete. (If the caller of scope()
// happens to be a thread in this thread pool, and we only have one thread in the pool, then
// simply calling future::block_on(spawned) would deadlock.)
- let mut spawned = local_executor.spawn(fut);
- loop {
- if let Some(result) = future::block_on(future::poll_once(&mut spawned)) {
- break result;
- };
-
- self.executor.try_tick();
- local_executor.try_tick();
- }
+ let spawned = executor.spawn(fut);
+ future::block_on(spawned)
}
})
} And performance for default plugins does improve - not as much as I hoped - from ~130% of CPU to ~75%. Good news is that I will try to find the other offenders considering how much CPU time an empty project is still using. Note: I am not sure of the significance of changing |
I'm still not fully convinced that "combined os-reported cpu usage percentage" is by itself a good metric to optimize for. Ex: if we do multi-threading each frame, work will be spread out across all cores. That means each frame we're using 800% cpu on an 8 core machine when all cores are being utilized (vs 100% cpu if we only single thread). Depending on how the os samples and aggregates these numbers, we might have a situation where multithreading is faster / less power hungry / consumes fewer resources overall across a given slice of time, but the opposite is reported. Before making design decisions based on these numbers, we might want to determine how these metrics are calculated and whether or not they are a good measure for realtime multithreaded apps. |
briefly puts on world-weary project owner hat |
How about CPU temps? 😝 I am definitely not getting 60 fps on a high end macbook pro with really simple bevy projects so something is wrong. |
What is "really simple" in this context? Can you share a link or describe your use case? There are a number of knowm performance limitations that wouldn't surface in "empty projects", which is what this issue is specifically about. |
That is fair, it's likely my performance issues are more related to other problems. I am just rendering text and UI quads. Does that mean that empty projects get something into a pathological state and real projects are taxing enough that this performance problem doesn't arise? |
I agree that using CPU percentage for measuring performance of big multithreaded apps is not very serious. But an empty project with just the default plugins shouldn't make a laptop fan spin at full speed. If I write minimal C code to show a window and call Please don't view this as a rant, I'm just trying to understand what is going on. |
Also it's worth noting that the numbers for |
Theres a lot of related questions here like "how much text", how many ui elements, and how deeply nested the ui is (the "flexbox" impl we currently use for layout gets really slow with deeply nested elements). But long story short: there is plenty of room for improvement in these areas (and we do plan on improving them asap). But also tangential to the current conversation. Fps isn't the problem we're having with the "empty project", overly-engaging the cpu is (at least ... im relatively certain). @nehalem501 I'm assuming if you disable vsync, you still get high framerates (in the thousands)?
Yup fans engaging on an empty app is a real indicator of problems (and is a problem in its own regard). Can't chalk that up to "measurement methodology" 😄
Thanks for pointing that out. |
I’ve made some additional testing with the default plugins empty app. Disabling vsync on a release build gives me around 130-150 fps, with 70-80% CPU usage. I didn't expect huge numbers, but seems low even for an integrated GPU. Next, I’ve tried something different, with just the minimal plugins, so no window, no rendering, …, just the following code: use bevy::prelude::*;
fn main() {
App::build()
.add_plugins(MinimalPlugins)
.run();
} This gives me around 130% CPU usage on macOS. I understand I’ve tried the same code on Linux as well, this is the result I have on a Linux machine (same results as on macOS):
|
@cart Just a quick update re Windows, I switched to the master branch instead of 4.0 release and tested a debug build with the diagnostic plugins: fn main() {
App::build()
.add_plugins(DefaultPlugins)
.add_plugin(FrameTimeDiagnosticsPlugin::default())
.add_plugin(LogDiagnosticsPlugin::default())
.run();
} The frame rate seems to be capped to the monitor's native 144Hz refresh rate:
...but the high resource usage issue seems to persist: 25-30% on both Intel i7 8750H CPU and Nvidia RTX 2060 mobile GPU. The release build seems to reduce the CPU usage to about 10-15%, but the GPU usage remains the same, which seems really a lot for an empty 1920x1080 window. |
@malj Could you try running this code on Windows, to see if you have also high CPU usage without the GPU involved ? fn main() {
App::build()
.add_plugins(MinimalPlugins)
.run();
} I haven't looked at the GPU usage, but if you have the same problem as me on the |
@nehalem501 Sure, the CPU usage is about the same with minimal plugins: Edit: curiously, in this case the CPU usage remains the same even in the release build. |
i ran into this issues too when running examples on macos with latest commit 3e285d5 |
I have the same issue, bevy 0.5 and macOS 11.2.2. Instruments Time Profiler looks the same. Using |
I also have seen some high CPU usage on windows using the many_sprite example ( with vsync disabled since i only have a 60hz monitor ) reaching ~120 FPS. Pulling a CPU profile i see that it spends much time in clearing shaders: The example has been built with this config:
Resulting in this output |
I consider that to be a separate issue. The many sprites example is a intended to be stressful (and we are already in the process of optimizing it). This thread is about "baseline" cpu usage when nothing is happening. |
Is there a solution for this problem? The simple breakout example needs already up to 20% CPU usage on my PC in release mode, and a 3D game I'm developing needs about 80% CPU usage for drawing 10 boxes in 3D mode with one light. This is unusable for bigger games. |
I calculated a flamegraph and a lot of time was spent in the systems loop. So I thought a short timeout could improve it. I added this line in stage.rs in the bevy_esc module:
at the start of the |
Which system takes a lot of time? Pretty much everything is done using systems, including rendering. Of course sleeping inside a system will drop the framerate. It makes it take longer before the render system gets a chance to run. |
I exported it to webassembly as well: |
Just wanted to say that I'm having the same issue (~100% CPU usage with the empty app) on macOS. |
Same here, 150%+ and heating up. |
Sorry to throw another anecdote onto the fire, but I ran the |
Just ran the breakout app from the main branch on my macbook pro MacBook Pro (Retina, 15-inch, Mid 2015) running Monterey. I noticed the fans were going crazy and checked the activity monitor. CPU was stuck around 185-200%. Did not expect this! |
I checked that localy - ~30% utilization on release vs 200% on debug. With 100k cubes spawned I had ~100fps release vs 0.25fps debug. For me it's all fine on release but I guess there is a room for improvement on debug. |
Enabling high optimizations for dependencies in debug mode, lowered the CPU usage substantially for me. |
Closed by #3974. |
Bevy version
Tested on both 0.4 and recent master a895256.
Operating system & version
macOS 10.14 x86_64 (MacBook Pro 2014, 2c/4t 2.8GHz Intel Core i5).
Latest nightly
rustc
, with the default rust linker (I haven't installedlld
).What you did
I tested the following code on both versions mentioned above, in debug and release modes.
main.rs:
Cargo.toml:
What you expected to happen
Have the app window show app with low CPU usage.
What actually happened
I have 100% CPU usage with this code when compiled in debug mode, and around 50% CPU usage in release mode. This seems very excessive for drawing an empty window and displaying a few log lines in a terminal.
Am I doing something wrong ?
Additional information
These are the logs in debug mode, vsync seems to work:
And these the logs in release mode:
Screenshot of System Monitor with the debug version:
Screenshot of System Monitor with the release version:
The text was updated successfully, but these errors were encountered: