Replies: 5 comments 12 replies
-
I also noticed the behavior much earlier in Bevy's development and eventually found something "stable", but it was extremely sensitive to changes. I have a feeling that one of the many changes made since then regressed here. I think a part of this is that micro benchmarks will always have different performance characteristics. But I also think the current bevy_hecs is probably more sensitive to changes than "normal". Interestingly, my experimental safe(-er) hecs refactor remains nice and stable when you run the "test" above. Its not ready for prime time / I'm not yet sure its the right path to take bevy_ecs on, but I pushed the branch anyway in case you want to test. If it makes bevy perform better on the average case, thats maybe another reason to polish it up and merge it. |
Beta Was this translation helpful? Give feedback.
-
Found some useful tips performance optimizing here: https://github.com/flamegraph-rs/flamegraph#systems-performance-work-guided-by-flamegraphs |
Beta Was this translation helpful? Give feedback.
-
I created a repo for bevy benchmark games and added an asteroids-ish game. I still don't know exactly the best metrics to collect or tools to use to actually do benchmarking/profiling, but I'm looking into it. Tools like Linux perf and Valgrind seem useful for this kind of stuff, but I don't know how to use them. https://github.com/katharostech/bevy_benchmark_games Just measuring the time it takes for so many frames to run might be fine enough. On Linux the benchmark game is able to use the Anyway, I'll probably put together one more game example and then start figuring out what kind of metrics to actually collect. If anybody has profiling or benchmarking experience and could give me some pointers that would be great. 😃 |
Beta Was this translation helpful? Give feedback.
-
Opened a Rust forum topic to see if we can get any pointers from people with benchmarking tips or experience: |
Beta Was this translation helpful? Give feedback.
-
@cart I started checking out your Also, It'd be good for me to get more familiar with the ECS internals anyway so I could start cleaning up that code if you wanted and trying to get it merge ready. I'm not sure how far from the design you were going for that it is, but if the design is essentially there, I could probably fix the remaining failing tests ( I think there's one failing I think, and a new one I wrote that's failing for SOA ) and otherwise clean up the old comments and such. Edit: actually if we wanted to benchmark it I forgot that I can leave the renderer out and run it headless, like I was already doing anyway, I just have to disable the feature. |
Beta Was this translation helpful? Give feedback.
-
So I was just working on the Bevy ECS trying to prepare it for scripting, and I had been running the ecs_bench to make sure I didn't introduce performance regressions while modifying one of the really "hot" portions of the code. But then I ran into results that are really making me doubt the usefulness of these micro benchmarks.
The Experiment
Take this simple experiment for example:
To highlight the number's we'll be looking at:
937.92 ns
942.81 ns
OK, now that we've run the benchmark once, edit the
benches/pos_vel/bevy.rs
file and comment out thebevy_foreach
benchmarks so that thebench()
function body looks like this:Now run the bench again:
Now lets check our iteration times:
1.8639 us
+97.307%1.9874 us
+114.15%So we did not change any bevy code. All we did is comment out some of the benchmarks and it made it 100% slower!
I could be missing something, and maybe this is relatively normal and the benchmarks are still useful if the only variable is bevy's own code, but even in that situation I ran into some really strange performance effects. Again, maybe that is still normal, and that code could be sensitive to changes, but the above experiment is making me a lot less sure about how useful guiding optimization based on at least our current benchmarks is.
How Can We Help This?
So I wanted to start this discussion to start working out how we might be able to create better benchmarks or profiling solutions. I'm open to anything. I've thought about creating real games as benchmarks and then measuring frames per second. Or maybe an intrusive profiler is more useful, but profilers have overhead, so I'm not sure. And maybe we still use the current benchmarks and we just need more different strategies for measuring performance. I don't know, but I think it's definitely important to realize that micro-benchmarking probably isn't going to give us a true impression of Bevy's performance and we're going to want to start understanding how we can track Bevy's performance more reliably.
What do you think?
Beta Was this translation helpful? Give feedback.
All reactions