Skip to content
This repository has been archived by the owner on Nov 27, 2022. It is now read-only.

Added planck_ecs. Added violin graphs. #18

Merged
merged 14 commits into from
Apr 11, 2021

Conversation

AnneKitsune
Copy link
Contributor

This pull request adds violin images to the readme, so that manually editing the table doesn't need to be done manually anymore.
Futher, it adds planck_ecs to the benchmarked ecs.

@Systemcluster
Copy link

The graphs are unreadable in GitHub dark mode - Could you add a background instead leaving it transparent?

image

@AnneKitsune
Copy link
Contributor Author

I'm not totally sure if criterion has a config for this, but if I manage to find it I will!

@Systemcluster
Copy link

I created an issue here: bheisler/criterion.rs#469

I don't think that should stop this PR from being merged, as that can always be changed later.

@cart
Copy link
Collaborator

cart commented Mar 25, 2021

I'm in favor of this change. The graphs help visualize the differences and automation is a huge plus. The biggest downside is the loss of "additional metadata" about "specialized" benchmarks. I think that sort of thing needs to be captured to make the comparisons meaningful and honest. Imo we should sort out a way to encode this information before removing the current system (which is already not good enough).

Ex:

  • Many frameworks/apis/tests support batched operations. Having both batched and unbatched tests is important because batching can't always be used. It doesn't make sense to compare a batched api to an unbatched api (without a big asterisk) as they can't always be used interchangeably.
  • Many frameworks have some sort of opt-in "optimized storage". Legion and Shipyard have "packs". Bevy will soon have both sparse and table storage. If we just pick the best storage for each test (without reporting it), that isn't honest.

@cart
Copy link
Collaborator

cart commented Mar 25, 2021

Eh I should have cloned the repo and looked at all of the graphs first. The "specialized" tests are displayed in the graphs.

I guess thats good enough for now. Imo we should include more variants and documentation for those variants for each test, but thats not something to block this pr on 😄

readme.md Outdated
(*): The values in parentheses are results where per-benchmark storage optimizations were applied. Some of these are mutually exclusive, so with and without "packing" typically represent best and worst-case performance for the ECS.

The best result for each benchmark is marked in bold text. Note that run to run variance for these benchmarks is typically 2-3%, with outliers as much as 10%. All micro-benchmarks should be taken with a grain of salt, and any benchmarks within a few percent of each other should be considered "effectively equal".
![](./target/criterion/add_remove_component/report/violin.svg)
Copy link
Collaborator

@cart cart Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should move the results into their respective benchmark sections (underneath the description)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! Thanks for the review! ♥️

@AnneKitsune
Copy link
Contributor Author

I also agree that some benchmarks are very misleading. Things like legion's batch entity creation or world.extend allows for massive performance gains compared to other ECS that can't or choose not to support batch methods.

@cart
Copy link
Collaborator

cart commented Mar 26, 2021

This is a big enough change to the "result reporting" that we should get at least one more approval before merging. @TomGillen / @Ralith, can one or both of you take a look?

@Ralith
Copy link
Collaborator

Ralith commented Mar 27, 2021

I'll try to have a look at this tomorrow; thanks for your work so far!

Copy link
Collaborator

@Ralith Ralith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a big fan overall, though (as a non-blocking nit) I'd prefer modifying the readme and adding a new ECS to either be different PRs, or for the commits in this PR to be squashed such that each change is in a single commit suitable to be rebased onto master. I also wish we could adjust the aspect ratio of criterion's output a bit so the text didn't end up so tiny. Still, far more readable than a text chart.

Happy to merge once the missing plots are added regardless. Reducing the friction for updating is awesome.

readme.md Show resolved Hide resolved
@Ralith
Copy link
Collaborator

Ralith commented Mar 28, 2021

I also agree that some benchmarks are very misleading.

For the sake of discussion, I (from an obviously biased perspective) kinda disagree with this. The README gives a clear specification of the benchmarks, and it's one where batched operations are applicable. If an ECS chooses not to optimize for that case, that's fine, but it's still an important, realistic case, at least as far as anyone's taken it so far--nobody should be getting a free pass for sacrificing it. If unbatchable workloads are underrepresented, that's a different problem.

@cart
Copy link
Collaborator

cart commented Apr 1, 2021

For the sake of discussion, I (from an obviously biased perspective) kinda disagree with this. The README gives a clear specification of the benchmarks, and it's one where batched operations are applicable. If an ECS chooses not to optimize for that case, that's fine, but it's still an important, realistic case, at least as far as anyone's taken it so far--nobody should be getting a free pass for sacrificing it. If unbatchable workloads are underrepresented, that's a different problem.

Imo the biggest problem is that I could easily implement and use "optimized" storages/algorithms for almost any benchmark, but the code could end up not looking like "idiomatic" bevy/legion/hecs/shipyard/etc code. Ex: the legion "simple insert" benchmark uses SoA insertion, but you won't find that in their readme/examples. SoA requires the user to "decouple" each entity's components across component-specific arrays, which I wouldn't recommend to anyone (and which I would prefer to not implement in Bevy at all). But now I feel "pressured" to implement it otherwise I can't "compete".

If Bevy starts using "sparse set" components in its add/remove benchmark (which isn't what I would recommend by default for Bevy users), then the only way legion can compete is to implement sparse set storage (which they might not want to do for design reasons).

I consider those two cases to be almost identical. The behavior they encourage for ECS authors and the picture the results paint for users feel "wrong" to me.

I don't think ecs_bench_suite should encourage ECS authors to "play the benchmark game" with hyper-optimized per-benchmark results. I think it should encourage "real world" comparisons of ECS frameworks with "idiomatic" api usage. Anything outside of those bounds (or anything that might cause users to draw the wrong conclusions, such as comparing batched/unbatched benchmarks) should be clearly indicated.

Ex: I think we should have separate batched/unbatched insertion benchmarks. That clearly indicates which frameworks support batched insertion. Ideally we also have AoS and SoA labels. I also think Bevy shouldn't be allowed to silently use sparse set storage for add/removes. Instead we should provide both results (and probably indicate that Table storage is the default / recommended storage type).

@Ralith
Copy link
Collaborator

Ralith commented Apr 1, 2021

Oh, I didn't realize legion was getting columnar data from the user in that case; that's more egregious than batching row-oriented operations, I agree. Hecs has such an API as well, but I don't consider it appropriate for that benchmark.

I absolutely agree that we should focus on benchmarking realistic patterns rather than gamesmanship, and it looks like we've already slipped into that more than I realized.

@AnneKitsune
Copy link
Contributor Author

@Ralith the two missing plots are added

Copy link
Collaborator

@Ralith Ralith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serialization embeds seem to be 404ing?

@AnneKitsune
Copy link
Contributor Author

Oh, it doesn't generate violin graphs since only legion implemented those benchmarks. Woops!
I guess it doesn't mean much to have benchmarks only for one ECS. What should we do?

@Ralith
Copy link
Collaborator

Ralith commented Apr 9, 2021

I went ahead and merged the hecs support for that benchmark; merge master and it should be good to go.

@AnneKitsune
Copy link
Contributor Author

Completed 👍

@Ralith
Copy link
Collaborator

Ralith commented Apr 11, 2021

Great, thanks! This is a big improvement.

@Ralith Ralith merged commit dd9e00b into rust-gamedev:master Apr 11, 2021
@AnneKitsune AnneKitsune deleted the planck branch April 12, 2021 15:40
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants