Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment to mitigate StorageId union access patterns #16939

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

omaskery
Copy link
Contributor

Objective

  • In the query iteration code there are some types which rely on comments & programmer discipline to ensure correct access of fields:

    • StorageId - a union of ArchetypeId and TableId
    • QueryIterationCursor - multiple fields are only valid, and some types vary, depending on a boolean value:
      is_dense: bool,
      storage_id_iter: core::slice::Iter<'s, StorageId>,
      table_entities: &'w [Entity],
      archetype_entities: &'w [ArchetypeEntity],
    • QueryState - similar to above:
      // NOTE: we maintain both a bitset and a vec because iterating the vec is faster
      pub(super) matched_storage_ids: Vec<StorageId>,
      // Represents whether this query iteration is dense or not. When this is true
      // `matched_storage_ids` stores `TableId`s, otherwise it stores `ArchetypeId`s.
      pub(super) is_dense: bool,
  • I asked why this wasn't achieved with an enum of some kind on Discord.

  • Somebody replied saying the reasoning for this is historical (is_dense used to be derived from a const generic or similar), and my proposed enum refactoring was worth considering.

  • This PR is my initial draft of that work to get feedback on direction. Further testing may be required, and currently no documentation comments have been written!

Solution

  • The fields whose access patterns must be managed are now largely moved into enums, making it broadly impossible to access the wrong fields at the wrong times.
  • I have tried, where possible, to make the number of checks no more than before - aka where we only checked is_dense before, we now do a single match on the enum.
  • However, I have left in the concept of a StorageId type and made it a normal enum rather than union. I have then tried to only use this in places where the downsides are hopefully minimal, such as:
    • Where I think the discriminator being stored won't be significant (e.g. it's just transient as part of an iterator)
    • Where I think the cost of accessing the variant is minimal (we would've had to check is_dense anyway)
  • This is largely because I didn't want it to be too drastic a refactor in the fairly scary code around fold_over_storage_range, but I'm open to people giving advice on how to proceed there, whether I should be braver (or not).
  • I have put in the concept of accessing the ArchetypeId or TableId via debug_checked_as_x() unsafely, to still preserve the idea that when the QueryIter knows whether the iteration is dense or not, it doesn't want to pay the cost of checking that again when being asked to iterate storage by a StorageId. The old code didn't pay that cost, so I was wary of doing so. If people feel that this is overly cautious perf-wise, then I can make it safe and have it fail in some way (or whatever you suggest).

Testing

  • I've run the bevy_ecs unit tests, but don't know what else to do. Please advise! I'm also asking in #bevy_ecs on discord.
  • I haven't added any new tests as it is a refactor that should have no external effects 🤞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant