Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Systems should be skipped if their resources cannot be fetched #15265

Closed
alice-i-cecile opened this issue Sep 17, 2024 · 26 comments · Fixed by #15276
Closed

Systems should be skipped if their resources cannot be fetched #15265

alice-i-cecile opened this issue Sep 17, 2024 · 26 comments · Fixed by #15276
Labels
A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use S-Needs-Design This issue requires design work to think about how it would best be accomplished X-Controversial There is active debate or serious implications around merging this PR

Comments

@alice-i-cecile
Copy link
Member

          > The solution to this I would prefer is to lean into the ECS paradigm more & treat more things as if they were querying the world. If a Res<Foo> doesn't exist the system just doesn't run. Similar to how iterating over a Query<Bar> that matches nothing loops over nothing. If users want a more explicit panicky behavior they can use Option<Res<Foo>> & unwrap.

I do think we should do this, and if we can get this behavior in place I'd prefer to encourage the use of the Single system param from #15264. This would go a long way to addressing the problem and ensuring non-panicking behavior by default without an ergonomics hit.

Originally posted by @iiYese and @alice-i-cecile in #14275 (comment)

@alice-i-cecile alice-i-cecile added A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use X-Contentious There are nontrivial implications that should be thought through labels Sep 17, 2024
@alice-i-cecile
Copy link
Member Author

I really quite like this idea: I think it's generally a pretty robust solution. The design challenge here is:

a) how do we provide warnings to users that their system isn't running because they forgot to initialize a resource or whatever
b) how do we easily silence that warning for cases where this behavior is expected

@alice-i-cecile alice-i-cecile added S-Needs-Design This issue requires design work to think about how it would best be accomplished X-Controversial There is active debate or serious implications around merging this PR and removed X-Contentious There are nontrivial implications that should be thought through labels Sep 17, 2024
@Rigidity
Copy link

Rigidity commented Sep 17, 2024

+1 for this, this has bitten me before

I guess a warning could tell you to explicitly use Option or Single if it's missing?

@alice-i-cecile
Copy link
Member Author

Discord notes from a conversation on 2024-09-17.

SystemParam::get_param can mutate SystemParam::State, so it'd need to be split into try_get (immutable state) and apply_get (mutable, post access)

Semantically there's not much difference between SystemParam and WorldQuery, they both grab pointers from the world, cache state etc. IMO reaching convergence there so queries and systems are one and the same would be nice, but is so much code churn that I don't know if it's realistic at this point

@james-j-obrien

@jnhyatt
Copy link
Contributor

jnhyatt commented Sep 17, 2024

What about logging a warning on first run (or not-run?) that says more or less "system did not run because XXX resource was not present in the world. To suppress this warning, either insert the resource or add .run_if(resource_exists::<XXX>) to the system"

@alice-i-cecile
Copy link
Member Author

Yeah, I'm happy with that as an initial solution.

@iiYese
Copy link
Contributor

iiYese commented Sep 17, 2024

I'd prefer for this to be an expected default that designs take into consideration cause it's more consistent with the behaviors of ECS. A diagnostic warning about missing resources is mostly going to be for transitioning from the old behavior.

Is there a mechanism to have custom lints? If there is we could have one for all instances of Res & ResMut in 0.16 that people can suppress with #![allow(..)] that just reads:

"Systems will not run when resources can't be fetched from 0.16 onward. Use Option<Res<_>> or Option<ResMut<_>> to change this".

@MiniaczQ
Copy link
Contributor

MiniaczQ commented Sep 17, 2024

Another implication of this is that System::run_unsafe -> Self::Out can sometimes not run, so we need to wrap the return value in option, probably rename the function to try_run_unsafe as well.
Returning None will invalidate all chaining/piping/etc.

Unless we decide to do param validation before run_unsafe

@iiYese
Copy link
Contributor

iiYese commented Sep 17, 2024

We could also add an associated const on the resource trait

pub trait Resource: Send + Sync + 'static {
    const NOISY: bool = false;
}

With a helper attribute for the derive that sets this to true. When checking if a system runs where there is a missing resource that has this set to true it will cause a panic instead of the system not running.

@MiniaczQ
Copy link
Contributor

MiniaczQ commented Sep 17, 2024

Should SystemParam::validate_param run before or inside System::run_unsafe?
The latter requires us to return Option<System::Out> instead of System::Out.

I'd rather it be before instead of inside to avoid unnecessary wrapping in the cases where the validation can (potentially) be elided.

From @james7132 in discord

@alice-i-cecile
Copy link
Member Author

alice-i-cecile commented Sep 17, 2024

I agree with @james7132 there: validation should be distinct from running the system, and be done by schedulers.

@iiYese for custom lints, we'd need https://github.com/theBevyFlock/bevy_cli. I think I like the associated constant idea better though, although we should have various log levels (including panic), rather than just a bool. Defaulting to error feels correct though.

@alice-i-cecile
Copy link
Member Author

@MiniaczQ points out that run conditions are systems too. If a run condition cannot be evaluated, it should be treated as if it returned false.

@alice-i-cecile
Copy link
Member Author

@hymm points out that we should perform the validation checks as soon as possible, before spawning a task for the scheduler.

For piped systems, al params must be checked at the start, and the whole piped collection of systems must be skipped if any are invalid. This is because piped systems are opaque to the schedule.

@MiniaczQ
Copy link
Contributor

MiniaczQ commented Sep 17, 2024

For warnings, should we add a system config flag that decides whether to emit a warning?
Some systems will use this as run conditions, but in others this can be an accidental decision.

By default it would be true.
But then disabling it is a massive pain.

@alice-i-cecile
Copy link
Member Author

Yeah, I was thinking about that in terms of system config initially, but I wasn't happy with the disabling ergonomics. I don't mind it existing, but in most cases I think that adding this at the resource / component level will better match the user intent and result in fewer bugs and boilerplate.

@MiniaczQ
Copy link
Contributor

MiniaczQ commented Sep 17, 2024

Can we perhaps leverage module-filtering for this?
People could then selectively filter modules where they resolved this and no longer receive warnings.
This would play along with plugins too, since they're essentially modules.

Actually no, that doesn't make sense.
There is a solution to be made with disabling this per plugin, but it's a bit more work.

@iiYese
Copy link
Contributor

iiYese commented Sep 17, 2024

although we should have various log levels (including panic), rather than just a bool. Defaulting to error feels correct though.

I'm not sure about this cause for anything other than panic you'll get a wall of errors like WGPU which isn't much more helpful than a panic. Errors by default means you'll get a lot of opting out (which is indicative of a bad default). I think some people might be worried but would be surprised to find how sane this behavior actually is.

@alice-i-cecile
Copy link
Member Author

Hmm. Panic might be a better default, yeah. I worry about adding more causes of "what the heck why isn't my system working": forgetting to add the system and forgetting to add the plugin are both super frustrating currently.

@iiYese
Copy link
Contributor

iiYese commented Sep 17, 2024

Actually error would be fine if we add a bool to Schedule or something to track if first time diagnostics are emitted. It's a bool that would be changed at most once so it's easy for branch predictors to optimize out.

@urben1680
Copy link
Contributor

We could also add an associated const on the resource trait

pub trait Resource: Send + Sync + 'static {
    const NOISY: bool = false;
}

If this is considered, I'd prefer a const generic bool SKIP_SYSTEM_IF_MISSING on Res/ResMut. Though this will probably conflict with Option<Res> if the default is true as that makes no sense in that case.

@alice-i-cecile
Copy link
Member Author

@iiYese we already have a warn_once family of macros for exactly this purpose.

@alice-i-cecile
Copy link
Member Author

Adding an associated type or constant on Resource will break trait objects for it, so that needs to be considered carefully.

@iiYese
Copy link
Contributor

iiYese commented Sep 17, 2024

Adding an associated type or constant on Resource will break trait objects for it, so that needs to be considered carefully.

Why would people have trait objects of an empty trait tho? 🤔

@alice-i-cecile
Copy link
Member Author

Inserting arbitrary components / resources is something I've seen people want to do. That said, we already do this with the Storage associated type, and those users probably actually want to use reflection. So yeah, I think adding associated constants or types is fine.

@SpecificProtagonist
Copy link
Contributor

@iiYese we already have a warn_once family of macros for exactly this purpose.

That's once globally (per callsite), but we want to do it per system, and we want to have all warnings for the first run.

@alice-i-cecile
Copy link
Member Author

Right, we'll need to add some variant of that that checks against a key (probably the system name in this case).

@jnhyatt
Copy link
Contributor

jnhyatt commented Sep 18, 2024

Ah goodness duplicate comment, my browser had a stroke

github-merge-queue bot pushed a commit that referenced this issue Sep 23, 2024
# Objective

The goal of this PR is to introduce `SystemParam` validation in order to
reduce runtime panics.

Fixes #15265

## Solution

`SystemParam` now has a new method `validate_param(...) -> bool`, which
takes immutable variants of `get_param` arguments. The returned value
indicates whether the parameter can be acquired from the world. If
parameters cannot be acquired for a system, it won't be executed,
similarly to run conditions. This reduces panics when using params like
`Res`, `ResMut`, etc. as well as allows for new, ergonomic params like
#15264 or #15302.

Param validation happens at the level of executors. All validation
happens directly before executing a system, in case of normal systems
they are skipped, in case of conditions they return false.

Warning about system skipping is primitive and subject to change in
subsequent PRs.

## Testing

Two executor tests check that all executors:
- skip systems which have invalid parameters:
  - piped systems get skipped together,
  - dependent systems still run correctly,
- skip systems with invalid run conditions:
  - system conditions have invalid parameters,
  - system set conditions have invalid parameters.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use S-Needs-Design This issue requires design work to think about how it would best be accomplished X-Controversial There is active debate or serious implications around merging this PR
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants