Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Commands should be infallible whenever possible #10166

Open
alice-i-cecile opened this issue Oct 17, 2023 · 15 comments · May be fixed by #15929
Open

Commands should be infallible whenever possible #10166

alice-i-cecile opened this issue Oct 17, 2023 · 15 comments · May be fixed by #15929
Labels
A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use S-Ready-For-Implementation This issue is ready for an implementation PR. Go for it! X-Controversial There is active debate or serious implications around merging this PR

Comments

@alice-i-cecile
Copy link
Member

alice-i-cecile commented Oct 17, 2023

Commands in general shouldn't even have errors, and all our built-in commands should be infallible.

The panic that happens when an insert or remove comes after a despawn is something we just arbitrarily do. They should just be ignored (with a debug! warning). It isn't a bug if two different systems from two different plugins want to despawn an entity at the same time for different reasons, or if a plugin system wants to despawn an entity at the same time user code wants to give it a component, assuming it'll still exist. Batching does not add anything new to this equation.

All we need to do is make a PR that (1) changes the panic to a debug! warning and (2) privates all fields of [Insert][1], [Remove][2], and [Despawn][3] so that users can only make these commands through EntityCommands. You can't get an EntityCommands for an entity [unless that entity is alive][4] (or will be), so all commands created through EntityCommands must be considered valid.

Since commands modify the shared state of the world, I don't think they could be reordered safely, unless we know ahead of time the set of all commands and all of their interactions, which is impossible with custom commands and will make adding new commands very difficult and fragile.

The reality is that custom commands should never make implicit assumptions about which entities exist and what components they may or may not have. i.e. If you pushed <custom command 1> assuming that Mass won't have been added to e2 yet, that isn't safe to assume. I don't think we explicitly encouraged assuming things before, but it'd be actively discouraged after this.

Originally posted by @maniwani in #10154 (comment).

Having seen the problems that can arise, I think I generally agree. It's very hard to get a usable error handling solution (#2004, #2241) without severely compromising performance. And in any case, panicking is a very bad default, both for beginners (frustration), prototyping (wasted time) and production games (crashes are very bad).

Steps to fix

  • remove try_insert method
  • swap all panics to a debug warning in commands
  • make fields private
  • get SME-ECS to agree that we should do this
@alice-i-cecile alice-i-cecile added A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use X-Controversial There is active debate or serious implications around merging this PR labels Oct 17, 2023
@mockersf
Copy link
Member

It isn't a bug if two different systems from two different plugins want to despawn an entity at the same time for different reasons, or if a plugin system wants to despawn an entity at the same time user code wants to give it a component, assuming it'll still exist. Batching does not add anything new to this equation.

In my opinion this is not a fact but an opinion. For me, it's a bug if a system tries to act on an entity that was despawned by another. It can most of the times be ignored without consequences, but it's still a bug.

@JoJoJet
Copy link
Member

JoJoJet commented Oct 17, 2023

despawn in particular needs to be infallible. While trying to insert a component on an entity that no longer exists may be a bug in some cases, I've never heard an argument for why despawn should panic. Despawning is an idempotent operation -- if you try to despawn an entity and it's already been despawned, that's a success case. There's really no reason to panic.

@bushrat011899
Copy link
Contributor

despawn in particular needs to be infallible. While trying to insert a component on an entity that no longer exists may be a bug in some cases, I've never heard an argument for why despawn should panic. Despawning is an idempotent operation -- if you try to despawn an entity and it's already been despawned, that's a success case. There's really no reason to panic.

Could this maybe be an issue of naming more than anything else? despawn (to me) says "I want to perform the action despawn on this Entity", but if it is already despawned, then I couldn't despawn it, something else already did. Whereas, a method ensure_despawned says to me that "I want the state where this Entity is despawned". If something else despawned it first, then I can ensure it is despawned.

I'm not necessarily saying commands like despawn need to be renamed, but I do think their name plays an important role in how an end user will expect it to behave, especially in error conditions.

@mockersf
Copy link
Member

mockersf commented Oct 17, 2023

The "bug" is not necessarily in the double despawn, it's in the conditions that lead to it.

If an entity is despawned at the same time:

  • because the player killed it, which is a win condition
  • because it reaches the end of the map, which is a lose condition

in the end, the entity is despawned, and from the ECS point of view we don't care anymore about it. But there's an issue between those two systems, and we need a way to give that information back.

@maniwani
Copy link
Contributor

maniwani commented Oct 17, 2023

In my opinion this is not a fact but an opinion. For me, it's a bug if a system tries to act on an entity that was despawned by another. It can most of the times be ignored without consequences, but it's still a bug.

I'll speak objectively then.

  • We have been arbitrarily treating potential user error here as something bevy_ecs should internally panic over in all cases.
  • It's not possible to know if two systems produce commands that conflict (from your POV, not mine) until we try to apply them, and even then you won't necessarily be in a position to do anything about it (if the other system came from a plugin).
  • The choice of default behavior can only be "panic" or "don't panic".
  • If bevy_ecs does not panic over commands, that does not constitute a bug in bevy_ecs.
  • The ability to batch commands won't change any of this.

Is there an alternative "don't panic" that I'm not seeing? Many users have legitimate issues with these panics, all of us have a track record of choosing to reduce noisy things (ambiguities, #9822, #10145, etc.), and if apps continue to grow in complexity and number of plugins, then IMO it's inevitable users will encounter an increasing number of "false positives" here.

@mockersf
Copy link
Member

mockersf commented Oct 17, 2023

Is there an alternative "don't panic" that I'm not seeing?

Don't panic and have a proper error management. I'm ok with not panicking if we have a way to handle this from inside the app, I would actually prefer that to panics.

Panics are used as the "easy" error management due to the asynchronous nature of commands. There was a tentative in #2241 to add error handling.

@bushrat011899
Copy link
Contributor

bushrat011899 commented Oct 17, 2023

Don't panic and have a proper error management. I'm ok with not panicking if we have a way to handle this from inside the app.

Could we setup an Event for command errors, even if it was specific to each Command?

impl Command for Despawn {
    fn apply(self, world: &mut World) {
        if !world.despawn(self.entity) {
            world.send_event(CommandDespawnErrorEvent(self.entity));
        }
    }
}

Then a (possibly default) system could listen to those events and choose how to handle them? This would lean towards the "commands should never panic" side of this debate.

@mockersf
Copy link
Member

mockersf commented Oct 17, 2023

An event per command would be very hard to use without a way to link back to the context of the command creation, or at least the system.

I like:

  • a version of Command error handling #2241 without the performance degradation (or acceptable ones), and not panicking by default
  • an overhaul with commandless, command v2, commands as entities, or .... something new?

@bushrat011899
Copy link
Contributor

An event per command would be very hard to use without a way to link back to the context of the command creation, or at least the system.

You could link back to the system by inserting the Event handler system after the possible offending system, since runtime commands should only come from runtime systems anyway. To avoid the added overhead of many different command Event types, we could have a single type which contains a Box<dyn Error>, but I think that has worse ergonomics.

@dmyyy
Copy link
Contributor

dmyyy commented Oct 18, 2023

IIRC @HackerFoo also mentioned he did this for his personal fork of Bevy at one point

@james-j-obrien
Copy link
Contributor

How many use cases are there for needing to handle a duplicate despawn? To be clear I'm not opposed to having a more robust solution for handling fallible commands if it doesn't come with a performance regression but I think panic-ing is definitely the wrong behaviour.

I find it very easy to imagine cases where it's undesirable:
Let's say I have an entity A and an entity B each with component Lifespan. A is a parent of B and I have some system (unaware of the hierarchy) operating on Lifespan that says they should both be despawned. If I happen to submit the despawn A command first it will also despawn it's children and then when I despawn B I hit a panic even though I do want all of them to be despawned.

Similarly I could imagine an entity being despawned because it's too old and also out of bounds or some other combination of factors. You might not even see the panic the majority of the time and hit it unexpectedly when you hit the edge case.

I feel that when running systems in an ECS you are already operating on a snapshot of the world from the last time commands where applied. You might be modifying a component that an entity already has a remove command queued for (even if that is unlikely in practice). If multiple systems agree that an entity should be despawned I don't see that as an error at all.

IMO I agree with @maniwani that there should be no guarantees about the state of the world when you run a command. An entity might not have the component you are trying to remove, it might already have the one you are trying to insert. It might already be despawned when you try to despawn it. I could be convinced otherwise if I was shown how this was restricting some extremely desirable feature but I don't think that's the case at the moment. These assumptions make batching and applying commands much simpler as well as being a simpler mental model.

@MikMikus
Copy link

I have similar observations. We use bevy to visualization large chunks of data with complex multi-level parent-children structures.

In the app we have few systems for synchronizing and checking data related to an external source.
Sync systems can spawn or despawn major parts of data as "root parents" while the user can interact with children.

When the parent intends to recursively despawn, some of systems and events may still handle children entities in the current or next one tick.
In our case this is not problem, in next few ticks data will be synchronized. However, panicking during this process is a real pain.

Currently, to avoid this problem, we need to ensure that the systems are in the right order. Which still does not guarantee work without panic. It's easy to make a mistake and unnecessarily complicates the project.

github-merge-queue bot pushed a commit that referenced this issue Nov 29, 2023
# Objective

avoid panics from `calculate_bounds` systems if entities are despawned
in PostUpdate.

there's a running general discussion (#10166) about command panicking.
in the meantime we may as well fix up some cases where it's clear a
failure to insert is safe.

## Solution

change `.insert(aabb)` to `.try_insert(aabb)`
cart pushed a commit that referenced this issue Nov 30, 2023
# Objective

avoid panics from `calculate_bounds` systems if entities are despawned
in PostUpdate.

there's a running general discussion (#10166) about command panicking.
in the meantime we may as well fix up some cases where it's clear a
failure to insert is safe.

## Solution

change `.insert(aabb)` to `.try_insert(aabb)`
james7132 pushed a commit to james7132/bevy that referenced this issue Dec 1, 2023
# Objective

avoid panics from `calculate_bounds` systems if entities are despawned
in PostUpdate.

there's a running general discussion (bevyengine#10166) about command panicking.
in the meantime we may as well fix up some cases where it's clear a
failure to insert is safe.

## Solution

change `.insert(aabb)` to `.try_insert(aabb)`
@benfrankel
Copy link
Contributor

benfrankel commented Jan 4, 2024

In support of what's already been said, I think this is clearly a bevy bug. It causes sporadic hard-to-debug crashes for correct user logic like despawning an entity when its lifetime expires as well as when it exceeds some despawn radius, and having that rarely occur on the same frame.

EDIT: It seems double despawn is a warning now instead of a panic, which is better. I'm not sure which version changed it. There are examples for other combinations of commands that still panic like despawn + insert anyways.

Every game will run into this bug eventually, so every game has to reinvent the same 40-line DespawnSet resource and then never use commands.despawn. This just adds to the amount of boilerplate needed to make a bevy game.

Additionally, command error handling is something I've never felt the need for. In the vast majority of cases, I want command failure to be ignored. But even in the minority of cases where some command failure is actually causing a bug, I'd want debug logs to identify and fix the logic / scheduling bug in my code, not an error handling system to somehow "handle the command failure gracefully" on the fly.

@MatrixDev
Copy link

It might not be the most ecs-friendly idea, but can't we just have an optional callback for cases when error handling is neccesary?

let some_context = todo!()
let entity_commands = todo!()

// no panic, silent failure (warn!)
entity_commands.insert_new(Transform);

// no panic, failure is handled in the closure
entity_commands.insert_new(Transform).on_error(move |world| {
    println!("failure: {some_context}");
});

Pros:

  • no panics! (IMHO engine must not crash for such simple cases)
  • we could pass any additional context that might be needed to handle error in the closure
  • same approach can be used for all cases in question (insert, remove, despawn)

Cons:

  • we can't chain EntityCommands calls this way but this can be solved in multiple ways
  • needs allocation for a callback (most of the time callback will not be needed, so it should not be a problem)
  • &mut World access in the closure, requires error handling to be very cheap

PS: instead of the builder we could just add more arguments to insert-like functions, but than we'll need to add None in all places.

pub struct Insert<T: Bundle> {
    entity: Entity,
    bundle: T,
    on_error: Option<Box<dyn FnOnce(&mut World) + Send + 'static>>,
}

impl<T: Bundle> Command for Insert<T> {
    fn apply(self, world: &mut World) {
        if let Some(mut entity) = world.get_entity_mut(self.entity) {
            entity.insert(self.bundle);
        } else if let Some(on_error) = self.on_error {
            on_error(world);
        } else {
            warn!("Could not insert a bundle (of type `{}`) for entity {:?}", std::any::type_name::<T>(), self.entity);
        }
    }
}

pub struct InsertBuilder<'a, 'e, T: Bundle> {
    command: ManuallyDrop<Insert<T>>,
    commands: &'a mut EntityCommands<'e>,
}

impl<'a, 'e, T: Bundle> InsertBuilder<'a, 'e, T> {
    pub fn on_error(mut self, f: impl FnOnce(&mut World) + Send + 'static) -> Self {
        self.command.on_error = Some(Box::new(f));
        self
    }
}

impl<'a, 'e, T: Bundle> Drop for InsertBuilder<'a, 'e, T> {
    fn drop(&mut self) {
        // SAFETY: we always take command only during drop
        let command = unsafe { ManuallyDrop::take(&mut self.command) };
        self.commands.add(command);
    }
}

impl<'e> EntityCommands<'e> {
    pub fn insert_new<'a, T: Bundle>(&'a mut self, bundle: T) -> InsertBuilder<'a, 'e, T> {
        let command = Insert {
            bundle,
            entity: self.entity,
            on_error: None,
        };
        InsertBuilder {
            commands: self,
            command: ManuallyDrop::new(command),
        }
    }
}

@PROMETHIA-27
Copy link
Contributor

PROMETHIA-27 commented Nov 29, 2024

Just wanted to add that I also support making all commands infallible. I'm already having to roll try_ variants for every command i want to use because in my case there's not a good solution to prevent erroneous commands other than this. In particular it's UI code that looks roughly like the following: (pseudocode)

for updated_entity in updates {
    // some descendants may be in the updates query
    updated_entity.despawn_descendants();

    // may spawn new children and/or insert components
    apply_update(updated_entity);
}

this creates issues because an early entity in the query might despawn several entities further ahead in the query, causing them to not exist when their commands are processed, crashing the game.

One solution would be to build up a set of every despawned entity to then skip them later in the loop, but that seems suboptimal both performance-wise and code-cleanliness-wise.

@BenjaminBrienen BenjaminBrienen added the S-Ready-For-Implementation This issue is ready for an implementation PR. Go for it! label Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use S-Ready-For-Implementation This issue is ready for an implementation PR. Go for it! X-Controversial There is active debate or serious implications around merging this PR
Projects
None yet
Development

Successfully merging a pull request may close this issue.