Fallible systems #16589

NthTensor · 2024-12-01T22:19:08Z

Objective

Error handling in bevy is hard. See for reference #11562, #10874 and #12660. The goal of this PR is to make it better, by allowing users to optionally return Result from systems as outlined by Cart in #14275 (comment).

Solution

This PR introduces a new ScheuleSystem type to represent systems that can be added to schedules. Instances of this type contain either an infallible BoxedSystem<(), ()> or a fallible BoxedSystem<(), Result>. ScheuleSystem implements System<In = (), Out = Result> and replaces all uses of BoxedSystem in schedules. The async executor now receives a result after executing a system, which for infallible systems is always Ok(()). Currently it ignores this result, but more useful error handling could also be implemented.

Aliases for Error and Result have been added to the bevy_ecs prelude, as well as const OK which new users may find more friendly than Ok(()).

Testing

Currently there are not actual semantics changes that really require new tests, but I added a basic one just to make sure we don't break stuff in the future.
The behavior of existing systems is totally unchanged, including logging.
All of the existing systems tests pass, and I have not noticed anything strange while playing with the examples

Showcase

The following minimal example prints "hello world" once, then completes.

use bevy::prelude::*;

fn main() {
    App::new().add_systems(Update, hello_world_system).run();
}

fn hello_world_system() -> Result {
    println!("hello world");
    Err("string")?;
    println!("goodbye world");
    OK
}

Migration Guide

This change should be pretty much non-breaking, except for users who have implemented their own custom executors. Those users should use ScheduleSystem in place of BoxedSystem<(), ()> and import the System trait where needed. They can choose to do whatever they wish with the result.

Current Work

Fix tests & doc comments
Write more tests
Add examples
Draft release notes

Draft Release Notes

As of this release, systems can now return results.

First a bit of background: Bevy has hisotrically expected systems to return the empty type (). While this makes sense in the context of the ecs, it's at odds with how error handling is typically done in rust: returning Result::Error to indicate failure, and using the short-circuiting ? operator to propagate that error up the call stack to where it can be properly handled. Users of functional languages will tell you this is called "monadic error handling".

Not being able to return Results from systems left bevy users with a quandry. They could add custom error handling logic to every system, or manually pipe every system into an error handler, or perhaps sidestep the issue with some combination of fallible assignents, logging, macros, and early returns. Often, users would just litter their systems with unwraps and possible panics.

While any one of these approaches might be fine for a particular user, each of them has their own drawbacks, and none makes good use of the language. Serious issues could also arrise when two different crates used by the same project made different choices about error handling.

Now, by returning results, systems can defer error handling to the application itself. It looks like this:

// Previous, handling internally
app.add_systems(my_system)
fn my_system(window: Query<&Window>) {
   let Ok(window) = query.get_single() else {
       return;
   };
   // ... do something to the window here
}

// Previous, handling externally
app.add_systems(my_system.pipe(my_error_handler))
fn my_system(window: Query<&Window>) -> Result<(), impl Error> {
   let window = query.get_single()?;
   // ... do something to the window here
   Ok(())
}

// Previous, panicking
app.add_systems(my_system)
fn my_system(window: Query<&Window>) {
   let window = query.single();
   // ... do something to the window here
}

// Now 
app.add_systems(my_system)
fn my_system(window: Query<&Window>) -> Result {
    let window = query.get_single()?;
    // ... do something to the window here
    Ok(())
}

There are currently some limitations. Systems must either return () or Result<(), Box<dyn Error + Send + Sync + 'static>>, with no in-between. Results are also ignored by default, and though implementing a custom handler is possible, it involves writing your own custom ecs executor (which is not recomended).

Systems should return errors when they cannot perform their normal behavior. In turn, errors returned to the executor while running the schedule will (eventually) be treated as unexpected. Users and library authors should prefer to return errors for anything that disrupts the normal expected behavior of a system, and should only handle expected cases internally.

We have big plans for improving error handling further:

Allowing users to change the error handling logic of the default executors.
Adding source tracking and optional backtraces to errors.
Possibly adding tracing-levels (Error/Warn/Info/Debug/Trace) to errors.
Generally making the default error logging more helpful and inteligent.
Adding monadic system combininators for fallible systems.
Possibly removing all panicking variants from our api.

bushrat011899 · 2024-12-01T22:51:46Z

Even without the actual error handling benefits this provides, just having a more blessed way to use ? in systems will be really nice. I know we can just pipe the Result with current systems, but this will hide that bit of extra boilerplate. This also pairs nicely with making more APIs return Result instead of Option, and also makes panicking variants less important (possibly even removable TBH).

bushrat011899

I agree this probably needs an example, but I like the approach. Opens up the possibility of having error handlers in the future, which would resolve the to-panic or not to-panic debate entirely. This also lays the groundwork for how fallibility in Commands could work. Really nice work!

crates/bevy_ecs/src/lib.rs

crates/bevy_ecs/src/schedule/executor/mod.rs

crates/bevy_ecs/src/schedule/executor/multi_threaded.rs

bushrat011899 · 2024-12-02T03:12:27Z

It was mentioned in Discord, but I'll include it here for posterity: with fallible systems getting first-class treatment, there may be room to consider removing the panicking variants of certain functions (e.g., Query::get_entity and Query::entity), since the choice of behaviour could be controlled by a system error handler. This would be a large DX win, since the "proper" methods would get the shorter names, and it'd reduce the API surface area.

Co-authored-by: Zachary Harrold <[email protected]>

NthTensor · 2024-12-02T04:31:38Z

there may be room to consider removing the panicking variants of certain functions

That's in line with the third point Cart proposed in #14275 (comment). He indicated then that it was important to land all the related changes in a single release cycle, and I agree. This PR provides his (1), what (2) and (3) look like is up to @alice-i-cecile and the other designated ecs experts.

tychedelia

Amazing to see how straightforward this is, all things considered. Very excited!

crates/bevy_ecs/src/result.rs

teohhanhui · 2024-12-02T06:08:15Z

as well as const OK which new users may find more friendly than Ok(()).

Why? This just makes the code more jarring compared to the rest of the Rust ecosystem, and more cognitive load to switch between returning nothing vs. returning some value.

It'd make sense if it's something useful like https://docs.rs/anyhow/latest/anyhow/fn.Ok.html

NthTensor · 2024-12-02T06:11:45Z

This just makes the code more jarring compared to the rest of the Rust ecosystem

In this I am trying to defer to my understanding of Cart's preferences. He uses a const in the linked issue, and I believe has expressed that Ok(()) is sort of confusing and cumbersome. No strong preference here from me really.

NthTensor · 2024-12-02T21:49:34Z

@alice-i-cecile requesting removal of the M-Needs-Release-Note label (added in pr description because we don't have a good place for it yet).

crates/bevy_ecs/src/schedule/executor/mod.rs

crates/bevy_ecs/src/system/schedule_system.rs

alice-i-cecile · 2024-12-03T15:53:13Z

@alice-i-cecile requesting removal of the M-Needs-Release-Note label (added in pr description because we don't have a good place for it yet).

Just like migration guides, we should keep this label around even after they're written for searchability and tooling. We might wan to rename that to be more clear though 🤔

bushrat011899 · 2024-12-04T01:11:39Z

examples/ecs/fallible_systems.rs

+    ));
+
+    // Create a new sphere mesh:
+    let mut sphere_mesh = Sphere::new(1.0).mesh().ico(7)?;


What a sight to behold. Once proper (user configurable) handlers are added in a follow-up this will be perfect. Bevy APIs can be simplified and made more reliable without any loss in ergonomics (IMO). Adding Ok(()) at the end of a system is a small price to pay that (hopefully) Rust will solve on its own (since the issue isn't specific to Bevy)

Yeah :) this example is really just a minimal placeholder. Once we have handlers hooked up, I intend to go through and update all the examples to use this style (where it makes sense).

BenjaminBrienen · 2024-12-04T01:29:01Z

@alice-i-cecile requesting removal of the M-Needs-Release-Note label (added in pr description because we don't have a good place for it yet).

Just like migration guides, we should keep this label around even after they're written for searchability and tooling. We might wan to rename that to be more clear though 🤔

M-#[require(ReleaseNote)]

NthTensor · 2024-12-04T02:53:24Z

Alright, I added the most basic tests and examples in the world. There will be more to do there when handlers are hooked up. Ready for review.

examples/ecs/fallible_systems.rs

crates/bevy_ecs/src/schedule/config.rs

alice-i-cecile

@nth this is good to go once it's merge-conflict free. I do prefer the "everything is fallible" approach by WrongShoe, but that's easily left to a follow-up refactor. Let's get the ball rolling here.

Allow systems to return results

b80265b

NthTensor added C-Feature A new feature, making something new possible A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Dec 1, 2024

Fix typo

1d5f7b9

NthTensor requested review from bushrat011899, tychedelia and alice-i-cecile December 1, 2024 22:21

Fix formatting

f7bbc21

NthTensor and others added 8 commits December 1, 2024 18:35

Cleanup pass

abd9e91

Merge branch 'main' into fallible_systems

63a01b0

Replace ok function with const

add1247

Suppress warning about never

00df090

Fix tests

abe7c3a

Fix formatting

311a30f

Move lint allow to module

fa753a7

Fix doclink

1136dd1

bushrat011899 approved these changes Dec 2, 2024

View reviewed changes

NthTensor and others added 3 commits December 1, 2024 22:25

Add note to suppressed warning

fde9764

Co-authored-by: Zachary Harrold <[email protected]>

Update crates/bevy_ecs/src/schedule/executor/mod.rs

2ad1b7b

Co-authored-by: Zachary Harrold <[email protected]>

Update crates/bevy_ecs/src/schedule/executor/mod.rs

ecce62b

Co-authored-by: Zachary Harrold <[email protected]>

NthTensor mentioned this pull request Dec 2, 2024

Our API suggests that panicking should be the default #14275

Open

tychedelia approved these changes Dec 2, 2024

View reviewed changes

crates/bevy_ecs/src/result.rs Show resolved Hide resolved

NthTensor mentioned this pull request Dec 2, 2024

System piping for error/warn/info logging does not show error source #8638

Open

hymm reviewed Dec 3, 2024

View reviewed changes

crates/bevy_ecs/src/schedule/executor/mod.rs Outdated Show resolved Hide resolved

crates/bevy_ecs/src/system/schedule_system.rs Show resolved Hide resolved

NthTensor added 2 commits December 2, 2024 22:50

Implement first pass of reviewer feadback

0b0d60d

Fix prelude

2bffe65

MiniaczQ self-requested a review December 3, 2024 16:29

NthTensor and others added 3 commits December 3, 2024 19:54

Add basic fallible systems example

7e6c84b

Fix typo

2e884b5

Merge branch 'main' into fallible_systems

193684a

bushrat011899 reviewed Dec 4, 2024

View reviewed changes

Fix example comments

3636a91

Add basic fallible system test

4786816

NthTensor marked this pull request as ready for review December 4, 2024 02:50

NthTensor added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged labels Dec 4, 2024

Add docs for system return type

189ee7d

alice-i-cecile reviewed Dec 5, 2024

View reviewed changes

examples/ecs/fallible_systems.rs Outdated Show resolved Hide resolved

alice-i-cecile added the M-Needs-Migration-Guide A breaking change to Bevy's public API that needs to be noted in a migration guide label Dec 5, 2024

alice-i-cecile reviewed Dec 5, 2024

View reviewed changes

crates/bevy_ecs/src/schedule/config.rs Show resolved Hide resolved

alice-i-cecile approved these changes Dec 5, 2024

View reviewed changes

NthTensor and others added 3 commits December 5, 2024 16:29

Hide fallible/infallible marker types

95b64c0

Merge remote-tracking branch 'upstream/main' into fallible_systems

151b7ba

Merge branch 'main' into fallible_systems

5afedeb

alice-i-cecile added this pull request to the merge queue Dec 5, 2024

Merged via the queue into bevyengine:main with commit 0070514 Dec 5, 2024
31 of 32 checks passed

alice-i-cecile mentioned this pull request Dec 8, 2024

Fallible systems need to report failures #16718

Open

ChristopherBiscardi mentioned this pull request Dec 23, 2024

Add TileStorage::drain and return removed entities in remove StarArawn/bevy_ecs_tilemap#586

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fallible systems #16589

Fallible systems #16589

NthTensor commented Dec 1, 2024 •

edited

Loading

bushrat011899 commented Dec 1, 2024

bushrat011899 left a comment

bushrat011899 commented Dec 2, 2024

NthTensor commented Dec 2, 2024 •

edited

Loading

tychedelia left a comment

teohhanhui commented Dec 2, 2024 •

edited

Loading

NthTensor commented Dec 2, 2024

NthTensor commented Dec 2, 2024 •

edited

Loading

alice-i-cecile commented Dec 3, 2024

bushrat011899 Dec 4, 2024

NthTensor Dec 4, 2024

BenjaminBrienen commented Dec 4, 2024

NthTensor commented Dec 4, 2024

alice-i-cecile left a comment

Fallible systems #16589

Fallible systems #16589

Conversation

NthTensor commented Dec 1, 2024 • edited Loading

Objective

Solution

Testing

Showcase

Migration Guide

Current Work

Draft Release Notes

bushrat011899 commented Dec 1, 2024

bushrat011899 left a comment

Choose a reason for hiding this comment

bushrat011899 commented Dec 2, 2024

NthTensor commented Dec 2, 2024 • edited Loading

tychedelia left a comment

Choose a reason for hiding this comment

teohhanhui commented Dec 2, 2024 • edited Loading

NthTensor commented Dec 2, 2024

NthTensor commented Dec 2, 2024 • edited Loading

alice-i-cecile commented Dec 3, 2024

bushrat011899 Dec 4, 2024

Choose a reason for hiding this comment

NthTensor Dec 4, 2024

Choose a reason for hiding this comment

BenjaminBrienen commented Dec 4, 2024

NthTensor commented Dec 4, 2024

alice-i-cecile left a comment

Choose a reason for hiding this comment

NthTensor commented Dec 1, 2024 •

edited

Loading

NthTensor commented Dec 2, 2024 •

edited

Loading

teohhanhui commented Dec 2, 2024 •

edited

Loading

NthTensor commented Dec 2, 2024 •

edited

Loading