Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to UseAfterFree in master branch #1032

Closed
AngelicosPhosphoros opened this issue Dec 8, 2020 · 11 comments · Fixed by #1074
Closed

Ability to UseAfterFree in master branch #1032

AngelicosPhosphoros opened this issue Dec 8, 2020 · 11 comments · Fixed by #1074
Labels
A-ECS Entities, components, systems, and events C-Bug An unexpected or incorrect behavior

Comments

@AngelicosPhosphoros
Copy link
Contributor

AngelicosPhosphoros commented Dec 8, 2020

Bevy version
From Cargo.lock

[[package]]
name = "bevy"
version = "0.3.0"
source = "git+https://github.com/bevyengine/bevy.git#3d386a77b406ee65d6965ad91b29ea3acb8bcf3a"
dependencies = [
 "bevy_internal",
]

Operating system & version

Windows 10 but I guess it is not relevant

What you did

This sample program. Notice 'static lifetime of Res (ResMut or Query works too, I believe).

use std::sync::{Mutex, MutexGuard};

use bevy::prelude::*;
use once_cell::sync::OnceCell;

static INSTANCE: OnceCell<Mutex<Option<Res<'static, String>>>> = OnceCell::new();

fn main() {
   INSTANCE.get_or_init(||Mutex::new(None));
   App::build()
       .add_resource("HelloWorld".to_string())
       .add_system(ub_system.system())
       .run();
   let guard: MutexGuard<Option<Res<'static, String>>> = INSTANCE.get().expect("uninited?").lock().expect("Failed to acquire lock");
   let resource: &Res<'static, String> = guard.as_ref().unwrap();
   println!("Wow, I read freed memory with such data: {:?}", (**resource).as_bytes());
}

fn ub_system(string: Res<'static, String>){
   let mut guard = INSTANCE.get().expect("uninited?").lock().expect("Failed to acquire lock");
   *guard = Some(string);
}

What you expected to happen

This code should not compile because this is UB.

What actually happened

In my case it prints some garbage.

Wow, I read freed memory with such data: [0, 223, 3, 195, 196, 1, 0, 0, 176, 169]
PS D:\UserFolder\poc_bevy_unsound> cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.31s
     Running `target\debug\poc_bevy_unsound.exe`
Wow, I read freed memory with such data: [224, 84, 1, 115, 128, 1, 0, 0, 32, 151]
PS D:\UserFolder\poc_bevy_unsound> cargo run --release
    Finished release [optimized] target(s) in 0.29s
     Running `target\release\poc_bevy_unsound.exe`
Wow, I read freed memory with such data: [112, 58, 118, 247, 229, 1, 0, 0, 112, 132]
PS D:\UserFolder\poc_bevy_unsound> cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.28s
     Running `target\debug\poc_bevy_unsound.exe`
Wow, I read freed memory with such data: [144, 58, 46, 167, 239, 1, 0, 0, 48, 97]

Additional information

I recently tried to implement own ECS system for my game and bevy was my inspiration. I implement it mostly for fun and I found same bug in my ECS during implementation.
We discussed it on Reddit and I found a solution.
Look at this playground link.

So I suggest:

  1. Add lifetime parameter to the IntoSystem trait Let assume it named as 'app.
  2. Use this lifetime in implementation of system for functions and get_param method. Add 'app constraints to requested types for function.
  3. Make system method in IntoSystem trait unsafe (because it is).
  4. Change add_system method of App to accept IntoSystem trait implementation and use it like
fn my_system(v: Res<T>){}
App::build().add_system(my_system)

And implement it like:

fn add_system<'a, T>(&'a mut self, sys: T) where T: 'static + IntoSystem<'a>{
    let system = unsafe{ sys.system() };
    // old code goes here
}

This would prevent passing function that accepts Res<'static, T> into App unless it is 'static itself but mutation of 'static variable available only in unsafe code so it is fine.

Note: Code example above doesn't compile on Bevy 0.3.0 from crates.io.

@AngelicosPhosphoros
Copy link
Contributor Author

AngelicosPhosphoros commented Dec 8, 2020

Also, I would note, that it isn't code which someone would write reading bevy tutorials but in case when user wants store value into static Mutex, rustc suggest make Res<T> explicitely 'static instead of saying that it is impossible. So this bug would affect people who trusts the compiler suggestions in solving their mistakes.

@memoryruins memoryruins added C-Bug An unexpected or incorrect behavior A-ECS Entities, components, systems, and events labels Dec 8, 2020
@cart
Copy link
Member

cart commented Dec 9, 2020

Thanks for identifying the problem (and a solution!) I definitely agree that we have a problem here. I'll make it my next priority to fix this (after wrapping up the schedule work).

@cart
Copy link
Member

cart commented Dec 14, 2020

Alright I spent some time with this and tried a bunch of different options.

The best solution I've found so far is completely safe:

  • Systems cannot use Res<'static, T> to "break out" of the normal safety guarantees.
  • sys.system() is a safe operation in every context.

Its basically a combination of the Bevy 0.3 approach and the Bevy master approach. The one big downside I've found so far is that it prevents app.add_system(sys) from working. app.add_system(sys.system()) is required. I'm hoping to find a way around it, but so far I haven't.

To test different approaches I built a "minimal" reproduction of the Bevy abstractions (uploaded below)

I included various experiments:

  • A minimal reproduction of the current Bevy master approach
    • unsafe because you can "break out" of system scopes by giving params a static lifetime
  • Add a lifetime to SystemParam, and use for<'a> SystemParam<'a> in IntoSystem
    • This partially compiles (and would prevent 'static lifetimes), but system fns don't match the IntoSystem impl
  • Completely remove lifetimes from systems params (ex: Res<T> instead of Res<'a, T>)
    • This is clearly unsafe because Res no longer has a lifetime and could be moved out of the function call scope
  • An attempt at @AngelicosPhosphoros's solution
    • This protects against the 'static problem (when adding systems via an app builder), but it has two major problems:
      1. It requires putting a lifetime on System, which makes it unboxable
      2. It makes .system() much less versatile, because it is unsafe, and many .system() calls in Bevy don't have access to the App/World/Resource lifetimes.
  • The Bevy 0.3 + Bevy master approach (main.rs in the attached experiments)
    • Does everything we want it to, with the exception of allowing app.add_system(sys), which is a real shame. Removing .system() was a nice ergonomics boost, but safety comes first!

@cart
Copy link
Member

cart commented Dec 14, 2020

rust_play.zip

@cart
Copy link
Member

cart commented Dec 14, 2020

Ah wait that wasn't an accurate summary. The "minimal reproduction of the current Bevy master approach" I mentioned (e2e_with_param_transmute.rs) is actually a variant of @AngelicosPhosphoros's solution. It actually behaves pretty well:

  • 'static hacking won't compile
  • app.add_system(sys) works

The only real downside is that sys.system() is very unsafe / relies on the caller manually anchoring the lifetime to something valid.

edit: also it requires std::mem::transmute to coerce the lifetime to match FuncSystem's requirements (which is not ideal)

@cart
Copy link
Member

cart commented Dec 14, 2020

So I think it comes down to: do we want app.add_system(sys) or sys.system() safety more?

@cart
Copy link
Member

cart commented Dec 14, 2020

Updated examples (and re-added a bevy_master example):

rust_play.zip

@cart
Copy link
Member

cart commented Dec 15, 2020

One more example update (fixed the e2e_with_param_transmute example to actually work and modified main.rs to prove that system-owned values like Commands work)

rust_play.zip

@cart
Copy link
Member

cart commented Dec 15, 2020

Another loss in functionality in the main.rs solution: captured values in system closures are no longer supported:

let hi = vec![1];
App::build()
    .add_system(move || println!("{:?}", &hi))
    .run();

This didn't work in Bevy 0.3 either, but it did work on Bevy master.

@AngelicosPhosphoros
Copy link
Contributor Author

It requires putting a lifetime on System, which makes it unboxable

It is not necessary.
I finished with such code in my project (copypaste with some code removed to show only key points):

pub struct NewTypeOverBoxSystem(pub(crate) Box<dyn System>);
impl <template args> IntoSystem<'app, ... > for F 
        where
            Out: Into<TicksToSkip>,
            F: 'static + Fn(
                $(command_buff_type!($command_buffer; &mut CommandBuffer),)?
                $(Res<'app, $tres>,)*
                $(ResMut<'app, $tresmut>,)*
                $($tquery,)*
            ) -> Out,
            $($tres: Storable,)*
            $($tresmut: Storable,)*
            $($tquery: query_trait::QueryTrait<'app>,)*{
// Safety: Caller must ensure that 'app is not static and result doesn't outlive 'app
unsafe fn make_system(self) -> NewTypeOverBoxSystem{
                 ....  // Some validation here and accesses gathering

                #[allow(unused_unsafe)]
                let tick_func = move |command_buffer_arg: &mut CommandBuffer,
                                    resource_storage: &ResourceStorage,
                                    archetypes: &[&[&Archetype]]| -> TicksToSkip {

                    // Unsafe trick to make resulting System static.
                    // Caller of `make_system` should ensure that 'app is not static
                    #[allow(unused)]
                    let command_buffer_arg: &'app mut CommandBuffer = unsafe{
                        &mut *(command_buffer_arg as *mut _)
                    };
                    #[allow(unused)]
                    let resource_storage: &'app ResourceStorage = unsafe {
                        &*(resource_storage as *const _)
                    };
                    let archetypes: &[&[&'app Archetype]] = unsafe {
                        std::mem::transmute(archetypes)
                    };

                    $(
                        let $tres: Res<$tres> = resource_storage.get_resource();
                    )*
                    $(
                        let $tresmut: ResMut<$tresmut> = resource_storage.get_resource_mut();
                    )*

                    // This assert removes bounds checks for arch_index
                    // Also it validates input
                    assert_eq!(archetypes.len(), count_macro!($($tquery),*));
                    #[allow(unused)]
                    let arch_index = 0usize;

                    $(
                        let $tquery: $tquery = unsafe {
                            $tquery::from_archetypes(Archetype::convert_to_wrappers(archetypes[arch_index]))
                        };
                        #[allow(unused)]
                        let arch_index = arch_index + 1;
                    )*

                    self(
                        $(command_buff_arg!($command_buffer; command_buffer_arg),)?
                        $($tres,)*
                        $($tresmut,)*
                        $($tquery,)*
                    ).into()
                };
           NewTypeOverBoxSystem(Box::new(FunctionSystem {
                    name_func: type_name::<Self>,
                    access_func,
                    tick_func,
          }))
}

I don't need to make System in my project public so I accept with function like:

        struct App {}
        impl Drop for App {
            fn drop(&mut self) { /* dropping systems */ }
        }
        impl App {
            fn add_system<'a, T, Arg>(&'a mut self, system: T)
            where
                T: 'static + IntoSystem<'a, Arg>,
            {
                let system = unsafe { system.make_system() };
                // store system in private fields so user cannot use it.
            }
        }

Also I use newtype primarly to avoid leaking System trait out of ECS bounds.
If you want to allow users to provide manually implemented Systems, you can just implement this trait for System too:

impl<'app, T> IntoSystem<'app, T> for T where T: SpecialSystemTraitForExternalImplementers + 'static{
   unsafe fn make_system(self) -> NewTypeOverBoxSystem{
       ....
   }
}

So finally I have a function that accepts something convertible to System instead of System so all unsafety is hidden inside ECS.
My code is not ready (because I work on it nearly 8 hours on week but it is fun and I learn Rust a lot doing it) so look on it as a concept.

I think, it is not very big problem for Bevy to make this switch:

// old way
App::build()
   .add_system(my_func.system())
   .add_system(MyManualSystem::new().system())
   .run();
// new way
// Safe and maybe even more convenient
App::build()
   .add_system(my_func)
   .add_system(move || {}) // closure system
   .add_system(MyManualSystem::new())
   .run();

Especially, if you explicitly stated that Bevy is not stable.

If you worry that monomorphied add_system would affect compile time badly, you can use such trick

// Not monomorphied: fast compilation
// Cannot be called from outside
fn add_system_impl(&mut self, system: Box<dyn System>){...}
// Very little function: fast compilation
pub fn add_system<'app, T: 'static + IntoSystem<'app, B>, B>(&'app mut self, system: T){
   unsafe{
       // Safe because self and trait is bound by 'app
       // Also cannot be 'static unless unsafe in caller because 'static mut is unsafe
       let boxed_system = system.system();
       self.add_system_impl(boxed_system);
   }
}

@cart
Copy link
Member

cart commented Dec 15, 2020

It is not necessary. [the lifetime on system]

Haha I just realized that the "e2e_with_param_transmute.rs" example doesn't have a lifetime on system. So I guess I sorted that out already then promptly forgot.

I think, it is not very big problem for Bevy to make this switch:
If you worry that monomorphied add_system would affect compile time badly, you can use such trick

Thats literally the exact Bevy ECS master api, so I'm already well convinced that its "worth it" 😄

My biggest hangup with the approach you're suggesting is that it ties the lifetime to the "wrong thing" and severely limits contexts where you can create systems.

Systems shouldn't need to be anchored to App/World/Resource lifetimes because they have no relationship to that state. They are essentially boxed functions with some extra internal state.

By doing so, we are preventing all present and future scenarios that require non-app system construction.

Ex: With your approach its unsafe to allow users to create Schedules / Stages outside of the context of an app, as the Schedule could have any lifetime. Almost everything added in this PR would be unsafe #1021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Bug An unexpected or incorrect behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants