Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible data race & use-after-free in std::env::var for unix implementation #114949

Closed
Nekrolm opened this issue Aug 17, 2023 · 7 comments · Fixed by #114968 or #115035
Closed

Possible data race & use-after-free in std::env::var for unix implementation #114949

Nekrolm opened this issue Aug 17, 2023 · 7 comments · Fixed by #114968 or #115035
Assignees
Labels
A-process Area: `std::process` and `std::env` C-bug Category: This is a bug. E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. O-unix Operating system: Unix-like T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@Nekrolm
Copy link

Nekrolm commented Aug 17, 2023

Investigating library sources for std::os::env::var/set_var, I have found that the copying of the env variable value is performed not under the lock:

pub fn getenv(k: &OsStr) -> Option<OsString> {
    // environment variables with a nul byte can't be set, so their value is
    // always None as well
    let s = run_with_cstr(k.as_bytes(), |k| {
        let _guard = env_read_lock();
        Ok(unsafe { libc::getenv(k.as_ptr()) } as *const libc::c_char)
       // lock is dropped here
    })
    .ok()?;
    if s.is_null() {
        None
    } else {
        // but access is here
        Some(OsStringExt::from_vec(unsafe { CStr::from_ptr(s) }.to_bytes().to_vec()))
    }
}

So there is a potential data race with setenv.

I tried this code with miri

fn main() {
    let t1 = std::thread::spawn(|| {
        let mut cnt = 0;
        for _ in 0..100000 {
            let var = std::env::var_os("HELLO");
            cnt += var.map(|v| v.len()).unwrap_or_default();
        }
        cnt
    });
    let t2 = std::thread::spawn(|| {
        for i in 0..100000 {
            let value = format!("helloooooooooooooo{i}");
            std::env::set_var("HELLO", &value);
        }
    });

    let cnt = t1.join().expect("ok");
    println!("{cnt}");
    t2.join();
}

And got

error: Undefined Behavior: Data race detected between (1) Read on thread `<unnamed>` and (2) Deallocate on thread `<unnamed>` at alloc3346+0x6. (2) just happened here
   --> /home/dmis/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys/unix/os.rs:613:26
    |
613 |             cvt(unsafe { libc::setenv(k.as_ptr(), v.as_ptr(), 1) }).map(drop)
    |                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Data race detected between (1) Read on thread `<unnamed>` and (2) Deallocate on thread `<unnamed>` at alloc3346+0x6. (2) just happened here
    |
help: and (1) occurred earlier here
   --> src/main.rs:30:23
    |
30  |             let var = std::env::var_os("HELLO");
    |                       ^^^^^^^^^^^^^^^^^^^^^^^^^
    = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
    = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
    = note: BACKTRACE (of the first span):
    = note: inside closure at /home/dmis/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys/unix/os.rs:613:26: 613:65
    = note: inside `std::sys::common::small_c_string::run_with_cstr::<(), [closure@std::sys::unix::os::setenv::{closure#0}::{closure#0}]>` at /home/dmis/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys/common/small_c_string.rs:43:18: 43:22
    = note: inside closure at /home/dmis/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys/unix/os.rs:611:9: 614:11
    = note: inside `std::sys::common::small_c_string::run_with_cstr::<(), [closure@std::sys::unix::os::setenv::{closure#0}]>` at /home/dmis/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys/common/small_c_string.rs:43:18: 43:22
    = note: inside `std::sys::unix::os::setenv` at /home/dmis/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys/unix/os.rs:610:5: 615:7
    = note: inside `std::env::_set_var` at /home/dmis/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/env.rs:347:5: 347:31
    = note: inside `std::env::set_var::<&str, &std::string::String>` at /home/dmis/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/env.rs:343:5: 343:43
note: inside closure
   --> src/main.rs:38:13

Have tried to move Some(OsStringExt::from_vec(unsafe { CStr::from_ptr(s) }.to_bytes().to_vec())) under the lock, but still got error with miri.

I was not able to trigger the real crash with it yet.

Meta

rustc --version --verbose:

rustc 1.71.1 (eb26296b5 2023-08-03)
binary: rustc
commit-hash: eb26296b556cef10fb713a38f3d16b9886080f26
commit-date: 2023-08-03
host: x86_64-unknown-linux-gnu
release: 1.71.1
LLVM version: 16.0.5

Bug was introduced by this commit: 86974b8 (PR: #93668)
Before that changes, read was under the lock.

@Nekrolm Nekrolm added the C-bug Category: This is a bug. label Aug 17, 2023
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Aug 17, 2023
@ChrisDenton ChrisDenton added T-libs Relevant to the library team, which will review and decide on the PR/issue. O-unix Operating system: Unix-like A-process Area: `std::process` and `std::env` and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Aug 17, 2023
@ShE3py
Copy link
Contributor

ShE3py commented Aug 17, 2023

@rustbot claim

@saethlin saethlin added the I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness label Aug 18, 2023
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Aug 18, 2023
@apiraino apiraino removed the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Aug 18, 2023
@the8472
Copy link
Member

the8472 commented Aug 18, 2023

😬

I think this happened because

  • implicit scoping of the lock guard, which makes it easy to shorten its lifetime
  • the rwlock isn't coupled to the resource via a lifetime, so it's easy for the extracted pointer to be used longer than the lock is valid

The first one could be made more visible by linting to add an explicit drop(guard) to make its lifetime more explicit in the code when a critical section ends. Especially when guard is otherwise unused which indicates that the lock doesn't directly own the resource it protects.
The second one could be solved by using a lifetime-tagged raw pointer and coupling that lifetime to the guard.

@ChrisDenton
Copy link
Member

Side note: even when fixed this assumes all other read/writes acquire Rust's lock and is potentially unsound if they do not. This has often been discussed with a view to making set_var unsafe. Arguably if that happens then the lock would be redundant, although it could also be argued it's still useful to help avoid some footguns even if you can't make it totally safe.

@RalfJung
Copy link
Member

Side note: even when fixed this assumes all other read/writes acquire Rust's lock and is potentially unsound if they do not. This has often been discussed with a view to making set_var unsafe. Arguably if that happens then the lock would be redundant, although it could also be argued it's still useful to help avoid some footguns even if you can't make it totally safe.

That's #90308.

@bors bors closed this as completed in 7b66abe Aug 20, 2023
@RalfJung
Copy link
Member

RalfJung commented Aug 20, 2023 via email

@RalfJung RalfJung reopened this Aug 20, 2023
@RalfJung RalfJung added E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. and removed I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness labels Aug 20, 2023
@ShE3py
Copy link
Contributor

ShE3py commented Aug 20, 2023

Something like

#[test] // miri shouldn't detect any data race in this fn
#[cfg_attr(all(target_family = "wasm", not(target_os = "wasi")), ignore)] // monothreaded platforms
fn test_env_get_set_multithreaded() {
    let t1 = std::thread::spawn(|| {
        let mut n = 0;
        
        for _ in 0..100 {
            let var = std::env::var_os("foo");
            n += var.map(|v| v.len()).unwrap_or_default();
        }
        
        n
    });
    
    let t2 = std::thread::spawn(|| {
        for i in 0..100 {
            let value = format!("bar{i:03}");
            std::env::set_var("foo", &value);
        }
    });

    let n = t1.join().unwrap();
    assert!(n % 2 == 0, "the sum of even values should be even");
    t2.join().unwrap();
}

?

@RalfJung
Copy link
Member

Yeah something like that. For Miri it's not really necessary to do this "sum of the lengths" computation, not sure why that would be particularly susceptible to a data race.

The one existing concurrency test uses #[cfg_attr(target_os = "emscripten", ignore)], so probably it would make sense to use the same attribute here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-process Area: `std::process` and `std::env` C-bug Category: This is a bug. E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. O-unix Operating system: Unix-like T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
8 participants