Miri hangs in spin loops forever #1388

RalfJung · 2020-05-03T10:16:58Z

The following program loops forever in Miri, but terminates as expected with normal rustc:

use std::thread;
use std::sync::atomic::{AtomicUsize, Ordering};

static FLAG: AtomicUsize = AtomicUsize::new(0);

fn main() {
    let j = thread::spawn(|| {
        while FLAG.load(Ordering::Acquire) == 0 {
            // spin and wait
        }
    });
    thread::yield_now();
    FLAG.store(1, Ordering::Release);
    j.join().unwrap();
}

One could argue the spin loop should use hint::spin_loop();, but that is unstable -- and it seems reasonable to expect the scheduler to de-schedule a thread at some point even if it does not yield.

Cc @vakaras

The text was updated successfully, but these errors were encountered:

oli-obk · 2020-05-03T10:21:38Z

We could do round robin, where every terminator causes the next thread to get activated.

RalfJung · 2020-05-03T10:39:39Z

Ah turns out there is a stable version of the same intrinsic, std::sync::atomic::spin_loop_hint().

We could do round robin, where every terminator causes the next thread to get activated.

I was more thinking "switch every N terminators", but N = 1 is certainly a possible choice. ;) We might make this configurable even.

RalfJung · 2020-08-19T09:45:48Z

There's an interesting testcase in once_cell where even adding a spin loop hint does not help. My guess is that the scheduler keeps swapping between two of the spinning threads so that the others are starved.

RalfJung · 2021-04-23T14:25:05Z

I think here is another example where yield_now is insufficient:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=f3ee4d476185291766d413ddeaa425ad
This test hangs in Miri; I assume what happens is that the two threads with their while !THREAD2_LAUNCHED.load(Ordering::SeqCst) loops are playing ping-pong and the thread that would set THREAD2_LAUNCHED never gets run.

I assume @JCTyblaidd's PR #1651 would help here.

vakaras · 2021-04-23T14:31:03Z

I think here is another example where yield_now is insufficient:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=f3ee4d476185291766d413ddeaa425ad
This test hangs in Miri; I assume what happens is that the two threads with their while !THREAD2_LAUNCHED.load(Ordering::SeqCst) loops are playing ping-pong and the thread that would set THREAD2_LAUNCHED never gets run.

This is the most likely scenario because the threads with the lower ID get priority (simply because we pick them by traversing the list).

Kixunil · 2021-10-08T10:03:35Z

Got bitten by this recently when I forgot to call thread::park() and attributed the spin loop to slowness/experimental nature of Miri. 😅 Tried to look into the docs and didn't find it. Found it when refactoring the code later.

I think it'd be useful to document this for now, I can send a PR if you agree.
Long-term, if some kind of hint is required, would it be feasible to have some heuristics where Miri could detect basic instances of this and warn?

Anyway, thanks for your great work on this tool! I feel much better now that I see my unsafe code is probably correct. :)

RalfJung · 2021-10-09T15:49:32Z

Long-term, if some kind of hint is required, would it be feasible to have some heuristics where Miri could detect basic instances of this and warn?

Seems better to use a different scheduling algorithm instead. One version of that started in #1651, but was not completed. A simpler alternative would be a round-robin scheduler.

Until that happens, I agree documenting this better would be good, and a PR would be definitely welcome. :)

Kixunil · 2021-10-09T20:21:53Z

Sorry for being unclear. It seems to me that a spinloop without spin_loop_hint() is undesirable so warning about that could be nice? Although that's probably something for clippy, but I'm not sure if clippy is powerful enough to notice that.

RalfJung · 2021-10-11T21:44:04Z

It seems to me that a spinloop without spin_loop_hint() is undesirable so warning about that could be nice?

Hm, that might be the case (not sure if everyone would agree) but I don't think Miri is necessarily the tool for detecting such things. None of the solutions I know of that would make Miri better at running such code would provide any avenue towards having such a warning.

Kixunil · 2021-10-12T09:10:25Z

Sure, thanks!

Document threading support a bit more This adds a few known limitations around threading to the README and suggests the users to look into GitHub issues to learn more. Addresses #1388 (comment)

831: Add spin loop hints in tests for Miri r=taiki-e a=cbeuw This is a better way to do #829 Miri does not have a pre-emptive scheduler, so once the execution falls into a spin loop it'll hang forever: rust-lang/miri#1388 Similar measures (`yield_now()`) are already present in [some other tests](https://github.com/crossbeam-rs/crossbeam/blob/master/crossbeam-queue/tests/array_queue.rs), but it's missing here 832: Fix links to macro rules in docs r=taiki-e a=alygin The fix provides explicit links to macro rules in docs because they [are not resolved globally anymore](rust-lang/rust#96676). Co-authored-by: Andy Wang <[email protected]> Co-authored-by: Andrew Lygin <[email protected]>

RalfJung · 2022-06-05T16:35:00Z

There's an interesting testcase in once_cell where even adding a spin loop hint does not help. My guess is that the scheduler keeps swapping between two of the spinning threads so that the others are starved.

Here's a self-contained testcase for that situation:

use std::thread;
use std::sync::atomic::{AtomicUsize, Ordering};

static FLAG: AtomicUsize = AtomicUsize::new(0);

// When a thread yields, Miri's scheduler will pick the thread with the lowest ID
// that can run. IDs are assigned in thread creation order.
// This means we can make 2 threads ping-pong with each other while
// really there is a 3rd thread that we should schedule to make progress.

fn main() {
    let waiter1 = thread::spawn(|| {
        while FLAG.load(Ordering::Acquire) == 0 {
            // spin and wait
            thread::yield_now();
        }
    });
    let waiter2 = thread::spawn(|| {
        while FLAG.load(Ordering::Acquire) == 0 {
            // spin and wait
            thread::yield_now();
        }
    });
    let progress = thread::spawn(|| {
        FLAG.store(1, Ordering::Release);
    });
    // The first `join` blocks the main thread and thus takes it out of the equation.
    waiter1.join().unwrap();
    waiter2.join().unwrap();
    progress.join().unwrap();
}

RalfJung · 2022-06-05T18:25:07Z

#2197 helps with that. So after that PR, at least explicitly yielding spinloops should always work in Miri.

RalfJung · 2022-06-05T18:35:24Z

I think here is another example where yield_now is insufficient: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=f3ee4d476185291766d413ddeaa425ad This test hangs in Miri; I assume what happens is that the two threads with their while !THREAD2_LAUNCHED.load(Ordering::SeqCst) loops are playing ping-pong and the thread that would set THREAD2_LAUNCHED never gets run.

That test (originally by @jethrogb) also works now, but I had to change it a little. Unchanged, it double-panics:

thread '<unnamed>' panicked at 'use of std::thread::current() is not possible after the thread's local data has been destroyed', /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/thread/mod.rs:677:35
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/panicking.rs:584:5
   1: std::rt::panic_fmt
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/panicking.rs:142:14
   2: std::rt::panic_display::<&str>
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/panicking.rs:72:5
   3: core::panicking::panic_str
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/panicking.rs:56:5
   4: std::option::expect_failed
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/option.rs:1874:5
   5: std::option::Option::<std::thread::Thread>::expect
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/option.rs:738:21
   6: std::thread::current
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/thread/mod.rs:677:5
   7: std::sync::mpsc::blocking::tokens
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sync/mpsc/blocking.rs:30:42
   8: std::sync::mpsc::sync::wait::<()>
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sync/mpsc/sync.rs:110:38
   9: std::sync::mpsc::sync::Packet::<()>::recv
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sync/mpsc/sync.rs:280:25
  10: std::sync::mpsc::Receiver::<()>::recv
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sync/mpsc/mod.rs:1178:47
  11: <launcher::{closure#0}::RecvOnDrop as std::ops::Drop>::drop
             at tests/pass/concurrency/spin_loops.rs:53:21
  12: std::ptr::drop_in_place::<launcher::{closure#0}::RecvOnDrop> - shim(Some(launcher::{closure#0}::RecvOnDrop))
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:487:1
  13: std::ptr::drop_in_place::<std::option::Option<launcher::{closure#0}::RecvOnDrop>> - shim(Some(std::option::Option<launcher::{closure#0}::RecvOnDrop>))
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:487:1
  14: std::mem::drop::<std::option::Option<launcher::{closure#0}::RecvOnDrop>>
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/mem/mod.rs:974:24
  15: std::thread::local::fast::destroy_value::<launcher::{closure#0}::RecvOnDrop>
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/thread/local.rs:1033:13
  16: std::sys_common::thread_local_dtor::register_dtor_fallback::run_dtors
             at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sys_common/thread_local_dtor.rs:43:17
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: PoisonError { .. }', /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sync/mpsc/sync.rs:386:42
stack backtrace:
   0:           0x2b728a - std::backtrace_rs::backtrace::miri::trace_unsynchronized::<&mut [closure@std::sys_common::backtrace::_print_fmt::{closure#1}]>
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/../../backtrace/src/backtrace/miri.rs:99:5
   1:           0x2b6e30 - std::backtrace_rs::backtrace::miri::trace::<&mut [closure@std::sys_common::backtrace::_print_fmt::{closure#1}]>
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/../../backtrace/src/backtrace/miri.rs:62:14
   2:           0x25a463 - std::backtrace_rs::backtrace::trace_unsynchronized::<[closure@std::sys_common::backtrace::_print_fmt::{closure#1}]>
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   3:           0x23c7eb - std::sys_common::backtrace::_print_fmt
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:66:5
   4:           0x23a1d3 - <std::sys_common::backtrace::_print::DisplayBacktrace as std::fmt::Display>::fmt
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:45:22
   5:           0x56a4dd - std::fmt::write
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/fmt/mod.rs:1196:17
   6:           0x124dc3 - <std::sys::unix::stdio::Stderr as std::io::Write>::write_fmt
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/io/mod.rs:1654:15
   7:           0x23c4e3 - std::sys_common::backtrace::_print
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:48:5
   8:           0x23c389 - std::sys_common::backtrace::print
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:35:9
   9:           0x253dd1 - std::panicking::default_hook::{closure#1}
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/panicking.rs:295:22
  10:           0x2540ae - std::panicking::default_hook
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/panicking.rs:314:9
  11:           0x257c44 - std::panicking::rust_panic_with_hook
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/panicking.rs:698:17
  12:           0x256a77 - std::panicking::begin_panic_handler::{closure#0}
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/panicking.rs:588:13
  13:           0x23d08e - std::sys_common::backtrace::__rust_end_short_backtrace::<[closure@std::panicking::begin_panic_handler::{closure#0}], !>
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:138:18
  14:           0x256995 - std::panicking::begin_panic_handler
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/panicking.rs:584:5
  15:           0x52467e - std::rt::panic_fmt
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/panicking.rs:142:14
  16:           0x53ff19 - std::result::unwrap_failed
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/result.rs:1805:5
  17:           0x53ab66 - std::result::Result::<std::sync::MutexGuard<std::sync::mpsc::sync::State<()>>, std::sync::PoisonError<std::sync::MutexGuard<std::sync::mpsc::sync::State<()>>>>::unwrap
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/result.rs:1098:23
  18:           0x215733 - std::sync::mpsc::sync::Packet::<()>::drop_port
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sync/mpsc/sync.rs:386:25
  19:           0x200881 - <std::sync::mpsc::Receiver<()> as std::ops::Drop>::drop
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sync/mpsc/mod.rs:1507:36
  20:           0x3d48b8 - std::ptr::drop_in_place::<std::sync::mpsc::Receiver<()>> - shim(Some(std::sync::mpsc::Receiver<()>))
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:487:1
  21:            0x7ecf9 - <launcher::{closure#0}::RecvOnDrop as std::ops::Drop>::drop
                               at tests/pass/concurrency/spin_loops.rs:54:17
  22:           0x3d48c7 - std::ptr::drop_in_place::<launcher::{closure#0}::RecvOnDrop> - shim(Some(launcher::{closure#0}::RecvOnDrop))
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:487:1
  23:           0x3d48d7 - std::ptr::drop_in_place::<std::option::Option<launcher::{closure#0}::RecvOnDrop>> - shim(Some(std::option::Option<launcher::{closure#0}::RecvOnDrop>))
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:487:1
  24:           0x3bcee6 - std::mem::drop::<std::option::Option<launcher::{closure#0}::RecvOnDrop>>
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/mem/mod.rs:974:24
  25:            0xb05dd - std::thread::local::fast::destroy_value::<launcher::{closure#0}::RecvOnDrop>
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/thread/local.rs:1033:13
  26:           0x1ec71c - std::sys_common::thread_local_dtor::register_dtor_fallback::run_dtors
                               at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/std/src/sys_common/thread_local_dtor.rs:43:17
thread panicked while panicking. aborting.

I think this might be a true positive; that code does call rx.recv() in the destructor of a thread-local variable, which I am not surprised to see leading to problems. I assume it is pure chance that this works fine outside of Miri.

make Miri's scheduler proper round-robin When thread N blocks or yields, we activate thread N+1 next, rather than always activating thread 0. This should guarantee that as long as all threads regularly yield, each thread eventually takes a step again. Fixes the "multiple loops that yield playing ping-pong" part of #1388. `@cbeuw` I hope this doesn't screw up the scheduler-dependent tests you are adding in your PR.

jethrogb · 2022-06-07T09:39:18Z

@RalfJung The code with rx.recv had other issues, the final test that was upstreamed doesn't use MPSC channels, see rust-lang/rust#84409

RalfJung · 2022-06-07T11:51:27Z

Ah, that looks like a nice test, I'll add it to Miri. :)

RalfJung added C-bug Category: This is a bug. A-concurrency Area: affects our concurrency (multi-thread) support labels May 3, 2020

This was referenced May 6, 2020

Stop ignoring miri leaks jonhoo/flurry#82

Merged

Miri: run liballoc tests with threads rust-lang/rust#71737

Merged

RalfJung mentioned this issue Dec 11, 2020

stacked-borrows very slow #1647

Closed

JCTyblaidd mentioned this issue Dec 15, 2020

Improve yield handling and add yield-based livelock detection. #1651

Closed

RalfJung mentioned this issue Feb 15, 2021

Run Miri on CI crossbeam-rs/crossbeam#578

Merged

Kixunil mentioned this issue Oct 9, 2021

Document threading support a bit more #1898

Merged

Kixunil mentioned this issue Oct 12, 2021

New lint: loop with atomics without other side effects rust-lang/rust-clippy#7809

Closed

RalfJung mentioned this issue Mar 9, 2022

add basic support for android targets #2011

Closed

3 tasks

This was referenced May 17, 2022

Disable Miri weak memory emulation for deque tests crossbeam-rs/crossbeam#829

Closed

Add spin loop hints in tests for Miri crossbeam-rs/crossbeam#831

Merged

RalfJung mentioned this issue May 19, 2022

Weak memory emulation using store buffers #1963

Merged

6 tasks

RalfJung mentioned this issue Jun 5, 2022

make Miri's scheduler proper round-robin #2197

Merged

RalfJung mentioned this issue Jun 6, 2022

Make scheduler preemptive #2208

Merged

bors closed this as completed in ad576d8 Jun 7, 2022

bors closed this as completed in #2208 Jun 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Miri hangs in spin loops forever #1388

Miri hangs in spin loops forever #1388

RalfJung commented May 3, 2020

oli-obk commented May 3, 2020

RalfJung commented May 3, 2020

RalfJung commented Aug 19, 2020

RalfJung commented Apr 23, 2021 •

edited

Loading

vakaras commented Apr 23, 2021 •

edited

Loading

Kixunil commented Oct 8, 2021

RalfJung commented Oct 9, 2021

Kixunil commented Oct 9, 2021

RalfJung commented Oct 11, 2021

Kixunil commented Oct 12, 2021

RalfJung commented Jun 5, 2022 •

edited

Loading

RalfJung commented Jun 5, 2022

RalfJung commented Jun 5, 2022 •

edited

Loading

jethrogb commented Jun 7, 2022

RalfJung commented Jun 7, 2022

Miri hangs in spin loops forever #1388

Miri hangs in spin loops forever #1388

Comments

RalfJung commented May 3, 2020

oli-obk commented May 3, 2020

RalfJung commented May 3, 2020

RalfJung commented Aug 19, 2020

RalfJung commented Apr 23, 2021 • edited Loading

vakaras commented Apr 23, 2021 • edited Loading

Kixunil commented Oct 8, 2021

RalfJung commented Oct 9, 2021

Kixunil commented Oct 9, 2021

RalfJung commented Oct 11, 2021

Kixunil commented Oct 12, 2021

RalfJung commented Jun 5, 2022 • edited Loading

RalfJung commented Jun 5, 2022

RalfJung commented Jun 5, 2022 • edited Loading

jethrogb commented Jun 7, 2022

RalfJung commented Jun 7, 2022

RalfJung commented Apr 23, 2021 •

edited

Loading

vakaras commented Apr 23, 2021 •

edited

Loading

RalfJung commented Jun 5, 2022 •

edited

Loading

RalfJung commented Jun 5, 2022 •

edited

Loading