Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stacked Borrows: asserting uniqueness too early? Should we allow the optimizer to add spurious stores? #133

Open
RalfJung opened this issue May 28, 2019 · 45 comments
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) A-SB-vs-TB Topic: Design questions where SB and TB are opposite sides of the design axis C-open-question Category: An open question that we should revisit S-pending-design Status: Resolving this issue requires addressing some open design questions

Comments

@RalfJung
Copy link
Member

RalfJung commented May 28, 2019

We have seen a few cases in libstd where the rule that just creating a mutable reference already asserts uniqueness is a problem:

In some cases, even just creating a shared ref is an issue, as it conflicts with a recently created mutable ref:

So maybe we assert uniqueness to aggressively here? Maybe mutable references should only become unique on first write, or so? I am not sure what that would look like though. (In principle this could also happen with shared references but I have not seen that.)

I'll use this issue to collect such cases.

However, there are also cases where people want the extra UB to get more optimizations:

@RalfJung RalfJung added the A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) label May 28, 2019
@RalfJung RalfJung changed the title Stacked Borrows: asserting aliasing guarantees too often? Stacked Borrows: asserting aliasing guarantees too early? May 28, 2019
@RalfJung RalfJung changed the title Stacked Borrows: asserting aliasing guarantees too early? Stacked Borrows: asserting uniqueness too early? May 28, 2019
@RalfJung
Copy link
Member Author

RalfJung commented Jul 10, 2019

Another related case: rust-lang/rust#62553; calling mem::forget asserts uniqueness and invalidates the pointer that we actually care about.

@retep998
Copy link
Member

retep998 commented Jul 17, 2019

This is my opinion based on experience with unsafe code and FFI:

While mutable references should assert that the data is otherwise immutable, I believe that the assertion of uniqueness should only apply to references. It is far too common of an assumption that raw pointers are not invalidated by a mutable reference existing, and that it is sufficient to merely hold off on reading or writing through raw pointers until the mutable reference ceases to exist. At the moment you dereference a raw pointer immutably no other mutable references should exist, and at the moment you dereference a raw pointer mutably no other references should exist at all, but a raw pointer should not be invalidated by mutable references.

@RalfJung
Copy link
Member Author

the assertion of uniqueness should only apply to references

It does only apply to references.

if you have a reference and a raw pointer working on the same thing, the reference is not unique. It's not the raw pointer that is causing issues here, it is the reference.

a raw pointer should not be invalidated by mutable references.

It would help if you could give a concrete self-contained example of something that is UB according to current Stacked Borrows, but which you think should be allowed.

Is it something like this:

fn main() {
    let mut local = 0;
    let raw = &mut local as *mut i32;
    
    let mut_ref = &mut local;
    *mut_ref = 13;
    
    let _val = unsafe { *raw }; // UB (as of Stacked Borrows 2.1).
}

@retep998
Copy link
Member

fn main() {
    let mut x = 0;
    let y = &mut x as *mut _;
    { let _z = &mut x; }
    unsafe { *y = 10; }
}

When run through miri:

error[E0080]: Miri evaluation error: no item granting write access to tag <untagged> found in borrow stack
 --> src/main.rs:5:14
  |
5 |     unsafe { *y = 10; }
  |              ^^^^^^^ Miri evaluation error: no item granting write access to tag <untagged> found in borrow stack
  |
  = note: inside call to `main` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:64:34
  = note: inside call to closure at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:52:53
  = note: inside call to closure at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panicking.rs:296:40
  = note: inside call to `std::panicking::try::do_call::<[closure@DefId(1:5891 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe], i32>` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panicking.rs:292:5
  = note: inside call to `std::panicking::try::<i32, [closure@DefId(1:5891 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe]>` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panic.rs:394:9
  = note: inside call to `std::panic::catch_unwind::<[closure@DefId(1:5891 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe], i32>` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:52:25
  = note: inside call to `std::rt::lang_start_internal` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:64:5
  = note: inside call to `std::rt::lang_start::<()>`

error: aborting due to previous error

I was led to believe that this error is by design due to stacked borrows. If it is just an issue with miri that will get fixed and this code example is actually fine then I would be so happy.

@RalfJung
Copy link
Member Author

@retep998 Right, so that is an example involving an unused mutable references. This is definitely forbidden by the current Stacked Borrows (it is not "just" a Miri implementation issue), but I could imagine something like "Tree-shaped borrows" that would open a design space to allow more things here. However, it is unclear how much that would cost in terms of lost optimizations. This issue is about collecting cases where unused mutable references caused UB. ("used"/"unused" is a fuzzy term here, but if the variable is dead immediately after it got created, like in your last example, that's definitely "unused".)

What I am wondering about is how you feel about used mutable references interacting with raw pointers, like in my example above.

@retep998
Copy link
Member

In your example above the used mutable reference still exists when the raw pointer is accessed, which is shaky ground. If we were to modify my example to actually use the mutable reference...

fn main() {
    let mut x = 0;
    let y = &mut x as *mut _;
    {
        let z = &mut x;
        *z = 20;
    }
    unsafe { *y = 10; }
}

then I would still want it to be okay and not UB. Raw pointers should not care if something else mutated the value in the time between accesses.

@taralx
Copy link

taralx commented Jul 24, 2019

So if I understand correctly, miri invalidates other children of x when &mut x is created. Is it possible to simply hold the raw children in abeyance, like one does with x itself? It is a bit surprising that creating z from y is ok, but creating z from x (which we know is the same as y) is not.

@comex
Copy link

comex commented Jul 24, 2019

@retep998 By itself your example is pretty harmless, but it's also useless in that nothing uses the stored value of 10. On the other hand, if we add a use of x...

fn main() {
    let mut x = 0;
    let y = &mut x as *mut _;
    {
        let z = &mut x;
        *z = 20;
    }
    unsafe { *y = 10; }
    println!("{}", x);
}

Then we have a problem, because what if the raw pointer stuff is hidden behind function calls?

fn foo(x: &mut i32){
    some_global = x as *mut _;
}

fn bar() {
    unsafe { *some_global = 10; }
}

fn main() {
    let mut x = 0;
    foo(&mut x);
    {
        let z = &mut x;
        *z = 20;
    }
    bar();
    println!("{}", x);
}

(Note that in reality foo and bar might be trait object methods or from external crates, so the compiler might not be able to inspect their contents.)

Under Stacked Borrows, the compiler can optimize main by changing println!("{}", x); to println!("{}", 20}. If this example is non-UB, it cannot.

In other words, from an implementation perspective, the issue is not how the compiler treats raw pointers, but what assumptions the compiler can make about unknown code.

That said, I agree that this is a footgun. I really wish we had something with miri's level of undefined behavior detection that also worked with FFI, at least to some extent.

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 25, 2019

I really wish we had something with miri's level of undefined behavior detection that also worked with FFI, at least to some extent.

I don't think we should design things under the assumption that something like this can exist.

@Lokathor
Copy link
Contributor

Lokathor commented Jul 25, 2019

i think that means that the foo/bar example has to be blindly assumed to always be the sort of case if you pass an address to unknown code, because some code does work that way.

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 25, 2019

In other words, from an implementation perspective, the issue is not how the compiler treats raw pointers, but what assumptions the compiler can make about unknown code.

While the aliasing model determines which optimizations Rust is allowed to do locally (without whole program analysis), it also determines what assumptions users can make about the behavior of unknown code. Can users reason about the code locally? E.g. if I modify this piece of code, I know I can do this and that, because these unknown functions are not allowed to do x and y without introducing UB. Or do users need to keep the whole program in their head to be able to reason about UB and about how to modify code without introducing UB ?

@RalfJung
Copy link
Member Author

RalfJung commented Jul 25, 2019

@retep998

In your example above the used mutable reference still exists when the raw pointer is accessed, which is shaky ground.

Now that is interesting. Stacked Borrows was specifically designed to not care about such cases of "existence" because I was sure that would result in a lot of backlash. ;) So I didn't expect some far on the "max freedom for unsafe code" end of the spectrum to consider this a reasonable restriction.
However, there's also the fact that even the borrow checker considers a locally created reference dead when it is definitely not used/copied/reborrowed any more. I don't think the dynamic model should be even more conservative that the borrow checker.

OTOH, references passed as function argument do have special treatment that keeps them "live" throughout the call (and the borrow checker does assume they live for the entire function).

(Btw let me say that I enjoy this exchange a lot, this is the kind of discussion I was hoping to have when I put out my proposals.)

If we were to modify my example to actually use the mutable reference... [...]
then I would still want it to be okay and not UB.

This is where it gets interesting. I consider this equivalent to my example as the only difference is in the scoping of some variables without drop glue.

Thankfully @comex already came up with a great example for why Stacked Borrows makes this example UB -- an example for what kind of optimizations this UB enables. In my view, the entire point of Stacked Borrows is to enable intraprocedural optimization based on alias information, so losing some of those is a big deal.

For the current version of Stacked Borrows, we have formal machine-checked proofs that some optimizations are possible (I'll share this in a blog post eventually). It does not seem obvious that even those optimizations are still possible in a model that permits as many programs as you are asking. But it might be possible. I will have to think about this more.

Raw pointers should not care if something else mutated the value in the time between accesses.

As stated above, the raw pointer does not care. It's the mutable reference that cares enough to make other pointers invalid, in order to enforce its own discipline.

@retep998 while I have your attention, can I probe your intuition another way? What if we change your example to use x directly instead of creating a new mutable reference and writing to that?

fn main() {
    let x = &mut 0; // changed x to also be a reference
    let y = &mut *x as *mut _;
    *x = 20; // instead of `{ let z = &mut *x; *z = 20; }`
    unsafe { *y = 10; }
    println!("{}", *x); // Does removing this change anything?
}

This program I feel very strongly should have UB, because between two uses of x (the write and the read) there is a conflicting access. If we do not make this UB, I think there is very little to nothing that we can optimize based on reference types.

But OTOH, my intuition also says it makes sense to consider *x = 20 and { let z = &mut *x; *z = 20; } as equivalent. At that point we are very close to your example, the only difference being the println! at the end.

So, I am wondering what your thoughts on this are.

@taralx

Is it possible to simply hold the raw children in abeyance, like one does with x itself?

Maybe. Stacked Borrows is but one point in a gigantic design space. But "simply" doesn't exactly describe the situation. ;)

It is a bit surprising that creating z from y is ok, but creating z from x (which we know is the same as y) is not.

However, here I think your intuition needs some refinement. x and y are not the same. The entire point of an aliasing model is to define which of multiple aliasing pointers you are actually allowed to use to do what in which order. That is, even if two pointers point to the same thing, they are not "the same".

This is somewhat unintuitive, but even C does this. Here is some literature: [1], [2], [3].

I don't think we should design things under the assumption that something like this can exist.

I think it could exist. It's just hard.
Stacked Borrows supports untagged pointers specifically for this case: all accesses done by FFI code would be considered as happening with an untagged pointer.

@gnzlbg

E.g. if I modify this piece of code, I know I can do this and that, because these unknown functions are not allowed to do x and y without introducing UB.

I am afraid I don't understand your question. Do you have an example for "this and that"?
Certainly the same local reasoning principles that the compiler uses, can also be used by the programmer.

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 25, 2019

I am afraid I don't understand your question. Do you have an example for "this and that"? Certainly the same local reasoning principles that the compiler uses, can also be used by the programmer.

It wasn't a question, more like a criticism that the discussion often focus on which optimizations the aliasing model allows, which programs it bans, how it makes the life of miri/etc. easier, without actually covering how does the aliasing model make it easier for users to reason about their code. I think it should focus more on the latter.

For example, in @comex second example here, if bar cannot write to some_global, then a user modifying / refactoring main can just assume it does not happen, and can actually refactor the whole function without inspecting the code of foo and bar. However, if that is allowed to happen, then a user would need to inspect foo and bar to be able to touch main.

Arguing that making X UB allows users to not have to inspect the source code of your whole program when modifying some function is a stronger argument for most users IMO than saying that it allows some compiler optimization.

@Lokathor
Copy link
Contributor

I agree. People are willing to set a lot of optimization on fire if it makes the universe easier to understand. See Also: all GC languages. The first evaluation of an aliasing model is if it lets a low to average skill programmer follow along with the implications of what they're doing. There are many other metrics too, but that needs to be the first target.

Which is not to say that Stacked Borrows, or any other model, is necessarily hard to follow, I just want to emphasize the importance of gnzlbg's point.

@danielhenrymantilla
Copy link
Contributor

danielhenrymantilla commented Jul 25, 2019

Minor nitpick that is nevertheless relevant to the "(human) reasoning about code" is that @comex's bar function ought to be unsafe: this makes it clearer that bar() may not be as innocent as it may look from the outside without knowing what it does, and that it should thus be moved with care.


But OTOH, my intuition also says it makes sense to consider *x = 20 and `{ let z = &mut *x; *z = 20; } as equivalent.

  • Another nitpick: it's x = 20, or *(&mut x) = 20 when being verbose

I think this equivalence is needed for reborrowing to make sense: one can always imagine that *(&mut x) = 20 goes through an ephemeral reborrow, something like "spurious reborrows" :D


I am on the side of a strict language with enabled optimizations, and an escape hatch for "raw stuff". In this case, Rust's escape hatch is UnsafeCell:

  • fn raw_mut_ref_from_unique<T : ?Sized> (x: &'_ mut T) -> &'_ UnsafeCell<T>
    {
        unsafe {
            // # Safety:
            //
            //   - `UnsafeCell` is `repr(transparent)`
            //
            //   - data-race-free mutation is now the responsibility of the programmer
            //     (requires `unsafe`)
            mem::transmute(x)
        }
    }
  • Contrived patterns such as @comex's would thus require starting from a & UnsafeCell reference instead of a &mut:

    static SOME_GLOBAL: AtomicPtr<i32> = AtomicPtr::new(0 as *mut _);
    
    fn foo (at_x: &'_ UnsafeCell<i32>)
    {
        SOME_GLOBAL.store(at_x.get(), Ordering::SeqCst);
    }
    
    /// # Safety:
    ///
    ///   - `SOME_GLOBAL` must point to a valid i32,
    ///     which must not be read or written to in parallel
    unsafe fn bar ()
    {
         *SOME_GLOBAL.load(Ordering::SeqCst) = 10;
    }
    
    fn main ()
    {
        let mut x = 0;
        {
            let at_x = raw_mut_ref_from_unique(&mut x);
            foo(at_x);
            {
                let z = at_x;
                unsafe {
                    // # Safety: no parallel reads or writes
                    *z.get() = 20;
                }
            }
            unsafe {
                // # Safety:
                //
                //   - the initial raw_mut_ref_from_unique has not been invalidated yet;
                //
                //   - no parallel reads or writes;
                bar();
            }
        } // SOME_GLOBAL is now invalid by the way
        println!("{}", x);
    }

@RalfJung
Copy link
Member Author

RalfJung commented Jul 25, 2019

@gnzlbg I agree. A memory model has two consumers: programmers that code against it, and compiler authors that implement it and optimize under it.

Arguing that making X UB allows users to not have to inspect the source code of your whole program when modifying some function is a stronger argument for most users IMO than saying that it allows some compiler optimization.

Interesting. That is not at all my experience from most discussions around UB, and I have gotten explicit pushback against arguing by "simplicity", but maybe that's not what you mean.

The argument is almost the same though: allowing compiler optimizations is about making program analyses easier, and that also makes it easier for humans to analyze. Though I suppose something like TBAA is rarely useful for a human analyzer, while Stacked Borrows provides some entirely intraprocedural reasoning principles that humans can learn just like compilers can. If I had infinite time, I'd write a blog post or two about those reasoning principles, but in lieu of that I can only point at this section of the Stacked Borrows 1.0 post.

@Lokathor

People are willing to set a lot of optimization on fire if it makes the universe easier to understand.

Not the kind of people that make languages like C and C++ though. Source: have you looked at those languages?^^ And people use them because they are fast, which they are because of all those complications.

See Also: all GC languages

Not really, those languages also statically prohibit many of the things you have to worry about in C/C++. I mean, safe Rust also is a no-brainer when it comes to Stacked Borrows.

@Ixrec
Copy link
Contributor

Ixrec commented Jul 25, 2019

FWIW, everything going on in these Stacked Borrows models and all the UCG discussions is several lightyears ahead of C/C++ in terms of "simplicity" (which for me means roughly "comprehensibility to non-wizards"). I've long since given up having any idea what C/C++ think UB really is, but for the stuff under discussion here, I'm fairly sure that the only reason I can't work through most of the examples myself is simply that all of this stuff is still in flux so I haven't tried to properly memorize the current model. Certainly the basic idea in the last several comments that "while there's a &mut x in scope, using aliasing raw pointers to mutate x is UB" seems perfectly natural and intuitive to me (whether or not we end up accepting that rule).

I also feel like the past discussions on this have typically been taking simplicity / the ability for users to reason about their code very seriously, to the point that pretty much every Stacked Borrows update either gives a very specific reason why a new complexity had to be introduced, or proclaimed some significant simplification of the model, or both. Though perhaps I'm over-focusing on Ralf's blog posts there.

But maybe the point of bringing up simplicity/reasonability here was to imply something like "the complexity cost of introducing tree-shaped borrows is too high to be worth making retep's original case non-UB". In which case I tentatively agree; we'd need to see some big optimization wins to justify that.

@taralx
Copy link

taralx commented Jul 28, 2019

FWIW I retract my suggestion on the basis of @comex's interprocedural example. 👍

@RalfJung
Copy link
Member Author

RalfJung commented Aug 2, 2019

Another instance of "mutable references created but never used breaks stuff": thomcc/rust-typed-arena#27.

@RalfJung RalfJung added the C-open-question Category: An open question that we should revisit label Aug 14, 2019
@goffrie
Copy link

goffrie commented Mar 27, 2020

I ran into this (or something similar) today when using https://github.com/Matthias247/futures-intrusive under miri. In particular, we have some data structure

struct SomeFuture {
    wait_node: ListNode,
    // ...
}

where ListNode is an intrusive linked list node. We stash pointers to wait_node in some external structure. But then if we ever create a mutable reference to SomeFuture, those pointers are invalidated.

I was hoping that wrapping the ListNode in an UnsafeCell was enough, but it isn't - miri sees through the UnsafeCell.

It seems to me that this destroys any possibility of making "encapsulated" intrusive data structures, since we can't stop the user from taking mutable reference to the structure containing the list node - even if that mutable reference will never be used to mutate the node in question.

@saethlin
Copy link
Member

saethlin commented Jan 30, 2022

I am once again running Miri on a lot of crates. It seems that breaking this rule specifically with Box is rather common. A user on the community discord once pointed out that it really seems like Box is the tool to get address stability, but under SB 2.1 that feels like a bait-and-switch. You get address stability until you try to move the Box itself. Over time I'm coming to agree more and more with this perspective. I think I see the alternative perspective, that Box<T> owns the T so of course it gets Unique permission. But I think if that's the case, we should provide a way for people to get the tool they want: An RAII heap allocation that can be aliased. And provide strong lints to force people to migrate.

string_cache: https://github.com/servo/string-cache/blob/72d9737baef70e1c3afb684a77d09cbad2353065/src/dynamic_set.rs#L76-L83

        let mut entry = Box::new(Entry {
            next_in_bucket: self.buckets[bucket_index].take(),
            hash,
            ref_count: AtomicIsize::new(1),
            string: string.into_boxed_str(),
        });
        let ptr = NonNull::from(&mut *entry);
        self.buckets[bucket_index] = Some(entry);

lru: https://github.com/jeromefroe/lru-rs/blob/934f17ee0b744525176e5d2ba8cf5ce92eac97a3/src/lib.rs#L311-L340

                let node = /* snip */
                } else {
                    // if the cache is not full allocate a new LruEntry
                    Box::new(LruEntry::new(k, v))
                };

                let node_ptr: *mut LruEntry<K, V> = &mut *node;
                self.attach(node_ptr);

                let keyref = unsafe { (*node_ptr).key.as_ptr() };
                self.map.insert(KeyRef { k: keyref }, node);

owning_ref https://github.com/Kimundi/owning-ref-rs/blob/677c5bbe33210e65ababfd46eeb2b7aaab41f94d/src/lib.rs#L326-L333

    pub unsafe fn new_assert_stable_address(o: O) -> Self
        where O: Deref<Target = T>,
    {
        OwningRef {
            reference: &*o,
            owner: o,
        }
    }

@saethlin
Copy link
Member

saethlin commented Jan 30, 2022

Also, it's unclear to me if the aforementioned code is already UB in practice due to the way rustc applies LLVM's noalias. If someone knows better than I, or has some kind of way to demonstrate this, or has some kind of checking interpreter for LLVM IR, that would be awesome.

@danielhenrymantilla
Copy link
Contributor

danielhenrymantilla commented Jan 30, 2022

Such a pointer is already featured in the ::aliasable crate; I think it currently "just" needs more publicity.

Related to it, and to owning-ref's UB, I wrote a pretty lengthy post to teach and raise awareness about these things: https://users.rust-lang.org/t/how-can-we-teach-people-about-self-referential-types/65362/9?u=yandros#warning-careful-with-unsafe-warning-1

@comex
Copy link

comex commented Jan 30, 2022

Also, it's unclear to me if the aforementioned code is already UB in practice due to the way rustc applies LLVM's noalias

Yes, it is. This code succeeds in debug mode but panics in release mode:

#[inline(never)]
fn noalias_test(mut a: Box<i32>, b: *mut i32) -> i32 {
    *a = 4;
    unsafe { *b = 5; }
    *a // <- LLVM assumes this must be 4
}

fn main() {
    let mut a = Box::new(10);
    let b = &mut *a as *mut i32;
    assert_eq!(noalias_test(a, b), 5);
}

I haven't heard of a checking interpreter for LLVM IR, but this kind of "write one thing, write another thing, read back the first thing" pattern is a good way to test aliasing assumptions in general.

@saethlin
Copy link
Member

That's a slick demo, but I'm not totally sure it applies to the particular examples I linked. Particularly for a string interner or entries in an LRU cache, there shouldn't be any mutation. If there were mutation through the pointer I think that code might not apply to this issue in particular, because asserting uniqueness at mutation would just be UB through another route (I'd say later, but time travel).

I can't think up with any way that a surprising optimization would appear in the absence of mutation, but the LangRef's section on noalias only mentions access instead of specifically calling out mutation so I'm not sure of much here.

@comex
Copy link

comex commented Jan 30, 2022

The LangRef does call out mutation:

noalias
This indicates that memory locations accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value. This guarantee only holds for memory locations that are modified, by any means, during the execution of the function. [..]

Emphasis added. Also, noalias was originally designed to implement C restrict, which has a similar rule:

if some object that is accessible through P (directly or indirectly) is modified, by any means, then all accesses to that object (both reads and writes) in that block must occur through P (directly or indirectly), otherwise the behavior is undefined

(source)

So if there really is no mutation, then you're safe from LLVM's perspective; perhaps that's an indication that Stacked Borrows should be loosened in that case.

That said:

I was able to translate my example to use owning_ref in combination with interior mutability:

use std::cell::Cell;
use owning_ref::BoxRef;
#[inline(never)]
fn noalias_test(x: BoxRef<Cell<i32>>) -> i32 {
    x.as_owner().set(4);
    x.set(5);
    x.as_owner().get()
}
fn main() {
    let x: BoxRef<Cell<i32>> = BoxRef::new(Box::new(Cell::new(10)));
    assert_eq!(noalias_test(x), 5);
}

(Incidentally, owning_ref unfortunately has multiple other soundness issues even without compiler optimizations... [1] [2])

As for lru, I believe there is mutation involved. The Box points to an LruEntry<K, V> which contains prev and next pointers, forming a doubly linked list; these pointers will be mutated when adjacent entries are added or removed. It's also possible that the key or value could have interior mutability. Similarly, in string-cache, the next_in_bucket field is used to form a singly linked list, and an entry's next_in_bucket field will be mutated if the entry following it is removed.

That said, I suspect it would be difficult to actually trigger a miscompilation in either of those two crates.

@saethlin
Copy link
Member

Big thanks for correcting me so thoroughly! I've run into a subtle case with interior mutability before, maybe sprinkling Cell around should just be part of looking into these findings.

bors-servo added a commit to servo/rust-smallvec that referenced this issue Feb 1, 2022
Fix all problems encounted with Miri -Ztag-raw-pointers

I poked at this crate with `-Zmiri-tag-raw-pointers` before, and I was unable to fix what I found (I just added a test case that ruled out one of my wrong ideas #271). I tried again just now and I guess I just understand better this time.

This PR fixes 3 separate pointer invalidation problems, which are detected by running `MIRIFLAGS=-Zmiri-tag-raw-pointers cargo miri test`.

Depending on how you squint, 2 or 3 of these are rust-lang/unsafe-code-guidelines#133. The last one is _probably_ still present even with late invalidation, because `set_len` does a write through a `&mut`.

It's unclear to me if any of these things that Miri complains about are potentially a miscompilation in rustc due to the use of LLVM `noalias`. But perhaps given how subtle this codebase is overall, it would be best to run the tools on their pickiest settings, even if there are a few things more like a false positive than a real problem.
bors-servo added a commit to servo/rust-smallvec that referenced this issue Feb 7, 2022
Fix all problems encounted with Miri -Ztag-raw-pointers

I poked at this crate with `-Zmiri-tag-raw-pointers` before, and I was unable to fix what I found (I just added a test case that ruled out one of my wrong ideas #271). I tried again just now and I guess I just understand better this time.

This PR fixes 3 separate pointer invalidation problems, which are detected by running `MIRIFLAGS=-Zmiri-tag-raw-pointers cargo miri test`.

Depending on how you squint, 2 or 3 of these are rust-lang/unsafe-code-guidelines#133. The last one is _probably_ still present even with late invalidation, because `set_len` does a write through a `&mut`.

It's unclear to me if any of these things that Miri complains about are potentially a miscompilation in rustc due to the use of LLVM `noalias`. But perhaps given how subtle this codebase is overall, it would be best to run the tools on their pickiest settings, even if there are a few things more like a false positive than a real problem.
nico-abram added a commit to nico-abram/utils that referenced this issue Mar 30, 2022
See rust-lang/unsafe-code-guidelines#133
Calling as_mut_ptr creates a unique reference, and as_ptr a shared
reference. These 2 conflict, and one invalidates the other. Both
ptr need to be reborrows of or be basically the same pointer.
newpavlov pushed a commit to RustCrypto/utils that referenced this issue Mar 30, 2022
See rust-lang/unsafe-code-guidelines#133
Calling as_mut_ptr creates a unique reference, and as_ptr a shared
reference. These 2 conflict, and one invalidates the other. Both
ptr need to be reborrows of or be basically the same pointer.
@JakobDegen JakobDegen added the S-pending-design Status: Resolving this issue requires addressing some open design questions label Aug 1, 2023
@JakobDegen
Copy link
Contributor

Visiting for UCG. Issue is mostly fine, but editing some Box-related questions out of the main issue in an attempt to keep things managable

@RalfJung RalfJung added the A-SB-vs-TB Topic: Design questions where SB and TB are opposite sides of the design axis label Apr 1, 2024
@RalfJung
Copy link
Member Author

RalfJung commented Apr 1, 2024

FWIW, Tree Borrows does things different here -- mutable references are two-phase, and uniqueness is only asserted when they are written to. That seems to be a fairly elegant solution, though it also has its downsides.

@RalfJung
Copy link
Member Author

And of course now people are asking for optimizations that are only correct if we do aggressively assert uniqueness early. ;)

rust-lang/rust#124346

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-aliasing-model Topic: Related to the aliasing model (e.g. Stacked/Tree Borrows) A-SB-vs-TB Topic: Design questions where SB and TB are opposite sides of the design axis C-open-question Category: An open question that we should revisit S-pending-design Status: Resolving this issue requires addressing some open design questions
Projects
None yet
Development

No branches or pull requests