Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

noalias is not enough #53105

Open
gnzlbg opened this issue Aug 6, 2018 · 12 comments
Open

noalias is not enough #53105

gnzlbg opened this issue Aug 6, 2018 · 12 comments
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 6, 2018

Somebody on the internet (https://blog.dend.ro/rust-and-the-case-of-the-redundant-comparison/) complained that something like this:

fn vec_clear(x: &mut i32) {
    if *x != 0 {
        *x = 0;
    }
}

generates a conditional store:

    cmpl    $0, (%rdi)
    je      .LBB0_2
    movl    $0, (%rdi)
.LBB0_2:  
    retq

on x86_64 instead of just an unconditional store movl $0, (%rdi); retq.

Taking a look at the optimized LLVM-IR:

define void @vec_clear(i32* noalias nocapture dereferenceable(4) %x) {
start:
  %0 = load i32, i32* %x, align 4
  %1 = icmp eq i32 %0, 0
  br i1 %1, label %bb2, label %bb1

bb1:
  store i32 0, i32* %x, align 4
  br label %bb2

bb2:
  ret void
}

shows the issue.

The LLVM-IR generated by rustc is loosing critical information. It marks i32* as noalias, which means, that no other pointers in vec_clear's scope will alias it. However, outside vec_clear scope, other pointers are allowed to alias that memory. That is, if *x is zero, other threads could be concurrently reading the memory and if LLVM would generate an unconditional store here, that would introduce a data-race, which means that this optimization is not safe on the LLVM-IR generated by rustc. OTOH, &mut i32` means that the pointer has unique access to the memory, that is, no other pointer can access the memory behind it as long as that pointer is alive. Therefore, transforming the code to an unconditional store does not introduce a data-race.

Therefore, I think that noalias is not enough to perform this optimization and that we would need something stronger for LLVM to be able to perform it.


This also shows that &mut T is stronger than C's restrict keyword.

@RalfJung
Copy link
Member

RalfJung commented Aug 6, 2018

That is, if *x is zero, other threads could be concurrently reading the memory and if LLVM would generate an unconditional store here, that would introduce a data-race, which means that this optimization is not safe on the LLVM-IR generated by rustc.

Ah, good point. Read-write data races "just" make the read yield undef, but even that would clearly be a misoptimization.

OTOH, &mut i32 means that the pointer has unique access to the memory, that is, no other pointer can access the memory behind it as long as that pointer is alive. Therefore, transforming the code to an unconditional store does not introduce a data-race.

Correct. AFAIK, noalias was never meant to express the full set of properties. It's just the strongest thing LLVM provides.

This also shows that &mut T is stronger than C's restrict keyword.

Oh yes, it is very much stronger in various ways.

@varkor
Copy link
Member

varkor commented Aug 6, 2018

Isn't this the sort of thing the noalias and alias.scopes metadata (#16515) allows one to express?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 6, 2018

@varkor what would be the scopes for the load and stores in the example?

@varkor
Copy link
Member

varkor commented Aug 6, 2018

In this example, as you point out, the aliasing is important with regards to memory accesses outside the function. So if in theory you could mark all the others... I doubt that's sufficient for LLVM though.

@leonardo-m
Copy link

Is it a good idea to write a LLVM enhancement request?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 9, 2018

As @varkor says, we could mark all others, and we would have to mark all others for every &mut that the programs creates, and even then, this is not something that alias analysis would take into account because no sane language front-end will do this.

Extending LLVM to support this won't be easy either. Currently LLVM hoists memory ops from functions when profitable, but:

// T: Copy
fn foo(x: &mut T) {  
   // in this scope there is only one 
   // pointer to the value behind x
   let y = *x; 
    ... 
}
{
   let mut z = T;
   let ptr = &mut z as *mut T;
   // hoist the load from foo out here
   foo(&mut z);     
   *ptr = T;
}

so when hoisting the load (or store) from foo to the outer scope, the "invariant" that that's the only pointer to the data doesn't hold any more, because in the outer scope there might be other pointers to the data.

So all the optimizations that currently move memory across scope would need to update and be extremely careful with any attribute/metadata that we might want to use.

Maybe a minimal extension to alias analysis that allow us to specify the "opposite" / "negative" aliasing groups would be enough, but one would need to teach many pieces of the pipeline about this for the new information to result in better code gen.

@steveklabnik
Copy link
Member

Triage: no idea what the current status of this is, to be honest. I imagine that this was never suggested upstream.

@jonas-schievink jonas-schievink added I-slow Issue: Problems and improvements with respect to performance of generated code. A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 25, 2020
@RalfJung
Copy link
Member

Therefore, I think that noalias is not enough to perform this optimization and that we would need something stronger for LLVM to be able to perform it.

If noalias really means "does not alias for the duration of this function call", then I think in fact it would be enough. Unfortunately, noalias scoping in LLVM is basically undocumented, and ti is also buggy so one cannot just go from the implementation.

However, a new round of noalias/restrict patches has been landing recently (https://lists.llvm.org/pipermail/llvm-dev/2019-March/131127.html, https://reviews.llvm.org/D69542#change-veawD9rpruA2), so maybe that new infrastructure is powerful enough to express the desired guarantee here.

FWIW, Stacked Borrows does allow the optimization.

@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
@gonzalobg
Copy link

gonzalobg commented Sep 24, 2024

If noalias really means "does not alias for the duration of this function call", then I think in fact it would be enough.

It seems noalias means "If there is a modification then does not alias for the duration of this function call". In this example, there is an execution in which there is no modification, so we can't do the optimization.

EDIT: Agreed.

@nikic
Copy link
Contributor

nikic commented Sep 24, 2024

This is a duplicate of all the other "is spurious store on &mut valid?" issues, like #124346 and #114886. LLVM provides a way to convey this nowadays (writable), but Rust is leaning towards not considering this a valid optimization anymore.

@felix91gr
Copy link
Contributor

This might be a useful link for those interested: LLVM Alias Analysis Technical Call - Notes.

The document is long. For the older entries I have done some cleanup and formatting to help with reading. I hope that helps, since I'm not really capable to help with the noalias issue directly.


@nikic: one sec, are you the Nikita mentioned in the Meeting Notes? I should perhaps add your name to the meetings where you participated.


This is a duplicate of all the other "is spurious store on &mut valid?" issues. LLVM provides a way to convey this nowadays (writable), but Rust is leaning towards not considering this a valid optimization anymore.

What optimization do you mean? The LLVM alias analysis passes? :o

I thought that might eventually happen (and it makes sense, I think), but I'm asking just in case that's not what you meant.

@RalfJung
Copy link
Member

RalfJung commented Sep 24, 2024

What optimization do you mean? The LLVM alias analysis passes?

No, specifically inserting spurious writes -- having a function that originally might not have written to a given location at all, perform a write to that location.

Stacked Borrows allows that optimization, Tree Borrows does not. This is one of the main reasons why a lot of real-world code that has UB under Stacked Borrows is fine under Tree Borrows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests