Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use intrinsics::debug_assertions in debug_assert_nounwind #120863

Merged
merged 2 commits into from
Feb 20, 2024

Conversation

saethlin
Copy link
Member

@saethlin saethlin commented Feb 10, 2024

This is the first item in #120848.

Based on the benchmarking in this PR, it looks like, for the programs in our benchmark suite, enabling all these additional checks does not introduce significant compile-time overhead, with the single exception of Alignment::new_unchecked. Therefore, I've added #[cfg(debug_assertions)] to that one call site, so that it remains compiled out in the distributed standard library.

The trailing commas in the previous calls to debug_assert_nounwind! were causing the macro to expand to panic_nouwnind_fmt, which requires more work to set up its arguments, and that overhead alone is measured between this perf run and the next: #120863 (comment)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 10, 2024
@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 10, 2024
@bors
Copy link
Contributor

bors commented Feb 10, 2024

⌛ Trying commit a715c46 with merge 8389ea2...

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 10, 2024
Improve precondition checks for unchecked slice indexing

r? `@ghost`
@bors
Copy link
Contributor

bors commented Feb 10, 2024

☀️ Try build successful - checks-actions
Build commit: 8389ea2 (8389ea2f75983e82925f2ddecb240a9712fc052d)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (8389ea2): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.1% [-2.1%, -2.1%] 1
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
7.1% [0.2%, 16.9%] 3
Regressions ❌
(secondary)
3.9% [3.9%, 3.9%] 1
Improvements ✅
(primary)
-3.3% [-4.7%, -2.0%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 3.0% [-4.7%, 16.9%] 5

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
4.8% [4.7%, 5.0%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 4.8% [4.7%, 5.0%] 2

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.2% [0.0%, 0.5%] 33
Regressions ❌
(secondary)
0.0% [0.0%, 0.1%] 7
Improvements ✅
(primary)
-0.1% [-0.6%, -0.0%] 27
Improvements ✅
(secondary)
-0.0% [-0.0%, -0.0%] 3
All ❌✅ (primary) 0.0% [-0.6%, 0.5%] 60

Bootstrap: 666.679s -> 665.456s (-0.18%)
Artifact size: 308.00 MiB -> 307.97 MiB (-0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 10, 2024
@rust-log-analyzer

This comment has been minimized.

@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 11, 2024
@bors
Copy link
Contributor

bors commented Feb 11, 2024

⌛ Trying commit 9c1c07b with merge f759bae...

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 11, 2024
Improve precondition checks for unchecked slice indexing

r? `@ghost`
@bors
Copy link
Contributor

bors commented Feb 11, 2024

☀️ Try build successful - checks-actions
Build commit: f759bae (f759baeb67d6de14ca220ef2e0f359b7ae95b0ac)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (f759bae): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.9% [0.3%, 2.5%] 27
Regressions ❌
(secondary)
1.6% [0.4%, 4.3%] 13
Improvements ✅
(primary)
-1.1% [-1.4%, -0.8%] 2
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.8% [-1.4%, 2.5%] 29

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
5.0% [0.7%, 9.2%] 4
Regressions ❌
(secondary)
4.3% [4.3%, 4.3%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.9% [-3.9%, -3.9%] 1
All ❌✅ (primary) 5.0% [0.7%, 9.2%] 4

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.6% [0.7%, 2.9%] 14
Regressions ❌
(secondary)
3.1% [2.2%, 4.5%] 5
Improvements ✅
(primary)
-1.6% [-1.6%, -1.6%] 1
Improvements ✅
(secondary)
-2.1% [-2.1%, -2.1%] 1
All ❌✅ (primary) 1.3% [-1.6%, 2.9%] 15

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.4% [0.0%, 0.9%] 96
Regressions ❌
(secondary)
0.8% [0.0%, 4.9%] 55
Improvements ✅
(primary)
-0.2% [-1.4%, -0.1%] 19
Improvements ✅
(secondary)
-0.1% [-0.2%, -0.0%] 4
All ❌✅ (primary) 0.3% [-1.4%, 0.9%] 115

Bootstrap: 666.058s -> 667.485s (0.21%)
Artifact size: 308.32 MiB -> 308.14 MiB (-0.06%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Feb 11, 2024
@saethlin saethlin force-pushed the slice-get-checked branch 2 times, most recently from d3a5063 to 8561678 Compare February 12, 2024 00:44
@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 12, 2024
@bors
Copy link
Contributor

bors commented Feb 12, 2024

⌛ Trying commit 8561678 with merge cdb40e7...

@bors
Copy link
Contributor

bors commented Feb 20, 2024

📌 Commit 4a12f82 has been approved by the8472

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Feb 20, 2024
@bors
Copy link
Contributor

bors commented Feb 20, 2024

⌛ Testing commit 4a12f82 with merge 2b43e75...

@bors
Copy link
Contributor

bors commented Feb 20, 2024

☀️ Test successful - checks-actions
Approved by: the8472
Pushing 2b43e75 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 20, 2024
@bors bors merged commit 2b43e75 into rust-lang:master Feb 20, 2024
12 checks passed
@rustbot rustbot added this to the 1.78.0 milestone Feb 20, 2024
@saethlin saethlin deleted the slice-get-checked branch February 20, 2024 16:27
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (2b43e75): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.8% [0.3%, 1.7%] 13
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.4% [-0.4%, -0.4%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.7% [-0.4%, 1.7%] 14

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
4.1% [0.3%, 11.1%] 5
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-4.3% [-5.9%, -2.1%] 5
Improvements ✅
(secondary)
-2.7% [-3.7%, -2.0%] 5
All ❌✅ (primary) -0.1% [-5.9%, 11.1%] 10

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.4% [1.0%, 1.7%] 3
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.4% [1.0%, 1.7%] 3

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.2% [0.0%, 0.9%] 54
Regressions ❌
(secondary)
0.1% [0.0%, 1.8%] 35
Improvements ✅
(primary)
-0.2% [-0.5%, -0.0%] 15
Improvements ✅
(secondary)
-0.1% [-0.2%, -0.0%] 16
All ❌✅ (primary) 0.1% [-0.5%, 0.9%] 69

Bootstrap: 640.822s -> 641.255s (0.07%)
Artifact size: 308.66 MiB -> 308.58 MiB (-0.02%)

@rustbot rustbot added the perf-regression Performance regression. label Feb 20, 2024
@saethlin
Copy link
Member Author

Not what we saw before. I'll look into this later.

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 20, 2024
Tweak inlining attributes for slice indexing

Doing some experiments in response to this unexpected regression: rust-lang#120863 (comment)

I expect the opt changes to be addressed by something like reviving rust-lang#91222. The debug changes are what I'm interested in.

Codegen tests will probably fail from time to time in this PR, I will fix them up later but also I don't trust the opt-level-z one: rust-lang#119878 (comment)

r? `@ghost`
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 24, 2024
…iler-errors

Ignore less tests in debug builds

Since rust-lang#120594 and rust-lang#120863, nearly all UB-detecting debug assertions get compiled out of code that is monomorphized by a crate built with debug assertions disabled.

Which means that if we default all our codegen tests to `-Cdebug-assertions=no`, most of them work just fine against a sysroot built with debug assertions.

I also tried to explain a bit better why some tests need to be skipped, for those that still need to be skipped.
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Feb 24, 2024
Rollup merge of rust-lang#121531 - saethlin:ignore-less-debug, r=compiler-errors

Ignore less tests in debug builds

Since rust-lang#120594 and rust-lang#120863, nearly all UB-detecting debug assertions get compiled out of code that is monomorphized by a crate built with debug assertions disabled.

Which means that if we default all our codegen tests to `-Cdebug-assertions=no`, most of them work just fine against a sysroot built with debug assertions.

I also tried to explain a bit better why some tests need to be skipped, for those that still need to be skipped.
#[rustc_macro_transparency = "semitransparent"]
pub macro debug_assert_nounwind {
($cond:expr $(,)?) => {
if $crate::cfg!(debug_assertions) {
if $crate::intrinsics::debug_assertions() {
Copy link
Member

@RalfJung RalfJung Feb 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this violates the contract for intrinsics::debug_assertions. We said this should only be used if there is anyway UB (meaning: language UB) when the condition is violated, but debug_assert_nounwind is used in places where there is only library UB.

EDIT: #121583 adds some FIXMEs.

bors pushed a commit to rust-lang/miri that referenced this pull request Feb 25, 2024
Ignore less tests in debug builds

Since rust-lang/rust#120594 and rust-lang/rust#120863, nearly all UB-detecting debug assertions get compiled out of code that is monomorphized by a crate built with debug assertions disabled.

Which means that if we default all our codegen tests to `-Cdebug-assertions=no`, most of them work just fine against a sysroot built with debug assertions.

I also tried to explain a bit better why some tests need to be skipped, for those that still need to be skipped.
@rylev
Copy link
Member

rylev commented Feb 27, 2024

@saethlin I'm doing perf triage - it seems the perf regression here hasn't yet been addressed. Looking at the profiles, it seems that most of the regressions are in codegen which I suppose makes sense given the nature of the change. Would you be able to either open an issue or a PR addressing the perf concerns in some form or perhaps argue for why the regression is acceptable at the current time?

@saethlin
Copy link
Member Author

@rylev The perf run in this PR which doesn't have the inline(always) attributes required to make the opt-level=z codegen test pass doesn't have these regressions: #120863 (comment) so I created a PR which is linked above, to demonstrate that inline(always) is causing regressions: #121369 (comment). Based on my investigation of the codegen test in question, I don't think it actually tests for what it is supposed to: #119878 (comment). I was also shocked to learn that inline(always) is honored when optimizations are not enabled, so I created this PR to investigate the effect of removing that property: #121417 (comment) but that PR probably can't make progress for a long time.

But most of the code in this PR is going to be overwritten by #121662. The primary cause of compile-time overhead in this PR is that the checks here are never outlined, which is how I mitigated the compile-time overhead of the first round of checks I added in #120594. That will be fixed by #121662.

All PRs with these checks as they are currently written interact strongly with tweaks to codegen like #121421 and #120650 because these checks insert IR patterns like br i1 {true,false}, as well as inducing creation of post-mono goto chains which are normally relatively rare in codegen because we have a MIR optimization to remove them.


In terms of "what to do next" I would like #121662 to land because it significantly changes what IR we generate. Then I intend to create a perf experiment PR that cfg's out all these checks to measure their compile-time impact. This whole feature has been landed in relatively small parts across now between 5 and 7 PRs and I'm not sure their perf impact is separable.

Also last night I came up with an idea for how to fix the impact that these checks have on MIR inlining, which might fix all the compile-time overhead in the opt benchmarks.

adpaco-aws added a commit to model-checking/kani that referenced this pull request Feb 29, 2024
Upgrades the Rust toolchain to `nightly-2024-02-25`. The Rust compiler
PRs that triggered changes in this upgrades are:
 * rust-lang/rust#121209
 * rust-lang/rust#121309
 * rust-lang/rust#120863
 * rust-lang/rust#117772
 * rust-lang/rust#117658

With rust-lang/rust#121309 some intrinsics
became inlineable so their names became qualified. This made our `match`
on the intrinsic name to fail in those cases, leaving them as
unsupported constructs as in this example:

```
warning: Found the following unsupported constructs:
             - _RNvNtCscyGW2MM2t5j_4core10intrinsics8unlikelyCs1eohKeNmpdS_5arith (3)
             - caller_location (1)
             - foreign function (1)
         
         Verification will fail if one or more of these constructs is reachable.
         See https://model-checking.github.io/kani/rust-feature-support.html for more details.

[...]

Failed Checks: _RNvNtCscyGW2MM2t5j_4core10intrinsics8unlikelyCs1eohKeNmpdS_5arith is not currently supported by Kani. Please post your example at https://github.com/model-checking/kani/issues/new/choose
 File: "/home/ubuntu/.rustup/toolchains/nightly-2024-02-18-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/num/mod.rs", line 25, in core::num::<impl i8>::checked_add
```

We use `trimmed_name()` to work around this, but that may include type
arguments if the intrinsic is defined on generics. So in those cases, we
just take the first part of the name so we can keep the rest as before.

Resolves #3044
lnicola pushed a commit to lnicola/rust-analyzer that referenced this pull request Apr 7, 2024
Use intrinsics::debug_assertions in debug_assert_nounwind

This is the first item in rust-lang/rust#120848.

Based on the benchmarking in this PR, it looks like, for the programs in our benchmark suite, enabling all these additional checks does not introduce significant compile-time overhead, with the single exception of `Alignment::new_unchecked`. Therefore, I've added `#[cfg(debug_assertions)]` to that one call site, so that it remains compiled out in the distributed standard library.

The trailing commas in the previous calls to `debug_assert_nounwind!` were causing the macro to expand to `panic_nouwnind_fmt`, which requires more work to set up its arguments, and that overhead alone is measured between this perf run and the next: rust-lang/rust#120863 (comment)
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this pull request Apr 27, 2024
Use intrinsics::debug_assertions in debug_assert_nounwind

This is the first item in rust-lang/rust#120848.

Based on the benchmarking in this PR, it looks like, for the programs in our benchmark suite, enabling all these additional checks does not introduce significant compile-time overhead, with the single exception of `Alignment::new_unchecked`. Therefore, I've added `#[cfg(debug_assertions)]` to that one call site, so that it remains compiled out in the distributed standard library.

The trailing commas in the previous calls to `debug_assert_nounwind!` were causing the macro to expand to `panic_nouwnind_fmt`, which requires more work to set up its arguments, and that overhead alone is measured between this perf run and the next: rust-lang/rust#120863 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants