-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Insufficient inlining of PartialEq::eq #59375
Comments
Unconditionally putting Is there a reason you can't use opt-level 0 for development and opt-level 3 (aka Cargo's release profile) for benchmarks and tests? (I'm assuming you want this because your tests take a long time?) |
Okay nevermind me, apparently it already puts
|
Yup, likewise the libcore implementations are also #[inline]. The implementation of partialeq is always just byte compare (usually 1 instruction), so I can't think of any case you would not want it inlined. I found this bug by converting an integer field to an enum in the project, and suddenly seeing CI take over 2x the time (as it runs in debug mode). It actually makes it impractical to use an enum unless we use opt-level 3 for debug (and that inlines so aggressively otherwise that it's not desired for debugging). |
I tried changing the implementation to the following:
Unfortunately the resulting compiler really struggles to compile the syn crate - I'm not sure if it's just really slow or it's actually hung, I'll let it run overnight. |
Mixed artifacts from two different compilers, deriving Eq doesn't solve the problem. |
It might be interesting to try to change the |
I played around with this and have a WIP branch (dotdash@a5dc7e3) that teaches the deriving code about trivial cases, so the generated code turns from: fn eq(&self, other: &RefType) -> bool {
{
let __self_vi =
unsafe { ::std::intrinsics::discriminant_value(&*self) } as
isize;
let __arg_1_vi =
unsafe { ::std::intrinsics::discriminant_value(&*other) } as
isize;
if true && __self_vi == __arg_1_vi {
match (&*self, &*other) { _ => true, }
} else { false }
}
} into fn eq(&self, other: &RefType) -> bool {
{
let __self_vi =
unsafe { ::std::intrinsics::discriminant_value(&*self) } as
isize;
let __arg_1_vi =
unsafe { ::std::intrinsics::discriminant_value(&*other) } as
isize;
if true && __self_vi == __arg_1_vi { true } else { false }
}
} Which saves 32 bytes of stack in the eq function in this case (it depends on the repr type for the enum). Additionally, the derive for PartialEq emits But there's one problem left: the discriminant intrinsic. The codegen for intrinsics still requires a stack slot for the return value, even for immediate values. Thus both discriminants get spilled onto the stack for no good reason. If we could get that fixed we'd have good enough codegen to always inline these simple cases, even without optimizations. Unfortunately the codegen code has changed quite a bit since I last touched it, so I didn't quite get that part done, and I'm not sure how long it might take. |
@tdaede could you try and see whether this helps with your code in rav1e? |
I've been trying to get it to reproduce, and now despite using the same compiler version and code, I see it suddenly decide to inline now :( I wonder if it's dependent on what codegen unit boundaries rustc decides to use at a given time? My smaller testcases in the bug still reproduce. FWIW in rav1e I was using the following to try to reproduce it: |
https://godbolt.org/z/rKb5vYqWb It looks to me like this was fixed in 1.59, the PartialEq implementation is now inlined at |
In rav1e, an enum that implements PartialEq fails to get its eq() method inlined at any opt-level lower than 3, despite being a one-instruction test. This destroys performance on anything other than opt-level 3.
I made a godbolt test case, however it only exhibits the behavior between 1 and 2, not 2 and 3. Not sure what makes rav1e different, but regardless even for opt-level 1 it probably should be inlined. Possibly the standard library should have #[inline(always)] for this derive implementation?
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=0ca77d0df01dc401fec111d64aa9eb0f
The text was updated successfully, but these errors were encountered: