-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduced performance when using question mark operator instead of try!
#37939
Comments
I didn't read the implementation until now, I though they expanded to the exact same thing as try!() |
Might just need to stick a couple of |
I'm curious why this issue doesn't even get a label like I-Slow. |
@sfackler I tried this ( |
I've found out why this is slow. Suppose you have an expression match Carrier::translate(res) {
Ok(val) => val,
Err(err) => return Carrier::from_error(From::from(err)),
} The real culprit is impl<U, V> Carrier for Result<U, V> {
type Success = U;
type Error = V;
fn from_success(u: U) -> Result<U, V> {
Ok(u)
}
fn from_error(e: V) -> Result<U, V> {
Err(e)
}
fn translate<T>(self) -> T
where T: Carrier<Success=U, Error=V>
{
match self {
Ok(u) => T::from_success(u),
Err(e) => T::from_error(e),
}
}
} In // Carrier::translate(<expr>)
let discr = {
// expand <expr>
let sub_expr = self.lower_expr(sub_expr);
let path = &["ops", "Carrier", "translate"];
let path = P(self.expr_std_path(unstable_span, path, ThinVec::new()));
P(self.expr_call(e.span, path, hir_vec![sub_expr]))
}; Suppose we remove translation and implement the piece simply like this: // Carrier::translate(<expr>)
let discr = {
// expand <expr>
P(self.lower_expr(sub_expr))
}; If we do that, there will be no slowdown in ratel-core due to question mark operators (I benchmarked it). I created a simple example that demonstrates the issue: https://is.gd/Q7PGzm identity:
.cfi_startproc
xorl %eax, %eax
cmpq $0, (%rsi)
movq 8(%rsi), %rcx
setne %al
movq %rax, (%rdi)
movq %rcx, 8(%rdi)
movq %rdi, %rax
retq You can see there are unncesseary I'm a total beginner at rustc, so no idea how to proceed further. Should we perhaps detect identity functions within a MIR optimization pass? |
I just switched a couple inner loop ?s back to try!(...) to work around rust-lang/rust#37939
Lower `?` to `Try` instead of `Carrier` The easy parts of rust-lang/rfcs#1859, whose FCP completed without further comments. Just the trait and the lowering -- neither the error message improvements nor the insta-stable impl for Option nor exhaustive docs. Based on a [github search](https://github.com/search?l=rust&p=1&q=question_mark_carrier&type=Code&utf8=%E2%9C%93), this will break the following: - https://github.com/pfpacket/rust-9p/blob/00206e34c680198a0ac7c2f066cc2954187d4fac/src/serialize.rs#L38 - https://github.com/peterdelevoryas/bufparse/blob/b1325898f4fc2c67658049196c12da82548af350/src/result.rs#L50 The other results appear to be files from libcore or its tests. I could also leave Carrier around after stage0 and `impl<T:Carrier> Try for T` if that would be better. r? @nikomatsakis Edit: Oh, and it might accidentally improve perf, based on rust-lang#37939 (comment), since `Try::into_result` for `Result` is an obvious no-op, unlike `Carrier::translate`.
Lower `?` to `Try` instead of `Carrier` The easy parts of rust-lang/rfcs#1859, whose FCP completed without further comments. Just the trait and the lowering -- neither the error message improvements nor the insta-stable impl for Option nor exhaustive docs. Based on a [github search](https://github.com/search?l=rust&p=1&q=question_mark_carrier&type=Code&utf8=%E2%9C%93), this will break the following: - https://github.com/pfpacket/rust-9p/blob/00206e34c680198a0ac7c2f066cc2954187d4fac/src/serialize.rs#L38 - https://github.com/peterdelevoryas/bufparse/blob/b1325898f4fc2c67658049196c12da82548af350/src/result.rs#L50 The other results appear to be files from libcore or its tests. I could also leave Carrier around after stage0 and `impl<T:Carrier> Try for T` if that would be better. r? @nikomatsakis Edit: Oh, and it might accidentally improve perf, based on rust-lang#37939 (comment), since `Try::into_result` for `Result` is an obvious no-op, unlike `Carrier::translate`.
I think this issue is fixed. Generated asm is same for try! and ?. https://play.rust-lang.org/?gist=625d88df305ace951a088e1cee2ec13a Edit: Typo |
Even if this is fixed: Is there a unit test which prevents regression? |
We could have a Someone tag this as |
It looks like they both have the same bad code generation now, needs looking into. |
Example where it is compared with the identity function. Returning But it seems this example is not a good condensation of the issue, because I can't find any previous Rust version (in rust.godbolt) that has the desired identity function code gen even for the It doesn't show the difference between try and ?, but it shows something we can fix to improve them both. Edited: Update to another example code link (configurable Result type). |
I think https://reviews.llvm.org/D37216 should fix this, but it's a little stuck in the LLVM review queue. |
@arielb1 it was reverted llvm-mirror/llvm@c87c1c0. |
Still not fixed (tested nightly on playpen) |
This seems to be fixed (in stable). Tried the play link above. ; Function Attrs: noinline norecurse nounwind readnone uwtable
define { i64, i64 } @try_op(i64, i64) unnamed_addr #2 {
%3 = tail call { i64, i64 } @try_macro(i64 %0, i64 %1) #2
ret { i64, i64 } %3
} try_op:
jmp try_macro
try_macro:
xorl %eax, %eax
testq %rdi, %rdi
setne %al
movq %rsi, %rdx
retq |
You can still provoke it to make a difference and introduce conditionals or more copies for the type T = (i32, i32);
type E = T;
type R = Result<T, E>;
#[no_mangle]
pub fn try_op(a: R) -> R {
Ok(a?)
}
#[no_mangle]
pub fn try_macro(a: R) -> R {
Ok(try!(a))
} |
Looks like it regressed: https://godbolt.org/z/awKD_U |
We are fast again (at least for the examples given above) since Rust 1.52 which contains LLVM 12 upgrade! Once somebody adds test (or confirms that similar test already exists) this issue could be closed. |
There's this codegen test, but I'm not sure if it's enough to check for this. |
According to this blog post, Rust has to this date (Rust 1.62.1) performance problems with the question mark operator, resulting in a 4% performance loss. |
Definitely not -- you'll notice it's both But there's good news! LLVM 15 merged a few days ago, bringing the fix mentioned in #85133 (comment). With that, both of these are nops now: #![feature(try_blocks)]
pub fn result_nop_match(x: Result<i32, u32>) -> Result<i32, u32> {
match x {
Ok(x) => Ok(x),
Err(x) => Err(x),
}
}
pub fn result_nop_traits(x: Result<i32, u32>) -> Result<i32, u32> {
try {
x?
}
} https://rust.godbolt.org/z/71dYnrMf6 example::result_nop_match:
mov rax, rdi
ret
example::result_nop_traits:
mov rax, rdi
ret For comparison, on 1.63 even the example::result_nop_match:
xor ecx, ecx
test edi, edi
setne cl
movabs rax, -4294967296
and rax, rdi
or rax, rcx
ret EDIT: I opened #100693 to have a codegen test here. |
This was reported on the users forum , and I don't want it to get lost. Basically, replacing
try!
by?
resulted in ~20% performance loss in benchmarks:ratel-rust/ratel-core#48 (comment)
I've reproduced, but not further investigated, these findings. Is that expected right now? It's not a good argument for adopting the question mark :)
The text was updated successfully, but these errors were encountered: