-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large performance problem due to @fastmath
on many operations
#22275
Comments
Is this actually a regression? |
@fastmath
on many operations@fastmath
on many operations
Oops, my bad. I noticed this before on v0.5 but never reported it. Changed the title. These tests are run on v0.6-rc2. |
Just tested the lastest nightly binaries and it looks like it's still the case: julia> @time f1(a,b,c,d,e,f,g,h,j,k,l,m,n,o,p)
0.014693 seconds (4.06 k allocations: 225.578 KiB)
julia> @time f2(a,b,c,d,e,f,g,h,j,k,l,m,n,o,p)
0.024622 seconds (10.12 k allocations: 568.695 KiB)
julia> @time f1(a,b,c,d,e,f,g,h,j,k,l,m,n,o,p)
0.000004 seconds (4 allocations: 160 bytes)
julia> @time f2(a,b,c,d,e,f,g,h,j,k,l,m,n,o,p)
0.000043 seconds (294 allocations: 5.313 KiB)
|
This is number of operations dependent: function f1(a,b,c,d,e,f,g,h,j,k,l,m,n)
aidx = eachindex(a)
for i in aidx
@inbounds a[i] = b[i]+c*(d*e[i]+f*g[i]+h*j[i]+k*l[i]+m*n[i])
end
end
function f2(a,b,c,d,e,f,g,h,j,k,l,m,n)
aidx = eachindex(a)
@fastmath for i in aidx
@inbounds a[i] = b[i]+c*(d*e[i]+f*g[i]+h*j[i]+k*l[i]+m*n[i])
end
end
a = rand(10)
b = rand(10)
c = 0.1
d = 0.1
e = rand(10)
f = 0.1
g = rand(10)
h = 0.1
j = rand(10)
k = 0.1
l = rand(10)
m = 0.1
n = rand(10) julia> @benchmark f1($a,$b,$c,$d,$e,$f,$g,$h,$j,$k,$l,$m,$n)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 35.421 ns (0.00% GC)
median time: 36.593 ns (0.00% GC)
mean time: 39.635 ns (0.00% GC)
maximum time: 299.767 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
julia> @benchmark f2($a,$b,$c,$d,$e,$f,$g,$h,$j,$k,$l,$m,$n)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 34.836 ns (0.00% GC)
median time: 36.300 ns (0.00% GC)
mean time: 39.278 ns (0.00% GC)
maximum time: 129.684 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000 But this issue happens even though the parenthesis encloses <16 values? |
The examples in this issue seem to be performing much better on master, example in OP:
|
Looks fixed indeed. |
Found in the same code as #22255
The text was updated successfully, but these errors were encountered: