Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3-argument dot method #173

Open
devmotion opened this issue Aug 21, 2020 · 5 comments
Open

3-argument dot method #173

devmotion opened this issue Aug 21, 2020 · 5 comments

Comments

@devmotion
Copy link
Member

return dot(z, Q * z)
and probably also dot_percol could benefit from the 3-argument method dot(x, A, y). However, since this version was only introduced in Julia 1.4, Compat >= 3.2.0 would be required to support earlier Julia versions.

Of course, this would increase loading times on Julia < 1.4 (by hiding the import behind a @static if VERSION < v"1.4.0-DEV.92" check it should be possible to avoid this with recent Julia versions, I assume). As an example, on my computer (with Julia 1.5 though) I get on master

julia> @time using Distances
  0.040426 seconds (32.08 k allocations: 2.444 MiB)

whereas with an additional using Compat I get

julia> @time using Distances
  0.055971 seconds (63.66 k allocations: 4.458 MiB)

Would it still be OK to add Compat as a dependency?

Alternatively, one could maybe define and use internally

function dot3args(x, A, y)
    @static if VERSION < v"1.4.0-DEV.92"
        dot(x, A * y)
    else
        dot(x, A, y)
    end
end

which would just fall back to the current implementation for older Julia versions (in contrast to the implementation in https://github.com/JuliaLang/Compat.jl/blob/773db6a10c299a9af68eae5bf0f238117517bd1b/src/Compat.jl#L127-L144 that avoids the intermediate creation of an array).

@dkarrasch
Copy link
Member

IIRC, the 3-arg dot is slower than using BLAS twice (which I just confirmed with a quick benchmark for Symmetric matrices). Its main purpose is to avoid intermediate allocations. Or do you have concrete benchmarks that show the opposite?

@devmotion
Copy link
Member Author

I get quite different benchmark results, for the example in Distances, the 3-argument version seems to be faster (and, of course, avoid the allocation):

julia> using LinearAlgebra, BenchmarkTools

julia> z = rand(5);

julia> Q = rand(5, 5);

julia> f(z, Q) = dot(z, Q * z)
f (generic function with 1 method)

julia> g(z, Q) = dot(z, Q, z)
g (generic function with 1 method)

julia> @btime f($z, $Q);
  90.175 ns (1 allocation: 128 bytes)

julia> @btime g($z, $Q);
  28.354 ns (0 allocations: 0 bytes)

julia> z = rand(100);

julia> Q = rand(100, 100);

julia> @btime f($z, $Q);
  4.842 μs (1 allocation: 896 bytes)

julia> @btime g($z, $Q);
  1.670 μs (0 allocations: 0 bytes)

@dkarrasch
Copy link
Member

Oh, I was benchmarking large matrices (dimension several hundreds). I wasn't sure what the typical use case for Mahalanobis is.

@devmotion
Copy link
Member Author

Yes, I can confirm that at some point the current implementation becomes faster. Would it make sense to add an empirical cutoff in the dot3args function suggested above?

@devmotion
Copy link
Member Author

Although it sounds difficult to come up with good threshold values for arbitrary vector and matrix (not possible currently, but I guess the type constraints of SqMahalanobis could be relaxed?) types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants