Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed of sum(BroadcastArray) #16

Open
mcabbott opened this issue Jan 13, 2019 · 12 comments
Open

Speed of sum(BroadcastArray) #16

mcabbott opened this issue Jan 13, 2019 · 12 comments

Comments

@mcabbott
Copy link
Contributor

Here's something I was trying to do recently, to avoid materializing a huge array just to sum it:

V = rand(500);
V3 = reshape(V, 1,1,:);
@time sum(V .* V' .* V3)                                 # 0.57 seconds, 953.675 MiB
@time sum(BroadcastArray(*, V, V', V3))                  # 0.36 seconds
@time sum(BroadcastArray(*, V, V', V3), dims=(1,2,3))[1] # 0.014 seconds

Might it be worth making sum() simply call the dims=... version here?

Before I saw this package I was messing around with broadcasting to do this by hand, and could get in the ballpark of 0.014s here, but no better. Doing that seems likely to be more fragile.

@dlfivefifty
Copy link
Member

Happy to accept a PR fixing this!

@mcabbott
Copy link
Contributor Author

Will do.

But first, one more puzzle which just bit me: why is the first of these 5x quicker?

V2 = reshape(V, 1,:)
@time sum(BroadcastArray(*, V, V', V3), dims=(2,3));    # 0.025 seconds
@time sum(BroadcastArray(*, V, V2, V3), dims=(2,3));    # 0.132 seconds

Same times here:

W = similar(V);
@time sum!(W, BroadcastArray(*, V, V', V3));
@time sum!(W, BroadcastArray(*, V, V2, V3));

@dlfivefifty
Copy link
Member

I looked at the profile and the second one spends more time in getindex. I can't seem to think of a good reason why.

@mcabbott
Copy link
Contributor Author

Thanks for taking a look, it is indeed strange. The biggest difference in an isolated getindex I can see is... far from a factor of 5:

B1 = BroadcastArray(*, V, V', V3).broadcasted
B2 = BroadcastArray(*, V, V2, V3).broadcasted
@btime $B1[3,3,3]    # 3.816 ns
@btime $B2[3,3,3]    # 4.205 ns

@dlfivefifty
Copy link
Member

Remember order of access of data makes a big difference, but I’m not sure how to test that

@mcabbott
Copy link
Contributor Author

Yes that could be what's happening. I was trying to dig into what CartesianIndices etc. give, but couldn't find any difference. Here's another attempt, but I don't know what this means...

V = collect(11:12)/1;

W = collect(21:23)/1;
V2 = reshape(collect(21:23)/1, 1,:);

V3 = reshape(collect(31:34)/1, 1,1,:);

function Base.getindex(A::AbstractArray, I...) 
    Base.error_if_canonical_getindex(IndexStyle(A), A, I...)
    out = Base._getindex(IndexStyle(A), A, to_indices(A, I)...)
    println("  ",out,"  ",I)
    out
end

using LazyArrays

sum(BroadcastArray(*, V, W', V3), dims=(2,3));

sum(BroadcastArray(*, V, V2, V3), dims=(2,3));

The fast one ends with this, the slow one is the same but skips the lines starting 23.0:

  23.0  (CartesianIndex(1, 3),)
  8602.0  (1, CartesianIndex(3, 4))
  23.0  (CartesianIndex(1, 3),)
  9384.0  (2, CartesianIndex(3, 4))

@dlfivefifty
Copy link
Member

Ohhhhh, the answer is simple! V and V’ share memory, while reshape(V,1,:) allocates new memory. The compiler might be smart enough to only load memory, or if not it’s because it doesn’t have to jump to a new location.

@mcabbott
Copy link
Contributor Author

mcabbott commented Jan 19, 2019

But if I set V2[2] = 99 then V is altered. However typeof(V2) and dump(V2) show no evidence of this, so perhaps at some level it's ignorant of the connection, and thus reads it twice? (But 5x?)

I also just tried making all three factors different random numbers, but one adjointed, one reshaped. And the speed difference is the same.

@dlfivefifty
Copy link
Member

Hmm, you are right. Here's another data point:

V4 = Base.ReshapedArray(V, (1, 500), ())
@time sum(BroadcastArray(*, V, V4, V3), dims=(2,3));    # 0.132 seconds

@mcabbott
Copy link
Contributor Author

mcabbott commented Feb 6, 2019

BTW, I just saw JuliaLang/julia#30973 and wonder if that issue (28126) might be what is going on here. Haven't tried it out yet.

@dlfivefifty
Copy link
Member

Can we close this?

@mcabbott
Copy link
Contributor Author

mcabbott commented Feb 8, 2019

I remain puzzled by the V2 = reshape(V, 1,:) issue, although it's not obviously this package's fault.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants