Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better reducedim type inference #6994

Merged
merged 1 commit into from
May 28, 2014

Conversation

stevengj
Copy link
Member

Fixes #6672.

@timholy
Copy link
Member

timholy commented May 27, 2014

#6672 was filed for reductions other than those in reducedim, so probably wouldn't want to close #6672 just yet. I also don't think this will work for an example like sum(Vector{Int}[[1,2],[4,3]], 1) because IIUC it still relies on having zero or one defined.

On my todo-list is to completely eliminate the need to initialize the output. Basically generalize reductions like

julia> function mysum(x)
           xs = x[1]
           for i = 2:length(x)
               xs += x[i]
           end
           xs
       end

to more than one dimension. The logic here is a little tricky, but I hope it's doable (though it might possibly hurt efficiency). If we can do that, then reducedim will be more flexible than it ever has been.

But if we need a stopgap, one good option would be to at least allow the user to initialize the output manually and be able to run sum! or prod! without triggering an error. I believe that's where we were before the init keyword got added.

@stevengj
Copy link
Member Author

@timholy, the problems in #6672 boil down to the limitations of reducedim, and are indeed fixed by this patch. e.g. std(FloatingPoint[1,2,3], 1) now gives [1.0] and Base.sumabs2(FloatingPoint[1,2,3],1) now gives [14.0].

@stevengj
Copy link
Member Author

sum(Vector{Int}[[1,2],[4,3]], 1) should work but doesn't here because I erroneously checked for method_exists(zero, (T,)) rather than method_exists(zero, (Type{T},)). I'll post an updated patch shortly.

@stevengj
Copy link
Member Author

Unfortunately, your mysum function is not type-stable; see #6069 and #6116. You need to special-case 3 cases: empty sums (return zero if possible, otherwise throw an error), 1-element sums (return x[1]+zero(x[1]) if possible, otherwise return x[1]), and sums of two or more elements (initialize to x[1] + x[2]).

Doing this for reducedim-type reductions is straightforward in principle, but a little bit hairy in practice which is why I didn't attempt it in #6116.

@timholy
Copy link
Member

timholy commented May 27, 2014

@timholy, the problems in #6672 boil down to the limitations of reducedim

I hadn't caught that back when it was filed & discussed. Rats.

Unfortunately, your mysum function is not type-stable

Right, but isn't that what you give up with type-flexibility? For example, for

A = FloatingPoint[1.0f0 2.0f0; 3.0 4.0]

shouldn't we really have sum(A, 1) == FloatingPoint[4.0 6.0] and

sum(A, 2) == reshape(FloatingPoint[3.0f0;  7.0], 2, 1)

?

Anyway, the revised patch looks better than where we are now (I didn't even realize zero([1,2]) == [0,0], that's quite handy). Fine with me if this gets merged, even if there may be yet another iteration someday.

@stevengj
Copy link
Member Author

@timholy, obviously if you have an Array{Real} where every element is a different real type, then you aren't going to get any benefits of type stability. However, in the special case where all your elements have the same type then you want the sum function to be fast and type-stable. Your implementation does not have the second property.

@timholy
Copy link
Member

timholy commented May 27, 2014

It sounds great to get type-stability in cases where the type of the container is less specific than it could have been, but I'm being a little slow in understanding how you achieve it. For example, arguably we shouldn't have zero(FloatingPoint) defined at all. If you don't buy that, then convert everything I'm about to say to Vector{SIQuantity} (from the SIUnits package), for which it's pretty clear that you can't define zero(SIQuantity) (is it 0Meter? 0KiloGram?)

In such cases, IIUC with this patch the element type of sum(a, 1) is determined from sum(a). But

julia> Base.return_types(sum, (Vector{FloatingPoint},))
1-element Array{Any,1}:
 Any

and type-stability of Any is not, as far as I am aware, something that helps very much. Now, it might so happen that you get a Float64 back, and then you can do your reducedim using Float64 as the element type, and everything in the reduction is type-stable. But in the meantime, didn't you have to call sum and have it operate in what I assume must be a type-unstable fashion? Both sum(a) and sum(a, 1) are O(N), where N is the number of elements of a, so I'm not yet seeing what this really buys you.

But you seem to have thought about this, so I bet I'm just not getting it.

@stevengj
Copy link
Member Author

@timholy, sorry I wasn't clear. sum will never be type-stable in a useful way for Vector{FloatingPoint}. However, you do want it to be type-stable in cases where it can be usefully type-stable.

For example, your mysum function is not type-stable for Vector{Int8}, where we certainly should be able to get type-stability.

@timholy
Copy link
Member

timholy commented May 28, 2014

Got it. Of course, that's fixed simply by changing the function to

function mysum(x)
           xs = x[1] + x[2]
           for i = 3:length(x)
               xs += x[i]
           end
           xs
       end

and has the advantage of doing its job in a single pass.

(I also think it's basically inevitable that someday, Stefan will agree that Int8 + Int8 = Int8 😄.)

@timholy
Copy link
Member

timholy commented May 28, 2014

(But I'll add that this is all vaporware right now, and I have no objections to your version being merged, particularly if there is not a better way of getting the same effect.)

@stevengj
Copy link
Member Author

@timholy. Right, but you also have to special-case 0 and 1 elements as I mentioned above. This is what sum does now.... it is just a pain to generalize to reducedim.

@timholy
Copy link
Member

timholy commented May 28, 2014

Agreed, the logic for avoiding double-counting in reductions that involve more than one dimension simultaneously is not going to be entirely trivial. Adding the possibility of 0- and unit-length arrays on top of that makes it a bit worse, although I suspect that's going to be relatively minor ("do it on a separate code path") compared to the first.

@eval function $f{T}(A::AbstractArray{T}, region)
if method_exists($init, (Type{T},))
z = $op($init(T), $init(T))
Tr = typeof(z) == typeof($init(T)) ? T : typeof(z)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why so simply set Tr = typeof(z) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If T is Real, then typeof(z) will be Int, and then you will get errors if there are non-integer values in the array.

On the other hand, if T is Int8, then z will have type Int, and I think we want Tr to be Int too. Hence the conditional.

@JeffBezanson
Copy link
Member

I get the sense we should merge this?

@lindahua
Copy link
Contributor

I think it is good to merge.

JeffBezanson added a commit that referenced this pull request May 28, 2014
better reducedim type inference
@JeffBezanson JeffBezanson merged commit 916b2f5 into JuliaLang:master May 28, 2014
@IainNZ
Copy link
Member

IainNZ commented May 28, 2014

Was the perf test suite run before-after?

@stevengj
Copy link
Member Author

@IainNZ, I didn't run the perf-test suite. This patch shouldn't really impact performance in the common case of sum for Array{T} where T is a concrete Number type, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

some reductions over dimensions depend too heavily on zero()
5 participants