Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functions that produce results in pre-defined outputs #1115

Closed
timholy opened this issue Aug 4, 2012 · 15 comments
Closed

Functions that produce results in pre-defined outputs #1115

timholy opened this issue Aug 4, 2012 · 15 comments
Labels
needs decision A decision on this change is needed speculative Whether the change will be implemented is speculative
Milestone

Comments

@timholy
Copy link
Member

timholy commented Aug 4, 2012

Operations like matrix multiplication, C = A*B, have historically been implemented as C = mult(A,B), and this seems very natural. Below, I argue that there are numerous benefits to habitually writing our core algorithms as mult(C, A, B), where C is a pre-allocated output. Note that this issue goes far beyond matrix multiplication: taken to the extreme, one might say that the majority of function that produce any kind of "nontrivial" output (especially ones that allocate storage for that output) might be candidates for refactoring under this proposal. Of course, many areas of Julia already follow this paradigm, e.g., the I/O functions take as first input "io", which is a stream representing the output.

Here are reasons that I think we need to consider doing things this way a matter of habit:

  1. Flexibility efficient sets using hash tables #1: first, note that C = mult(A, B) can be trivially written in terms of mult(C, A, B):
function mult(A, B)
    C = Array(T, size(A, 1), size(B, 2))
    mult(C, A, B)
end

The converse is not true: you can't write mult(C, A, B) in terms of mult(A, B), because mult(A, B) allocates the storage for its output. Putting the actual core code into the three-argument version is therefore an increase in API flexibility.

The only disadvantage is lots of little "stub" functions when we also want the two-argument version to be available. But in general we have a lot of those anyway, and they are a strength of Julia.

  1. Flexibility reimplement Set efficiently using hash tables #2: Better control over output type. For example, for handling big data, I might want the output to be a memory-mapped array. But how is a matrix-multiplication routine supposed to know that? It happens automagically for me if the output is one of the inputs.* Likewise, consider a case where an algorithm might produce results in one of two formats, say a string or a number representation of a date: if I don't want to have to re-format, I might as well have the output be an input to the function that figures out what day it is.
  2. Performance: often one has to do the same computation a ridiculous number of times. Consider a case where you have to do det(A_i * B_i) for millions of small matrices A_i and B_i (applications where this arises: simulations, calculations on a grid, etc.). With the two-argument form of matrix multiplication, malloc has to be called for each intermediate value. One might imagine speed improvements using a single preallocated temporary for each intermediate result. In a test of 2x2 matrix multiplication, this trivially gets you a 3-fold speed improvement. (The fact that it's not larger is a triumph of Julia's memory management.)
  3. Extensibility: there is more than a little interest in using Julia for GPU computing and in SMP multithreading. In such cases, you want to avoid calls from a thread to Julia's allocator functions. Presumably it will be easier to build kernels for such applications if more of our core computations already follow this paradigm.

Aside from encouraging this as a matter of programming style, let me bring up some of the issues worthy of discussion. We use the ! convention to indicate a function that mutates its inputs, such as sort!. However, this is not exactly what's happening here: in a matrix multiplication routine, A and B are not being mutated. There are also subtleties: for plus(C, A, B) it's perfectly safe to call this as plus(A, A, B), meaning the matrix version of A += B. However, you can't safely do matrix multiplication in-place: the output storage has to be independent of the inputs.

In other words, we might need more conventions or notation. I'm not good at naming, but to get the discussion started, let me make a few proposals.

A. We could reserve ! for its "purest" current meaning, things like in-place sort!.

B. We might not need special notation for a version that allows you to supply outputs, e.g., we could have both C = plus(A, B) and plus(C, A, B), where the distinction is handled by multiple dispatch. Alternatively, would it be desirable/feasible to use function names like mult=(C, A, B), where the = indicates that the first (or first several) arguments are outputs?

C. When functions need their output storage to be independent of their inputs, they could be given names like mult_noalias (following notation used in Eigen and Boost) to indicate to the user that it's a bug to call such functions like this: mult_noalias(A, B, A).

D. Finally, there has been a proposal for operator versions like C .= A*B. This kind of thing is clearly a good idea. But by now it should be clear that operators are only the tip of the iceberg---this issue permeates the entire function API.

References to previous related discussions:
#249
https://groups.google.com/d/topic/julia-dev/bWGWeIdhsnI/discussion
https://groups.google.com/d/topic/julia-dev/NaPxnrHtut8/discussion
Commit fc14052, which began the process for some of the linear algebra routines, but was horribly inconsistent in its naming scheme and needs fixing.

*Currently the calling function couldn't determine whether an array is in-memory or mapped to a file. This could potentially lead to problems from too much automagical behavior, when, for example, the best algorithm might depend upon the storage representation. But it should be clear that in general Julia's type system usually allows you to resolve such things very cleanly via multiple dispatch; memory-mapped arrays are a bit of an exception in this regard.

@JeffBezanson
Copy link
Member

One idea is to use a keyword argument output as the convention for specifying an output argument. That way the feature can be added to any function without disturbing normal arguments.

A simple rewrite would be A := f(...) becomes A = f(..., output=A). I like := since it is traditionally used for mutating or updating assignment.

Then a more far-out enhancement would be to allow the compiler to speculatively insert output= itself to reuse temporaries:

a = b+2*c+d

could become

tmp = 2*c
tmp2 = b+tmp
+(tmp2, d, output=tmp)
a = tmp

@timholy
Copy link
Member Author

timholy commented Aug 4, 2012

To me, putting output in the function declaration seems very clear. How would this proposal interact with this situation?

val := optimize_me(x)
val, grad := optimize_me(x)
val, grad, hess := optimize_me(x)

function optimize_me(output=grad, output=hess, x)

Since val is usually a scalar (in this type of example), it would probably be returned directly. Could this be flexible enough to determine whether it's necessary to bother computing the gradient?

@timholy
Copy link
Member Author

timholy commented Aug 4, 2012

Also, this by itself doesn't handle the noalias issue, although of course that's something that documentation could provide.

@JeffBezanson
Copy link
Member

I think a,b,c := f(x) would have to be a,b,c = f(x,output=(a,b,c)).

We'd also have to decide whether output is mandatory or advisory --- whether the specified output must be used, or just can be used. This affects the case where output might not be the right type. It could be an error, or just result in allocation.

@timholy
Copy link
Member Author

timholy commented Aug 5, 2012

If it's not of the right size or type, in general I'd lean towards an error---for use in kernels, it might be nice to have a guarantee that allocation won't be performed. It should be easy enough to write a wrapper around it that does the allocation, if that's the behavior someone actually wants.

@timholy
Copy link
Member Author

timholy commented Aug 6, 2012

Another question: presumably output doesn't restrict something from also being used as an input, right? I'm thinking of routines like the BLAS daxpy.

@dmbates
Copy link
Member

dmbates commented Aug 6, 2012

I wrote a comment on commit fff7d42 which probably should have been here instead. It seems to me that trying to find a syntax for linear algebra operations with preallocated results is overkill. The Lapack and BLAS code already does uses preallocated resutls and I think exposing these capabilities to programmers would be best accomplished by having the immediate Julia wrappers of Lapack or BLAS subroutines use preallocated storage for the results. The user-facing Julia functions would allocate the result and the sophisticated programmer who wanted to save the malloc/free overhead could call the low-level wrappers directly.

@timholy
Copy link
Member Author

timholy commented Aug 6, 2012

I think we mostly agree on this point, although I think we should provide one level up that "Julia-izes" Lapack but without allocating the outputs. Right now the linalg code tree seems not to provide this reliably.

While filing this issue was prompted by what I found in the linalg files (I hadn't looked at them until very recently), as a matter of "Julia programming style" it goes beyond that. I'm really thinking ahead to the day when Krys will deliver multithreadable kernels from native Julia code :-), and I want to be able to use those with minimum of reading documentation about the ccall interface to external packages. So part of the plan in filing this issue was to collect comments from existing Julia developers and to be able to direct newcomers to Julia to some document that describes why this might be important.

@o-jasper
Copy link

o-jasper commented Aug 7, 2012

Hmm what about return type declarations? Basically also pattern-match for the T there; function foo{T}(x)::T, and then select T with y = foo(x)::TypeYouWant. Not sure how hard that is to implement..

:= could be useful. For instance with the distinction, you could allow for Common Lisps setf -functions, because you can tell the difference between get(dict,key) := value (that's assign(dict,value,key)) and get(dict,key) = value; defining a function. (I like setf-functions because it takes the convention out of getter and setters, and makes the language aware of the idea.)

Another instance is special variables, where without the distinction you don't know if you're changing it's value or locally defining it. Well not quite, because = is still assignment for variables, unless we're willing to change all the code. I guess the local keyword covers it.

@timholy
Copy link
Member Author

timholy commented Aug 9, 2012

Return type declarations are definitely interesting and have overlap with this proposal, but also have some important differences. The most obvious is that they don't solve the problem of temporary allocation. If I want a 2x2 matrix back, but don't want the algorithm to allocate fresh memory to store it, then I have to tell it which blob of memory I want it to use. That's easy if the output is one of the inputs.

@o-jasper
Copy link

o-jasper commented Aug 9, 2012

I agree that it would be good if the user is sure that it will do mult(C,A,B) but it is also good if it can infer it when you type C =A*B in assignment, and C has some compatible matrix value already.

@timholy
Copy link
Member Author

timholy commented Aug 31, 2012

For those who don't follow every last detail of Julia development, I realized I should give a status update: the process proposed here has been completed (some time ago) for matrix multiplication. On some not-too-small problems (20x20) we've seen 1.5-fold speed increases, in addition to the benefit of extra flexibility that it permits (e.g., writing the results directly to memory-mapped storage).

The pull request referenced above for tridiagonal and Woodbury is the first case where it might be implemented for inversion, too. That pull request also contains a specific proposal (the last bullet point) that could be the basis for spreading this more widely. So, anyone who cares about this issue may want to check that discussion.

@andreasnoack
Copy link
Member

I am working on some functionality for Toeplitz matrices where I define e.g. SymmetricToeplitz, Circulant and TriangularToeplitz and only store the first column. The reason is that I want fast matrix-vector multiplication via fft and therefore define methods for each type that do the right circulant embedding and so forth. Next, I wanted to write some iterative solvers for Toeplitz problems and after writing them I realised that they were agnostic to the Toeplitz structure. They only need matrix-vector multiplication and a solver when preconditioning and could be defined for AbstractMatrix. However, the multiplication and solver API would need to be standardised over the different subtypes of AbstractMatrix. In my Toeplitz code I found it convenient to define e.g.

A_mul_B{T}(α::T,A::Circulant{T},x::StridedVector{T},β::T,y::StridedVector{T})

for the operation y=αAx+βy similary to gemv in BLAS. I also defined solve!(A,b) that wrote the solution to b. Now I am ready to follow up on the question of this thread. Would it make sense to have this structure generally and then derive all the usual allocating * and \ from non allocating workhorse functions on the form just presented?

For some more code context have a look at

https://github.com/andreasnoackjensen/ToeplitzMatrices.jl

@cdsousa
Copy link
Contributor

cdsousa commented Sep 28, 2013

Let me suggest something like:

  • mult!(C, A, B) for the C<-A*B multiplication;
  • add!(C, A, B) for the C<-A+B addition;
  • add!(A, B) for the inplace A<-A+B addiction.

Then possible the inplace operators .= *, .= + and .+= of #249 could be defined to map to these function.

@simonster
Copy link
Member

This is a very old issue. We have settled on the !/first argument convention. Perhaps this can be closed?

@timholy timholy closed this as completed Sep 25, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs decision A decision on this change is needed speculative Whether the change will be implemented is speculative
Projects
None yet
Development

No branches or pull requests

7 participants