Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A_mul_B and all that jazz #57

Closed
jiahao opened this issue Jan 8, 2014 · 40 comments
Closed

A_mul_B and all that jazz #57

jiahao opened this issue Jan 8, 2014 · 40 comments
Assignees
Labels
breaking This change will break code

Comments

@jiahao
Copy link
Member

jiahao commented Jan 8, 2014

The purpose of this issue is to discuss the future of the dozens of functions in Julia that are slight variations of matrix multiplication, matrix division and backslash. I presume that the point was to have fine-grained control over the use of matrix variables and temporaries when needed. However, I don't understand why the availability of these functions is so inconsistent: is this intentional design or a consequence of organic growth with methods being implemented as needed?

List of all the functions (existing and plausibly existing in the future)

Here are the 3x3x3x2=54 possible functions that can be constructed out of A, its transpose At and its Hermitian conjugate Ac; and similarly for B, Bt and Bc; mul, rdiv and ldiv; and mutating and non-mutating variants. (### means that this function does not exist.)

A_mul_B #deprecated synonym of A*B
A_mul_B!
A_mul_Bt
A_mul_Bt!
A_mul_Bc
A_mul_Bc!

At_mul_B
At_mul_B!
At_mul_Bt
At_mul_Bt!
At_mul_Bc   ###
At_mul_Bc!  ###

Ac_mul_B
Ac_mul_B!
Ac_mul_Bt #deprecated
Ac_mul_Bt!  ###
Ac_mul_Bc
Ac_mul_Bc!


A_ldiv_B    ### synonym of A\B
A_ldiv_B!
A_ldiv_Bt
A_ldiv_Bt!  ###
A_ldiv_Bc
A_ldiv_Bc!  ###

At_ldiv_B
At_ldiv_B!  ###
At_ldiv_Bt
At_ldiv_Bt! ###
At_ldiv_Bc  ###
At_ldiv_Bc! ###

Ac_ldiv_B
Ac_ldiv_B!  ###
Ac_ldiv_Bt  ###
Ac_ldiv_Bt! ###
Ac_ldiv_Bc
Ac_ldiv_Bc! ###


A_rdiv_B    ### synonym of A/B
A_rdiv_B!   ###
A_rdiv_Bt
A_rdiv_Bt!  ###
A_rdiv_Bc
A_rdiv_Bc!  ###

At_rdiv_B
At_rdiv_B!  ###
At_rdiv_Bt
At_rdiv_Bt! ###
At_rdiv_Bc  ###
At_rdiv_Bc! ###

Ac_rdiv_B
Ac_rdiv_B!  ###
Ac_rdiv_Bt  ###
Ac_rdiv_Bt! ###
Ac_rdiv_Bc
Ac_rdiv_Bc! ###

I'm rocking this boat because it's come up as a natural consequence of systematically making methods available for special matrix types.

Specific issues

  1. Documentation. The existence, motivation and use of these functions is sparsely documented. Not all the functions even have defined help strings.

2 Growth and proliferation. Do we foresee an eventual future where all 54-3=51 functions (omitting the synonyms for *, / and \) become defined and have methods implemented? There would a great many methods that would have to be implemented to cover all the possible types of matrices that currently exist.

  • Can we eliminate any of these functions on the basis of being useless? There are no functions defined that mix Ac
    and Bt, for example; however, this is a somewhat rare but entirely plausible combination to operate on. Do we care?
  • How many of these functions can be implemented with just a fallback method?
  • Having more things in Base without reason is not ideal, but conversely these functions form a coherent grouping for which it doesn't make sense to have some but not others.
  • How many of these functions can be defined trivially in terms of lazy array views (#5003) and rewriting rules (Label QRCompactWY.T as Triangular #122)?

cc: @JeffBezanson @StefanKarpinski @andreasnoackjensen @dmbates @lindahua

@johnmyleswhite
Copy link
Member

If we get rid of these functions, am I right to assume we'll be pushing information into the type system using transpose types and conjugate types?

@jiahao
Copy link
Member Author

jiahao commented Jan 8, 2014

I think that would depend quite strongly on how #42 is resolved.

@dmbates
Copy link
Member

dmbates commented Jan 8, 2014

Because I develop iterative methods that require a lot of numerical linear algebra within each iteration, I want to be careful about reusing existing storage when possible. Hence I find these mutating functions to be worthwhile. I could always write them as direct calls to functions in the BLAS or LAPACK modules but this makes the code harder to read for those who didn't grow up needing to know Lapack calling sequences as I did.

However, I would be amenable to hiding all these functions in the Base namespace and only exporting the non-mutating versions that are called when expressions like C = A'*B are parsed. The use of mutating functions to save memory allocations is a "consenting adults" activity in my opinion. If you are going to engage in it you should be aware of the potential problems.

@ViralBShah
Copy link
Member

I think only export non-mutating versions is the right choice. What about the transposed multiply operators? Should those also not be exported by default?

I would be ok with removing the c and t combinations too. It wasn't fun dealing with all of those in a recent PR.

For the stuff with lazy array views, we should wait until we get there, or else we will see a performance degradation. At that point, I would expect that we can get rid of the transposed multiply and divide operators.

@johnmyleswhite
Copy link
Member

To me, it seems like it would be better to wait until we have a general tool for expressing mutating operations before removing these functions.

@ViralBShah
Copy link
Member

I am only giving a +1 to the proposal to remove the Ac_mul_Bt where you are mixing complex and real transposed matrices, and not exporting all these in Base by default.

@johnmyleswhite
Copy link
Member

Ok. That makes sense to me, although I'd think there's something to be said for keeping an eyesore around as a reminder that we need a cleaner solution.

@ViralBShah
Copy link
Member

There is some value to eyesore, but keeping it around longer means more people use it thinking it is a supported API and makes it difficult to remove in the future.

@johnmyleswhite
Copy link
Member

Very true.

@jiahao
Copy link
Member Author

jiahao commented Jan 9, 2014

Time for an @Embarrassing macro?

@andreasnoack
Copy link
Member

Yesterday I found out that

julia> n=1000;
julia> A=randn(n,n);A=A+A';Aeig=eigfact(Symmetric(A));
julia> V=Aeig[:vectors];D=diagm(Aeig[:values]);
julia> @time V*D*V';
elapsed time: 0.263635665 seconds (24000512 bytes allocated)
julia> @time V*(D*V');
elapsed time: 0.235679586 seconds (16000272 bytes allocated)
julia> @time (V*D)*V';
elapsed time: 0.237066579 seconds (16000272 bytes allocated)

because

julia> @which V*D*V'
*(a,b,c) at operators.jl:71

causing the transpose to be calculated before the multiplication.

I think that @johnmyleswhite's proposal sounds like the right one. It could eliminate all the A_x_B functions and only require mul!,*,ldiv!,rdiv!,\ and /. The rest is dispatch. I think all the function bodies would be unaffected. Should we try this change? I can do the coding, but I would like to know it could fly before spending the time.

@johnmyleswhite
Copy link
Member

I believe the problem with the transpose type (which somebody else proposed before me and I just brought up again) is that we're not sure how to implement it for higher-order tensors.

@Jutho
Copy link
Contributor

Jutho commented Apr 20, 2014

Another suggestion would be to follow more closely the BLAS approach and have a single function with additional arguments 'N', 'T' and 'C' specifying the multiplication pattern. This of course only shifts the problem to branching within the method, and probably calling all of the above functions anyway, but the advantage is that only a single function needs to be exported and documented.

Occasionally I regret that the mutating multiplication functions in Julia do not more closely resemble the BLAS design, i.e. not only allow to store the result of the multiplication in a given array, but actually add the result to it. I guess there has been some thought in the BLAS design, and it would be great if Julia could generalise this functionality (i.e. matrices with no stride 1, different types of dense and sparse matrices, ...) without sacrificing the existing functionality.

I know that gemm and gemv are available within Base.LinAlg.BLAS , but if you define a new type hierarchy, say AbstractLinearOperator, for sparse linear operators for use in an iterative linear solver or eigensolver, then you want to implement a method A_mul_B! for its concrete subtypes. If you would allow to actually add the result of the matrix vector multiplication to a given vector in this method, then you could also define a SumOfLinearOperators <: AbstractLinearOperator type, define + for any AbstractLinearOperator to return a SumOfLinearOperators which just keeps track of the different AbstractLinearOperator terms, and implements A_mul_B! for an input vector by calling A_mul_B! sequentially on all the terms, thereby adding the result and not having to create any copies.

@JeffBezanson
Copy link
Member

Getting rid of these is one of my top wishlist items. We can even have mul!, ldiv!, and rdiv!, just no underscores :) Let's move ahead with this.

@johnmyleswhite
Copy link
Member

I just telling someone today how these methods are one of the main weak points of Julia's otherwise amazing design for linear algebra.

@tkelman
Copy link

tkelman commented Mar 18, 2015

JuliaLang/julia#6837 is step one, right?

@JeffBezanson
Copy link
Member

@andreasnoack Any update here?

@andreasnoack
Copy link
Member

Not yet. I'll give an update on this after the weekend.

@andreasnoack
Copy link
Member

andreasnoack commented Apr 29, 2016

I've pushed a version that eliminates all the Ax_mul_Bx functions to JuliaLang/julia#6837. I'm using a Transpose type with a conjugated field. I think it works fairly well. It is also possible to avoid quite a few explicit transposes.

However, this hurts compilation time significantly. With the new version, the triangular.jl tests take more than 15 minutes on my machine. The total number of allocations is 3x the version on master and total amount of memory allocation is more than 2x. Running the tests a second time is still as fast as master so this is about compilation.

@kmsquire
Copy link
Member

Anything you could add to base/precompile.jl that might help?

@andreasnoack
Copy link
Member

andreasnoack commented Apr 29, 2016

My impression is that precompilation is actaully slower that "normal" compilation so I'd expect that the total compile time would be longer although the time in the tests might be shorter.

@ViralBShah
Copy link
Member

It would certainly also be helpful to understand the needs of the ML folks before taking a final call on many of the open linalg issues. That would put these improvements squarely past 1.0, and that may be ok.

cc @MikeInnes

@MikeInnes
Copy link
Member

I will link the CUBLAS wrappers. You can see that there's a fair bit of redundancy with Base; given that CUBLAS has an identical api to blas, ideally we'd just swap out the final gemm! calls without ever touching the higher-level API.

That said it does work fine at present, and I expect that things will be improved with whatever changes are made – reducing the redundancy would just be icing.

I think it's actually a bigger deal for AD, where we have to code a derivative for every Ax_mul_Bx combination. Having a single mul! would be a huge improvement there.

@StefanKarpinski
Copy link
Member

Is there some way we can isolate Base from this change? The special lowering to these function names is an issue, but it would be possible to provide the fallbacks and then override them in the future, which would leave the unfortunate lowering hack but would allow this to be fixed in the future.

@StefanKarpinski
Copy link
Member

Related: JuliaLang/julia#23424.

@StefanKarpinski
Copy link
Member

Our options at this point are:

  1. Keep A_mul_B and all that jazz and live with it until 2.0.

  2. Move a' etc. out of Base into LinAlg and fix this post 1.0, this means that you'd have to do using LinAlg before you can do x'y for a dot product.

  3. Hassle @andyferris about finishing the lazy transpose and conjugate work that this requires.

@andyferris
Copy link
Member

Hassle @andyferris about finishing the lazy transpose and conjugate work that this requires.

Noted. :)

@StefanKarpinski
Copy link
Member

For what it's worth, I think this is more crucial than getting the indexing with associatives stuff sorted. The main action item there would simply be to iterate dictionaries by value. The most conservative potential change for 0.7 would be to just deprecate dict iteration altogether and force explicit choice of iteration by pairs(d), keys(d) or values(d). That leaves us in a position to choose any means of iteration we want in the future, including going back to pair iteration.

@JeffBezanson
Copy link
Member

The more I think about it, the more I like the idea of requiring pairs(d), keys(d), or values(d) to iterate over dicts. Anything else just leads to surprises --- other popular languages do keys by default, and on the other hand many operations make the most sense on values or pairs. There just isn't one obvious right way to do it.

@vtjnash
Copy link
Member

vtjnash commented Nov 21, 2017

What does that have to do with A_mul_b and all that jazz?

@quinnj
Copy link
Member

quinnj commented Nov 21, 2017

What does that have to do with A_mul_b and all that jazz?

Because we're collectively prioritizing @andyferris's discretionary time and open-source efforts 😆

@andyferris
Copy link
Member

@StefanKarpinski I came to the same conclusion - if someone else wants to volunteer for a ValueIterator for dicts and deprecating next(::Associative), I'd love to work on this stuff ;)

@JeffBezanson Getting even more off topic, I still feel the answer lies in what you want map(f, dict) and reduce(op, dict) to do (my answer: the most useful behavior for map (for manipulating data, especially after a group/groupby function) is to preserve keys and map values, just like for arrays (including offset arrays), and to reduce over values). These functions seem subservient to the iteration API - especially reduce.

@vtjnash Yes, it's a bit convoluted how this conversation ended up here! 😄

@vtjnash
Copy link
Member

vtjnash commented Nov 22, 2017

Please take off-topic remarks elsewhere. This thread is about making implementations of A_mul_b better, not about making Associative worse.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Nov 22, 2017

Please take off-topic remarks elsewhere.

As explained, these are not off-topic and have to do with work prioritization which affects this issue.

@Sacha0
Copy link
Member

Sacha0 commented Nov 28, 2017

@JeffBezanson
Copy link
Member

These functions have been removed 🎉 so I think we can close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change will break code
Projects
None yet
Development

No branches or pull requests