Multidimensional arrays #399

pkofod · 2017-04-21T07:21:06Z

Fix up (L-)BFGS NelderMead, and PSO code such that we can pass multidimensional arrays as input.

cortner · 2017-05-22T20:40:41Z

How about making the input an abstract type? At the end of the day the state of the optimisation just needs to be an element of a hilbert space

cortner · 2017-05-22T20:53:12Z

Also see @ChrisRackauckas comments in #326. I really see no barrier to admitting a general abstract type as the state of optimisation - only advantages.

pkofod · 2017-05-22T21:01:32Z

How abstract? AbstractArray?

cortner · 2017-05-22T21:05:38Z

Not even an AbstractArray. You actually need elements of a Hilbert space over some type T, i.e., you need x + y, t * x, dot(x, y) and I think that is it already. (I'm not even sure it needs to be a real Hilbert space?)

This is really not a necessity, BUT: it might actually make the code more readable, and it could have some unexpected applications. E.g., if I minimise over a space of functions, then I don't have to represent functions in subsequent optimisation steps w.r.t. the same basis.

ChrisRackauckas · 2017-05-22T21:13:00Z

This is really not a necessity, BUT: it might actually make the code more readable, and it could have some unexpected applications. E.g., if I minimise over a space of functions, then I don't have to represent functions in subsequent optimisation steps w.r.t. the same basis.

Yes, ApproxFun would be a very interesting application which takes the generality to the max here.

Not even an AbstractArray. You actually need elements of a Hilbert space, i.e., you need x + y, t * x, dot(x, y) and I think that is it already.

From BFGS, it looks like you need A_mul_B!, scale!, indexing, *, +, and vecdot. So if the indexing operation was replaced with a broadcast, then this all boils down to *, +, vecdot, and broadcast overloads for the type (guaranteed by indexing, but necessarily needing indexing). I believe Fun types may already satisfy this.

cortner · 2017-05-22T21:16:10Z

everything in BFGS-type algorithms can in principle be reduced to the operations I mentioned - even if the current implementation possibly doesn't. The problem here is that that LBFGS stores the outer product x * y' instead of the operator it defines

ChrisRackauckas · 2017-05-22T21:19:47Z

everything in BFGS-type algorithms can in principle be reduced to the operations I mentioned - even if the current implementation possibly doesn't.

Mathematically yes. But in reality, what I was trying to point out is that you're missing that either indexing or broadcast needs to be defined. Right now that's using indexing, hence an AbstractArray subtype. But that definitely has an easy change to broadcasting, in which case an AbstractArray isn't needed. Other than that we agree, since A_mul_B! and vecdot have generic fallbacks that use *, +, and hopefully broadcast.

cortner · 2017-05-22T21:22:46Z

and in addition CG of L-BFGS or a hessian-free TR or the various accelerated steepest descent methods really just need the small set I mentioned.

pkofod · 2017-05-23T01:14:56Z

Okay, I'm sold. If we need to do something that sacrifices performance for arrays, we can just have tightly and loosely typed methods

cortner · 2017-05-23T08:38:33Z

is there a performance benchmark suite? I'd be quite interested to do a prototype (say, CG or LBFGS) and confirm that the strict typing we use at the moment is not needed at all.

pkofod · 2017-05-23T08:50:26Z

is there a performance benchmark suite? I'd be quite interested to do a prototype (say, CG or LBFGS) and confirm that the strict typing we use at the moment is not needed at all.

Not tagged, but the code can be found at OptimTests.jl ..I should really get that fixed up and merged (will do after Optim is tagged)

ChrisRackauckas · 2017-05-23T14:29:41Z

I would be heavily surprised if, when you did this, the machine code changed one bit. It would only change if you change the indexing to broadcasting, in which case it's already pretty well documented what the (current) cost is there.

cortner · 2017-05-23T14:35:46Z

I agree, but there is lots of stuff like this in the code:

        @simd for i in 1:n
            @inbounds state.y[i] = gradient(d, i) - state.g_previous[i]
        end

which I think is appalling, so this will also be a good opportunity to get rid of it.

pkofod · 2017-05-23T16:07:12Z

which I think is appalling, so this will also be a good opportunity to get rid of it.

Agree, and I the time to calculate y .= gradient(d, i) .- state.g_previous would never dominate ldivs or matrix multiplications anyway.

schlichtanders · 2017-05-30T19:41:54Z

Stumbled upon the following (is this related?)

when trying to optimize a sum over multidimensional Rosenbrock function

f(x) = sum((1.0 .- x[1]).^2 .+ 100.0 .* (x[2] .- x[1].^2).^2)

I get the following error

optimize(f, [[12.0,1.0],[1.0,2.0]])
ERROR: MethodError: no method matching zero(::Type{Array{Float64,1}})
Closest candidates are:
  zero(::Type{Base.LibGit2.Oid}) at libgit2\oid.jl:88
  zero(::Type{Base.Pkg.Resolve.VersionWeights.VWPreBuildItem}) at pkg/resolve/versionweight.jl:80
  zero(::Type{Base.Pkg.Resolve.VersionWeights.VWPreBuild}) at pkg/resolve/versionweight.jl:120
  ...
 in initial_state(::Optim.NelderMead{Optim.AffineSimplexer,Optim.AdaptiveParameters}, ::Optim.Options{Void}, ::#f, ::Array{Array{Float64,1},1}) at C:\Users\s.sahm\.julia\v0.5\Optim\src\nelder_mead.jl:132
 in optimize(::Function, ::Array{Array{Float64,1},1}, ::Optim.NelderMead{Optim.AffineSimplexer,Optim.AdaptiveParameters}, ::Optim.Options{Void}) at C:\Users\s.sahm\.julia\v0.5\Optim\src\optimize.jl:172
 in #optimize#74(::Array{Any,1}, ::Function, ::#f, ::Array{Array{Float64,1},1}) at C:\Users\s.sahm\.julia\v0.5\Optim\src\optimize.jl:24
 in optimize(::#f, ::Array{Array{Float64,1},1}) at C:\Users\s.sahm\.julia\v0.5\Optim\src\optimize.jl:23

My final goal is to regard the subcomponents x[1], x[2], ... as completely independent, i.e. for instance x::Array{Array{T,1},1}, is this possible with Optim.jl in general?
And is this possible with autodiff in particular?

thanks a lot

pkofod · 2017-05-30T19:46:50Z

I don't quite get what you're trying to do. For example, do you realize that your provided point cannot be evaluated by f ?

julia> f( [[12.0,1.0],[1.0,2.0]])
ERROR: DimensionMismatch("Cannot multiply two vectors")
Stacktrace:
 [1] power_by_squaring(::Array{Float64,1}, ::Int64) at ./intfuncs.jl:166
 [2] f(::Array{Array{Float64,1},1}) at ./REPL[161]:1

ChrisRackauckas · 2017-05-30T19:52:13Z

Supporting arrays of arrays is a whole different ballgame. You need to make sure most internal calls will automatically be recursive. The repo

https://github.com/JuliaDiffEq/RecursiveArrayTools.jl

is the tools that are used in DiffEq to support this kind of stuff, but there's a substantial amount of "being careful with types" that has to be done to get recursive arrays working.

pkofod · 2017-05-30T20:46:01Z

You need to make sure most internal calls will automatically be recursive.

You also need to make sure you agree what it's supposed to mean... I mean, in the example above I'm a bit confused.

schlichtanders · 2017-05-31T08:49:46Z

@pkofod I am sorry for the confusing, it should be
f(x) = sum((1.0 .- x[1]).^2 .+ 100.0 .* (x[2] .- x[1].^2).^2) and I now also changed it above.
The error stays the same.

I would like to have an argument of type x::Array{Array{T,1},1} supported, because when using the alternative of a single flat vector, I would need to know the shapes of my single arguments in advance. If I define a custom function, I can have multiple parameters and auto-differentiate the complete function with respect to all parameters. Such multiple independend parameters are kind of what I wanted to regain with x::Array{Array{T,1},1}.
Such a setting allows for easily changing sizes of parameters during iteration for instance.

It would be really helpful if I can have something like multiple parameters (i.e. arguments) to my optimizable function. Is this still this issue, or rather something else? Looking forward to your thoughts.

timholy · 2017-05-31T09:01:46Z

If each array in the "outer" container is of the same size (as in this example), you almost surely want to use StaticArrays and make it an Vector{SVector{L,T}}. At least the call to zero will succeed if you do that; you might have to make some additional changes, though.

pkofod · 2017-05-31T09:02:14Z

It would be really helpful if I can have something like multiple parameters (i.e. arguments) to my optimizable function. Is this still this issue, or rather something else? Looking forward to your thoughts.

I think it is this issue, yes.

cortner · 2017-10-18T13:10:32Z

CC @ettersi

pkofod · 2017-10-18T13:12:32Z

Does @ettersi have a use case here?

ChrisRackauckas · 2017-10-18T13:31:03Z

I see I never answered about the recursive arrays. Let me be a little more clear about that.

If you do something like a .= b with arrays of arrays, then it's like a[i] = b[i] and thus the elements of the two arrays of arrays will now share references. The same happens with copy!. So you have to be very careful here. recursivecopy! from RecursiveArrayTools.jl and the VectorOfArrays broadcast implementation take care of these kinds of issues. What @timholy suggests is another way of handling it since SArrays are immutable and thus you don't need recursion because a[i] = b[i] will set the value of a[i] to the static array value of b[i] and not copy a reference. So in short, if everything is generic enough, Vector{SVector{L,T}} should already work without any changes, while Vector{Vector{T}} would need a few extra changes but the tools to do this are already created an well-tested.

I'm not sure if the full mutability makes too much sense for an optimization problem, diffeq does it because things "change over time" but I'm not sure that has much meaning here. But it would be something nice to handle correctly down the road.

As for other array types, things like GPUArrays (and support for internal multithreading via JLArrays) comes free if you A_mul_B! and change your loops to broadcasting as above. In DiffEq we found this has essentially no cost as long as you avoid JuliaLang/julia#22255 , so I would recommend just changing every element-wise loop whenever you have time. Looking at the code, it shouldn't be too difficult to get this all working with GPUArrays, which is nice because it's different than what other packages have. The "marketing" is like this. Most optimization packages allow you to "use" GPUs because you can GPU the call to the objective function. However, when setting this up the user has to always write a step that updates the parameters on the GPU given the update from the optimizer, i.e. there is a GPU load/unload required. But if you setup everything to work with GPUArrays, then internally everything important from the optimizer can stay on the GPU, leading to an optimizer which compiles itself to be "copy-free" and require no extra transferring arrays to/from the GPU. I think this is a huge win and something that can be really hammered home in a blog post, and I could see that getting a lot of interest.

One thing that is missing from the last point though is that GPUArrays don't fully support matrix factorizations and \ yet, so you may want to do all of these changes but then only test it out for now on gradient decent.

I can start over the weekend helping get this all setup. Maybe if you guys are also setup with GPUArrays my "please help me get factorizations working with GPUArrays!" call will be stronger.

ettersi · 2017-10-18T15:28:29Z

My use-case is optimising an (approximate) lattice of atoms, which I store as a Matrix{SVector{2,Float64}}. Currently, when I call

optimize(
         total_energyy, 
         forces!, 
         y, 
         ConjugateGradient()
        )

I get the following error:

ERROR: AssertionError: typeof(f_x) == T
 in initial_state(::Optim.ConjugateGradient{Void,Optim.##29#31,LineSearches.#hagerzhang!}, ::Optim.Options{Void}, ::Optim.OnceDifferentiable, ::Array{StaticArrays.SVector{2,Float64},2}) at /home/simon/.julia/v0.5/Optim/src/cg.jl:105
 in optimize(::Optim.OnceDifferentiable, ::Array{StaticArrays.SVector{2,Float64},2}, ::Optim.ConjugateGradient{Void,Optim.##29#31,LineSearches.#hagerzhang!}, ::Optim.Options{Void}) at /home/simon/.julia/v0.5/Optim/src/optimize.jl:172
 in optimize(::T.##6#9{CartesianIndex{2}}, ::T.##7#10{CartesianIndex{2}}, ::Array{StaticArrays.SVector{2,Float64},2}, ::Optim.ConjugateGradient{Void,Optim.##29#31,LineSearches.#hagerzhang!}) at /home/simon/.julia/v0.5/Optim/src/optimize.jl:58

I haven't really dug into the details of this error yet, though. It could well be that there is some subtle issue in my code, or that it's an issue that I am still on 0.5.

pkofod · 2017-10-18T16:59:08Z

Being on v05 will at least mean that you're not getting the latest updates, but supporting static arrray types are high on my priority list

ChrisRackauckas · 2017-10-18T17:02:17Z

Vectors of Static Arrays will have other issues though, since indexing gets you one static array each time, so defining a Jacobian and factorizing it won't work very well making it work with some algorithms but not too many. You'll want to wrap the Vector{SArray} in something like RecursiveArrayTools' VectorOfArray to really fix that (because then it will have proper indexing, so you can vec to get a view of the whole thing as a Vector and treat it likewise), unless Optim does a bunch of special handling.

ettersi · 2017-10-19T09:20:18Z

Solved the problem using reinterpret. Works kind of all right, although a solution without this extra typing would of course be nice. But I understand that this is open source code and if I want a feature I would have to do it myself.

pkofod · 2017-10-19T09:51:44Z

You don't necessarily have to do it yourself. A precise description of what you want goes a long way :)

pkofod · 2020-08-18T11:51:28Z

Fixed the original issue, if anything we can do comes up we can open a new issue.

pkofod mentioned this issue Jun 16, 2017

Fix input for (L)BFGS and PSO #431

Merged

cortner mentioned this issue Jun 20, 2017

Manifold optimization #433

Closed

This was referenced Jun 23, 2017

Complex optimization #438

Closed

Accept multidimensional arrays as input JuliaNLSolvers/LineSearches.jl#55

Merged

cortner mentioned this issue Dec 17, 2017

New NLSolversBase JuliaNLSolvers/NLSolversBase.jl#27

Merged

8 tasks

pkofod mentioned this issue Aug 16, 2020

Generic domains for inputs master issue pkofod/NLSolvers.jl#21

Open

pkofod closed this as completed Aug 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multidimensional arrays #399

Multidimensional arrays #399

pkofod commented Apr 21, 2017 •

edited

Loading

cortner commented May 22, 2017 •

edited

Loading

cortner commented May 22, 2017

pkofod commented May 22, 2017

cortner commented May 22, 2017 •

edited

Loading

ChrisRackauckas commented May 22, 2017

cortner commented May 22, 2017

ChrisRackauckas commented May 22, 2017

cortner commented May 22, 2017

pkofod commented May 23, 2017

cortner commented May 23, 2017

pkofod commented May 23, 2017

ChrisRackauckas commented May 23, 2017

cortner commented May 23, 2017

pkofod commented May 23, 2017

schlichtanders commented May 30, 2017 •

edited

Loading

pkofod commented May 30, 2017

ChrisRackauckas commented May 30, 2017

pkofod commented May 30, 2017

schlichtanders commented May 31, 2017

timholy commented May 31, 2017

pkofod commented May 31, 2017

cortner commented Oct 18, 2017

pkofod commented Oct 18, 2017

ChrisRackauckas commented Oct 18, 2017

ettersi commented Oct 18, 2017

pkofod commented Oct 18, 2017

ChrisRackauckas commented Oct 18, 2017

ettersi commented Oct 19, 2017

pkofod commented Oct 19, 2017

pkofod commented Aug 18, 2020

Multidimensional arrays #399

Multidimensional arrays #399

Comments

pkofod commented Apr 21, 2017 • edited Loading

cortner commented May 22, 2017 • edited Loading

cortner commented May 22, 2017

pkofod commented May 22, 2017

cortner commented May 22, 2017 • edited Loading

ChrisRackauckas commented May 22, 2017

cortner commented May 22, 2017

ChrisRackauckas commented May 22, 2017

cortner commented May 22, 2017

pkofod commented May 23, 2017

cortner commented May 23, 2017

pkofod commented May 23, 2017

ChrisRackauckas commented May 23, 2017

cortner commented May 23, 2017

pkofod commented May 23, 2017

schlichtanders commented May 30, 2017 • edited Loading

pkofod commented May 30, 2017

ChrisRackauckas commented May 30, 2017

pkofod commented May 30, 2017

schlichtanders commented May 31, 2017

timholy commented May 31, 2017

pkofod commented May 31, 2017

cortner commented Oct 18, 2017

pkofod commented Oct 18, 2017

ChrisRackauckas commented Oct 18, 2017

ettersi commented Oct 18, 2017

pkofod commented Oct 18, 2017

ChrisRackauckas commented Oct 18, 2017

ettersi commented Oct 19, 2017

pkofod commented Oct 19, 2017

pkofod commented Aug 18, 2020

pkofod commented Apr 21, 2017 •

edited

Loading

cortner commented May 22, 2017 •

edited

Loading

cortner commented May 22, 2017 •

edited

Loading

schlichtanders commented May 30, 2017 •

edited

Loading