[RFC/WIP] Rework *DifferentiableFunction #337

pkofod · 2017-01-08T20:32:30Z

The main point is to create dispatch based objective evaluations, and use them in a smarter way. For example, f_calls+=1 now follows a value(df, x) call. This actually caught a few places where we forgot to increment these counters even if a call was made. It's nowhere near done, but I'm just putting it up here for people to see. I didn't put [RFC] because I don't consider it anywhere near done, so I don't expect people to spend time reviewing it. However, I do accept comments and discussion at this point as well, if people want to dive in.

fixes/closes: #329 #306 #305 #287 #241 #219 #163

codecov-io · 2017-01-09T10:17:43Z

Codecov Report

Merging #337 into master will decrease coverage by 1%.
The diff coverage is 75.37%.

@@            Coverage Diff             @@
##           master     #337      +/-   ##
==========================================
- Coverage   88.54%   87.54%   -1.01%     
==========================================
  Files          29       28       -1     
  Lines        1633     1574      -59     
==========================================
- Hits         1446     1378      -68     
- Misses        187      196       +9

Impacted Files	Coverage Δ
src/utilities/generic.jl	`100% <ø> (ø)`	✅
src/bfgs.jl	`97.05% <100%> (-0.38%)`	❌
src/accelerated_gradient_descent.jl	`100% <100%> (ø)`	✅
src/gradient_descent.jl	`100% <100%> (ø)`	✅
src/particle_swarm.jl	`88.15% <100%> (+0.55%)`	✅
src/simulated_annealing.jl	`96.42% <100%> (-0.55%)`	❌
src/utilities/assess_convergence.jl	`93.1% <100%> (ø)`	✅
src/fminbox.jl	`84.92% <100%> (-0.24%)`	❌
src/momentum_gradient_descent.jl	`100% <100%> (ø)`	✅
src/l_bfgs.jl	`98.36% <100%> (-0.06%)`	❌
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 250b50a...3528c01. Read the comment docs.

KristofferC · 2017-01-09T12:45:11Z

src/objective_types.jl

+    n_x = length(x_seed)
+    f_calls = [0]
+    g_calls = [0]
+    function g!(x::Array, storage::Array)


Why not something like:

function g!(x::Array, storage::Array) f_calls::Int # (maybe needed to avoid Core.Box) f_calls = 0 Calculus.finite_difference!(x -> (f_calls += 1; f(x)), x, storage, g_method.method) f_calls[1] .+= f_calls return end

to avoid having to predict how many times the gradient estimator evaluates the function. For automatic differentiation this is not always so obvious so this suggestion would remove that difficulty.

I will certainly do this! Thanks for that comment :)

pkofod · 2017-01-09T19:37:29Z

In this PR you have to give the *Differentiable constructor an x seed, as also mentioned in elsewhere in an issue somewhere. I can personally live with that, and my advice to users would simply be to use optimize(f, g!, ...) and don't bother constructing the *Differentiable types yourself unless you want to have a special g or H storage.

anriseth · 2017-01-10T12:21:44Z

Should this be part of Optim, or some separate package?
This could be useful for LineSearches (and I guess NLsolve?) as well.

pkofod · 2017-01-10T12:25:29Z

I think I'll put it in here first, but if it fits with nlsolve and linesearches we can take it out and have it as a dependency.

anriseth · 2017-01-10T12:27:05Z

I think LineSearches needs this, or will have to replicate the value and gradient functions in order to work with Optim.

pkofod · 2017-01-10T12:31:16Z

I'm not removing or renaming the old fields, so I think it still works? Certainly the line searches still work in the tests here

anriseth · 2017-01-10T12:48:17Z

Okay, as long as we can still communicate number of f- and g-calls back to Optim, it should be fine.

KristofferC · 2017-01-10T12:50:13Z

I think Optim.jl will keep track of that itself.

pkofod · 2017-01-10T12:53:20Z

Currently, line searches spits out the additional calls to f and g and then we increment the counter in optim. I would love for lineaearches to use the value and value_grad methods. So I am for the idea put forth, I just want this pr to be a bit more complete before doing it. It does require nlsolve to agree though. If not, we can just keep it as it is

anriseth · 2017-01-10T12:56:10Z

Ah, I see do_linesearch still keeps track of function and gradient evaluations.

Will this PR make it easier for us to prevent multiple evaluations of functions or gradients at the same points? (As discussed in JuliaNLSolvers/LineSearches.jl#10 and #288)

pkofod · 2017-01-10T12:56:38Z

If course, there's the simpler way where we just use the anon function trick as in the finite differences constructor in all incoming functions, and then lineaearches doesn't even have to worry about calls at all. Then we're back to d.f(x) and family again

pkofod · 2017-01-10T12:57:49Z

Yes, I will get to that soon. It will check internally if the last x is input again.

anriseth · 2017-01-14T19:09:48Z

test/nelder_mead.jl

@@ -2,7 +2,7 @@
 	# Test Optim.nelder_mead for all functions except Large Polynomials in Optim.UnconstrainedProblems.examples
 	for (name, prob) in Optim.UnconstrainedProblems.examples
 		f_prob = prob.f
-		res = Optim.optimize(f_prob, prob.initial_x, NelderMead(), Optim.Options(iterations = 10000))
+		@show res = Optim.optimize(f_prob, prob.initial_x, NelderMead(), Optim.Options(iterations = 10000))


Did you intend to commit the @show?

oops, thanks..

Evizero · 2017-01-14T19:10:36Z

src/objective_types.jl

+    g!(x_seed, g_x)
+    Differentiable(f, g!, fg!, f(x_seed), g_x, copy(x_seed), [1], [1])
+end
+function Differentiable{T}(f::Function, x_seed::Array{T}; method = :finitediff)


Is it important that typeof(f) <: Function ? Would the code work if the type of f simply implements (f::MyType)(x) = ...? if so we could consider not be overly restrictive here

Not at all, and it will be removed. I know @ChrisRackauckas would appreciate them gone as well :)

anriseth · 2017-01-14T19:16:09Z

docs/src/algo/precondition.md

@@ -39,7 +39,7 @@ using ForwardDiff
 plap(U; n = length(U)) = (n-1)*sum((0.1 + diff(U).^2).^2 ) - sum(U) / (n-1)
 plap1 = ForwardDiff.gradient(plap)
 precond(n) = spdiagm((-ones(n-1), 2*ones(n), -ones(n-1)), (-1,0,1), n, n)*(n+1)
-df = DifferentiableFunction(x -> plap([0; X; 0]),
+df = Differentiable(x -> plap([0; X; 0]),


Does this also need the x_seed now?

Yes, that's why the docs box is not ticked yet :) but thanks for looking through the changes. This was just a search and replace :)

anriseth · 2017-01-14T19:16:43Z

docs/src/user/tipsandtricks.md

@@ -81,7 +81,7 @@ using Optim
 initial_x = ...
 buffer = Array(...) # Preallocate an appropriate buffer
 last_x = similar(initial_x)
-df = TwiceDifferentiableFunction(x -> f(x, buffer, initial_x),
+df = TwiceDifferentiable(x -> f(x, buffer, initial_x),


note to self: maybe we should start adding some actual doctests

pkofod · 2017-01-14T19:29:48Z

is this "check last_x" along the lines of what you had in mind @anriseth ?

anriseth · 2017-01-15T01:05:50Z

is this "check last_x" along the lines of what you had in mind @anriseth ?

Thank you, it looks like a good way to deal with the double evaluation issues as far as I'm concerned. Are there situations where people would want to define such objective objects without an initial evaluation at x_seed? (E.g. in NLsolve or JuliaML, if we put use this in a OptimTests.jl, @KristofferC @Evizero )

pkofod · 2017-01-15T07:05:01Z

well it depends on the general interface. Generally, I'm thinking that people in optim provide f g and h for example, and then this object is used internally. Then there is automatically a seed: the first x used for evaluation.

anriseth

Do these changes handle cases where the previous evaluation was only of the objective, or only of the gradient, but not both?

If not, maybe we would have to store last_x_f, last_x_g and last_x_h for the evaluation of f,g and h respectively.
Then for value_grad, there are different options of how to handle cases where last_x_g != last_x_f.

anriseth · 2017-01-15T12:21:24Z

src/objective_types.jl

+                                       g_x, Array{T}(n_x, n_x), copy(x_seed), [1], [1], [0])
+end
+
+function value(obj, x)


Are there situations where one would want to call f(x) without updating last_x?

If not, maybe we would have to store last_x_f, last_x_g and last_x_h for the evaluation of f,g and h respectively.

I certainly need to handle this!

ScottPJones · 2017-01-21T22:40:46Z

src/bfgs.jl

-    linesearch!::L
-    initial_invH::H
-    resetalpha::Bool
+immutable BFGS <: Optimizer


Just curious, why was this changed from being parameterized, to leaving them all as type Any?
(I usually recommend to always use concrete types for type/immutable fields, parameterized if necessary)

Probably because they don't want to recompile for each new function? But the linesearch! function, is that dependent on the user's function? If not and this is usually the same function (some default), then it should be more strictly typed. Anyways, resetalpha should still be typed to a Bool.

I'd like to stop this comment thread now because @ScottPJones should probably be banned from the JuliaOpt org's repos in the same way that he is currently banned from the JuliaLang org's repos. For me (as the original creator of this specific project), it is very important to respect the Julia community stewards' decision to ban Scott.

ScottPJones · 2017-01-21T23:47:24Z

src/objective_types.jl

+    h!
+    f_x
+    g_x
+    h_storage


You have this parameterized in the constructor, maybe it should also be parameterized here, for better performance?

pkofod · 2017-01-23T09:11:16Z

Just let the function ones be free so that all callable types work?
I guess you mean:

type Type{T}
    f::T
end

Sure

anriseth · 2017-03-08T12:32:52Z

Sure, but maybe in a separate PR. Then I can merge this this week, create a new pr for the gradient position change (to avoid confusion), merge and then tag. Sounds good?

Makes sense.
Are you going to separate out the *Function changes in this PR to NLSolversBase.jl, or merge directly to Optim?

pkofod · 2017-03-08T13:19:08Z

Makes sense.
Are you going to separate out the *Function changes in this PR to NLSolversBase.jl, or merge directly to Optim?

I was just going to merge this actually, and then copy+paste the code to NLSolversBase.jl, just to have the change in the git history (so it is easier to follow what happened prior to moving code to another package).

pkofod · 2017-03-08T20:28:37Z

I'm pushing some bug fixes tonight, and adding x_calls_limit for f, g, and h. Now is the time to speak up if you don't like these changes, as I'll merge soon.

anriseth · 2017-03-08T20:49:20Z

src/objective_types.jl

+    n_x = length(x_seed)
+    f_calls = [1]
+    g_calls = [1]
+    if method == :finitediff


If we add these methods here, maybe we can call *Differentiable(f,x_init;method) directly from the optimize functions instead of having separate code for generating g!, fg! (and h!) there? I'm referring to e.g.

Optim.jl/src/optimize.jl

Line 98 in 9d90b5a

else

good catch, I forgot that was also done there.

anriseth · 2017-03-08T23:16:28Z

test/gradient_descent.jl

@@ -2,18 +2,21 @@
    for use_autodiff in (false, true)


Is use_autodiff used anywhere now?

pkofod · 2017-03-10T11:42:00Z

Right, so just to update everyone, this is approaching its end. With this PR I am no longer targeting v0.4 (yet to be changed) and I won't carry the deprecations forwards either. This means that this PR is heavily breaking, so tagging will have to be done with care (limits in metadata). I removed all deprecations as well.

V0.4 was so long ago, v0.6 is coming soon, and shortly after we have v1.0. Post v1.0 I will be very happy to be very careful in backwards compatibility, but as far as I'm concerned, we're approaching a sprint for v1.0 of JuliaLang, and some things will have to be done a little faster. The main reason is, that we don't have 1-2 people sitting here full time, making sure that everything works across v0.2-v0.6, mac, linux, windows, amiga, and raspberry pi.

pkofod force-pushed the pkm/df branch from 7908d14 to d0eaae3 Compare January 9, 2017 10:34

KristofferC reviewed Jan 9, 2017

View reviewed changes

anriseth mentioned this pull request Jan 10, 2017

Communicate f-value back to caller? JuliaNLSolvers/LineSearches.jl#10

Closed

anriseth reviewed Jan 14, 2017

View reviewed changes

Evizero reviewed Jan 14, 2017

View reviewed changes

anriseth reviewed Jan 14, 2017

View reviewed changes

anriseth reviewed Jan 15, 2017

View reviewed changes

pkofod force-pushed the pkm/df branch 3 times, most recently from a5c025a to 67e3bb3 Compare January 21, 2017 22:23

ScottPJones reviewed Jan 21, 2017

View reviewed changes

pkofod force-pushed the pkm/df branch from 42af753 to 84c4b7c Compare January 23, 2017 09:03

anriseth reviewed Mar 8, 2017

View reviewed changes

pkofod added 2 commits March 8, 2017 22:31

Grab bag of changes.

217b852

Remove deprecations and autodiff in Options.

364e297

anriseth reviewed Mar 8, 2017

View reviewed changes

test/gradient_descent.jl Outdated

@@ -2,18 +2,21 @@

for use_autodiff in (false, true)

Copy link

Contributor

anriseth Mar 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is use_autodiff used anywhere now?

pkofod force-pushed the pkm/df branch 8 times, most recently from 1a399fd to c075c97 Compare March 10, 2017 09:42

Refactor tests.

73e6d47

pkofod force-pushed the pkm/df branch from c075c97 to 73e6d47 Compare March 10, 2017 09:45

Update NEWS.md and REQUIRE.

ba937e8

pkofod force-pushed the pkm/df branch from 22737de to ba937e8 Compare March 10, 2017 12:26

pkofod added 2 commits March 10, 2017 13:33

Update travis for v0.4 removal.

9e595a4

Use new test function for NewtonTrustRegion.

811bb07

pkofod force-pushed the pkm/df branch from 9dac6eb to 202f4d9 Compare March 10, 2017 20:55

Fix deprecation warning in objective_types.jl

84f4086

pkofod force-pushed the pkm/df branch from 202f4d9 to 84f4086 Compare March 10, 2017 21:03

Edit NEWS.md slightly.

3528c01

pkofod merged commit 7e1ef14 into master Mar 11, 2017

pkofod mentioned this pull request Mar 13, 2017

Changes to DifferentiableFunction API #287

Closed

anriseth mentioned this pull request Mar 27, 2017

GSoC 2017: Constrained Optimisation #379

Closed

pkofod deleted the pkm/df branch April 8, 2017 18:08

[RFC/WIP] Rework *DifferentiableFunction #337

[RFC/WIP] Rework *DifferentiableFunction #337

Conversation

pkofod commented Jan 8, 2017 • edited Loading

codecov-io commented Jan 9, 2017 • edited by codecov bot Loading

Codecov Report

KristofferC Jan 9, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkofod commented Jan 9, 2017

anriseth commented Jan 10, 2017

pkofod commented Jan 10, 2017

anriseth commented Jan 10, 2017

pkofod commented Jan 10, 2017

anriseth commented Jan 10, 2017

KristofferC commented Jan 10, 2017

pkofod commented Jan 10, 2017

anriseth commented Jan 10, 2017

pkofod commented Jan 10, 2017

pkofod commented Jan 10, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkofod commented Jan 14, 2017

anriseth commented Jan 15, 2017

pkofod commented Jan 15, 2017

anriseth left a comment • edited by pkofod Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChrisRackauckas Jan 21, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkofod commented Jan 23, 2017 • edited Loading

anriseth commented Mar 8, 2017

pkofod commented Mar 8, 2017

pkofod commented Mar 8, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkofod commented Mar 10, 2017

pkofod commented Jan 8, 2017 •

edited

Loading

codecov-io commented Jan 9, 2017 •

edited by codecov bot

Loading

KristofferC Jan 9, 2017 •

edited

Loading

anriseth left a comment •

edited by pkofod

Loading

ChrisRackauckas Jan 21, 2017 •

edited

Loading

pkofod commented Jan 23, 2017 •

edited

Loading