-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC/WIP] Rework *DifferentiableFunction #337
Conversation
Codecov Report
@@ Coverage Diff @@
## master #337 +/- ##
==========================================
- Coverage 88.54% 87.54% -1.01%
==========================================
Files 29 28 -1
Lines 1633 1574 -59
==========================================
- Hits 1446 1378 -68
- Misses 187 196 +9
Continue to review full report at Codecov.
|
src/objective_types.jl
Outdated
n_x = length(x_seed) | ||
f_calls = [0] | ||
g_calls = [0] | ||
function g!(x::Array, storage::Array) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not something like:
function g!(x::Array, storage::Array)
f_calls::Int # (maybe needed to avoid Core.Box)
f_calls = 0
Calculus.finite_difference!(x -> (f_calls += 1; f(x)), x, storage, g_method.method)
f_calls[1] .+= f_calls
return
end
to avoid having to predict how many times the gradient estimator evaluates the function. For automatic differentiation this is not always so obvious so this suggestion would remove that difficulty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will certainly do this! Thanks for that comment :)
In this PR you have to give the |
Should this be part of Optim, or some separate package? |
I think I'll put it in here first, but if it fits with nlsolve and linesearches we can take it out and have it as a dependency. |
I think LineSearches needs this, or will have to replicate the |
I'm not removing or renaming the old fields, so I think it still works? Certainly the line searches still work in the tests here |
Okay, as long as we can still communicate number of f- and g-calls back to Optim, it should be fine. |
I think Optim.jl will keep track of that itself. |
Currently, line searches spits out the additional calls to f and g and then we increment the counter in optim. I would love for lineaearches to use the value and value_grad methods. So I am for the idea put forth, I just want this pr to be a bit more complete before doing it. It does require nlsolve to agree though. If not, we can just keep it as it is |
Ah, I see Will this PR make it easier for us to prevent multiple evaluations of functions or gradients at the same points? (As discussed in JuliaNLSolvers/LineSearches.jl#10 and #288) |
If course, there's the simpler way where we just use the anon function trick as in the finite differences constructor in all incoming functions, and then lineaearches doesn't even have to worry about calls at all. Then we're back to |
Yes, I will get to that soon. It will check internally if the last x is input again. |
test/nelder_mead.jl
Outdated
@@ -2,7 +2,7 @@ | |||
# Test Optim.nelder_mead for all functions except Large Polynomials in Optim.UnconstrainedProblems.examples | |||
for (name, prob) in Optim.UnconstrainedProblems.examples | |||
f_prob = prob.f | |||
res = Optim.optimize(f_prob, prob.initial_x, NelderMead(), Optim.Options(iterations = 10000)) | |||
@show res = Optim.optimize(f_prob, prob.initial_x, NelderMead(), Optim.Options(iterations = 10000)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you intend to commit the @show
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, thanks..
src/objective_types.jl
Outdated
g!(x_seed, g_x) | ||
Differentiable(f, g!, fg!, f(x_seed), g_x, copy(x_seed), [1], [1]) | ||
end | ||
function Differentiable{T}(f::Function, x_seed::Array{T}; method = :finitediff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it important that typeof(f) <: Function
? Would the code work if the type of f
simply implements (f::MyType)(x) = ...
? if so we could consider not be overly restrictive here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at all, and it will be removed. I know @ChrisRackauckas would appreciate them gone as well :)
docs/src/algo/precondition.md
Outdated
@@ -39,7 +39,7 @@ using ForwardDiff | |||
plap(U; n = length(U)) = (n-1)*sum((0.1 + diff(U).^2).^2 ) - sum(U) / (n-1) | |||
plap1 = ForwardDiff.gradient(plap) | |||
precond(n) = spdiagm((-ones(n-1), 2*ones(n), -ones(n-1)), (-1,0,1), n, n)*(n+1) | |||
df = DifferentiableFunction(x -> plap([0; X; 0]), | |||
df = Differentiable(x -> plap([0; X; 0]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this also need the x_seed
now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's why the docs
box is not ticked yet :) but thanks for looking through the changes. This was just a search and replace :)
docs/src/user/tipsandtricks.md
Outdated
@@ -81,7 +81,7 @@ using Optim | |||
initial_x = ... | |||
buffer = Array(...) # Preallocate an appropriate buffer | |||
last_x = similar(initial_x) | |||
df = TwiceDifferentiableFunction(x -> f(x, buffer, initial_x), | |||
df = TwiceDifferentiable(x -> f(x, buffer, initial_x), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x_seed
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to self: maybe we should start adding some actual doctests
is this "check |
Thank you, it looks like a good way to deal with the double evaluation issues as far as I'm concerned. Are there situations where people would want to define such objective objects without an initial evaluation at |
well it depends on the general interface. Generally, I'm thinking that people in optim provide f g and h for example, and then this object is used internally. Then there is automatically a seed: the first x used for evaluation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these changes handle cases where the previous evaluation was only of the objective, or only of the gradient, but not both?
If not, maybe we would have to store last_x_f
, last_x_g
and last_x_h
for the evaluation of f,g and h respectively.
Then for value_grad
, there are different options of how to handle cases where last_x_g != last_x_f.
src/objective_types.jl
Outdated
g_x, Array{T}(n_x, n_x), copy(x_seed), [1], [1], [0]) | ||
end | ||
|
||
function value(obj, x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there situations where one would want to call f(x) without updating last_x?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not, maybe we would have to store last_x_f, last_x_g and last_x_h for the evaluation of f,g and h respectively.
I certainly need to handle this!
a5c025a
to
67e3bb3
Compare
src/bfgs.jl
Outdated
linesearch!::L | ||
initial_invH::H | ||
resetalpha::Bool | ||
immutable BFGS <: Optimizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, why was this changed from being parameterized, to leaving them all as type Any?
(I usually recommend to always use concrete types for type/immutable fields, parameterized if necessary)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably because they don't want to recompile for each new function? But the linesearch! function, is that dependent on the user's function? If not and this is usually the same function (some default), then it should be more strictly typed. Anyways, resetalpha should still be typed to a Bool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to stop this comment thread now because @ScottPJones should probably be banned from the JuliaOpt org's repos in the same way that he is currently banned from the JuliaLang org's repos. For me (as the original creator of this specific project), it is very important to respect the Julia community stewards' decision to ban Scott.
src/objective_types.jl
Outdated
h! | ||
f_x | ||
g_x | ||
h_storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have this parameterized in the constructor, maybe it should also be parameterized here, for better performance?
Sure |
Makes sense. |
I was just going to merge this actually, and then copy+paste the code to NLSolversBase.jl, just to have the change in the git history (so it is easier to follow what happened prior to moving code to another package). |
I'm pushing some bug fixes tonight, and adding |
src/objective_types.jl
Outdated
n_x = length(x_seed) | ||
f_calls = [1] | ||
g_calls = [1] | ||
if method == :finitediff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add these methods here, maybe we can call *Differentiable(f,x_init;method)
directly from the optimize
functions instead of having separate code for generating g!
, fg!
(and h!
) there? I'm referring to e.g.
Line 98 in 9d90b5a
else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, I forgot that was also done there.
test/gradient_descent.jl
Outdated
@@ -2,18 +2,21 @@ | |||
for use_autodiff in (false, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is use_autodiff
used anywhere now?
1a399fd
to
c075c97
Compare
Right, so just to update everyone, this is approaching its end. With this PR I am no longer targeting v0.4 (yet to be changed) and I won't carry the deprecations forwards either. This means that this PR is heavily breaking, so tagging will have to be done with care (limits in metadata). I removed all deprecations as well. V0.4 was so long ago, v0.6 is coming soon, and shortly after we have v1.0. Post v1.0 I will be very happy to be very careful in backwards compatibility, but as far as I'm concerned, we're approaching a sprint for v1.0 of JuliaLang, and some things will have to be done a little faster. The main reason is, that we don't have 1-2 people sitting here full time, making sure that everything works across v0.2-v0.6, mac, linux, windows, amiga, and raspberry pi. |
The main point is to create dispatch based objective evaluations, and use them in a smarter way. For example,
f_calls+=1
now follows avalue(df, x)
call. This actually caught a few places where we forgot to increment these counters even if a call was made. It's nowhere near done, but I'm just putting it up here for people to see. I didn't put [RFC] because I don't consider it anywhere near done, so I don't expect people to spend time reviewing it. However, I do accept comments and discussion at this point as well, if people want to dive in.last_x_h
g_previous
etc)Checking input
x
before calculatingI now have a way to avoid recalculating
f
for the samex
. I'm not sure if this is too much, or if it addresses the things relevant to LineSearches.jl. This PR is introducing quite some machinery, and I am not exactly sure if it's a bit too much... But let's see where it takes us!fixes/closes: #329 #306 #305 #287 #241 #219 #163