-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backtracking line search with interpolation #7
Conversation
Current coverage is 48.67% (diff: 94.73%)@@ master #7 diff @@
==========================================
Files 7 7
Lines 591 606 +15
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 281 295 +14
- Misses 310 311 +1
Partials 0 0
|
# provided that c1 < 1/2; the backtrack_condition at the beginning | ||
# of the function guarantees at least a backtracking factor rho. | ||
alpha1 = - (gxp * alpha) / ( 2.0 * ((f_x_scratch - f_x)/alpha - gxp) ) | ||
alpha = max(alpha1, alpha * min(0.25, rho)) # avoid miniscule steps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a heuristic for choosing 0.25 over other numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only personal experience - do you want to make it a parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think a parameter defaulted to 0.25 can be useful.rhomin
EDIT: I'm not sure if rhomin
is the best name :p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll call the parameter mindecrease
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mindecfact
c1 = backtrack_condition | ||
end | ||
if rho <= 0.25 | ||
warn("rho <= 0.25; revert to standard backtracking") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that we only revert back to standard backtracking if
alpha1
< alpha * min(0.25, rho)
. Is that correct, or have I misunderstood?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interpbacktrack
ensures that at each step alpha
is decreased by at least a factor rho
. So if rho < 0.25
(or, rho <= mindecrease
) then it will be standard backtracking with factor rho
.
Thanks! I have found the current backtracking algorithm to be poor-performing, so this is great. An alternative to removing backtracking, can be to make the |
The problem with just setting interp=true is it then gets awkward to pass backtracking as an option. Sent from my iPhone On 9 Oct 2016, at 19:12, Asbj?rn Nilsen Riseth <[email protected]mailto:[email protected]> wrote: Thanks! I have found the current backtracking algorithm to be poor-performing, so this is great. An alternative to removing backtracking, can be to make the interp flag true by default? You are receiving this because you authored the thread. |
Funnily enough I've not seen this anywhere, hence I mentioned it in an appendix of a paper where I used it. But this certainly doesn't mean others haven't done it -it is an extremely simple and natural idea. Maybe if you run into Nick Gould at some point ask him? P.S.: I meant I'd feel a bit embarrassed citing my own paper for this. But if you want I can add more documentation. |
if interp # this means we are coming from interpbacktrack_linesearch! | ||
backtrack_condition = 1.0 - 1.0/(2*rho) # want guaranteed backtrack factor | ||
if c1 >= backtrack_condition | ||
warning("""The Armijo constant c1 is too large; replacing it with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be warn
? I couldn't find a warning
function in Base
on Julia 0.5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right - thank you for catching this.
|
||
# Store angle between search direction and gradient | ||
gxp = vecdot(gr_scratch, s) | ||
# read f_x and slope from LineSearchResults |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is useful. It seems like Optim
is always passing in an up to date LineSearchResults
.
@KristofferC: Will taking f_x
and gxp
from lsr
break NLOpt
sNLSolve
s use of line searches, or is this fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, I think this needs to be cleaned up in other linesearches as well. According to @pkofod only HZ uses this so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should probably be OK. Might make our hack for this unnecessary in NLsolve.jl.
I had a look in Nocedal and Wright - Numerical Optimization. They go one step further: |
great - I didn’t realise they discussed this. I specifically don’t want to use the cubic though. I’ve found this quadratic interpolation to be extremely robust. But if we can introduce the linesearch options, then we can just let the user choose which interpolation to use.
|
I added the reference now. It would be good to add the cubic interpolation as well. Hopefully we can do that in a separate pull request after moving to LineSearchOptions. |
Great, I'll merge later today. |
* added a backtracking line search with interpolation * mindecfact and some documentation * changed `warning to * added reference
This PR adds a second backtracking line search,
interpbacktrack_linesearch!
which, instead of an a *= rho update performs an interpolation step; this is highly efficient for some problems; see QCG and QLBFGS in the following figure (from my own research);It also outperforms HZ and MT line-search for some more standard model problems. In general, it is no worse than standard backtracking, which I left only for the sake of compatibility. I would actually recommend to remove standard backtracking altogether and replace it with
interpbacktrack
.