-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modified Newton using PositiveFactorizations #177
Conversation
iteration += 1 | ||
|
||
# Search direction is always the negative gradient divided by | ||
# the "closest" positive definite matrix to H. This step is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"closest positive definite matrix to H" isn't quite accurate (the point of the discussion in #153 (comment)). More accurate would be to say "a matrix encoding the absolute values of the curvatures represented by H."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied your formulation, and directed interested people to #153
Like you, I don't see a good reason to have an algorithm that breaks on a large fraction of problems, so I agree this should just be the default for our Newton's method. |
Yes, it seems kind of odd. As long as we document that it does not just take the Hessian and perform |
@johnmyleswhite what is your stand of simply using PositiveFactorizations in our "regular" newton's method re the discussions above and elsewhere in the issues? |
I'll completely defer to Tim here, but it seems clear that we shouldn't default to a version of Newton's method that doesn't work on lots of problems. |
cd650cc
to
3178a1e
Compare
Okay; switched gears, and simply patched the original Newton's method. As it were, it was not useful for much more than illustrating the original idea behind the algorithm. In practice, there is really no good reason not to handle concave regions intelligently, and I think |
Converging is better than not converging. 👍 |
Since there are no objections, I'm merging this. Currently, it seems to be our best bet of ensuring posdefness, and it fixes something that has generated quite a few issues in the past. |
Isn't this the same as the "saddle-free Newton method" suggested in this paper: https://arxiv.org/abs/1406.2572? |
I have no idea where I pulled the other request from in #123 . Made a branch here instead.
Thanks for the wonderful tool @timholy !
I'm thinking we really just do this by default, but that's of course something we need to discuss.