-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New NLSolversBase #27
Conversation
Codecov Report
@@ Coverage Diff @@
## master #27 +/- ##
===========================================
- Coverage 84.41% 70.54% -13.88%
===========================================
Files 3 8 +5
Lines 77 129 +52
===========================================
+ Hits 65 91 +26
- Misses 12 38 +26
Continue to review full report at Codecov.
|
At some point I had |
a7e5f05
to
2f21779
Compare
Might as well rename this package edit: this is not an actual suggestion |
I like this change |
(I mean the PR not sure about renaming |
f # objective | ||
df # (partial) derivative of objective | ||
fdf # objective and (partial) derivative of objective | ||
F::TF # cache for f output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should f
, df
, fdf
have type parameters? Otherwise this causes a type instability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for TwiceDifferentiable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They could, but they weren't. It will require recompilation for each new instance which might be an annoyance if people create many anonymous functions. We could allow for both though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I don't care - my objectives are very expensive. This was more targeted towards the use-case of many cheap optimisations. But you are right, if one has to recompile each time then this is not helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. It's fine to compile for say each input type (Float64, Vector ...) but I do get the impression that a lot of people are iteratively/repeatedly creating anonymous functions that they minimize in fractions of seconds (but maybe thousands of times), and this would seriously hurt them. As I said, we can have a special type for the non-default type if necessary.
Anyway, the fact that you cannot infer the output type isn't really too bad, as we always know exactly what the type of the cache variable is. This means that even if the function ends up spitting out Float32's, it'll automatically be promoted to Float64's, and the majority of the code will be compiled under the assumption that it is indeed Float64s.
As far as I can see the type restrictions are sufficiently general to allow Optim.jl:#399 to be eventually implemented? |
Finally, if I see correctly, this still requires that I pass in |
Yes there should be quite wide input possibilities. It's only some of the convenience constructors that have restricted annotations (to capture most normal users use cases) |
Yes. I didn't realize it was such a big "feature request" to have this baked in. I guess I figured people would just do the copying themselves. |
Personally, I find it quite annoying to write out the copying. It could really be hidden so easily in |
Only because you (unexpectedly to me) don't use the storage |
it just doesn't cost anything
that's really all I am hoping for |
I'm not sure it's true that allocations are always "free", but if the majority of users want simple f(x), g(x), h(x) default interfaces I'm 💯 comfortable having that as the default |
I was only referring to my use case. |
My bad, I read it as a general observation. But sure, I do realize that the more work being doing per call the smaller the advantage. But I agree, both versions should be easy to use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, all unified and ready to go 👍 Sorry about the late feedback...
I remember there was a discussion about caching (on a lower level) on gitter or in some issue. Is that still up for grabs?
end | ||
|
||
|
||
function TwiceDifferentiable(f, g!, h!, x::AbstractVector, F = real(zero(eltype(x))), G = similar(x)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that our x-input needs to be a Vector if we don't supply fg!
?
[0,], [0,], [0,]) | ||
end | ||
|
||
function TwiceDifferentiableHV(f, fdf, h, x::AbstractVector{T}) where T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AbstractVector vs AbstractArray?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I was lazy and just wanted to move it here. It's still unexported in Optim so we can do whatever we want :) But yes, that should be generalized (and tested on the generalization!)
Cache of what? I can't remember the specific discussion. |
I believe in the sense of memoization. It was either @cortner or @ChrisRackauckas who were talking about this, but I may have misinterpreted what they were requesting. |
I wonder if we can cater to people who want that in a sufficiently generic fashion, or if that should just be part of their actual objective function definition, probably easiest as a callable type. We have it in the sense of the "last" [edit] I should give a small remark about the fact that I had considered splitting the caches out from the NDifferentiable functions. I do not intend to do that anymore. I tried it, and it just complicated the API too much. The NDifferentiable was basically useless without the caches, and just ended up as a glorified tuple. So my new view is that people should just ignore the NDifferentiable-stuff UNLESS they want the caches specifically. |
I think that might have been the point. If you keep more history then it would remove the need to store intermediate values in, say, linesearches. But I'm not convinced this makes code so much more readable . . . |
We could do something along those lines, I just wan't aware of that discussion or forgot. Shouldn't be difficult to implement, we just use a queue instead of a vector/scalar for |
What
This PR brings in the OnceDifferentiable constructors from Optim and the DifferentiableMultivariateFunction type (now also just OnceDifferentiable) into NLSolversBase.
Why
I wanted to get NLsolve up-to-date, and one thing that annoyed me was that
*DifferentiableMultivariateFunction
has an annoying name: the special thing it that it's a multivalued function (Optim objectives are also multivariate) but we can just have dispatch handle that, it's a very long name, and the sparse versions are even longer. I think it will make Optim and NLsolve seem much more like something coming from the same org that we avoid type/functionality duplication. Pulling it here is needed in order to haveusing Optim, NLsolve
play nice (as they're exported).
As I said, I think it's very valuable to share types among Optim, NLsolve, and even LsqFit (LM would be using the same structure as NLsolve for example) in the future where it makes sense. Constructors need only be defined in NLSolversBase if they not type restricted, only autodiff functionality is defined in each package (mainly to avoid unnecessary dependencies).
I allows for conventions and sharing of ideas between the packages much better, and we only have to test this stuff in one place.
This is Julia, we don't need a separate type for each jacobian type, each gradient type or each array type in general.
What's new
The syntax is different now as we need to handle both R^N->R and R^N->R^N cases. We used to seed with
x
, but obviously we need to seedF
as well now, and not through the type name. This is handled with positional arguments.What's missing
This required some restructuring, but I actually think it works quite well. I have a PR for Optim and NLsolve as well (that I will make after this one). Those PRs work and have passing tests, but will break people's code quite hard if they use
OnceDifferentiable
,DifferentiableMultivariateFunction
, and very much the sparse multivariates. The great thing is that they used to be called different things, so we can actually deprecate this quite smoothly!Another positional...
... isn't that annoying? Well, no, I don't really think so. I think of these types as two things: internal mechanisms for passing around caches and objective related functions, and a mechanism for advanced users to keep caches between repeated optimization/solves. If you're really need and want that, I'm sure you can handle writing everything out.
If it isn't clear, it is not my intention that the normal user that just minimizes something once should ever see these types. We should only promote these as advanced tools. That's why I intend to bring back the
autodiff
keyword in Optim. My guess is that most people useOnceDifferentiable
to get access to the ForwardDiff functionality, and exposing an implementation dealing just to have users do this is a mistake I made when we moved toOnceDifferentiable
. So I am undoing my mistake.** Worklist **
edit: I know something is weird in the OnceDifferentiable main constructor, but it's fixed.