-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for weighted StatsBase methods #28
Conversation
This looks good, thanks. I have one design question. In some sense a natural place for |
Yeah, I feel like having the keys be statistical weights would be a bit confusing... at least for our use cases. We typically want to do things like compute the exponentially weighted mean along a
That could still be useful for seeing how weights map to each key along that dimension. |
fa1bfc3
to
c9cc879
Compare
Alright, I've rebased and added support for KeyedArray weight vectors. I've also opted to wrap the covariance estimation methods related to a NameDims.jl issue. invenia/NamedDims.jl#120 |
Queued job seems to have completed, but github wasn't notified. |
I dropped the ball here too, sorry. Re using key-vector as weights, maybe I start to agree that this isn't such a great idea. Is CovarianceEstimation a fairly lightweight package? I guess if invenia/NamedDims.jl#122 loads it, then there's no cost to loading it here too. While if NamedDims uses Requires.jl, then possibly this should follow that? |
Yeah, CovarianceEstimation.jl would only add StatsBase.jl, which we're considering adding here anyways. My issue with Requires.jl is that the load time performance hasn't been great (might be better on julia >= 1.5 though). It also doesn't let you specify dependency requirements which is particularly problematic for packages like StatsBase and CovarianceEstimation which are still pre-1.0. |
OK, I think Requires has got faster, but haven't timed things exhaustively. It is loaded by NamedDims already, but I also don't know whether that influences things. I am a little surprised how many things StatsBase depends on, despite its name: https://juliahub.com/ui/Packages/StatsBase/EZjIG/0.33.1?t=1 But that's not the end of the world. |
Yeah, that's fair. I think the plan is for a lot of the core functionality to move into the Statistics stdlib? That being said, it's only 4 non-stdlibs and those package are minimal enough to not have further indirect dependencies. I guess my comparison is packages like Flux.jl https://juliahub.com/ui/Packages/Flux/QdkVy/0.11.1?t=1 |
Sorry, I got confused. Is there anything in particular you wanted me to fix with this PR (e.g., use Requires.jl)? |
Have just been busy, sorry. Will look properly soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry this took forever.
I think this is fine but we should compact some things, by making some more loops, by shortening some bodies, and by deleting some methods.
Alright, looks like that managed to remove about 100 lines of largely duplicated code. |
Co-authored-by: Michael Abbott <[email protected]>
This PR doesn't wrap everything, but it wraps most of the main weighted functions that we use at Invenia. Closes #26