Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove optional use of statsmodels within EMOS #1563

Merged

Conversation

gavinevans
Copy link
Contributor

@gavinevans gavinevans commented Sep 22, 2021

Addresses part of #1537

Description
This PR removes the optional use of statsmodels, meaning that it is now a required module. This aids the further development in #1537 for using a static additional predictor.

Further information in https://github.com/metoppv/mo-blue-team/issues/73#issuecomment-925092996.

Testing:

  • Ran tests and they passed OK
  • Added new tests for the new feature(s)

@codecov
Copy link

codecov bot commented Sep 22, 2021

Codecov Report

Merging #1563 (93a4803) into master (f8e2ce4) will increase coverage by 0.03%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1563      +/-   ##
==========================================
+ Coverage   97.93%   97.97%   +0.03%     
==========================================
  Files         106      107       +1     
  Lines        9463     9562      +99     
==========================================
+ Hits         9268     9368     +100     
+ Misses        195      194       -1     
Impacted Files Coverage Δ
improver/calibration/utilities.py 98.93% <ø> (+0.95%) ⬆️
improver/calibration/ensemble_calibration.py 99.73% <100.00%> (ø)
improver/threshold.py 100.00% <0.00%> (ø)
improver/utilities/rescale.py 100.00% <0.00%> (ø)
improver/utilities/temporal.py 100.00% <0.00%> (ø)
improver/synthetic_data/set_up_test_cubes.py 100.00% <0.00%> (ø)
improver/lightning.py 100.00% <0.00%> (ø)
improver/utilities/spatial.py 98.78% <0.00%> (+0.12%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f8e2ce4...93a4803. Read the comment docs.

@tjtg tjtg added the BoM review required PRs opened by non-BoM developers that require a BoM review label Sep 27, 2021
@tjtg
Copy link
Contributor

tjtg commented Sep 30, 2021

I've created a PR-to-PR which changes this to make statsmodels required only for estimate EMOS functionality. That change avoids a hard dependency on statsmodels for all of IMPROVER but still allows for a lot of the simplification in this PR.

I've also had a read through the this PR. All looks OK to me and avoiding two code paths is a nice simplification.

Statsmodels required only for estimate EMOS
Copy link
Contributor

@fionaRust fionaRust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions

@gavinevans gavinevans assigned fionaRust and unassigned gavinevans Oct 4, 2021
Copy link
Contributor

@fionaRust fionaRust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Gavin

@fionaRust fionaRust assigned bayliffe and MoseleyS and unassigned fionaRust and bayliffe Oct 4, 2021
Copy link
Contributor

@MoseleyS MoseleyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got a few questions about this PR, hopefully it is just my lack of understanding.

@@ -508,6 +484,7 @@ def setUp(self):
halo surrounding the original data.
Set up expected results for different situations.
"""
pytest.importorskip("statsmodels")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If statsmodels is no longer optional, why are we allowed to skip it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MoseleyS. I think your queries are all related to @tjtg's PR to this PR (gavinevans#13). You're correct that this PR makes statsmodels no longer optional, however, as statsmodels is still only used in one particular place in the codebase, rather than having widespread usage like numpy or Iris, @tjtg has recommended that statsmodels is regarded as an optional module. I think this is a similar set-up to pysteps. Therefore, if someone wants to use components of improver, but isn't interested in EMOS, then they don't need to have statsmodels in their environment.

Comment on lines -24 to +27
- statsmodels
# Optional
- numba
- pysteps=1.4.1
- statsmodels
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I need this explaining. The statsmodel dependency has moved from "Required" to "Optional", but the PR removes the optional status of this module, which I think makes it "Required".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My previous understanding of "Required" and "Optional" was just like you suggest, which is why I made this PR: #1556. However, the preference, is to treat statsmodels more like pysteps, so even though it is required for EMOS, it is optional for the improver codebase, as someone might want to use improver, but not the EMOS component.

Comment on lines -24 to -28
- statsmodels
# Optional
- numba
- pysteps=1.4.1
- timezonefinder=4.1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an interest in representing different things with these different environments with this environment being slightly slimmer. This is the approach chosen by @tjtg in gavinevans#13.

@@ -994,6 +993,8 @@ def compute_initial_guess(
List of coefficients to be used as initial guess.
Order of coefficients is [alpha, beta, gamma, delta].
"""
import statsmodels.api as sm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this import statement not at the top of the file with all the other import statements?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, this is because statsmodels is being treated as optional for the codebase, even though it is required for EMOS. Therefore, if someone isn't using the EMOS component of improver, then they can use a slimmer environment.

@MoseleyS MoseleyS assigned gavinevans and unassigned MoseleyS Oct 5, 2021
@gavinevans gavinevans assigned MoseleyS and unassigned gavinevans Oct 5, 2021
@@ -21,11 +21,7 @@ dependencies:
- scipy=1.6
- sigtools
- sphinx
- statsmodels
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tjtg. Can we add a comment to this environment stating what is NOT supported, so that we know what to expect. e.g.

# This environment does not support the following CLIs:
- estimate-emos-coefficients
- generate-timezone-mask-ancillary
- nowcast-accumulate
- nowcast-extrapolate
- nowcast-optical-flow-from-winds
- generate-advection-velocities-from-winds

Copy link
Contributor

@MoseleyS MoseleyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. I will follow up on a comment in the environment as a separate PR when Tom has had a chance to respond to my question.

@MoseleyS MoseleyS merged commit 516e9fc into metoppv:master Oct 6, 2021
@gavinevans gavinevans deleted the improver1537_nonoptional_statsmodels branch October 6, 2021 09:21
MoseleyS pushed a commit to MoseleyS/improver that referenced this pull request Aug 22, 2024
* Modifications to remove optional use of statsmodels. statsmodels is now a required module.

* Run isort and black.

* Flake8 corrections.

* Flake8 correction.

* Code and test changes to make statsmodels optional

* Remove optional libraries from py38 environment

* Move statsmodels to optional in py37 environment

* Fix black

* Re-add warning messages to be ignored.

Co-authored-by: Tom Gale <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BoM review required PRs opened by non-BoM developers that require a BoM review FY21/22 Temperature calibration Owned by Gavin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants