Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RainForests apply calibration skeleton #1708

Merged

Conversation

benowen-bom
Copy link
Contributor

@benowen-bom benowen-bom commented May 2, 2022

This PR adds a skeleton for RainForest apply_rainforests_calibration CLI and associated plugin.

Future PRs following on from this will add in functionality and tests, however following the review of the CLI interface through this PR the interface is not expected to change in subsequent PRs.

Open questions (@tjtg, I'd be interested in your thoughts):

  • Additional packages are required RainForest calibration. Which envs should these be added to (currently they have been added to all envs except conda-forge.yml)?
  • Where's the best place to call initialise_model_config? Currently being called in the CLI prior to calling process. Other options could be that it is called in the __init__ for ApplyRainForestsCalibration or within the associated process function, however both these options will change the interface for the process function.

Still to-do:

  • Add test for initialise_model_config __init__.

Testing:

  • Ran tests and they passed OK
  • Added new tests for the new feature(s)

@codecov
Copy link

codecov bot commented May 2, 2022

Codecov Report

Merging #1708 (6924fa6) into master (b18f795) will decrease coverage by 0.15%.
The diff coverage is 39.28%.

@@            Coverage Diff             @@
##           master    #1708      +/-   ##
==========================================
- Coverage   98.13%   97.98%   -0.16%     
==========================================
  Files         111      112       +1     
  Lines       10190    10272      +82     
==========================================
+ Hits        10000    10065      +65     
- Misses        190      207      +17     
Impacted Files Coverage Δ
improver/calibration/rainforest_calibration.py 39.28% <39.28%> (ø)
improver/feels_like_temperature.py 100.00% <0.00%> (ø)
improver/calibration/dataframe_utilities.py 100.00% <0.00%> (ø)
improver/utilities/spatial.py 98.86% <0.00%> (+0.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b18f795...6924fa6. Read the comment docs.

@tjtg
Copy link
Contributor

tjtg commented May 4, 2022

Additional packages are required RainForest calibration. Which envs should these be added to (currently they have been added to all envs except conda-forge.yml)?

I'd suggest adding these packages to environment_b and latest.

The current CI error on the conda-forge environment will need the lightgbm import to be moved inside the functions where it is used rather than being at the top of the file. This is similar to other optional packages in IMPROVER.

Please also update the dependencies documentation in doc/source/Dependencies.rst

Where's the best place to call initialise_model_config? Currently being called in the CLI prior to calling process. Other options could be that it is called in the __init__ for ApplyRainForestsCalibration or within the associated process function, however both these options will change the interface for the process function.

The contents of initialise_model_config look like they should become the __init__ method of ApplyRainForestsCalibration. Many other IMPROVER plugin classes don't have anything useful to do in the __init__ method, so __init__ is either tiny or doesn't exist. This plugin does have some useful setup that can be done in the __init__ method. I don't think there's any other realistic use of initialise_model_config outside this class, so no reason to have it at top level outside the classes.

@benowen-bom benowen-bom force-pushed the rainforest_apply_calibration_skeleton branch 2 times, most recently from a6b6d4d to d2763f6 Compare May 9, 2022 05:57
@benowen-bom benowen-bom changed the title WIP: Rainforest apply calibration skeleton Rainforest apply calibration skeleton May 9, 2022
@benowen-bom benowen-bom requested a review from tjtg May 9, 2022 23:07
Copy link
Contributor

@tjtg tjtg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start on adding the RainForests calibration into IMPROVER. The comments I've made are a bunch of minor items and should be pretty easy to address.
It would be worthwhile getting a second review of this PR as it's adding a new CLI.

improver/calibration/rainforest_calibration.py Outdated Show resolved Hide resolved
improver/calibration/rainforest_calibration.py Outdated Show resolved Hide resolved
improver/calibration/rainforest_calibration.py Outdated Show resolved Hide resolved
improver/calibration/rainforest_calibration.py Outdated Show resolved Hide resolved
improver/calibration/rainforest_calibration.py Outdated Show resolved Hide resolved
improver/calibration/rainforest_calibration.py Outdated Show resolved Hide resolved
improver/calibration/rainforest_calibration.py Outdated Show resolved Hide resolved
@tjtg tjtg added the MO review required PRs opened by non-Met Office developers that require a Met Office review label May 11, 2022
@tjtg tjtg changed the title Rainforest apply calibration skeleton RainForests apply calibration skeleton May 11, 2022
@MoseleyS MoseleyS self-assigned this May 11, 2022
improver/cli/apply_rainforests_calibration.py Outdated Show resolved Hide resolved
improver/cli/apply_rainforests_calibration.py Outdated Show resolved Hide resolved
@MoseleyS MoseleyS assigned benowen-bom and unassigned MoseleyS May 11, 2022
@benowen-bom benowen-bom force-pushed the rainforest_apply_calibration_skeleton branch from d505faf to 8a6bf66 Compare May 12, 2022 05:22
@MoseleyS MoseleyS assigned MoseleyS and unassigned benowen-bom May 12, 2022
@MoseleyS MoseleyS assigned benowen-bom and unassigned MoseleyS May 12, 2022
@benowen-bom
Copy link
Contributor Author

Can I ask you not to rebase your branch each time to push new changes? It changes the history and forces me to re-review the whole PR, rather than incrementally reviewing your updates. You should only need to rebase if Github tells you there are conflicts with master, and even then, a merge is sometimes the better choice.

And sorry about this. I wasn't aware of this behaviour on your end (and can appreciate how annoying this would be). I'm relatively a novice when it comes to git, so feedback like this is really useful!

@MoseleyS MoseleyS assigned MoseleyS and unassigned benowen-bom May 16, 2022
Copy link
Contributor

@MoseleyS MoseleyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed - I think the sorted dict needs to be used in place of the unsorted dict. You probably need a unit test to show that you're getting the sorting you desire too.

improver/calibration/rainforest_calibration.py Outdated Show resolved Hide resolved
@MoseleyS MoseleyS assigned benowen-bom and unassigned MoseleyS May 16, 2022
@benowen-bom
Copy link
Contributor Author

As discussed - I think the sorted dict needs to be used in place of the unsorted dict.

Sorry about this @MoseleyS, I wasn't careful enough on the previous update (I clearly rushed it and was to focused on the error thresholds). I've gone with your suggested change, and updated the unit tests to ensure we get what is expected.

@MoseleyS MoseleyS assigned MoseleyS and unassigned benowen-bom May 17, 2022
Copy link
Contributor

@MoseleyS MoseleyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the functionality now, but now that I understand the code, I can't resist asking for the tests to be parametrized as there is a lot of repetition and overlap. To show you what I mean, I've created a PR to this PR: benowen-bom#1

@MoseleyS
Copy link
Contributor

I hope you don't mind me making so many suggestions! I also hope you agree that the end result is better code! Our working hours don't seem to overlap, so it is taking longer to progress than is normal. I think the end is in sight though.

@MoseleyS MoseleyS assigned benowen-bom and unassigned MoseleyS May 17, 2022
@benowen-bom
Copy link
Contributor Author

I'm happy with the functionality now, but now that I understand the code, I can't resist asking for the tests to be parametrized as there is a lot of repetition and overlap. To show you what I mean, I've created a PR to this PR: benowen-bom#1

Thanks @MoseleyS for this suggestion. I think this does a fantastic job at removing the repetition in the tests and means that the same unit test is applied for calls to __init__, so I have adopted your change, with a couple of minor tweaks. These changes are to the testing regime, rather than the test itself and relate to running tests when the treelite dependency is missing.

@benowen-bom
Copy link
Contributor Author

I hope you don't mind me making so many suggestions! I also hope you agree that the end result is better code!

@MoseleyS: I really appreciate your input and interest in this work! I think the quality of the code is much better through this process, and it's also been a learning experience for me. So there is no need to apologise with making so many suggestions.

Our working hours don't seem to overlap, so it is taking longer to progress than is normal. I think the end is in sight though.

The timeframe to get this done has been longer than I expected, but as you allude to, if time zones overlapped nicely this would have been a couple of days rather than a week.

Copy link
Contributor

@MoseleyS MoseleyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this now!

@benowen-bom
Copy link
Contributor Author

@tjtg: Are you happy with this to merged in?

Copy link
Contributor

@tjtg tjtg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had another check through after the review changes. I'm happy for this to go ahead and get merged.

@MoseleyS MoseleyS merged commit d3598a6 into metoppv:master May 19, 2022
@benowen-bom
Copy link
Contributor Author

Thanks @tjtg and @MoseleyS!

@benowen-bom benowen-bom mentioned this pull request May 23, 2022
4 tasks
MoseleyS added a commit to MoseleyS/improver that referenced this pull request Aug 22, 2024
* Placeholder for RainForests calibration functionality.

* Interface for ApplyRainForestsCalibration process function.

* Placeholder for apply_rainforests_calibration.

* Fill out apply_rainforests_calibration cli, and add function to load model objects.

* Add tree-model packages to improver envs.

* Update types in interface.

* Remove additions to environment_a.yml.

* Move initialise_model_config to ApplyRainForestsCalibration __init__ method.

* Update dependencies to include tree-model packages.

* Add placeholder for unit tests.

* Formatting.

* Move fixtures to conftest and add test for missing treelite module.

* Minor updates to comments.

* Reduce default output realization count to more reasonable size.

* Included update to default output-realizations into CLI.

* Incorporate review comments: update docstrings, imports and variable naming.

* Cast features as CubeList and add comment on nature of error_thresholds in __init__ method.

* Update license on added files.

* Update handling of model_filenames, and add check for missing lightgbm files prior to initialisation.

* Add function to enforce error-threshold is ordered, with tree_models ordered accordingly.

* Simplify sorting of error_thresholds and associated tree_models.

* Fix use of sorted_model_config_dict, and test expected consistency between thresholds and tree_models.

* Moves boolean setting out of try block

* Parametrizes tests

* Drop mark.skipif and update test to run when treelite dependency missing.

Co-authored-by: Stephen Moseley <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MO review required PRs opened by non-Met Office developers that require a Met Office review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants