Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement expected value via integration over probability thresholds #1734

Merged
merged 9 commits into from
Jun 30, 2022

Conversation

tjtg
Copy link
Contributor

@tjtg tjtg commented Jun 3, 2022

As a follow up to #1719, implement processing of threshold data by numerical integration over probability thresholds, removing the quick-to-implement but poor performance implementation using ConvertProbabilitiesToPercentiles.

The acceptance test data is unchanged for this PR.
The time to run the acceptance test on threshold data reduced from ~1.5 seconds down to ~0.2 seconds on my workstation. There are significant performance improvements on larger data sets, such as whole-country sized grids.

Testing:

  • Ran tests and they passed OK
  • Added new tests for the new feature(s)

@tjtg tjtg marked this pull request as draft June 3, 2022 08:56
@codecov
Copy link

codecov bot commented Jun 3, 2022

Codecov Report

Merging #1734 (7762939) into master (730ed80) will increase coverage by 0.04%.
The diff coverage is 95.12%.

@@            Coverage Diff             @@
##           master    #1734      +/-   ##
==========================================
+ Coverage   98.17%   98.22%   +0.04%     
==========================================
  Files         113      114       +1     
  Lines       10336    10678     +342     
==========================================
+ Hits        10147    10488     +341     
- Misses        189      190       +1     
Impacted Files Coverage Δ
improver/expected_value.py 96.00% <95.12%> (-4.00%) ⬇️
improver/constants.py 100.00% <0.00%> (ø)
improver/regrid/landsea.py 99.21% <0.00%> (ø)
improver/utilities/solar.py 100.00% <0.00%> (ø)
improver/developer_tools/metadata_interpreter.py 99.35% <0.00%> (ø)
improver/wind_calculations/vertical_updraught.py 100.00% <0.00%> (ø)
improver/calibration/reliability_calibration.py 98.84% <0.00%> (+<0.01%) ⬆️
...ometric_calculations/psychrometric_calculations.py 99.22% <0.00%> (+0.02%) ⬆️
improver/utilities/spatial.py 98.91% <0.00%> (+0.04%) ⬆️
improver/utilities/cube_manipulation.py 99.14% <0.00%> (+0.07%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 730ed80...7762939. Read the comment docs.

@tjtg tjtg marked this pull request as ready for review June 7, 2022 03:24
@tjtg tjtg requested a review from btrotta-bom June 7, 2022 23:35
Copy link
Contributor

@btrotta-bom btrotta-bom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The integration logic looks ok to me. I made a suggestion about the choice of endpoints; this won't affect results much provided the original thresholds are sensibly chosen, but I think it would be good to have consistency.

improver/expected_value.py Outdated Show resolved Hide resolved
improver/expected_value.py Show resolved Hide resolved
btrotta-bom
btrotta-bom previously approved these changes Jun 23, 2022
Copy link
Contributor

@btrotta-bom btrotta-bom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good now.

Copy link
Contributor

@benowen-bom benowen-bom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR provides a good implementation for evaluating expected value directly from the probability data. Overall I am happy with the method, but I have put forward a couple of questions; expecting these should be easy enough to address.

improver/expected_value.py Show resolved Hide resolved
improver/expected_value.py Show resolved Hide resolved
improver_tests/expected_value/test_expected_value.py Outdated Show resolved Hide resolved
benowen-bom
benowen-bom previously approved these changes Jun 29, 2022
Copy link
Contributor

@benowen-bom benowen-bom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for responding to my comments. As I expected, the resolution was likely to be leave as is but I figured that it was worthwhile raising the questions.

I've left one suggestion for adding a comment on the +/- np.inf bounds, but I don't think is necessary given the issue you mentioned in the comment so I would be happy to leave it out (I'll leave this to your digression).

Either way, I'm happy with what is here. Great work!

Copy link
Contributor

@benowen-bom benowen-bom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding in the comment. I'm happy for this to be merged in now.

@tjtg tjtg merged commit 98691c8 into metoppv:master Jun 30, 2022
MoseleyS added a commit to MoseleyS/improver that referenced this pull request Jul 1, 2022
* master:
  Calc temperature after latent heat release (metoppv#1739)
  Fixed broken links (metoppv#1745)
  Vicinity processing CLI (metoppv#1749)
  Rainforest minor fixes (metoppv#1751)
  Implement expected value via integration over probability thresholds (metoppv#1734)
  Exclude hidden directories and their sub-directories from the init check test. This accommodates IDEs that store information in hidden directories, i.e. vscode. (metoppv#1748)

# Conflicts:
#	improver/psychrometric_calculations/psychrometric_calculations.py
#	improver_tests/acceptance/test_vicinity.py
@tjtg tjtg deleted the expectedvalue3 branch July 25, 2022 02:29
MoseleyS pushed a commit to MoseleyS/improver that referenced this pull request Aug 22, 2024
…etoppv#1734)

* Implement expected value over thresholds

* Add tests for non-monotonic data and thresholds lt/gt

* Update acceptance test docstring

* min/max of threshold spacing and ECC bounds

* Fix interpolation mismatched with thresholds

Also add parameterized tests to pick up this issue

* Add unit test for unequally spaced thresholds

* Fix black

* Remove duplicated data equality check

* Update comment explaining extra thresholds
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants