-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement expected value via integration over probability thresholds #1734
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1734 +/- ##
==========================================
+ Coverage 98.17% 98.22% +0.04%
==========================================
Files 113 114 +1
Lines 10336 10678 +342
==========================================
+ Hits 10147 10488 +341
- Misses 189 190 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The integration logic looks ok to me. I made a suggestion about the choice of endpoints; this won't affect results much provided the original thresholds are sensibly chosen, but I think it would be good to have consistency.
Also add parameterized tests to pick up this issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks good now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR provides a good implementation for evaluating expected value directly from the probability data. Overall I am happy with the method, but I have put forward a couple of questions; expecting these should be easy enough to address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for responding to my comments. As I expected, the resolution was likely to be leave as is but I figured that it was worthwhile raising the questions.
I've left one suggestion for adding a comment on the +/- np.inf bounds, but I don't think is necessary given the issue you mentioned in the comment so I would be happy to leave it out (I'll leave this to your digression).
Either way, I'm happy with what is here. Great work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding in the comment. I'm happy for this to be merged in now.
* master: Calc temperature after latent heat release (metoppv#1739) Fixed broken links (metoppv#1745) Vicinity processing CLI (metoppv#1749) Rainforest minor fixes (metoppv#1751) Implement expected value via integration over probability thresholds (metoppv#1734) Exclude hidden directories and their sub-directories from the init check test. This accommodates IDEs that store information in hidden directories, i.e. vscode. (metoppv#1748) # Conflicts: # improver/psychrometric_calculations/psychrometric_calculations.py # improver_tests/acceptance/test_vicinity.py
…etoppv#1734) * Implement expected value over thresholds * Add tests for non-monotonic data and thresholds lt/gt * Update acceptance test docstring * min/max of threshold spacing and ECC bounds * Fix interpolation mismatched with thresholds Also add parameterized tests to pick up this issue * Add unit test for unequally spaced thresholds * Fix black * Remove duplicated data equality check * Update comment explaining extra thresholds
As a follow up to #1719, implement processing of threshold data by numerical integration over probability thresholds, removing the quick-to-implement but poor performance implementation using ConvertProbabilitiesToPercentiles.
The acceptance test data is unchanged for this PR.
The time to run the acceptance test on threshold data reduced from ~1.5 seconds down to ~0.2 seconds on my workstation. There are significant performance improvements on larger data sets, such as whole-country sized grids.
Testing: