Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IM-1787: Improve memory efficiency of threshold plugin #1913

Merged
merged 35 commits into from
Oct 20, 2023
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
8b68ae7
Making threshold more efficient.
bayliffe Jan 3, 2023
83513a4
Getting the contributions right when using masked data.
bayliffe Jan 4, 2023
589ae1a
Correct unit setting bug.
bayliffe Jan 10, 2023
c3b9670
Beginning to restructure.
bayliffe Jan 11, 2023
8e2dc97
Changing interfaces, simplifying CLI, and updating unit tests. Incomp…
bayliffe Jan 11, 2023
9689768
Break up the occurence within vicinity plugin to allow use of compone…
bayliffe Jan 12, 2023
174258f
Figuring out where the vicinity processing differs.
bayliffe Jan 16, 2023
037fdc5
Add new exception. May be temporary.
bayliffe Jan 17, 2023
5005977
more incremental changes.
bayliffe Jan 19, 2023
6910855
Final fixes to vicinity processing from within the threshold plugin.
bayliffe Feb 1, 2023
3a9a669
Format changes.
bayliffe Feb 1, 2023
c9934f3
Remove defunct test. Add array slicing to avoid swelling memory when …
bayliffe Feb 2, 2023
1d87141
Use a slice iterator rather than list to reduce memory usage. As sugg…
bayliffe Jun 16, 2023
5e371f8
Set self.fill_masked which was lost in rebase.
bayliffe Jun 16, 2023
193bca2
Remove assumption of realization coordinate added when changing to us…
bayliffe Jun 16, 2023
ba3e291
Modify threshold such that it can once again accept a single threshol…
bayliffe Jun 16, 2023
aff1499
Remove doc-strings relating to fuzzy-bounds, the setting of which usi…
bayliffe Jun 16, 2023
33c3529
Add unit tests to cover the vicinity processing applied within the th…
bayliffe Jun 16, 2023
80f96c8
Format fixes.
bayliffe Jun 16, 2023
4fa3df7
Merge remote-tracking branch 'upstream/master' into im1787
bayliffe Jul 27, 2023
19083bf
Replicate percentile collapse functionality.
bayliffe Jul 28, 2023
9cbfccb
Added back land mask without vicinity exception.
bayliffe Aug 2, 2023
1bfc213
Rewriting threshold unit tests using pytest and extending coverage. I…
bayliffe Aug 2, 2023
94a7885
Unit test replacement complete. Doc-strings and formatting changes to…
bayliffe Aug 3, 2023
e63df9f
Add docstrings and typing to the conftest functions.
bayliffe Aug 3, 2023
5ae1450
Formatting corrections.
bayliffe Aug 3, 2023
f926509
Remove broken new test added to BasicThreshold tests which is not nee…
bayliffe Aug 4, 2023
24f24ee
Add vicinity tests using a landmask.
bayliffe Aug 4, 2023
5f54444
Partial review response.
bayliffe Aug 25, 2023
7d84c96
Further review changes.
bayliffe Aug 30, 2023
2e6bb36
Further review tweaks.
bayliffe Sep 12, 2023
8767a61
Tweak to type checking.
bayliffe Sep 14, 2023
6cdcb10
Isort reorder
bayliffe Sep 14, 2023
ab7edfa
Review changes.
bayliffe Oct 17, 2023
ffd3fb9
Fix my test for bounds set to None.
bayliffe Oct 17, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/Code-Style-Guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -342,7 +342,7 @@ Plugins (classes) should be an example of a non-trivial algorithm or set
of algorithms for a particular purpose. They should be set up via the
``__init__`` method and then invoked on a particular iris Cube ``cube``
using a ``process`` method - e.g. using ``process(cube)``. See
e.g. `BasicThreshold <https://github.com/metoppv/improver/blob/master/lib/improver/threshold.py>`_
e.g. `Threshold <https://github.com/metoppv/improver/blob/master/lib/improver/threshold.py>`_
class. In some limited cases an iris ``CubeList`` may be preferable.
Avoid writing code that can do both. Class names use
`PascalCase <https://en.wikipedia.org/wiki/PascalCase>`_ whilst
Expand Down
70 changes: 11 additions & 59 deletions improver/cli/threshold.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,10 @@ def process(
Args:
cube (iris.cube.Cube):
A cube to be processed.
land_sea_mask (Cube):
Binary land-sea mask data. True for land-points, False for sea.
Restricts in-vicinity processing to only include points of a
like mask value.
threshold_values (list of float):
Threshold value or values about which to calculate the truth
values; e.g. 270,300. Must be omitted if 'threshold_config'
Expand Down Expand Up @@ -104,10 +108,6 @@ def process(
List of distances in metres used to define the vicinities within
which to search for an occurrence. Each vicinity provided will
lead to a different gridded field.
land_sea_mask (Cube):
Binary land-sea mask data. True for land-points, False for sea.
Restricts in-vicinity processing to only include points of a
like mask value.
fill_masked (float):
If provided all masked points in cube will be replaced with the
provided value before thresholding.
Expand All @@ -119,16 +119,8 @@ def process(
Raises:
ValueError: If threshold_config and threshold_values are both set
ValueError: If threshold_config is used for fuzzy thresholding
ValueError: Cannot apply land-mask cube without in-vicinity processing.
ValueError: Can only collapse over a realization coordinate or a percentile
coordinate that has been rebadged as a realization coordinate.
"""
from improver.ensemble_copula_coupling.ensemble_copula_coupling import (
RebadgePercentilesAsRealizations,
)
from improver.threshold import BasicThreshold
from improver.utilities.cube_manipulation import collapse_realizations
from improver.utilities.spatial import OccurrenceWithinVicinity
from improver.threshold import Threshold

if threshold_config and threshold_values:
raise ValueError(
Expand All @@ -137,56 +129,16 @@ def process(
)
if threshold_config and fuzzy_factor:
raise ValueError("--threshold-config cannot be used for fuzzy thresholding")
bayliffe marked this conversation as resolved.
Show resolved Hide resolved
if threshold_config:
thresholds = []
fuzzy_bounds = []
for key in threshold_config.keys():
# Ensure thresholds are float64 to avoid rounding errors during
# possible unit conversion.
thresholds.append(float(key))
# If the first threshold has no bounds, fuzzy_bounds is
# set to None and subsequent bounds checks are skipped
if threshold_config[key] == "None":
fuzzy_bounds = None
continue
fuzzy_bounds.append(tuple(threshold_config[key]))
else:
# Ensure thresholds are float64 to avoid rounding errors during possible
# unit conversion.
thresholds = [float(x) for x in threshold_values]
fuzzy_bounds = None

each_threshold_func_list = []

if vicinity is not None:
vicinity = [float(x) for x in vicinity]
# smooth thresholded occurrences over local vicinity
each_threshold_func_list.append(
OccurrenceWithinVicinity(radii=vicinity, land_mask_cube=land_sea_mask)
)
elif land_sea_mask:
raise ValueError("Cannot apply land-mask cube without in-vicinity processing")

if collapse_coord == "realization":
each_threshold_func_list.append(collapse_realizations)
elif collapse_coord == "percentile":
cube = RebadgePercentilesAsRealizations()(cube)
each_threshold_func_list.append(collapse_realizations)
elif collapse_coord is not None:
raise ValueError(
"Can only collapse over a realization coordinate or a percentile "
"coordinate that has been rebadged as a realization coordinate."
)

if fill_masked is not None:
fill_masked = float(fill_masked)

return BasicThreshold(
thresholds,
return Threshold(
gavinevans marked this conversation as resolved.
Show resolved Hide resolved
threshold_values=threshold_values,
threshold_config=threshold_config,
fuzzy_factor=fuzzy_factor,
fuzzy_bounds=fuzzy_bounds,
threshold_units=threshold_units,
comparison_operator=comparison_operator,
each_threshold_func=each_threshold_func_list,
collapse_coord=collapse_coord,
vicinity=vicinity,
fill_masked=fill_masked,
)(cube)
)(cube, land_sea_mask)
6 changes: 3 additions & 3 deletions improver/precipitation_type/shower_condition_probability.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
create_new_diagnostic_cube,
generate_mandatory_attributes,
)
from improver.threshold import BasicThreshold
from improver.threshold import Threshold
from improver.utilities.cube_manipulation import collapse_realizations

from .utilities import make_shower_condition_cube
Expand Down Expand Up @@ -171,10 +171,10 @@ def process(self, cubes: CubeList) -> Cube:
cloud, convection = self._extract_inputs(cubes)

# Threshold cubes
cloud_thresholded = BasicThreshold(
cloud_thresholded = Threshold(
self.cloud_threshold, comparison_operator="<="
).process(cloud)
convection_thresholded = BasicThreshold(self.convection_threshold).process(
convection_thresholded = Threshold(self.convection_threshold).process(
bayliffe marked this conversation as resolved.
Show resolved Hide resolved
convection
)

Expand Down
6 changes: 3 additions & 3 deletions improver/regrid/landsea.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
from improver.metadata.constants.attributes import MANDATORY_ATTRIBUTE_DEFAULTS
from improver.metadata.constants.mo_attributes import MOSG_GRID_ATTRIBUTES
from improver.regrid.landsea2 import RegridWithLandSeaMask
from improver.threshold import BasicThreshold
from improver.threshold import Threshold
from improver.utilities.cube_checker import spatial_coords_match
from improver.utilities.spatial import OccurrenceWithinVicinity

Expand Down Expand Up @@ -288,9 +288,9 @@ def _get_matches(
# Identify nearby points on regridded input_land that match the
# selector_value
if selector_val > 0.5:
thresholder = BasicThreshold(0.5)
thresholder = Threshold(0.5)
else:
thresholder = BasicThreshold(0.5, comparison_operator="<=")
thresholder = Threshold(0.5, comparison_operator="<=")
bayliffe marked this conversation as resolved.
Show resolved Hide resolved
in_vicinity = self.vicinity(thresholder(self.input_land))

# Identify those points sourced from the opposite mask that are
Expand Down
Loading