Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function which normalises input cubes according to a reference #1919

Merged
merged 19 commits into from
Jul 14, 2023

Conversation

brhooper
Copy link
Contributor

@brhooper brhooper commented Jun 27, 2023

Addresses https://github.com/metoppv/mo-blue-team/issues/520
Acceptance test data: metoppv/improver_test_data#20

Description
Adds plugin and CLI which takes a cubelist and a reference cube as inputs and adjusts the data in the input cubelist so that the sum total of this data is equal to the data in reference cube. Also adds unit tests for the new function.

Testing:

  • Ran tests and they passed OK
  • Added new tests for the new feature(s)

@brhooper brhooper changed the title Mobt 520 normalise to reference Add function which normalises input cubes according to a reference Jun 27, 2023
@codecov
Copy link

codecov bot commented Jun 27, 2023

Codecov Report

Merging #1919 (efefeab) into master (8b218e0) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #1919   +/-   ##
=======================================
  Coverage   98.36%   98.37%           
=======================================
  Files         122      122           
  Lines       11643    11707   +64     
=======================================
+ Hits        11453    11517   +64     
  Misses        190      190           
Impacted Files Coverage Δ
...semble_copula_coupling/ensemble_copula_coupling.py 99.21% <100.00%> (+<0.01%) ⬆️
...prover/utilities/forecast_reference_enforcement.py 100.00% <100.00%> (ø)

Copy link
Contributor

@bayliffe bayliffe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on Ben. I think there is quite a lot more to do yet.

  • Can you explain how you envisage multi-percentile data being processed by this, e.g. so that the 20th, 50th, and 80th percentiles of the cubelist diagnostics are all normalised relative to the matching percentiles in the reference cube?
  • How will it handle the case in which these percentiles don't match between cubelist diagnostics and the reference cube?
  • We need a way to call this function / plugin (whatever it ends up being), so it needs a CLI.
  • Can you outline the chain of calls in the suite that will be required, including to this code, to normalise the percentiles and regenerate the probabilities?
  • Is there overlap with the consistency code you've recently written? i.e. could consistency either be capping a value, or normalising a value? (I've not looked at that code, but just a thought).

Copy link
Contributor

@Katie-Howard Katie-Howard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked through all these changes, and they seem generally fine to me. There are just a few minor comments that I have made on the code. It all seems to work fine, and the new unit tests/acceptance tests also run through without any problems. I've had a little look at the new acceptance test data and that too looks ok 👍

improver/cli/enforce_consistent_forecasts.py Show resolved Hide resolved
improver/cli/normalise_to_reference.py Outdated Show resolved Hide resolved
improver/cli/normalise_to_reference.py Outdated Show resolved Hide resolved
improver_tests/utilities/test_normalise_to_reference.py Outdated Show resolved Hide resolved
@brhooper brhooper assigned Katie-Howard and unassigned brhooper Jul 11, 2023
Katie-Howard
Katie-Howard previously approved these changes Jul 12, 2023
Copy link
Contributor

@Katie-Howard Katie-Howard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked through the changes and I'm happy with them now, thanks 👍

@Katie-Howard
Copy link
Contributor

I've looked again and the acceptance tests fail for the checksums and wx-modal. I think it's because of the weather symbols work that has been merged. So if you update your code it should work hopefully.

Copy link
Contributor

@bayliffe bayliffe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments and now I can't read anymore (migraine aura's are trippy). Bailing.


desired_cubes = iris.cube.CubeList()
other_cubes = iris.cube.CubeList()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether it is tidier to make this function return all the cubes as desired_cubes if cube_names == None. It would remove the logic in the CLI in relation to return_name, and allow this function to be called in other places where the status of return_name is similarly unknown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


return normalise_to_reference(cubes, reference, ignore_zero_total)
cubes = cubes[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand why this was put here, but it's not doing what you want when called from the command line; actually the behaviour of the acceptance tests is rather weird. From the command line, calling this CLI with multiple cubes, the cubes variable becomes something like this:

(
    [
        <iris 'Cube' of probability_of_lwe_thickness_of_precipitation_amount_above_threshold / (1) (lwe_thickness_of_precipitation_amount: 27; projection_y_coordinate: 970; projection_x_coordinate: 1042)>
    ],
    [
        <iris 'Cube' of probability_of_thickness_of_rainfall_amount_above_threshold / (1) (thickness_of_rainfall_amount: 27; projection_y_coordinate: 970; projection_x_coordinate: 1042)>
    ],
    [
        <iris 'Cube' of probability_of_lwe_thickness_of_sleetfall_amount_above_threshold / (1) (lwe_thickness_of_sleetfall_amount: 27; projection_y_coordinate: 970; projection_x_coordinate: 1042)>
    ],
    [
        <iris 'Cube' of probability_of_lwe_thickness_of_snowfall_amount_above_threshold / (1) (lwe_thickness_of_snowfall_amount: 27; projection_y_coordinate: 970; projection_x_coordinate: 1042)>
    ]
)

Meaning that your cubes = cubes[0] line just gives you:

cubes = [
    <iris 'Cube' of probability_of_lwe_thickness_of_precipitation_amount_above_threshold / (1) (lwe_thickness_of_precipitation_amount: 27; projection_y_coordinate: 970; projection_x_coordinate: 1042)>
],

However, your acceptance tests create a single list wrapped in a tuple:

(
    [
        <iris 'Cube' of probability_of_lwe_thickness_of_precipitation_amount_above_threshold / (1) (lwe_thickness_of_precipitation_amount: 27; projection_y_coordinate: 970; projection_x_coordinate: 1042)>,
        <iris 'Cube' of probability_of_thickness_of_rainfall_amount_above_threshold / (1) (thickness_of_rainfall_amount: 27; projection_y_coordinate: 970; projection_x_coordinate: 1042)>,
        <iris 'Cube' of probability_of_lwe_thickness_of_sleetfall_amount_above_threshold / (1) (lwe_thickness_of_sleetfall_amount: 27; projection_y_coordinate: 970; projection_x_coordinate: 1042)>,
        <iris 'Cube' of probability_of_lwe_thickness_of_snowfall_amount_above_threshold / (1) (lwe_thickness_of_snowfall_amount: 27; projection_y_coordinate: 970; projection_x_coordinate: 1042)>
    ]
)

Hence your command yields a simple list that can be iterated over.
I have absolutely no idea why calling the CLI from the acceptance test is behaving differently to the command line and it perturbs me. However, the solution is to use the flatten utility which is used in some other CLIs that accept a cubelist. This will enable the acceptance tests and command line calls to behave in the same way.

from improver.utilities.flatten import flatten

cubes = flatten(cubes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this suggestion @bayliffe, I've implemented it and everything is working again.

improver/cli/normalise_to_reference.py Outdated Show resolved Hide resolved
improver/cli/normalise_to_reference.py Outdated Show resolved Hide resolved
improver/cli/normalise_to_reference.py Outdated Show resolved Hide resolved

# check cube compatibility
reference_dim_coords = reference.coords(dim_coords=True)
n_dims_mismatch = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can replace the next ~40 lines with

for cube in cubes:
    cube = check_cube_coordinates(reference, cube)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that I can, because that chunk of code looks at the names of the dimension coordinates and then compares each coordinate between cubes, whereas I think check_cube_coordinates() only checks that the same dimension coordinates exist on each cube.

I do agree that this section of code could be tidied up, though, but my brain is currently failing me on how to do it.

I've not made any change here for now, though I'm happy to revisit it. I think as any future change wouldn't affect the input/output for the plugin that it can safely be made later without changing the CLI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

improver/utilities/forecast_reference_enforcement.py Outdated Show resolved Hide resolved
improver_tests/acceptance/test_normalise_to_reference.py Outdated Show resolved Hide resolved
improver_tests/utilities/test_split_cubes_by_name.py Outdated Show resolved Hide resolved
improver_tests/utilities/test_split_cubes_by_name.py Outdated Show resolved Hide resolved
improver_tests/utilities/test_normalise_to_reference.py Outdated Show resolved Hide resolved
improver_tests/utilities/test_normalise_to_reference.py Outdated Show resolved Hide resolved
Copy link
Contributor

@MoseleyS MoseleyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one more thing...


@pytest.mark.parametrize(
"forecast_type,ignore_zero_total",
(("probability", False), ("percentile", False), ("percentile", True)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As ("probability", True) isn't included, I don't think the probability/kgo.nc is used.

Copy link
Contributor Author

@brhooper brhooper Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The probability inputs don't contain any instances where the total of the inputs is zero but the reference is non-zero. Due to this the CLI can happily run with ignore_zero_total=False (the default), so I don't think that a change is required here.

Later on if forecast_type == "probability" is checked before checking ignore_zero_total, so for probability inputs the test runs regardless of the value of ignore_zero_total

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - I had misread the if..elif..else block


@pytest.mark.parametrize(
"forecast_type,ignore_zero_total",
(("probability", False), ("percentile", False), ("percentile", True)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - I had misread the if..elif..else block

Copy link
Contributor

@Katie-Howard Katie-Howard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good to me.

Copy link
Contributor

@bayliffe bayliffe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not reviewed the unit tests, but you seem to have ample reviewers, so I think that's okay. I'm happy with this, thanks Ben.

Merging, along with the acceptance test data.

@bayliffe bayliffe merged commit 523ae6f into metoppv:master Jul 14, 2023
bayliffe added a commit to bayliffe/improver that referenced this pull request Jul 28, 2023
* upstream/master:
  Fix to the wind vertical displacement adjustment implementation (metoppv#1927)
  Add function which normalises input cubes according to a reference (metoppv#1919)
  Skip ECC bounds usage when converting probabilities to percentiles (metoppv#1926)
  Add CLIs to support rescaling of the forecast based on altitude difference (metoppv#1917)
  Changes to the modal code to increase the percentage to 30% and change the groupings to provide a more representative daily summary symbol. (metoppv#1925)
  Add plugins to support rescaling of the forecast based on altitude difference (metoppv#1916)
  Support conversion from percentiles to probabilities (metoppv#1924)
  Correct handling of reference time in weather_code plugin (metoppv#1920)
  Add CLI for clipping cubes (metoppv#1918)
  Update cbh ecc name (metoppv#1922)
  Updates Broadcast and expand_bounds in Combine Plugin (metoppv#1914)
  Mobt515 cloud base height spot extraction (metoppv#1911)
  MOBT-494: Cube title setting in weather symbol code (metoppv#1912)
  MOBT512-masking percentiles for cloud base height (metoppv#1908)
  Mobt 496 enforce forecast between references (metoppv#1907)
MoseleyS pushed a commit to MoseleyS/improver that referenced this pull request Aug 22, 2024
…etoppv#1919)

* Add new function to normalisetotal of input cubes to a reference forecast. Add unit tests for new function.

* Add docstrings.

* Formatting.

* Minor change to existing code.

* Add functionality to convert between percentile sets. Move function to more sensible location.

* Fixes to plugin. Adds unit tests, a CLI, and acceptance tests. Updates checksums.

* Adds CLI and acceptance tests.

* Updates formatting. Removes unnecessary module imports.

* Update formatting of module imports.

* Remove obsolete file.

* Update checksums

* Add handling of probability forecasts. Add probability forecast unit tests. Move unit tests file to correct directory.

* Correct formatting. Remove obsolete module import.

* Minor changes following review.

* Correct typo.

* Change to CLI to make compatible with suite. Add new function and corresponding unit tests. Change to acceptance test for changed CLI.

* Changes following review. Restrict CLI to only output exactly one cube.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants