-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add rtol for calculating grid spacing #1545
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1545 +/- ##
=======================================
Coverage 97.93% 97.93%
=======================================
Files 106 106
Lines 9447 9447
=======================================
Hits 9252 9252
Misses 195 195
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR, @zfan001 👍
I've suggested some modifications to this PR in the comments below. Primarily I've recommended to pass in the relative tolerance that you want from the CLI, rather than change the relative tolerance in the function calls. This is particularly the case for the centralised utilities, which are used in a number of different places and therefore it is hard to gauge the impact of the change to the relative threshold. There might also be a concern that this change could mask unintended changes to our grid spacing that would have previously caused an error. These changes will increase the size of the PR but will hopefully make this change more appropriate for different users.
There are a few things going on here. I'll try and pull it apart into separate questions/issues, then give my opinion on solutions to those. Some functionality in IMPROVER expects equally spaced or equal area grids. This is a useful simplifying assumption about the input which makes calculations simpler to implement and faster to run, as opposed to doing all calculations in a grid-generic way. Equal spacing is determined by the Questions:
My opinions on those questions:
I don't think exposing these tolerances in the CLI is needed if we can get agreement on sensible values. I suspect that some of the coordinates metadata in files @zfan001 has been using as input are problematic (as in, differences are larger than unavoidable float rounding differences). Implementing that would mean:
I'd suggest its worthwhile getting another person's opinion on this before proceeding as there are some complex questions about meanings and intention here. Perhaps @gavinevans or @MoseleyS on the MO side or @thbom001 on the BOM side? |
Thanks for the detailed write-up, @tjtg. I've pointed @MoseleyS at this, as I'm unfamiliar with the details of some of the usages of the |
@tjtg @gavinevans Thank you for these very good comments and suggestions. I add more explanation on my comment "There is an inherent problem in the grid-spacing rtol, which is related to dtype(float32) and grid resolution". For example, generate an equal-space lat/lon grid : longitude: 160:170degree, and latitude: 35:45 degree. grid spacing: 1/3 degree. create grid cube using float32. If the grid spacing has recurring decimals, the problem will probably turn up (e.g. BOM' MSAS grid ). I am happy to implement what Tom suggested next week, if there is no objection. I recommend to use default rtol=1.e-5. |
I thought I'd respond to a couple of the earlier comments, although I should make clear that I do not know the details of the code, so I'm responding the the information provided in those comments only. I agree with the breakdown and assessment by @tjtg and with the specific proposals around implementing a (fairly tight) tolerance - although I'd want @MoseleyS to confirm this does not impact the nowcast - the re-naming / clarification of the The last comment by @zfan001 slightly concerned me, but I may have misunderstood. This concerns latitude-longitude (Plate Carrée projection) grid for which is reasonable to check for equal grid spacing (to a fairly strict tolerance), but not for an equal area grid (which I assume is the intent of |
We expected this error to come up some time, and I agree that the solution is to specify an rtol value. Ideally, we should specify the smallest rtol value that achieves what we want. For the Met Office grids, which are all defined as having intervals expressable as non-recurring base-10 values, a tolerance of zero is fine. I agree that for an interval of 1/3 degrees, this is not fine. However, a tolerance of 1e-5 is much too large. 1e-7 is more reasonable (see https://stackoverflow.com/questions/39134956/accuracy-of-float32 and remember that our values are in the range 1e0 to 1e+6). There is a second issue entangled here, which really ought to be considered separately - that of when a grid is equal-area. As @tjtg pointed out, the method is specific to the only use-case we have so far, which is to detect a Lambert Azimuthal Equal Area grid with equal spacing in x and y coordinates, while excluding a regular Plate Carree grid, which is regular in x and y, but is not equal area. The assumption that a coordinate that can be expressed in metres is acceptable is probably not safe. The example of the 1/3 degree grid is very helpful in understanding the purpose of this PR. It ought to be included in a unit-test so that we don't lose sight of it in the future. |
@bjwheltor @MoseleyS Thank you very much for your comments. np.allclose is used to check equal spacing in " calculate_grid_spacing". Its original default is rtol=1.e-5. "If the following equation is element-wise True, then allclose returns True. I recommended rtol=1.0e-5 because coord(float32) has 8 significant digits, but grid spacing will normally lose 3 significant digits in np.diff(). In my opionon, It should eliminate vast majority of problems without causing other problems. I support that "check_if_grid_is_equal_area" issue is treated separately, though it calls "calculate_grid_spacing". It is seemingly a function only for equal-area-project grids. To proceed this PR, I suggest to remove all of my changes int/improver/improver/utilities/spatial.py and only change default rtol to 1.0e-5 (or 1.0e-6 ?,1.0e-7 ?) in " calculate_grid_spacing". |
@tjtg @gavinevans @MoseleyS @bjwheltor I changed the code. Basically three are 3 changes in this PR. Unit tests/acceptance tests passed OK. Please review it and let me know if you want any change. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zfan001
I was wrong about specifying a tolerance of 1e-8 as I had forgotten that some of the precision is taken up by the actual longitude of the domain (e.g. 152), so I now agree that a value around 1e-5 is appropriate.
There's still a couple of things to discuss, but I think we're nearly there.
afe2da2
to
4e1d387
Compare
@MoseleyS @gavinevans changes done as suggested. Thanks. |
Concerns have been overcome and others have approved the result. Gavin is happy.
@MoseleyS @tjtg @gavinevans Many thanks! |
* add rtol for calculating grid spacing * update default rtol and add test * fixing done as suggested * fix flake8 erro * minor fixing * minor changes * more minor changes
I had several running errors during running neighbourhood processing and regridding.
"ValueError: Coordinate longitude points are not equally spaced"
It is from calculate_grid_spacing function. It happened for several BOM-commonly-used grids.
Grid spacing is from np.diff(coord). I think that it is reasonable to add rtol=7.0e-5 which made all failure grids passed.
Testing: