Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CF global attribute "Conventions": expose to users, and include CF standard name table version #5255

Open
larsbarring opened this issue Apr 18, 2023 · 7 comments

Comments

@larsbarring
Copy link
Contributor

✨ Feature Request

Currently, the global attribute Conventions is difficult to change. The default is hardcoded deep in the code:

CF_CONVENTIONS_VERSION = "CF-1.7"
and changing the default is not as easy as one (at least I) would have hoped for, because it is "hidden" as iris.config.netcdf.conventions_override = True, which is difficult to find if I search for "global attribute Conventions" (or similar) in the documentation.

For our own use cases we would like to have better control of which CF version is actually written into the files, and moreover we would like to include additional conventions, like ACDD, Clix-meta version, as well as the CF Standard name Table version. Hence, this feature request has three main components:

  • In iris.save allow users to pass the kwarg Conventions='....'.
  • Change the tools/generate_std_names.py so that the lib/std_names.py includes the version number (and possibly related information such as creation date).
  • New functionality (in iris.util?) to get the default Conventions string, and to get the Standard name table version number from lib/st_names.py.

Motivation

A more general motivation for giving the user (easier) access to setting the CF version is given by CF itself in Section 2.6.1:

When CF is listed with other conventions, this asserts the same full compliance with CF requirements and interpretations as if CF was the sole convention. It is the responsibility of the data-writer to ensure that all common metadata is used with consistent meaning between conventions.

@pp-mo
Copy link
Member

pp-mo commented Apr 18, 2023

Not to speak against the proposal, but I just thought it would be worth explaining here why we control this at all, and what it really means -- which I don't think is written down anywhere in user docs ...

In my view, at least, the only meaning that you can attach to the CF version which Iris writes in output files is "whatever we put in this file should be CF-compliant up to this CF version" :
Effectively, that means "the file does not include any CF encoding from CF versions later than this" -- and that's really all.

You can probably infer that an application interpreting all aspects of that CF version can understand everything that we have tried to state about the data.
But it doesn't (even) mean that the output is definitely valid under later CF versions -- though to date I think CF have totally avoided any backwards-invalidating changes.
Only that it may contain information which is not understood by earlier versions.

@larsbarring
Copy link
Contributor Author

Yes, I understand this and do agree that this is a very reasonable Iris developer perspective. But from our side, as users of iris and building applications using the Iris api the perspective is a slightly different one. The files we produce never make use of all CF/Iris mechanisms, and most of them are fairly standard in most respects, which means that they are equally valid in later CF versions. But occasionally there are improvements and clarifications, and even corrections to the CF Conventions text that either is taken on board by Iris, or handled by us, which means that it is meaningful to indicate this in the files. Or that we have projects or users that want the data to follow some specific CF version.

The main motivation behind the request is however that we want to append additional conventions to the Conventions string, and then we have no easy/nice/clean way to get the Iris default string to start with. Add to this the other part of the request, that we would like to have access to the Standard name table version.

@pp-mo
Copy link
Member

pp-mo commented Apr 18, 2023

as users of iris and building applications using the Iris api the perspective is a slightly different one.

Thanks, understood !

I have been working in this area lately, in the context of #3325 and the project for that .
( which hopefully will make it into Iris 3.6 )
The way Iris handles Conventions attributes is certainly a bit odd + we probably want to fix some things.
I will take care how this might interact with that work.

@pp-mo
Copy link
Member

pp-mo commented Aug 15, 2023

#5423 records the table version in the header of iris.std_names, and a public global variable iris.std_names.CF_STANDARD_NAME_TABLE_VERSION.

@scitools-ci scitools-ci bot removed this from 🚴 Peloton Dec 15, 2023
@larsbarring
Copy link
Contributor Author

@ESadek-MO
Copy link
Contributor

@SciTools/peloton We need to discuss this further amongst ourselves, but in the meantime, have you thought about ncdata for post-save modification?

@trexfeathers
Copy link
Contributor

Good news: we discovered a secret feature :godmode: !

image

This gets used very rarely, and isn't even documented, so it has faded from developers' memory, but it is there. Please let us know if it works for you 😊


iris.site_configuration is designed to modify certain aspects of Iris based on a config file. This allows for a more global modification versus an argument in a function:

  • A permanent setting for a piece of software downstream of Iris
  • Can share a single config file between a team
  • Even if just writing scripts: don't need identical lines in multiple calls

Modifying Conventions was one of the imagined use cases, since it would often be something like an institution-wide phrase that needs to be added.

Permanent setting - create a file: iris/site_config.py

def cf_patch_conventions(conventions: str) -> str:
    return f"{conventions} extra" if conventions else conventions


def update(config: dict) -> None:
    config["cf_profile"] = lambda cube: None
    config["cf_patch_conventions"] = cf_patch_conventions
Expand for how this works

iris/lib/iris/__init__.py

Lines 249 to 258 in 46d2cf6

# Initialise the site configuration dictionary.
#: Iris site configuration dictionary.
site_configuration = {}
try:
from iris.site_config import update as _update
except ImportError:
pass
else:
_update(site_configuration)

Temporary setting - modify iris.site_configuration

import iris

def cf_patch_conventions(conventions: str) -> str:
    return f"{conventions} extra" if conventions else conventions

iris.site_configuration = {
    "cf_profile": lambda cube: None,
    "cf_patch_conventions": cf_patch_conventions,
}

iris.save(something, "something.nc")
You could even put this into a context manager
from contextlib import contextmanager

import iris

@contextmanager
def convention_addition(addition: str) -> None:
    def cf_patch_conventions(conventions: str) -> str:
        return f"{conventions} {addition}" if conventions else conventions

    original_config = iris.site_configuration.copy()

    iris.site_configuration = {
        "cf_profile": lambda cube: None,
        "cf_patch_conventions": cf_patch_conventions,
    }
    yield

    iris.site_configuration = original_config

with convention.addition("extra):
    iris.save(something, "something.nc")

More info

All modifications are currently just for Iris saving. This feature is currently undocumented so if you'd like to know the full extent of what you can do please browse the code at the below points, and perhaps search the Iris tests:

profile = iris.site_configuration["cf_profile"](cube)

cf_patch = iris.site_configuration.get("cf_patch")
if cf_patch is not None:
# Perform a CF patch of the dataset.
cf_patch(profile, self._dataset, cf_var_cube)

conventions_patch = iris.site_configuration.get("cf_patch_conventions")
if conventions_patch is not None:
conventions = conventions_patch(conventions)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

4 participants