Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

udunits supports ppm, but documentation states it does not #260

Closed
mathiasbockwoldt opened this issue Apr 22, 2020 · 50 comments · Fixed by #390 or #413
Closed

udunits supports ppm, but documentation states it does not #260

mathiasbockwoldt opened this issue Apr 22, 2020 · 50 comments · Fixed by #390 or #413
Labels
change agreed Issue accepted for inclusion in the next version and closed enhancement Proposals to add new capabilities, improve existing ones in the conventions, improve style or format

Comments

@mathiasbockwoldt
Copy link

Title

udunits supports ppm, but documentation states it does not

Moderator

None at the moment

Requirement Summary

The documentation states that udunits does not support dimensionless ratios like parts-per-million. But udunits does support such units.

Technical Proposal Summary

Change sentence to state that ppm, ppb etc are supported, or at least remove the statement that these units are not supported.

Benefits

Every reader working with dimensionless ratios will benefit from the correction.

Status Quo

The current working draft (1.9) states that ppm etc. are not supported by udunits.

Detailed Proposal

The chapter "Description of the Data" - "Units" (ch03.adoc) states:

The Udunits package defines a few dimensionless units, such as percent, but is lacking commonly used units such as ppm (parts per million).

The last part is wrong. UDUNITS-2 does support ppm, ppb, ppt, and ppq. In addition, percent is also supported. Other similar units, like permille are not supported. In fact, these units were supported since udunits2 version 2.1.24 from August 2011. The change occurred here:
Unidata/UDUNITS-2@789147f

Please update the documentation to reflect the support of ppm etc.

@mathiasbockwoldt mathiasbockwoldt added the defect Conventions text meaning not as intended, misleading, unclear, has typos, format or language errors label Apr 22, 2020
@JonathanGregory
Copy link
Contributor

Dear @mathiasbockwoldt

Thanks for raising this. I am glad to learn that udunits has been updated! Since CF1.8 gives https://www.unidata.ucar.edu/software/udunits/ as its reference for udunits, which means udunits 2 in effect, I agree that the text should be changed. We need to agree a specific change. Quoting a bit more of the current text:

The Udunits package defines a few dimensionless units, such as percent, but is lacking commonly used units such as ppm (parts per million). This convention does not support the addition of new dimensionless units that are not udunits compatible. The conforming unit for quantities that represent fractions, or parts of a whole, is "1". The conforming unit for parts per million is "1e-6".

I would suggest we replace that with

The Udunits package defines a few dimensionless units, such as percent and ppm (parts per million). This convention does not support the addition of new dimensionless units that are not udunits compatible. The conforming unit for quantities that represent fractions, or parts of a whole, is "1".

That is, we remove the recommendation to use 1e-6 for ppm. What do you and others think?

I am going to change the label of this issue to Enhancement because I would say that this is actually a material change to the convention (although a minor one) and thus deserves more scrutiny than a defect correction.

Cheers, Jonathan

@JonathanGregory JonathanGregory added enhancement Proposals to add new capabilities, improve existing ones in the conventions, improve style or format and removed defect Conventions text meaning not as intended, misleading, unclear, has typos, format or language errors labels Apr 22, 2020
@sethmcg
Copy link
Contributor

sethmcg commented Apr 22, 2020

I think the proposed change sounds good, and I support it.

@taylor13
Copy link

thanks to @mathiasbockwoldt for pointing this out and to @JonathanGregory for proposed wording, which I support.

@martinjuckes
Copy link
Contributor

Does this mean that we accept both ppm and ppmv (parts per million by volume), which are equivalent in udunits2 (i.e. they conform and 1 ppm = 1 ppmv)? I ask because this appears to stretch the concept of physical equivalence of units.

@cameronsmith1
Copy link

Does this mean that we accept both ppm and ppmv (parts per million by volume), which are equivalent in udunits2 (i.e. they conform and 1 ppm = 1 ppmv)? I ask because this appears to stretch the concept of physical equivalence of units.

Hi @martinjuckes . I am not sure I understand your comment. Are you pointing out that ppm is not the same as ppmv, even though they are usually so close that most people ignore the difference?

That is, we remove the recommendation to use 1e-6 for ppm.

Hi @JonathanGregory . I certainly prefer using ppm than 1e-6 because it helps distinguish from other dimensionless units, such as kg/kg (a common source of mistakes). However, what should a user do for a concentration that is smaller than udunits allows? There is also the downside that programs reading the data need to be told to recognize ppm/ppb/etc and multiply by 1e-6/1e-9/etc, rather than simply multiplying by the units. Personally, I would sacrifice this convenience for avoiding confusion between molar concentrations and mass concentrations.

@martinjuckes
Copy link
Contributor

Hi @cameronsmith1 : I'm asking whether we accept them as defined in udunits2. In udunits2, ppm and ppmv are exactly equivalent and interchangeable. My opinion is closer to yours: these two units are similar, but not equivalent. Hence, it might be worth excluding ppmv and other "by-volume" units in the grounds that the "by-volume" information is really information about the variable which should be in the standard name.

@cameronsmith1
Copy link

Hi @martinjuckes . I think you are correct, and I agree with you. Fortunately, if people don't follow what you say the consequence will probably be negligible.

@steingod
Copy link

I support the rewording suggested by @JonathanGregory with the modifications suggested by @cameronsmith1 and @martinjuckes.

@dopplershift
Copy link

Suggestion from the peanut gallery: Might this be an opportunity to move away from having CF's documentation of supported units as simply "whatever UDUnits supports"? It has always struck me as odd that a standard that takes such pains to be precise about certain aspects of what constitutes valid metadata just completely punts (to an entirely separate tool with no standards body behind it) with regards to valid units. And having just spent significant time trying to understand whether "Celsius" is and should be a valid unit (it is), some verbiage describing valid unit strings would IMO be much better than having to install and run a separate tool.

@roy-lowry
Copy link

I have always had a problem with the ambiguity of units like ppm. An oxygen measurement in the atmosphere in ppm could have two different values, one in mass/mass and the other in volume/volume. Consequently, the use of ppmv by some but my preference has always been to be semantically explicit with units such as microlitres/litre or mg/kg. Consequently, I was disappointed when ppm was added to UDUNITS. @JonathanGregory 's suggested change of words is an improvement in my view, but I would like to see it accompanied by a suggestion to use unambiguous, semantically implicit units.

To @dopplershift I would say that UDUNITS was incorporated into the initial CF standard because it was a part of the NetCDF community standard upon which the CF community standard was based. Replacing UDUNITS by a units vocabulary under CF governance is something that has been discussed by the CF community several times - usually during discussions of the units to be used for salinity in oceanography - but it was and isn't a feasible proposition within available resources and so solutions such as "1e-3" and "1e-6" were written into the conventions.

@JonathanGregory
Copy link
Contributor

I agree with what has been said by @martinjuckes, @cameronsmith1 and @roy-lowry. I don't think dimensionless units should be used to distinguish between dimensionless quantities, since that's the purpose of standard names, but Roy's preference for explicitly describing the dimensionless ratios makes sense to me. Thus we have these two proposals. We could add both of them to the text:

  • CF does not allow ppmv, ppbv and pptv, because the distinction between ratios of volume and other dimensionless ratios should be made by the standard_name, not the units.

  • When a dimensionless quantity is a ratio of dimensional quantities, data-writers are encouraged to consider specifying its units as a ratio of units, for instance mg kg-1 or equivalents instead of ppm for a mass ratio, and ul l-1, microlitre litre-1 or equivalents for ppmv, for clarity, while also providing a standard_name.

I have phrased the first as a negative requirement (a prohibition), meaning that these units would cause the CF checker to report an error. It could instead be a negative recommendation (a deprecation), so that the CF checker would give a warning. What do you think?

On @dopplershift's point, I agree with Roy. I would add that CF relies on udunits only for the definition of possible syntax and legal units, not for the udunits software. I should think that Unidata would be happy to have suggestions for improving the udunits documentation of what is allowed. Unidata have provided a useful resource and CF doesn't have much human resource available, so it is better not to duplicate. The "delegation" to udunits has served us well, I would say.

@JonathanGregory
Copy link
Contributor

The last point, raised by Ryan @dopplershift, is of course a reasonable question and has been raised before. If others agree with my answer, we should put it in the FAQ.

@cameronsmith1
Copy link

I agree with the text by @JonathanGregory . Use of ppmv instead of ppm is very common, and I don't think it is a big deal in practice, but I expect it will generally be easy to fix for users, so I don't have a strong opinion between whether it gives an error or a warning.

@Dave-Allured
Copy link
Contributor

I support the original request. CF should accept well known units such as ppm and ppmv without error, warning, or prejudice. These only improve the information content, not detract. Such units are more meaningful to humans than current recommendations like "1" and "1e-6".

@erget
Copy link
Member

erget commented Oct 6, 2022

Merged via unnecessarily complicated #390

@erget erget closed this as completed Oct 6, 2022
@JonathanGregory
Copy link
Contributor

Please could you clarify what change has been made here? Although there was agreement on some points in this issue, I don't see a conclusion that we agreed should be implemented. In particular, the last two comments are in disagreement. Thanks. Jonathan

@erget
Copy link
Member

erget commented Oct 6, 2022

Hi @JonathanGregory - I was going off of #385 (comment), which stated that the agreed changes in #390 would close this issue. Sorry for prematurely closing this one; I haven't been involved in this discussion but thought this one was dormant as the last comment was from 2020.

@larsbarring
Copy link
Contributor

During the hackaton we discussed also this part of the text, i.e. that UDUNITS now includes ppm and others. But obviously the suggested changes did not make it into the final PR. My apologies --- @JonathanGregory thanks for spotting this.

From the discussion in this issue I take it that there are two parts: (1) the current CF text needs to be updated because UDUNITS now accepts ppm, ppb, ppmv, and ppbv, etc. (2) There is no consensus regarding whether the "by volume" variants should be allowed or not.

@JonathanGregory
Copy link
Contributor

Dear Daniel @erget, @larsbarring and all

You're right that this was dormant, and thanks for reawakening it. From Daniel's comment I understand that the changes which have been implemented are those of issue 385, which corrects various references to UDUNITS. Thanks for implementing those corrections. I agree with Lars's summary that there are two things still to be addressed in this issue:

(1) the current CF text needs to be updated because UDUNITS now accepts ppm, ppb, ppmv, and ppbv, etc. (2) There is no consensus regarding whether the "by volume" variants should be allowed or not.

On (2), the majority opinion was that CF should either prohibit or deprecate the "by volume" units (ppmv etc.) The reason is that the distinction between fraction by mass or mole and fraction by volume relates to the quantity being described, and therefore should be made by the standard_name, not by the units. In my opinion, this is consistent with dimensional units, where CF insists that quantities with different units also have different standard names. But there isn't a consensus about this, as Lars says. Are there any more views on this point?

On (1), we have to decide whether to say anything about the non-v dimensionless units (ppm etc.), or keep silent (implying that CF thinks they are fine). In previous discussion, views were expressed that they are acceptable in CF, but we should encourage people to code something more informative:

When a dimensionless quantity is a ratio of dimensional quantities, data-writers are encouraged to consider specifying its units as a ratio of units, for instance mg kg-1 or equivalents for a mass ratio of 1e-6, and ul l-1, microlitre litre-1 or equivalents for a volume ratio of 1e-6, while also providing a standard_name.

Best wishes

Jonathan

@larsbarring
Copy link
Contributor

Dear @JonathanGregory

I fully agree with the reasons in your comment on point (2). But I think the "by volume" units as such should not be prohibited or deprecated. In connection with standard names I do agree they should be prohibited. But without a standard name I argue that the "by volume" units has their uses, after all they are widely used.

Regarding your comment on point (1) I like your proposed text, possibly with the minor modification to delete the two "of 1e-6" because the unit ratios are just examples. Furthermore, possibly a sentence could be added to clarify that "by volume" units are not allowed for use with standard names.

@cameronsmith1
Copy link

cameronsmith1 commented Oct 7, 2022 via email

@cameronsmith1
Copy link

I think many users find it helpful when plotting packages automatically add the units to a plot, so there is an advantage to having a units string in the metadata, in accordance with the comment of @Dave-Allured , so I am in favor of deprecating rather than prohibiting ppmv, as @JonathanGregory suggests.

I also think we are supporting, or at least allowing, ppm. Correct? The text and logic needs to be parsed carefully to see this. To make it more obvious, I suggest we revise the last sentence of the first paragraph to eliminate the double negative, eg "This convention allows dimensionless units that are UDUNITS compatible". This allows ppmv, which is consistent with the second paragraph if we deprecate it rather than prohibit it.

@JonathanGregory
Copy link
Contributor

Dear @larsbarring and Philip @cameronsmith1

I think perhaps we're being inconsistent. On the one hand, we suggest that microlitre litre-1 for a volume ratio would be better than ppm because it's more informative, but on the other hand we deprecate ppmv because it's too informative (and hence might be inconsistent) if there's a standard name as well!

Cheers

Jonathan

@larsbarring
Copy link
Contributor

I think that what we now are discussing is how to strike the right balance between formalism and pragmatism. The SI Brochure Section 5.4.7 is a good starting point:

**5.4.7 Stating quantity values being pure numbers**
As discussed in Section 2.3.3, values of quantities with unit one, are expressed simply as
numbers. The unit symbol 1 or unit name “one” are not explicitly shown. SI prefix symbols
can neither be attached to the symbol 1 nor to the name “one”, therefore powers of 10 are
used to express particularly large or small values.

Quantities that are ratios of quantities of the same kind (for example length ratios and
amount fractions) have the option of being expressed with units (m/m, mol/mol) to aid the
understanding of the quantity being expressed and also allow the use of SI prefixes, if this
is desirable (μm/m, nmol/mol). Quantities relating to counting do not have this option, they
are just numbers.

The internationally recognized symbol % (percent) may be used with the SI. When it is
used, a space separates the number and the symbol %. The symbol % should be used rather
than the name “percent”. In written text, however, the symbol % generally takes the
meaning of “parts per hundred”. Phrases such as “percentage by mass”, “percentage by
volume”, or “percentage by amount of substance” shall not be used; the extra information
on the quantity should instead be conveyed in the description and symbol for the quantity.

The term “ppm”, meaning 10^−6 relative value, or 1 part in 10^6, or parts per million, is also
used. This is analogous to the meaning of percent as parts per hundred. The terms “parts per
billion” and “parts per trillion” and their respective abbreviations “ppb” and “ppt”, are also
used, but their meanings are language dependent. For this reason the abbreviations ppb and
ppt should be avoided.

Here the use of % and ppm is clearly supported, and reasons related to language differences are given for avoiding ppb and ppt are given. Also, good arguments for allowing unit ratios, like microlitre litre-1 (as in Jonathan's example) are given. In contrast, the SI Brochure specifically disallows "per volume" (etc.) units.

I think that part of the current discussion is that ppm is implicitly taken to be a "per mass" unit, which it is not (the trailing "m" in ppm, or just historical dominance??). And this leads to the false perception that "per volume" quantities have to have different units. However, if one is really keen to fold the quantity type into the unit then Wikipedia on "Parts-per notation" helpfully points at ppmw etc. for "by-weight" units, and refers to a separate page detailing a different way to express "per-mole" fractions.

With all the different standard names (and in particular the meticulous work in establishing their precise link to the quantity they represent) I would be reluctant to fold the quantity meaning into the unit. In a quick search in the standard name table I found 11 standard names including the phrase "volume_fraction" and quite many involving either "mass_fraction" or "mole_fraction". All have canonical unit "1". I fail to see that allowing ppmv would add anything to the "volume_fraction" ones, especially in the light of the clear statement in the SI Brochure, and the anguish expressed having to add them to the UDUNITS database (see link in top post)

From my perspective I think that it is important to make a distinction between quantity and unit. And this is basically what the standard names and their canonical units are doing. I do see an advantage in allowing unit ratios like, for example microlitre litre-1 for the standard name volume_fraction_of_oxygen_in_sea_water, or unit g kg-1 (or kg kg-1) for specific_humidity. This is supported both by SI and UDUNITS. But when there is no standard name (for whatever reason) I do think that we should be pragmatic and allow "per volume" units like ppmv.

@larsbarring
Copy link
Contributor

Hi all,

With Jonathan's suggestion I think that we were nearly converging towards something that everyone could live with, but maybe not meeting anyone's every requirement. I think that it would be a pity to not move this forward. So, (esp. @JonathanGregory, @cameronsmith1, @Dave-Allured) could we proceed with Jonathan's suggestion?

Kind regards,
Lars

@Dave-Allured
Copy link
Contributor

Per above, I abstain. That means you may proceed.

@JonathanGregory
Copy link
Contributor

Dear @larsbarring

Thanks for getting back to this. It seems like we have a consensus except on the point of whether ppmv etc. should be prohibited or deprecated if there is a standard name. In most recent contributions, you favour prohibition, @cameronsmith1 favours deprecation, and I was equivocal.

In order to deprecate it, we would have to add an indication in the standard name table of which standard names would trigger the deprecation viz. any whose canonical unit is 1 and which are not volume ratios. I think that's too complicated, which makes me favour prohibition. Without any extra information, we can prohibit units of ppmv etc. in combination with any standard name.

That means we would also prohibit ppmv when used with standard names for quantities which aren't volume ratios, which is right to prohibit. It is clearly incorrect to have units of ppmv with mass_fraction_of_carbon_dioxide_in_air, although it's dimensionally correct.

Best wishes

Jonathan

@cameronsmith1
Copy link

Given all the discussion, I am OK with either prohibiting or deprecating ppmv.

@larsbarring
Copy link
Contributor

larsbarring commented Nov 2, 2022

Yes, I favour prohibition of "per-volume" units in connection with standard names for reasons I have expressed in earlier posts. And this is only strengthened by @JonathanGregory's insight that if they were deprecated (or allowed altogether) a mechanism has to be in place to distinguish for which they are allowed/not allowed. And this would just add another level of complication to deal with.

Kind regards,
Lars

@JonathanGregory
Copy link
Contributor

Dear all

In view of the discussion, and thanks to flexibility by @Dave-Allured and @cameronsmith1, this is the currently proposed new text for Section 3.1. Note that I have reordered it, to group the statements about UDUNITS, but without changing it otherwise. I hope this is fine for all, but please say if not.

Units are not required for dimensionless quantities. A variable with no units attribute is assumed to be dimensionless. However, a units attribute specifying a dimensionless unit may optionally be included. The canonical unit (see also 3.3, Standard Name, below) for dimensionless quantities that represent fractions, or parts of a whole, is "1". Descriptive information about dimensionless quantities, such as sea-ice concentration, cloud fraction, probability, etc., should be given in the long_name or standard_name attributes (see Sections 3.2 and 3.3, below) rather than the units. CF recommends that a standard_name or a long_name should be provided for all data variables. When a dimensionless quantity is a ratio of dimensional quantities, CF suggests that it may be informative to users of data if the units are given as ratio of dimensional units, for instance mg kg-1 for a mass ratio of 1e-6, or microlitre litre-1 for a volume ratio of 1e-6.

The UDUNITS package defines a few dimensionless units, such as percent and ppm (parts per million). The CF convention supports dimensionless units that are UDUNITS compatible, with one exception, concerning the dimensionless units defined by UDUNITS for volume ratios, such as ppmv and ppbv. These units are allowed in the units attribute by CF only if the data variable has no standard_name. These units are prohibited by CF if there is a standard_name, because the standard_name defines whether the quantity is a volume ratio, so the units are needed only to indicate a dimensionless number.

to replace this existing text:

Units are not required for dimensionless quantities. A variable with no units attribute is assumed to be dimensionless. However, a units attribute specifying a dimensionless unit may optionally be included. The UDUNITS package defines a few dimensionless units, such as percent, ppm (parts per million), and others. This convention does not support the addition of new dimensionless units that are not UDUNITS compatible. The conforming unit for quantities that represent fractions, or parts of a whole, is "1". The conforming unit for parts per million is "1e-6". Descriptive information about dimensionless quantities, such as sea-ice concentration, cloud fraction, probability, etc., should be given in the long_name or standard_name attributes (see below) rather than the units.

Are there any further concerns or comments? If none are expressed in the next three weeks (by 23rd November), this change will be accepted.

Best wishes

Jonathan

@taylor13
Copy link

taylor13 commented Nov 2, 2022

I stumbled over the sentence "Descriptive information about dimensionless quantities, such as sea-ice concentration, cloud fraction, probability, etc., should be given in ... " (retained unedited from the original) because it seems to have nothing to do with units, and I don't know why/how one would include this information in the units attribute. Could this be moved to the end of the paragraph and, to make it just a bit clearer, could the sentence be reworded along the lines "Information describing the dimensionless quantity itself (as opposed to its units) should be given in the long_name or standard_name attributes (see Sections 3.2 and 3.3, below), where one could learn, for example, whether a fraction represented cloud_area_fraction or land_area_fraction.

@JonathanGregory
Copy link
Contributor

Dear Karl @taylor13

I believe that this remark is included because actually (at least in the past) people have sometimes used units to describe dimensionless quantities. The next paragraph in the existing text of 3.1 is also concerned with this issue. Perhaps it would make more sense to put this sentence on which you stumbled into that paragraph instead. The text would be as follows:

Units are not required for dimensionless quantities. A variable with no units attribute is assumed to be dimensionless. However, a units attribute specifying a dimensionless unit may optionally be included. The canonical unit (see also 3.3, Standard Name, below) for dimensionless quantities that represent fractions, or parts of a whole, is "1". When a dimensionless quantity is a ratio of dimensional quantities, CF suggests that it may be informative to users of data if the units are given as ratio of dimensional units, for instance mg kg-1 for a mass ratio of 1e-6, or microlitre litre-1 for a volume ratio of 1e-6.

The UDUNITS package defines a few dimensionless units, such as percent and ppm (parts per million). The CF convention supports dimensionless units that are UDUNITS compatible, with one exception, concerning the dimensionless units defined by UDUNITS for volume ratios, such as ppmv and ppbv. These units are allowed in the units attribute by CF only if the data variable has no standard_name. These units are prohibited by CF if there is a standard_name, because the standard_name defines whether the quantity is a volume ratio, so the units are needed only to indicate a dimensionless number.

Descriptive information about dimensionless quantities, such as sea-ice concentration, cloud fraction, probability, etc., should be given in the long_name or standard_name attributes (see Sections 3.2 and 3.3, below) rather than the units. The units level, layer, and sigma_level are allowed for dimensionless vertical coordinates to maintain backwards compatibility with COARDS. These units are not compatible with UDUNITS and are deprecated by this standard because conventions for more precisely identifying dimensionless vertical coordinates are introduced (see Section 4.3.2, "Dimensionless Vertical Coordinate").

Is that OK? All of the final paragraph here is existing text.

Cheers

Jonathan

@cameronsmith1
Copy link

@JonathanGregory , @taylor13 , For the sentence 'Descriptive information about dimensionless quantities,...' I think it would help if we included examples of what we mean by 'Descriptive information', which I think are things like: 'volume_fraction, mole_fraction, mass_fraction, area_fraction, optical_thickness, salinity'.

I propose that we change that sentence to something like:

Descriptive information about dimensionless quantities (eg, volume_fraction, mole_fraction, mass_fraction, area_fraction, optical_thickness, salinity), should be incorporated into the long_name or standard_name attributes (see Sections 3.2 and 3.3, below) rather than the units.

This will replace

Descriptive information about dimensionless quantities, such as sea-ice concentration, cloud fraction, probability, etc., should be given in the long_name or standard_name attributes (see Sections 3.2 and 3.3, below) rather than the units.

@taylor13
Copy link

taylor13 commented Nov 2, 2022

The text proposed by @JonathanGregory is o.k.; I like @cameronsmith1 's minor edit slightly better. I still don't think it ideal to say "Descriptive information about dimensionless quantities" because I count the units as helping to describe the quantity too (i.e., it would be included in the term "descriptive information"). That's why I proposed an alternative in #260 (comment). Another possibility is: "Information describing a physical quantity itself does not belong in the units attribute, but should be given in the long_name or standard_name attributes (see Sections 3.2 and 3.3, below)," and then perhaps also including some examples.

@JonathanGregory
Copy link
Contributor

Dear Karl and Philip

Yes, I believe the sentence means to say, "Information describing a physical quantity itself does not belong in the units attribute, but should be given in the long_name or standard_name attributes," in Karl's words. The reason for the sentence is that people have sometimes put things like units="sea ice coverage". In the case of dimensional quantities, they won't do that, because it's obvious that the unit belongs in there, but when the quantity is dimensionless and has "no unit", the units can seem a tempting place to put information to describe the nature of the number supplied.

Best wishes

Jonathan

@cameronsmith1
Copy link

Thanks, Jonathan for explaining why it is an important point to make. FWIW, I think that Karl's version of the text is clearer too.

@larsbarring
Copy link
Contributor

larsbarring commented Nov 4, 2022

Thank you @JonathanGregory, @taylor13 and @cameronsmith1 for these final wordsmithing efforts. I cannot contribute to this, and will be happy to accept the outcome.

@JonathanGregory
Copy link
Contributor

Dear all

Thanks for your comments. Here's another suggested version, incorporating Karl's sentence. The first two paragraphs are the same as before.

Best wishes

Jonathan

Units are not required for dimensionless quantities. A variable with no units attribute is assumed to be dimensionless. However, a units attribute specifying a dimensionless unit may optionally be included. The canonical unit (see also 3.3, Standard Name, below) for dimensionless quantities that represent fractions, or parts of a whole, is "1". When a dimensionless quantity is a ratio of dimensional quantities, CF suggests that it may be informative to users of data if the units are given as ratio of dimensional units, for instance mg kg-1 for a mass ratio of 1e-6, or microlitre litre-1 for a volume ratio of 1e-6.

The UDUNITS package defines a few dimensionless units, such as percent and ppm (parts per million). The CF convention supports dimensionless units that are UDUNITS compatible, with one exception, concerning the dimensionless units defined by UDUNITS for volume ratios, such as ppmv and ppbv. These units are allowed in the units attribute by CF only if the data variable has no standard_name. These units are prohibited by CF if there is a standard_name, because the standard_name defines whether the quantity is a volume ratio, so the units are needed only to indicate a dimensionless number.

Information describing a dimensionless physical quantity itself (e.g. "area fraction" or "probability") does not belong in the units attribute, but should be given in the long_name or standard_name attributes (see Sections 3.2 and 3.3, below), in the same way as for physical quantities with dimensional units. There is an exception to this rule, that the units level, layer, and sigma_level are allowed for dimensionless vertical coordinates to maintain backwards compatibility with COARDS. These units are not compatible with UDUNITS and are deprecated by this standard because conventions for more precisely identifying dimensionless vertical coordinates are introduced (see Section 4.3.2, "Dimensionless Vertical Coordinate").

@taylor13
Copy link

taylor13 commented Nov 4, 2022

I'm happy with that. I didn't pay attention to the last 2 sentences until now. Could the first of these be slightly edited as follows:

To maintain backwards compatibility with COARDS, however, the text strings level, layer, and sigma_level can appear in the units attribute to indicate dimensionless vertical coordinates. These strings, however, are not compatible ...

Sorry for suggesting things in bits and pieces.

@JonathanGregory
Copy link
Contributor

How about this

Information describing a dimensionless physical quantity itself (e.g. "area fraction" or "probability") does not belong in the units attribute, but should be given in the long_name or standard_name attributes (see Sections 3.2 and 3.3, below), in the same way as for physical quantities with dimensional units. As an exception, to maintain backwards compatibility with COARDS, the text strings level, layer, and sigma_level are allowed in the units attribute, in order to indicate dimensionless vertical coordinates. This use of units is not compatible with UDUNITS, and is deprecated by this standard because conventions for more precisely identifying dimensionless vertical coordinates are available (see Section 4.3.2, "Dimensionless Vertical Coordinate").

@JonathanGregory
Copy link
Contributor

Here's the current complete proposal to replace the para starting "Units are not required for dimensionless quantities" in Sect 3.1:

Units are not required for dimensionless quantities. A variable with no units attribute is assumed to be dimensionless. However, a units attribute specifying a dimensionless unit may optionally be included. The canonical unit (see also 3.3, Standard Name, below) for dimensionless quantities that represent fractions, or parts of a whole, is "1". When a dimensionless quantity is a ratio of dimensional quantities, CF suggests that it may be informative to users of data if the units are given as ratio of dimensional units, for instance mg kg-1 for a mass ratio of 1e-6, or microlitre litre-1 for a volume ratio of 1e-6.

The UDUNITS package defines a few dimensionless units, such as percent and ppm (parts per million). The CF convention supports dimensionless units that are UDUNITS compatible, with one exception, concerning the dimensionless units defined by UDUNITS for volume ratios, such as ppmv and ppbv. These units are allowed in the units attribute by CF only if the data variable has no standard_name. These units are prohibited by CF if there is a standard_name, because the standard_name defines whether the quantity is a volume ratio, so the units are needed only to indicate a dimensionless number.

Information describing a dimensionless physical quantity itself (e.g. "area fraction" or "probability") does not belong in the units attribute, but should be given in the long_name or standard_name attributes (see Sections 3.2 and 3.3, below), in the same way as for physical quantities with dimensional units. As an exception, to maintain backwards compatibility with COARDS, the text strings level, layer, and sigma_level are allowed in the units attribute, in order to indicate dimensionless vertical coordinates. This use of units is not compatible with UDUNITS, and is deprecated by this standard because conventions for more precisely identifying dimensionless vertical coordinates are available (see Section 4.3.2, "Dimensionless Vertical Coordinate").

If there are no further concerns or suggestions, we can accept this in three weeks from now (on 5 Dec). Thanks.

@cameronsmith1
Copy link

Hi @JonathanGregory . I see only one detail to clarify. The previous discussion noted that the meaning of ppb, ppt, and ppq are language dependent. The current text doesn't make the CF position clear. However, I infer that the value of these language dependent quantities is whatever UDUNITS says it is. In which case, perhaps we can clarify this by amending the first sentence of the second paragraph to be:

"The UDUNITS package defines a few dimensionless units, such as percent, ppm (parts per million), and ppb (parts per billion = 10^-9)."

@JonathanGregory JonathanGregory linked a pull request Nov 18, 2022 that will close this issue
@JonathanGregory
Copy link
Contributor

I have created a pull request to implement this change, including the clarification just suggested by Philip @cameronsmith1, for which thanks; see modified text. Since that's a minor change, I won't reset the three-week countdown, unless someone objects. Note that there's also a new rule in the conformance document:

Dimensionless units for volume fractions defined by UDUNITS (ppv, ppmv, ppbv, ppbv, pptv, ppqv) are not allowed in the units attribute of any variable which also has a standard_name attribute.

Cheers, Jonathan

@davidhassell
Copy link
Contributor

Hello - I have just read this whole thread, having not being paying attention previously, and I'd like to thank you all for an interesting discussion and creating the new text, which I am very happy with.

@JonathanGregory
Copy link
Contributor

Thanks for reading, checking and adding your support, @davidhassell. Three weeks have passed with no further comments requiring attention, and enough support has been expressed according to the rules, so this change is thus agreed. Thanks for collaborating on the modified text for ppm etc., @larsbarring @taylor13 @cameronsmith1. Please could one of you or @davidhassell merge the pull request and attribute it to milestone 1.11. It's not proper for me to do it, since I wrote it. Thanks.

@JonathanGregory JonathanGregory added the change agreed Issue accepted for inclusion in the next version and closed label Dec 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
change agreed Issue accepted for inclusion in the next version and closed enhancement Proposals to add new capabilities, improve existing ones in the conventions, improve style or format
Projects
None yet