Sedimentary logs in CF-NetCDF #309
Replies: 6 comments 1 reply
-
Hello Luke @lhmarsden, I think this is an excellent point and thank you for opening the discussion topic. NetCDF is an interesting suggestion for this data "type"; I would say that from my perspective, it fits some of the useful features/capabilities of the format, i.e. 3-dimensional quantitative data, but not others (a sedimentary log would be recorded for a specific location, so no latitude/longitude grid which can make use of these coordinate dimensions). I don't think it's necessarily incompatible, but would need some careful thinking how to adapt. Something that I can see being a possible sticking point is the date/calendar system used for time with regard to the stratigraphical record. Geology of course deals very substantially with paleo/prehistoric timescales, which would be a little tricky to capture in the time dimension using the CF standard name There is now a " These are some first thoughts I had when reading, so they may or may not be relevant. Looking forward to talking more about this. |
Beta Was this translation helpful? Give feedback.
-
Dear Luke @lhmarsden Thanks for raising this possibility. I strongly agree with you and Ellie @efisher008 that this is an attractive idea. I think CF-netCDF would be very suitable for geological records of this kind. As you say, new standard names would very likely be needed. String-valued ones are possible, either from controlled vocabularies or not, and they can be stored in files as numbers using a self-documenting encoding with Do you have any geological colleagues who might be interested in joining the first two days of the CF 2024 workshop, in September in Sweden or online? One aim of those days is to bring in interested parties from domains that are new to CF, I believe. It occurs to me that ice-core data is of the same kind. We should try to engage someone's interest from that field too. Sediment cores are one-dimensional. As the stackexchange reply suggests, they could be regarded as discrete sampling geometry profiles following chapter 9, but those arrangements are really space-saving measures for storing many such profiles in a file. Even without chapter 9, it's CF-like to have a structure of the kind illustrated in stackexchange, essentially:
A chapter 9 Best wishes Jonathan |
Beta Was this translation helpful? Give feedback.
-
Hi, thanks for your positive reponses and thoughts. The example you provided @JonathanGregory is close to what I had in mind. Correct me if I am wrong, but my understanding was that latitude and longitude are not longer required coordinate variables in more recent versions of CF and this information can be provided using global attributes from ACDD. This is how I have been doing it with, for example, CTD profiles or other vertical profiles. Also great that this could be used for ice cores, sediment cores, etc. I think using flags is an elegant solution for the lithologies and this would simplify the process for users. Great that you are hosting this workshop. I was not aware of it before today and will circulate it around my network. Luke |
Beta Was this translation helpful? Give feedback.
-
I did not see a sign-up link to this workshop. Does this exist yet? I would certainly like to attend and could pick out a few people who also might be interested. FYI I am a data manager on a large multidisciplinary marine project that includes geologists, biologists, chemists, physical oceanographers, engineers... so I have spent quite a lot of time thinking about how to fit more complicated scientific data into CF-NetCDF. |
Beta Was this translation helpful? Give feedback.
-
Hello Luke, it'd be great to see you at the meeting. The official meeting website, with registration, should be live by the end of the week, I think ... we'll post here when it is. David |
Beta Was this translation helpful? Give feedback.
-
Dear Luke I don't believe CF has changed in this respect:
In general no coordinate variable is mandatory in CF. If you don't want to record the lat and lon of the sample, CF doesn't oblige you to do so. Section 5 preamble says that if lat and lon are dimensions with size more than one, they must be coordinate variables, but your case has size-one coordinates, for points. For size-one coordinates, CF doesn't require a size-one dimension; you can use a scalar coordinate variable instead (Section 5.7). Section 9, for discrete sampling geometries, has further requirements. All feature types currently defined require lat and lon variables to be supplied (Table 9.1). ACDD is "discovery" metadata, isn't it, for finding the dataset you want. CF is "use" metadata, for processing it. These are different purposes. CF-aware analysis software doesn't generally refer to ACDD metadata, so it would be helpful to analysts to provide lat and lon as netCDF coordinate variables to each data variable, even if you provide them as global attributes as well. If all data variables apply to the same location, they can share these single-valued coordinate variables. Putting the same information in both ACDD and CF metadata is redundancy, which we generally avoid, but I don't think we can in this case. Best wishes Jonathan |
Beta Was this translation helpful? Give feedback.
-
Topic for discussion
Geologists collect a lot of data that are useful but often not published in a machine-readable way.
I have been discussing sedimentary logs with several geologist colleagues. I am not aware of any standardised formats used for these data - I think people mostly just include an image of their log in their paper. I wonder if CF-NetCDF would be a suitable format for these data. I found this on stack exchange:
https://earthscience.stackexchange.com/questions/19513/storing-borehole-interval-data-logs-in-netcdf
In brief, the netcdf file would have a single dimension (depth). Different variables could be used for different parameters that geologists are interested in (age, grain size, temperature etc). Standard names would need to be proposed for some of these, but I think this discussion should take place elsewhere.
Cell bounds can be used for each layer in the sedimentary log.
Lithology is another important variable, but this is of course a text string. Perhaps this could be handled in a similar way to how taxon names are handled in this example:
https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#taxon-names-and-identifiers
If it is agreed that this is a sensible format for these data, I would love to see a example of how to structure these data as perhaps an appendix in a later version of the CF conventions documentation to encourage geologists to publish these important data in a machine-readable way.
Beta Was this translation helpful? Give feedback.
All reactions