SAWG MTG Minutes 2010-07-16
- Dan Kowal
- Phil Jones
- Anna Milan
- Scott McCormick
- Jeremy Throwe
- Chris Paver
Data Center Activities
- Number of cost estimates they're working on:
- NPP cal/val data.
- Official request for blended TPW product.
- Phil: would it be possible in the product charter to compare needed to support for different support for the products. Nancy would like things split out.
- Jeremy: How formal do we need to have these requests. Need an email from Rick or Bob to kick off the estimate. In terms of details, Scott will give directions to Bob. Image file factors to figure out.
- VIPIR ROM is in final review. Split into three phases. Close to finishing. Target Sept. COWG Mtg.
- Release 5.4 is going into Integration and Test. Target is September. Anna mentioned the CORS metadata effort, are there others? Scott not aware.
- Number of cost estimates they're working on:
- The bTPW Product. Few prods archiving internally. CDR program: 3 prods should be done by end of the year: HIRs, blended sea winds product.
- ICD work for CORS. NCDC example. One ICD with appendices prefered. Keep the body the same. Hopefully the fast majority will be the same across the interface; only exceptions -- this will kick off another round of signatures.
- Missing data discussions. Dan sent out email regarding action items from a meeting initiated by GOES GS with the Data Centers. Jeremy advised that we take lessons learned for NPP and make the capture of this information more robust. What's available now from NPP is not in a useful format. Although a "change process" will be initiated (Ananth contacted about it), the format or the mission information is still a TBD. The Data Centers have indicated that they want to be involved in this process.
- NODC. Welcome Chris. Dan will send out info about the SAWG to him.
CLASS Metadata and CRUD
- Anna presented on the matrix of actors and activities each is engaged in with CRUD.
- Jeremy: CLASS is ony inventorying the descriptive info which constitutes only < 1% of the the metadata; it's a subset to support search and access.
- Clarified that the CRUD matrix is the way things are done today.
- CLASS is looking for assurrances to not modify the inventory database or the files. Jeremy reviewed the update process through file resubmissions. He also pointed out the potential risk of updating the metadata that things could get out of sync. There needs to be some kind of integrity checking.
- Jeremy: CLASS could support changes with the file name, but it's more complicated to update the source of descriptive information when it's in the file itself.
- Scott: What would help is to look at some use cases where there is a mistake in the file name or with an associated metadata file.
- Anna: This happens a lot w/ the collection level metadata - gave "dates" example. She said we need a way for data managers to discover when things are out of sync and provide a mechanism for how they can fix it.
- Clarified that Use Cases can be both for things (data sets) we're dealing with now as well as hypothetical scenarios.
- Jeremy discussed the DMSP time error problem.
- Anna presented the AVHRR and JASON-2 CRUD examples for granule, collection metadata and use of the Rich Inventory. For Granule scenarios, made the distinction of using "companion" metadata file as opposed to ancillary metadata.
- Jeremy: CLASS prefers no having to open up a data file; prefers to deal with the file name and/or companion metadata file in XML. AVHRR was an exception where they are reading from the file.
- Anna: explained that JASON-2 is a proper model for data stewardship; however, one of the catches for NODC to run the Rich Inventory is that they have to have a complete set of the data also residing at NODC to run it. This is not an issue given that it's a small volume (size) data set.
- Jeremy: CLASS is doing cross-referencing for ancillary and mission data for JASON-2. To do this, CLASS needs the related files for cross-referencing and are using the XML metadata files -- for ODGR. Anna said this has application for the AIP description in Phil's DI XML sent out earlier for evaluation.
- Anna: We need to be clear such as in the case of GOES-R to define the roles about what CLASS does and what the Data Center's responsibilities are in manageing the metadata; it often gets lumped together.
- Phil: we need descriptive use cases to help this process along. Anna's matrix is good context and a good start.
- Jeremy: Kicked off discussion about the need to handle station-level metadata.
- Dan: discussed ISIS that's being used by NCDC.
- Anna: FGDC doesn't really handle station level metadata very well. ISO does; it can handle multiple extents.
- Chris: Discussed how NODC considers station level metadata as granules. They have a metadata per accesssion.
- The group discussed broader labels for station histories to capture a broader segment. Jeremy settled on Observing Network History Metadata.
- Action Plan:
- Develop a list of Use Cases.
- Work up examples.
- Submit to a broader audience.
- Consider this a grassroots effort for now.
- Anna will post her draft matrix.
- August 20, 2010