SAWG MTG Minutes 2010-06-18
From NGDC Wiki
- Dan Kowal
- Phil Jones
- Tess Brandon
- Anna Milan
- Scott McCormick
- Jeremy Throwe
Data Center Activities
- NODC (Tess)
- APSCO product approved. Will do an SA for it now; Crosswalk from the SIF.
- NODC wrapping up with their CLASS reqs person; Trying to get a permanent person to fill this role. 1st draft of CLASS reqs done.
- Coral reef products (acidification (OAPS) & thermal product suites) - looking to move into their archive in an automated way; will be a duplicate of what exists in CLASS. The data provider wants data to be served up through LAS - will take less time, less resources to do it at NODC. Discussed some similarities with JASON project.
- NCDC (Phil)
- There was a request for "total precip" product by OSDPD, but it got withdrawn as it's not going operational.
- Preparing for the next round of model data - 20th century.
- Idea for new way to send descriptive information to CLASS.
- Historically, when data center has drafted an SA, it provides file name conventions, metadata, etc....
- We are planning to migrate 6 to 800 data collections to CLASS in about 3 years (One PB of data - good chunck is NexRad). Need to hand documentation over to CLASS: Data types, in what family, support needed, etc...
- Want to capture this information in XML to give Class as a jumpstart on the planning. XML will simplify the process of getting info into a database.
- Scott talked briefly with Lee Crandall. There's a plan to spend a day talking about CLASS SE evolution -- Phil's thoughts are inline with where CLASS SE is heading to configure the system. This would streamline the requirements process so there isn't so many hoops to jump through for search and access capabilities. ***Nancy wants Phil to come up with a draft to share with the SAWG.
- Don't know where CLASS SE is heading with Data Families. Phil asked Tino about it. Tino said they are looking at dynamic families with various data types. Still in development.
- Anna: Have you taken a look at any standard schemas to be used? Phil: Not sure there's one that exists for defining file naming conventions. Anna: If you can find an existing one, it will facilitate the schema development process. Suggest "RDF" - Resource Description Format as an example.
- NGDC (Dan)
- With the CORS migration into CLASS, what are the implications for updating the SA?
- If it's a post-provider view, the SA doesn't necessarily need to reflect CLASS-archive backend logistics. If it's not relavent to the provider, then don't put it in.
- Provider would like to know where the data is being stored and accessed. So if the support for the data is changing (i.e. new search/access arrangements), then it should be reflected in the SA.
- There's no hard/fast rule. Update the SA as needed. Perhaps make reference to ICD and some notation about when the migration occurred and where the data is stored.
Metadata Management in CLASS.
- Scott: In functional reqs for GOES-R, there's information about managing descriptive information. They have been trying to draft reqs. At CLASS CCB, presented a Level 2 req: "CLASS should have the capability to do CRUD." There was a question about changing "manage" to "control" in req. CLASS wants to get a better understanding of the functionality for managing metadata of two kinds: **Collection level: update data or pulling/receiving data from that source; **Dynamic/Granule level - what's needed.
- Anna. That's a big question.
- Jeremy: some of this metadata management was discussed at NCDC and CLASS meetings this past week. No one from SAWG was present, but it would be good to get the minutes and have data center participation.
- Need input on accessing and influencing the metadata.
- Anna: Right now, it all happens outside of the CLASS metadata.
- Jeremy: Granule level example: changes can occur through the standard ingest process. If a data steward recognized a deficiency in the metadata, the producer/provider would have to create and resubmit a file to CLASS.
- Anna: Should also require a series of discussions because this is unusual.
- Can Data Centers suggest a change?
- Need a description of metadata; CLASS has the capability to extract metadata, but that's another topic; Whatever that is needed to be supported for Search and display is important.
- Dan: SWPC/GOES-R example.
- Anna: Editing metadata directly from a CLASS interface? There is a need to edit metadata when changed; and to a publishing need. Data mangaers/providers need this mechanism to keep the metadata up-to-date.
- Jeremy: There's a parallel between collection and granule; we are re-ingesting updated records.
- Anna: Scenario: "Now I've updated by metadata, now I want to send it to CLASS." CLASS just reflects it in their inventory. Do you log into CLASS to use a tool to change?
- Jeremy: Concern over the case where it's desired to change all of the granules because time info was wrong. How to update? The information could be wrong in multiple places. Had a DMSP issue with rounded start times to five minutes, the disconnect was with the filename, not reflecting the real time.
- Dan: Can we suggest infrastructure support needs like the Rich Inventory?
- Anna: Users may want a graph of quality about their data.
- Scott: Flesh out use cases; anamoly with an instrument, what metadata has to be updated; Reprocessing needs and why; Errors found and want to document.
- Anna: This is important for CLASS SE:
- Collection level;
- Inventory for search and access,
- Rich Inventory for data quality
- Inherent metadata that's not turned into metadata(undocumented) - Jeremy's contribution.
- Anna: CORS example - statistical analysis over the data (CORS) for a given time period.
- Scott: There are things that data managers want to do, but require the update of the metadata. Metrics analysis -- data missing; or a set of files ; not really metadata management per se.
- GOES-R Use Case Side Bar:
- Next week: Product distribution (PD) to CLASS: what happens if the file doesn't get sent or system goes down; That's the meeting next Monday. T-TH are face to face with gsp, harris and boeing and class.
- Anna: Are there designated data managers for these data sets? NG - yes, but education needed as to stewardship responsibilities for data residing in CLASS. NC - still an open question.
- No specific deadline for getting Metdata use cases. The purposes of defining L2 reqs, come to some resolution on the meaning of managing metadata. Devel specific use-cases. Dan and Anna will take a crack at it and pass around to the other DC reps.
- Anna: Does this support need to happen under the core-base infrastructure for CLASS?
- Phil: Do you want this functionality in CLASS or in middleware for something like NEAATT?
- Scott: This is for CLASS.
- Data Management Toolkit still tied up in management levels. The DCs have proposed some additional work for the CLASS contract year, but still under discussion. The work is focused on a DM toolkit. This is part of CLASS.
- Next step: write up some use-cases. Each use-case should be utilizing some piece of CLASS. Publish vs create scenarios.
- Scott the NMMR would do most of the updates for collection level for CLASS...
- Anna: if would help Chris Fox out to clarify these scenarios.
- Jeremy: As a producer makes a collection level update, you can still make a case that CLASS is updating a record. Doesn't matter who came up with the content for the update; CLASS needs the capability to update the records that holds that content.
- Semantics and role thing. What is the scope?
- Scott: From his perspective, who updates the metadata, and still needs to make its way into CLASS for searching, they are interested in it.
- Actors: Authors and the system need to be spelled out.
- Phil will solicit another round of feedback for ATRAK.
- July 16, 2010.