SATPROD Database Design

From NGDCWiki

Jump to: navigation, search

Contents

SATPROD Table Display

SATPROD Tables with Data ( All)

Recommendations

  1. Rename fields in the SATPROD to be consistent with NGDC field and table naming conventions.
  2. Drop the SYSTEM field from the DATAFILE Table. Done 20080910
  3. Drop the CREATED_BY_SYSTEM_ID, STORAGE_LINK_ID, FORMAT, DUMMY fields from the DATAFILE Table. Done 20080910
  4. Rename and populate the DATAFILE_LEVEL field.
  5. Drop SATPROD.GROUP_INSTRUMENT, migrate SATPROD.INSTR_PRODUCT to SATPROD.INSTRUMENT.
  6. Figure out the relationships between DATAFILE, PRODUCT_GROUP, INSTRUMENT, and CHANNEL.

Questions

  1. Snapshot of current status or history?
  2. What defines a productGroup?
  3. Instrument or Instrument Instance?

Field Name Conventions

A set of field naming conventions greatly simplifies working with a database at the SQL level. Conventions that we use at NGDC include:

  1. Table names/abbreviations are not repeated in the field names
  2. Table and field names are singular
  3. Foreign keys column names are child_table_ID

We have renamed fields in the SATPROD to be consistent with these conventions. This will not effect the web user interfaces.

Relationships and Fields

Datafiles and Processing Systems

The present SATPROD design supports two relationships between Datafiles and Processing Systems:

  1. The DATAFILE Table includes a SYSTEM field that gives a single Processing System associated with a Datafile. At this point the SYSTEM field is null for 86% of the Datafiles.
  2. The GROUP_DATAFILE Table provides a junction table between DATAFILE and PROCESSING_SYSTEM that supports a many-to-many relationship between Datafiles, Product Groups, Processing Systems and Roles. In this case, the Processing Systems are null in for 5% of the datafiles. In those cases the table joins Product Groups to Processing Systems.

Datafile Formats

The DATAFILE.FORMAT field allows the specification of a single format for each datafile. This field contains no information in all 858 records. The SATPROD.DATAFILE_FORMAT Table provides a many-to-many relationship between DATAFILE and FORMAT. This Table is populated and appears to be the up-to-date source of information on datafile formats.

Datafile Instruments

The DATAFILE.INSTRUMENT_ID field allows the specification of a single instrument for each datafile. This field contains values between 1 and 15 and is NULL for 795/858 records. The SATPROD.INSTRUMENT Table includes 86 instruments. The SATPROD.GROUP_INSTRUMENT and SATPROD.GROUP_DATAFILE Tables provide a many-to-many relationship between DATAFILE/PRODUCT_GROUP and INSTRUMENT. The SATPROD.GROUP_INSTRUMENT.INSTRUMENT_ID again has values only between 1 and 15. The SATPROD.INSTR_PRODUCT Table provides a many-to-may relationship between DATAFILE and INSTRUMENT. It also includes short and long names for instruments. This appears to be the operative Table.

Datafile Comment

This field appears to contain mostly instrument names, not comments. What is the purpose of this field? This field was initially used to distinguish datafiles with similar names that were created from different instruments.

Datafile Type

This field has a few entries of intermed product, product, and qc. It is NULL for 633 / 858 rows. I suspect that it has been replaced by the DATAFILE_GROUP.ROLE field.

Datafiles, Product Groups, Channels, Instruments, and Satellites

Datafiles

Datafiles are the heart of SATPROD and have relationships to many other entities. These can be identified as Tables that include the field DATAFILE_ID:

TableNote
DATAFILE_FORMATSimple foreign key for formats
DATAFILE_USERSimple foreign key for users
GROUP_DATAFILEThis junction table defines the relationships between datafiles, productGroups, and ProcessingSystems
INSTR_PRODUCTThis is the junction table between Datafiles and Instruments. It appears to be up-to-date. The table should probably be renamed to DATAFILE_INSTRUMENT or it should be dropped in favor of a DATAFILE_CHANNEL table. Note that the current PRODUCT_CHANNEL table joins PRODUCT_GROUP to Channel rather than Datafile to Channel.
PRODUCT_CAT_OPS_PROGThis table looks like it was created to address some NOSA reporting requirement. It includes DATAFILE_ID, DATAFILE_NAME, COVERAGE, OPERATIONAL_STATUS which could introduce a significant maintenance task (these fields are in, and should only be in, the DATAFILE Table). This table should be dropped and implemented as a view
SAT_PRODUCT_PROGThis table joins SATELLITE, DATAFILE, and PROGRAM. It might make sense to drop the DATAFILE_ID from this table as that relationship is represented in the datafile > channel > satellite chain. It is worth noting that this approach supports a many-to-many relationship between satellites and programs.
V_DATAFILE_1This looks like it was originally a view which is good. It contains DATAFILE_ID and DATAFILE_NAME and STORAGE_PLATFORM_ID. Note that all three of these fields are in the DATAFILE table. This should probably be dropped.

Datafile-centric Design

Datafile-centric Design
Datafile-centric Design
Datafile-centric Design 2
Datafile-centric Design 2
  1. Simplify DATAFILE_GROUP by migrating ROLE and PROCESSING_SYSTEM_ID to DATAFILE_PROCESSING_SYSTEM (new table)
  2. Create DATAFILE_PROCESSING_SYSTEM with DATAFILE_GROUP.ROLE, DATAFILE_GROUP.PROCESSING_SYSTEM_ID and PRODUCT_CHANNEL_USE (?)
  3. Change PRODUCT_CHANNEL into DATAFILE_CHANNEL: change PRODUCT_GROUP_ID to DATAFILE_ID, drop INSTRUMENT_ID, migrate USE, NOTE?
  4. Drop PRODUCT_GROUP_ID from DATAFILE_USER, may need PRODUCT_GROUP_USER table
  5. Move INSTRUMENT_STATUS and INSTRUMENT_STATUS_REMARK to INSTRUMENT.

ProductGroups

Originally ProductGroups were groups of Datafiles that share some set of properties and, therefore, that should be editable as a single item. The role of ProductGroups in SATPROD seems to have expanded considerably.

TableNote
DATAFILE_USERThis table joins Datafiles and ProductGroups to users. The inclusion of DATAFILE_ID and PRODUCT_GROUP_ID in this table creates a redundant relationship between datafile and productGroup. This is not generally a good thing. It could be that this table could describe relationships between users and productGroups or users and datafiles. In that case there would be nulls in the table. In fact, there are none, so this scenario seems unlikely. Seems likely that the PRODUCT_GROUP_ID should be dropped from this table. Was it there to simplify the editing process?
GROUP_DATAFILEThis junction table defines the relationships between datafiles, productGroups, and ProcessingSystems
GROUP_INSTRUMENTThis junction table defines the relationships between datafiles, productGroups, and ProcessingSystems
PRODUCT_CAT_OPS_PROGThis table looks like it was created to address some NOSA reporting requirement. It includes DATAFILE_ID, DATAFILE_NAME, COVERAGE, OPERATIONAL_STATUS which could introduce a significant maintenance task (these fields are in, and should only be in, the DATAFILE Table). This table should be dropped and implemented as a view
PRODUCT_CHANNELThis associates an instrument with a product group and a channel and appears to be up-to-date. It also includes a USE and a NOTE.
PRODUCT_GROUP_INSTRUMENTThis junction table defines the relationships between productGroups, and instruments. It appears to be the up-to-date version of the GROUPS_INSTRUMENT table. In this case an orbit range was added, but it was not populated.
SAT_INSTR_PGThis junction table defines the relationships between satellites, instruments, and productGroups. It appears to be up-to-date, but it is unclear why it is needed. Seems reasonable that the satellite table is only connected to the rest of the database by the SATELLITE_INSTRUMENT table and that datafiles are connected to channels.
V_NEW_PROD_CHANNELThis looks like a view which serves the same purpose as the PRODUCT_GROUP_INSTRUMENT and the (vestigial) GROUP_INSTRUMENT table. This view should be dropped.

Channels

TableNote
PRODUCT_CHANNELThis is the only junction table for channels and it relates channels to product groups. Seems like it should becalled DATAFILE_CHANNEL and relate datafiles to channels.

Unused Tables and Fields

Empty Fields

In DATAFILE Table: CREATED_BY_SYSTEM_ID, STORAGE_LINK_ID, DATAFILE_LEVEL, FORMAT, DUMMY

Empty Tables

The SATPROD Database includes a number of tables that do not contain data. These include:

  • SATPROD.CHANNEL_LIST
  • SATPROD.CHANNEL_USE
  • SATPROD.CONTACT
  • SATPROD.DATAFILE_AUX
  • SATPROD.DATA_USE
  • SATPROD.INSTRUMENT_AUX
  • SATPROD.KEYWORD
  • SATPROD.LINK
  • SATPROD.ORGANIZATION
  • SATPROD.ORG
  • SATPROD.ORG_CONTACT
  • SATPROD.OVERALL_SYSTEM
  • SATPROD.POP_A
  • SATPROD.SATELLITE_AUX
  • SATPROD.SAT_SYS
  • SATPROD.SYS_ORG

PowerBuilder Tables

The master of the original SATPROD Database, Tom Passin, used PowerBuilder to create and manage the database design. This tool requires a number of support tables. The content of these tables is not part of the SATPROD.

  • PBCATCOL
  • PBCATEDT
  • PBCATFMT
  • PBCATTBL
  • PBCATVLD
  • PROPERTY (The SATPROD table with the most rows)
Personal tools