NOAA/NASA Pathfinder AVHRR Oceans Sea Surface Temperature Data Set User's Guide Version 1.0 December 15, 1994 TABLE OF CONTENTS 1.0 INTRODUCTION 1.1 Scope of this document 1.2 Introduction 2.0 SATELLITE AND INSTRUMENT 2.1 Platform description 2.2 AVHRR instrument description 2.3 Level -1B data 3.0 ALGORITHM AND PROCESSING 3.1 Computation of sea-surface temperature 3.2 Initial assignment of quality flags 3.3 Declouding 3.4 Binning and Mapping 4.0 QUALITY ASSURANCE 4.1 Temporal analysis 4.2 Spatial analysis 4.3 Visual inspection by an operator 5.0 DATA SET DETAILS 5.1 Equal Area Product 5.2 9km Equal Angle All SST 5.3 9km Equal Angle Best SST 5.4 18km Equal Angle All SST 5.5 18km Equal Angle Best SST 5.6 All pixel 0.5 degree SST 5.7 Best SST 0.5 degree SST 6.0 ARCHIVE AND DATA ACCESS 6.1 Using a Web browser to access the data 6.2 Downloading the data using anonymous ftp 7.0 READING AND USING PATHFINDER SST DATA 8.0 FREQUENTLY ASKED QUESTIONS 9.0 References 1.0 INTRODUCTION 1.1 Scope of this document This document describes the production, quality assurance, archival, and methods of acquiring and using the Pathfinder Advanced Very-High Resolution Radiometer (AVHRR) Ocean Pathfinder sea-surface temperature (SST) data set (hereafter Pathfinder SST). It briefly discusses the AVHRR instrument and National Oceanic and Atmospheric Administration (NOAA) satellite platforms, the history of satellite-derived SST, and the refinement of the nonlinear correction algorithm and cloud clearing techniques used in reprocessing AVHRR level-1B data to produce the Pathfinder SST data set. It provides information on the various Pathfinder SST products and details the distribution of the data by the Physical Oceanography Distributed Active Archive Center at the Jet Propulsion Laboratory (PO.DAAC). 1.2 Introduction The Earth Observing System (EOS), centerpiece of NASA's Mission to Planet Earth, is expected to generate about 2 trillion bytes of new data per day. The Pathfinder projects were initiated as preparation for the projected volume of data that will be returned from EOS. The mandate of the Pathfinder SST task is to produce, validate, and evaluate a long time-series of AVHRR-derived SST as a precursor to EOS data sets, and for use in climatologic investigations and modeling. Historically, radiometer data have been used with several algorithms to produce estimates of SST (see McClain et. al, 1985 for a review). Multichannel sea-surface temperatures (MCSST) have been computed from AVHRR radiances operationally since 1981. As part of the Pathfinder SST task, a detailed re-analysis of the calibration procedures for the NOAA AVHRR based on thermal vacuum test data was performed by researchers at the University of Miami (Brown et al., 1993). New procedures were derived which improve the overall calibration accuracy as well as an increase in the number of temperature retrievals by an approximate factor of two. NASA's Jet Propulsion Laboratory is tasked with reprocessing historical AVHRR data to produce a prototype satellite SST database for global climate studies. This will provide a consistent SST time series of greater than 10 years, with known statistics across the various AVHRR platforms. Comprehensive calibration and validation information are included, and access to the data set is through the use of new technologies in storage and retrieval. 2.0 SATELLITE AND INSTRUMENT A more detailed description of the NOAA-series satellites, the AVHRR instrument, and the AVHRR Global Area Coverage (GAC) Level-1B data can be found in the Polar Orbiter Users Guide (Kidwell 1991), which can be obtained from NOAA/NESDIS, and from which the following information is reproduced. 2.1 Platform description Each of the series of NOAA satellites operates in a near-polar, sun-synchronous orbit. The orbital period is ~102 minutes, giving 14.1 orbits per day. Because the number of orbits/day is not an integer, the suborbital tracks do not repeat daily, although the local solar time of the satellite's passage is essentially unchanged for any latitude (Kidwell 1991). The 110.8 deg. cross--track scan equates to a swath width of about 2700 km. This swath width is greater than the 25.3 deg. separation between successive orbital tracks, and provides overlapping coverage. Table 1 lists the operational dates and nominal ascending node equator-crossing times for all platforms, although the data described in this User's Guide are derived only from the NOAA 7,9,11 satellites which carry the 5-channel instrument. Table 1 NOAA-Series Satellites (from the Polar Orbiter Users Guide, Kidwell 1991) SATELLITE DATES OPERATIONAL CROSSING TIME --------------------------------------------------------------------- TIROS-N OCT 19, 1978-JAN 30, 1980 15:00 LST NOAA-6 JUN 27, 1979 - MAR 5, 1983 19:30 LST NOAA-7 AUG 19, 1981 - JUN 7, 1986 14:30 LST NOAA-8 JUN 20, 1983 - OCT 31, 1985 19:30 LST NOAA-9 FEB 25, 1985 - NOV 7, 1988 14:30 LST NOAA-10 NOV 17, 1986 - present 19:30 LST NOAA-11 NOV 8, 1988 - present 14:30 LST NOAA-12 MAY 14, 1991 - present 19:30 LST 2.2 AVHRR instrument description Each of the NOAA polar-orbiting satellites have carried an AVHRR as one of three sensors aboard the spacecraft. AVHRR was designed for multispectral investigations of meteorological, oceanographic, and hydrologic parameters, measuring emitted and reflected radiance in four or five spectral bands (Table 2), spanning the visible portion of the spectrum to the thermal infrared. Coverage is global, twice daily, at an instantaneous field of view (IFOV) of ~1.4 milliradians, giving a ground field of view of ~1.1 km at nadir for a nominal altitude of 833 km. Table 2. AVHRR Spectral Bands Platform Channel Position (microns) ====== ========================= Tiros-N 0.55-0.90 0.725-1.10 3.55-3.93 10.3-11.3 NOAA-7,-9,-11,-12 0.55-0.68 0.725-1.10 3.55-3.93 10.3-11.3 11.5-12.5 NOAA-6,-8,-10 0.55-0.68 0.725-1.10 3.55-3.93 10.3-11.3 The AVHRR has a cross-track scanning system which use an elliptical beryllium mirror rotating at 360 RPM about an axis parallel to the Earth. The instrument is designed to maintain a constant operating temperature for the IR detectors and provide a signal-to- noise ratio (SNR) of 3:1 at 0.5% albedo. Each AVHRR scan views Earth for 51.282 milliseconds, during which time each channel of the analog data output is digitized. Scans occur at the rate of 6 per second, and the sampling rate of the AVHRR sensors is 39,936 samples per second per channel. During a scan, the detectors view an internal target, cold space, and the external scene. The temperature of the internal target is monitored, and space is assumed to have a black- body temperature of 3K. In this way, a simple two-point linear calibration is done internally (Schwalb, 1978). The nonlinear modification to this calibration is achieved at the time of postprocessing, and takes into account sensor nonlinearities, measurement of internal target temperature, calculation of target radiance, internal reflections and emissions, etc.. 2.3 Level-1B data Full resolution AVHRR data are continuously transmitted and recorded in High Resolution Picture Transmission (HRPT) format. The Global Area Coverage (GAC) data are subsampled to approximately 4 km IFOV, recorded internally, and downlinked daily. These data are the starting point for the Pathfinder SST processing. The Level-1B data are defined as radiometrically-corrected and calibrated data in physical units at full instrument resolution as acquired. To produce the NOAA GAC Level-1B data, the Level-0 (unprocessed) instrument data are quality controlled, assembled into discrete data sets, and have calibration and Earth location information appended. Data are then stored as full orbits consisting of both ascending (daytime) and descending (nighttime) data. 3.0 ALGORITHM AND PROCESSING The history of SST computation from AVHRR radiances is discussed at length by McClain et al. (1985). Briefly, radiative transfer theory is used to correct for the effects of the atmosphere on the observations by utilizing "windows" of the electromagnetic spectrum where little or no atmospheric absorption occurs. Channel radiances are transformed (through the use of the Planck function) to units of temperature, then compared to a-priori temperatures measured at the surface. This comparison yields coefficients which, when applied to the global AVHRR data, give estimates of surface temperature which have been nominally accurate to 0.3 deg. C. Recently, the AVHRR thermal vacuum test data have been examined in detail (Brown et. al., 1993) in order to quantify drift in the calibration coefficients of the channels. Through this work, the nonlinear SST algorithm first developed by Walton (1988) has been modified with a time-dependent term. Processing has been further modified by dividing the earth into three regimes of atmospheric water vapor. Regression coefficients are computed independently for each of these regimes, to compensate for the well-known limitations of AVHRR SST retrievals in tropical areas, which are an artifact of high humidity. A detailed description of the Pathfinder SST data processing is attached as Appendix A. 3.1 Computation of sea-surface temperature The AVHRR Level-1B sensor counts in the visible channels (1 and 2) are first converted to Rayleigh-corrected radiances and then to optical depth for use in removing the effects of the atmosphere and viewing and illumination geometry. Channels 3-5 are transformed to units of "brightness temperature", using the Planck black body function and a newly- determined (Brown et al., 1993) correction for sensor calibration non-linearity in the longer-wavelength channels. The algorithm used is essentially the nonlinear SST (NLSST: Walton, 1988), with a modification for sensor calibration drift with time. The algorithm is also conditioned in three regimes of atmospheric water vapor, and separate regression coefficients are applied. The form of the algorithm is: SST = a + b*T4 + c*(T4-T5) + d*(sec(q)-1)*(T4-T5) + etime (1) where q is the zenith angle of the instrument, and T4 and T5 are the brightness temperatures from AVHRR channels 4 and 5, respectively. The empirical coefficients a, b, c, d, and e were determined through a multiple-regression of AVHRR radiances with a database of in-situ temperatures, measured using moored and drifting buoys. In order to be considered a match, the pixel location and in-situ measurement must differ by no more than 0.1 degree spatially, and temporally by no more than 30 minutes. Two databases were compiled for this purpose, one for operational processing and another for experimentation and validation of temperature retrievals. Data were contributed to these databases from the U.S. Navigational Data Buoy Data Center (NDBC), the Japanese Meteorological Agency, and the TOGA/TAO project. Drifter data are from the NOAA Atlantic Oceanographic and Meteorological Laboratory, and the Canadian Marine Environmental Data Service (MEDS). Both of these databases are also available through the PO.DAAC, for use in verification and experimentation (JPL Physical Oceanography Distributed Archive Center (PO.DAAC) Data Availability, version 1-94). 3.2 Initial assignment of quality flags Temperature retrievals as detailed above are determined for all pixels. Several tests are then performed to assign an estimate of the goodness of each retrieval, in the form of a flag value with four possible values. The "satellite" test is a channel 4/5 threshold (used to detect how "bright" a pixel is), combined with a spatial homogeneity test. The "Reynolds" test is a comparison of the initial temperature retrieval to the Reynolds blended SST climatology. If a pixel passes all of these, it is considered "best", and assigned a quality flag of 3. Passing the Reynolds test but failing the satellite test generates a 2 (or "mediocre" quality), failing the Reynolds test and passing the satellite test generates a 1, and failing all tests gives a quality flag of zero. 3.3 Declouding The next phase is declouding, effected through the creation of composite images over 3 weeks before and after the target week, and a mean computed from these. The composite means are used to fill a central weekly mean image which contains the day being declouded (if the central mean image is missing values, and if there is a mean pixel of sufficient quality). If the weekly means from week(n-1) or week(n+1) cannot be used to fill empty values in the central (week n) mean, a spatial interpolation is done. The completely-filled weekly image is then compared to the daily image, and simple thresholding is used to indicate partial or complete cloudiness. Repeating this process generates a cloud mask for every day of data. 3.4 Binning and Mapping The AVHRR Ocean Pathfinder data are processed in an equal-area grid (Appendix A) based on one developed by the International Satellite Cloud Climatology Project (ISCCP). The bin size is approximately 9.28 km on a side, which gives 5,940,422 bins over the globe. An advantage of this grid is that it can be easily combined into grids with different zonal resolutions because the number of bins per row is always an integer. Since the GAC data were originally sampled at approximately 4 km resolution, bin values are averages. The number of retrievals which were averaged into each bin is a standard data product, and can be obtained to correctly perform any special weighted averaging. After processing, the data are remapped into an equal-angle projection (4096x2048 rectangular grid), in order to facilitate visualization and extraction of regional subsets. The data are stored in the Hierarchical Data Format (HDF). Several products are generated from these data, including 18 km and 0.5 û equal-angle projections, 7 and 8-day weekly composites, etc. See Appendix B for details of the binning procedure. 4.0 QUALITY ASSURANCE A semi-automated quality-assurance (QA) scheme has been developed which examines AVHRR SST retrievals for temporal and spatial consistency. This is carried out in a two- part statistical post-processor, followed by a visual inspection. The automated portion of the analysis serves to guide an operator in the visual inspection phase, greatly reducing the time necessary to characterize spurious findings. The data quality information thus gained is passed to the end-user in the form of additions to the processing flags, as well as comments which are included in the metadata of each image. This combination forms a qualitative and quantitative description of anomalies found in the data. 4.1 Temporal analysis Phase 1 is a time-series examination, using a centered running mean over a window 21 days in length and comparing each retrieval (cloudfree pixels with highest-quality SST estimates) to this mean and a threshold. This examination will potentially generate 3 flag values. If the retrieval is outside an envelope defined by the running mean plus-or-minus 2û, it is flagged as anomalous (too high or too low). The 2û threshold was selected to be consistent with that used in the autoprocessing. If the retrieval was the only cloudfree day in the 21-day window, it is flagged for a secondary spatial examination, since there was effectively no time series with which to compare (Figure 1). The length of the window was determined after a sensitivity analysis, which determined the effect of varying this length on the number of flags generated in each category. As the window increases, it becomes less probable that a retrieval will be the only one that was free of clouds. The 21-day length was selected as it is close to a temporal mesoscale in most of the world ocean, and a mean determined over this period will smooth over large changes in temperature that are reasonable with respect to physical oceanography. This then avoids setting spurious flags that must be cleared by visual inspection. Further, the number of flags generated as a function of window-length (from the sensitivity analysis) begins to asymptote at 21 days, so that using a larger window will not result in significantly fewer flags being set. 4.2 Spatial analysis The result of the phase 1 time series examination is that many flags are set, mainly a result of retrievals being the only cloudfree day within their 21-day window. Most of these are perfectly reasonable temperatures, however since there was no time series with which to compute a running mean, there is no way to know this without further examination. Phase 2 is therefore a spatial test, which compares each previously-flagged retrieval to a spatial mean plus-or-minus a 1.5 deg. threshold. A sensitivity analysis for this test examined the effect of varying the temperature threshold, as well as different spatial radii for computing the local mean. Since we are examining an oceanographic quantity, we use the minimum bin radius (1.5 pixels = 13.5 km) to avoid biasing the mean where there may be a large change in temperature over a small distance (for example on the edge of a front, Gulf Stream, etc.) If the value of a flagged retrieval is within the accepted range of a spatial mean computed in this fashion, the flag is cleared. If the pixel is still out of range, it is left flagged for visual inspection. 4.3 Visual inspection by an operator Using the locations determined by the temporal and spatial examinations to guide the search, we use visualization software developed for this task. The package allows an operator to browse an image in segments, toggle the display of nonflagged data locations, and display scalable zoomed subsets of the image in detail. The zoomed subsets are displayed along with a histogram of temperature distributions in the subset area, a contour map overlay of the locations that remain flagged, and a display of all temperatures in the row or column in which the cursor is located. This allows the user to zoom in on problem areas and carefully examine the area around flagged pixels, and make a determination as to whether the retrieval should remain flagged or not. If the flag is to remain set, the operator adds the location of the pixel to a file. This file is later merged with the processing flags. The operator also makes notes as to anomalous patterns which may be apparent in the gross appearance of the image. For example, missing orbits which are not due to instrument malfunction or nonoperation, linear features which may be due to the suturing-together of scan lines, single pixels marked as cloudy which are not near any other masked pixels, and enigmatic temperature gradients near feature boundaries are all noted, and later added to the metadata in a comment field. 5. DATA SET DETAILS The Pathfinder SST data are distributed in a variety of resolutions, projections, and temporal averages, to accomodate researchers with varying processing capabilities and needs. Each data product is produced as either an ascending (daytime) or descending (nighttime) image. These products are produced as daily composites, which are defined as spatial bins of all temperature retrievals at a maximum resolution of 9 km. Auxiliary information include quality and sampling data, as well as simple statistics. From the daily products, weekly, monthly, and yearly composites are formed. 5.1 Equal Area Product The equal area product is based on a gridding scheme where the number of bins per longitude is dependent on the latitude. For details of this binning scheme refer to Appendix A. One of the data sets generated for distribution is a 9km equal-area product with 6 different bands or extractable parameters describing the sea surface temperature in a given bin. These are distributed as HDF files, and are approximately 120 MB in size. The sum squared and number of observations per bin are included for proper resampling, should a researcher have special spatial or temporal requirements. Pixel quality and mask bits are determined during processing, and based on a variety of tests. Details of the pixel quality and mask bits and the method of their determination are given in Appendix B. The 6 bands included in an equal-area product are as follows: 1) bin_number: a unique number assigned to a particular bin based on the equal-area grid. This bin_number then is associated with a specific geographical or latitude, longitude coordinate. 2) # of observations per bin: because the 9km bins are based on an average of 4km Level- 1B data, this parameter indicates the number of averages comprising each. 3) pixel_quality: a quality flag generated during processing, which indicates the quality of the temperature estimate at each pixel. Values can be between 0 and 3 depending on a series of statistical tests and comparisons with other sources of data. 4) mask_bits: this band contains different image masks that are used, such as cloud or ice masks. 5) sum_sst: for a given 9km bin this number is the sum of the sea surface temperature values in that bin. This number, along with the number of observations per bin, can then be used to derive the average sst value. 6) sum_squared sst: for a given 9km bin this number is the sum of the squared sst values, to be used in computing higher-order statistical moments. 5.2 9km Equal Angle All SST A significant part of the processing is in mapping the equal-area product into a format suitable for image display purposes. Thus the 9km equal-angle product consists of the mapped equal area grids onto an equal angle grid with an image size of 4096 x 2048. This product contains all pixels regardless of data quality flag and will be available in the Hierarchical Data Format (HDF) developed by the National Center for Supercomputing Applications. In addition to the daily day and night passes available, 8-day, monthly and yearly composites will be produced. The product contains 3 bands or image planes of data: a) SST (2 bytes) b) pixel-quality: flag between 0 and 3 as defined in Appendix B. c) # of observations per bin: # of SST values that were averaged in the 9km bin. 5.3 9km Equal Angle Best SST Same dimensions as the 9km equal-angle product except only those SST values with a pixel quality of 3 are kept. This product is produced in the HDF format but in addition is also available in a raw binary image format. It is also available in the daily as well as the 8-day, monthly, and yearly composites. It contains two bands or image planes of data: a) Best SST value b) # of observations per bin. 5.4 18km Equal Angle All SST This is the same as 2.2 except the spatial bin size is now 18km instead of 9km. Thus the dimensions of the image are now 2048 x 1024. As 2.2 this product is available in the HDF format with 3 bands or image planes of data, the SST value, pixel quality, and number of observations per bin. It is also available in the daily (day or night) image size as well as the 8-day, monthly and yearly composites. 5.5 18km Equal Angle Best SST This product is identical to 2.3 except that the spatial bin size is 18km with the dimensions of the image 2048 x 1024. This product is available in the HDF or raw binary formats with the same 2 bands or image planes, best SST and number of observations per bin, as 2.3. It is also available in the daily (day or night passes), 8-day, monthly and yearly composites. 5.6 All pixel Equal Angle 0.5û SST The purpose of the browse images is to allow the researcher to quickly view the all pixel data and decide if the data are suitable for his/her purposes. In addition these data are also available for scientific studies of global scale ocean phenomena. The data consist of a 0.5 degree product, 720 x 360. As in previous cases (2.2, 2.4) this data set contains all pixels not just those with a quality flag of 3. It is also available in the standard day or night daily, 8-day, monthly and yearly composites. It is available in the HDF format and consists of the 1 band of SST values. 5.7 Best pixel Equal Angle 0.5 degree SST This data set is the same as 2.6 except that only the best SST values of quality flag 3 are kept. The same day or night daily, 8-day, monthly and yearly composites are produced. 6.0 ARCHIVE AND DATA ACCESS The Pathfinder SST data are available through the Physical Oceanography Distributed Active Archive Center (PO.DAAC) at the Jet Propulsion Laboratory. Because the processing of the level-1B AVHRR data to SST is also at JPL, the data can be browsed during the processing phase nearly as soon as the images are complete and have been checked for quality. The data may be accessed using a WWW browser such as Mosaic or Lynx, downloaded using anonymous ftp, or by making a request through electronic mail (or by telephone) to the staff at the PO.DAAC. 6.1 Using a Web browser to access the data Using a tool such as Mosaic, the http protocol may be used to access the Pathfinder SST data as the images are produced, as well as learn more about the Pathfinder SST project, the Jet Propulsion Laboratory, and NASA. The uniform resource locator (URL) for the Pathfinder SST home page is http://sst-www.jpl.nasa.gov. The home page contains details of the AVHRR instrument, an overview of the Pathfinder SST project, a description of the image resolutions, file formats, and projections that are available, and a user may browse the latest SST files added to the site. This is a dynamic process, and new files are added each day at the approximate rate of 20 days of data (day and night) per day of processing. Selecting an image to browse will display the image with colortable as well as the metadata header. This includes information on which satellite platform and AVHRR instrument were used to collect the channel radiances, calibration information, start and end times of the data collection, and other details. The web site also has an online order form, so that a researcher may acquire any or all of the Pathfinder SST data through the PO.DAAC. Upon completing and submitting the order form, an electronic mail confirmation of the order will be sent back, with an order reference number. The data will arrive through U.S. Mail on the selected media (8mm video, DAT, etc.). 6.2 Downloading the data using anonymous ftp For small-to-medium sized data subsets, anonymous ftp may be used to obtain the Pathfinder SST data. Connect to sst-www.jpl.nasa.gov using ftp, and enter 'anonymous' for a username. Please use your full e-mail address for a password, as you will be placed on the Pathfinder SST mailing list. The names and contents of the data subdirectories are: map09_all: contains 9km (4096X2048) equal-angle (rectangular) projections which contain all retrievals. This includes temperatures, clouds, bad data, etc.; virtually everything before the data quality and cloud screening processes mask the retrievals of dubious quality. These are provided for researchers who may want to develop cloud detection procedures of their own, or use the full set of retrievals for some other purpose. map09_best: contains the same resolution as above, but with only the "best" (see section on pixel quality) temperature retrievals. map18_all: contains 18 km (2048X1024) equal-angle projections. These data are included as they are the same resolution as the MCSST data set, and can be used as a replacement for ongoing research without the necessity of rebinning by the researcher. These files contain images with all retrievals. map18_best: contains the same projection and resolution as above, but with clouds and poor-quality temperature retrievals masked. map54_all: contains 54 km (720X360) equal-angle "browse" images. map54_best are the same images, with poor quality retrievals and clouds masked. software: this directory contains routines for reading the HDF data files, dumping an HDF image to a flat binary (byte, no header) formatted file, and a package written in IDL for browsing the full resolution images in detail. Use of the IDL routines and package requires that you have IDL version 3.5 or better installed on your system, and use of the FORTRAN routines requires that you acquire, compile, and install the HDF library (available via anonymous ftp from ftp.ncsa.uiuc.edu). Each of the ftp directories contains the same time-span of data; only the resolution of the images and the pixel quality varies between them. 7.0 READING AND USING PATHFINDER SST DATA The JPL PO.DAAC is distibuting these data in the Hierarchical Data Format (HDF). HDF was developed at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champain. The PO.DAAC is supplying FORTRAN drivers to read and write HDF data, however these must be linked with the HDF subroutine library. The library is available via anonymous ftp at ftp.ncsa.uiuc.edu (141.142.3.135). We also supply example IDL code for reading and writing HDF data; if you use IDL the HDF library is already included, however IDL version 3.5 or better must be installed at your site. 8.0 FREQUENTLY ASKED QUESTIONS 1. What is the naming convention used for the data sets? An example file name is 88020h09da-gdm.hdf, where: 88- year 020- day h- data type (h=hdf, p=postage stamp (or equal-area projection), d=dsp (binary with header and trailer), and b=binary (byte, no header)). 09- resolution (ground field-of-view, km) d- daily (w=weekly, m=monthly) a- ascending (daytime; d=descending (nighttime)) gdm- good declouded mean (highest-quality retrievals; adm=all declouded mean) hdf- file format (hdf; dsp, bin=binary) 2. Where can I learn more about the HDF data format? Use a web browser (e.g. Mosaic) and make an http connection to the URL http://www.ncsa.uiuc.edu/SDG/Software/HDF/HDF-FAQ.html. This document describes the details of the format, new features, visualization and analysis tools (both free and commercial), how to obtain source code, how to make a bug report, etc. 3. How do I report a problem with the data, or contact someone at the JPL PO.DAAC? The Pathfinder SST web page is maintained by Andy Tran and Rosanna Sumagaysay. They can be reached as andy@grumpy.jpl.nasa.gov and rosanna@haydn.jpl.nasa.gov. Problems with the data, placing an order for data (without using the WWW interface), and questions of a general nature should be directed to podaac@podaac.jpl.nasa.gov. 9.0 References Brown J. W., O. B. Brown, and R. H. Evans, 1993. Calibration of AVHRR Infrared channels: a new approach to non-linear correction, Journal of Geophysical Research, 98 (NC10), 18257-18268. JPL Physical Oceanography Distributed Active Archive Center (PO.DAAC) Data Availability, Version 1-94, JPL Publication 90-49, rev. 5. Kidwell, K., 1991. NOAA Polar Orbiter User's Guide. NCDC/NESDIS, National Climatic Data Center, Washington, D.C.. McClain E. P., W. G. Pichel, and C. C. Walton, 1985. Comparative performance of AVHRR based multichannel sea surface termperatures, Journal of Geophysical Research 90, 11587-11601. McMillin, L. M., and D. S. Crosby, 1984. Theory and validation of the multiple window sea surface temperature technique. Journal of Geophysical Research, 89(C3), 3655- 3661. Stowe, L. L., E. P. McClain, R. Carey, P. Pellegrino, G. G. Gutman, P. Davis, C. Long, and S. Hart, 1991. Global distribution of cloud cover derived from NOAA/AVHRR operational satellite data, Adv. Space Research, 3, 51-54. Appendix A. Processing details *OVERVIEW OF PRODUCT PROCESSING The generation of the NOAA/NASA AVHRR OCEANS PATHFINDER SST products is a multistep process. Specific details are outlined in Appendix E, however a brief synopsis is presented here. Level-1B data are first ingested from optical disk, then converted from to a standard image format. Data are then navigated from line/pixel(image) coordinates to latitude/ longitude coordinates and subsetted. Next the non-linear correction algorithm adjusts for the calibrations of the AVHRR channels, and SST is computed from predetermined regression coefficients which are specific to 3 regimes of water vapor. Once SST retrievals are determined, the data subsets are binned to produce the 9.28 km equal-area orbitals for both ascending and descending nodes and an initial quality flag is assigned to each retrieval. After the orbitals for an entire day have been completely processed, they are composited to a single daily file. The next phase is declouding. This is effected through the creation of composite images over 3 weeks, from each of which a mean is computed. These are then used to fill a central weekly mean image which contains the day being declouded (if the central mean image is missing values). If the weekly means from week(n- 1) or week(n+1) cannot be used to fill empty values in the central (week n) mean, a spatial interpolation is done. The completely-filled weekly image is then compared to the central daily, and simple thresholding is used to indicate partial or complete cloudiness. This process generates a cloud mask for every day of data. **INGESTION Processing of data is done with DSP, a software system supporting oceanographic satellite and image processing developed by the remote sensing group at the University of Miami/RSMAS. In DSP, (reference DSP user's manual) the beginning step in processing is called the ingestion. Level 1B data is a sensor level data set which consists of the raw sensor data organized as one scan line per record with quality control information, calibration coefficients for each channel, and earth locations for selected data spots appended to each scan line. The data are read in from optical disk and is converted to standard image format. A standard image format consists of 1200 scan lines of data read in from the medium. This is done with a program called GET_SCAN that locates particular points in each pass. These are the times of the poles crossings for NOAA-9 satellite at +90 (north pole) crossing and -90 (south pole) crossing. The splitting points (north/south crossings), chops the pass up into pieces (1200 scan line files or standard image format). The files become accessible for navigation. **NAVIGATION/SECTOR Adjustments and calibrations of the mapping from line/pixel(image) space to latitude/longitude space occurs next. (Refer to DSP U.G.)To determine the actual location of the line/pixel image, time and attitude parameters are corrected using navigation ephemeris files to get a comparable match-up of the coastlines that are visible from the actual image to a reference map outline. This generates a more refined set of parameters most importantly the earth location of the image in latitude and longitude values. Sectors are extracted data specified by a latitude/longitude center taken from the ingested files. The selection of the latitude and longitude center depends on the accuracy of the navigation process. The navigation file is updated for this piece or sector. It uses the same file but modifies the ingested file. For subsequent processing, the volume of data from a sector is far smaller compared to a typical ingested file. This is desirable in terms of effecient processing. **APPLICATION OF NON-LINEAR CORRECTION ALGORITHM A non-linear correction algorithm (See Algorithm above) to calculate the Sea Surface Temperature is applied to the modified ingested files. **BINNING After the SST retrievals are determined two binning processes occurs next. Since the Global Area Coverage (GAC) data were originally sampled at approximately 4km resolution, data are binned to an approximate 9.28 km bin size averages called spatial binning which gives 5,940,422 bins over the glove. An advantage of this grid is that it can be easily combined into grids with zonal resolutions because the number of bins per row is always an integer. The number of retrievals which were averaged into each bin is a standard product, and can be obtained to correctly perform any special weighted averaging. The spatial binning produces ascending and descending binned data including day splitting. Time binning begins when files that cover one orbit are spatially binned. These spatial binned files are composited into two files descending and ascending Equal Area Binned Orbitals. **EQUAL AREA DAILY When all of the Equal Area Binned Orbitals are completed for a particular day, they are then composited to a single daily file in both ascending and descending called Equal Area Daily that includes 6 channels info. 1) Bin Number 2) Number of Observations per bin 3) Pixel Quality 4) mask bits 5) sum_sst for a given 9km bin 6) sum squared sst for a given 9km bin. **DECLOUDING In the declouding process a weekly ascending and descending files are generated from the Equal Area Daily files of the given week. These weekly files will be used to create the weekly reference map. A reference file or reference map is created by Pathtiming or time binning the previous week(n-1) and the next week(n+1). The pathtiming process does a comparison between channel differences of channels 4 and 5 with neighboring pixels (refer to time binner and scenario diagram) in a 21 day moving reference, 7 days before and 7 days after. It takes the quality flag values and compares them with one another. The best quality value out of the three weeks and its corresponding SST value is assigned to the reference file of week(n). If two of the weeks have the same best quality value then the average value of their corresponding SST is calculated and assigned to the reference file of week(n) along with the quality flag value. Next the reference file goes through a data filling process. If the pixel contains a quality value of less than a 3 (3 being the best SST) then a linear interpolation is done to a 5x5 pixel array of the particular pixel. Then the new quality value replaces the original value of the particular pixel. Once the modified reference map is created, the Equal Area Daily Files of week(n) are then cloud-masked by thresholding against the reference map on a pixel-by pixel basis, using separate reference maps for ascending and descending data. The output will be the same daily files in the weekly directory, but with modification dates, reflecting the cloud-masking time. **REMAPPING The first output product, Equal Area Declouded Daily 9km File is written to disk and is remapped to an Equal-Angle projection to generated 72 different products for distribution (see product list). AFter process, the data are remapped into an equal-angle project(rectangular grid), 4096X2048 pixels, in order to facilitate visualization adn extraction of regional subsets. The data are stored as Hierarchical Data Format (HDF) files, and as flat binary. APPENDIX B - GRIDDING SCHEME Equal-area gridding scheme proposed for Pathfinder/SeaWIFS ocean products Introduction This document describes the equal-area gridding scheme proposed by the RSMAS Remote Sensing Group for the binned sea surface temperature fields produced by the AVHRR Pathfinder project. The same approach is being adopted for SeaWIFS binned ocean color products. The gridding scheme is based on that adopted by the International Satellite Cloud Climatology Project (ISSCP). This document does not motivate the need for an equal area grid for SeaWIFS or other oceanographic products. Such motivation can be found in a paper by W. Rossow and L. Gardner (Selection of a map grid for data analysis and archival, Journal of Climate and Applied Meteorology, 1984, 23: 1253-1257). Furthermore, this document describes only the design of the proposed equal-area grid, and does not discuss other related topics such as rules for spatially or temporally combining observations into the equal-area bins. Overview The gridding scheme proposed consists of rectangular bins or tiles, arranged in zonal rows. A compromise between data processing and storage capabilities, on one side, and the potential geophysical applications of satellite data, on the other side, suggest that a suitable minimum bin size would be approximately 8-10 km on a side. In the scheme proposed here, the tiles are approximately 9.28 km on a side. This size (9.28 km) was chosen because (a) it has approximately the desired minimum resolution, and (b) it results in 2160 zonal rows of tiles from pole to pole (i.e., 1080 in each hemisphere). This particular number of rows (2160) has some advantages which will be discussed in more detail below. Because the total number of rows is even, the bins will never straddle the Equator (i.e., there will be an equal number of rows above and below the Equator). This avoids possible situations where the Coriolis factor is zero. This is a characteristic that numerical modellers expect from any gridding scheme adopted. The total number of approximately 9-km bins is 5,940,422. The bins or tiles are arranged in a series of zonal rows; the number of tiles per row varies. The rows immediately above and below the Equator have 4320 tiles. This number is derived by dividing the perimeter of the Earth at the Equator by the standard tile size (i.e., 2¹Re/9.28), where Re is the equatorial radius of the Earth (Re = 6378.145 km). The number of tiles per row decreases approximately as a cosine function as the rows get closer to each pole (rigorously, there should be an adjustment for ellipticity of the Earth, as the equatorial radius decreases progressively to the smaller polar radius; this adjustment is not applied in the current implementation). At the poles, the number of tiles is always three. This special situation will be discussed in detail below. The number of bins in each zonal row is always an integer. To ensure an integer number of bins, the width of each bin (the size of a bin along a parallel, or x- length) must vary slightly from row to row. However, the bins are always 9.28 km long along the meridians. That is, only one of the bin dimensions changes. The size of the bins at each zonal row is established in the following manner. First, a preliminary value for the number of tiles (Np) at a given latitude (L) is computed as Np = 2¹r / X, where X is the x-size of a bin at the Equator (9.28 km) and r is the radius of the circle produced by slicing the Earth with a plane parallel to the Equator at latitude L. The radius r can be calculated as r = Req cos(L), where Req is the equatorial radius of the Earth. If the fractional part of Np is greater or equal than 0.5, then Np is rounded up to the nearest integer (i.e., the final number of tiles will be the integer portion of Np plus one), otherwise Np is rounded down (the final number of tiles is the integer portion of Np. Once the final integer number of tiles along a row is calculated, the X-size of the tiles must be adjusted. This is done by dividing the perimeter of the row (2¹r) by the integer number of tiles. The result is the x-length of a tile (width) for a given row. Because the x-length of the tiles is adjusted to ensure an integer number at each row, the Òequal areaÓ characteristics of this binning scheme are not rigorously preserved. However, variations in tile size are negligible throughout most of the globe, and only become relevant at very high latitudes, where there are fewer tiles per row and, thus, any adjustments are more noticeable. As soon as the number of tiles increases with distance from the poles, the difference between tile sizes rapidly becomes practically unnoticeable. To provide an idea of the magnitude of the fluctuations in tile size, the worst possible case occurs when half a tile remains ÒuncoveredÓ after filling a zonal row with an integer number of tiles. Once a row has 100 bins (approximately 16 rows, or 148 km from the poles), the worst possible difference between the actual tile x-length and the standard x-length is of the order of 0.5% (i.e., half a tile's length redistributed among about 100 tiles). For a tile of about 9 km a side, this represents a difference in the x-length of about 45 m. Through a similar calculation, a row with 50 bins (about 80 km away from the poles) has a 1% variation with respect to the standard bin size. The gridding scheme described here has an extremely useful feature: the number of 9.28 km tiles in each hemisphere (1080) is divisible by many numbers (e.g., 2,3,4,5,6) and therefore it is extremely easy to generate an integer number of rows at many useful spatial resolutions. For instance, 12 rows of 9.28 km tiles can be combined to generate zonal bands of approximately one degree (one degree of latitude is equal to 111.12 km; 12 bins would form a band 111.20 km wide). Another example is the use of 30 rows of to generate zonal bands of approximately 2.5¡ (a typical output resolution of atmospheric circulation models). The poles Both the North and South poles are special cases in the gridding scheme presented here. The pole areas are always covered by three tiles, shaped like pie sectors. While the meridional size of the polar bins (the y-length) will be the usual 9.28 km, the length of the bins along the arc of the sectors will be slightly larger. Neglecting sphericity, the area encompassed by the last row of tiles is ¹X2, where X = 9.28 km. If we express the area of the circle as a rectangle of height X, the remaining dimension is ¹X. If we divide the perimeter by three (to yield three tiles), each tile will have dimensions X by ¹X/3 (approximately 1.05X). That is, the bases of the triangular polar tiles are about 5% larger than the x-length of the equatorial tiles. APPENDIX C - FLAG DETAILS Two tests are done initially to assign a quality flag to the data: Level 1 Testing 1) Comparison between channel differences. This test involves comparing differences between channels 4 and 5 with neighboring pixels within a 3 x 3 box. The reason that channels 4 and 5 are used is that they are cleaner than channel 3 and they work during day and night unlike the visible channels. Furthermore the visible channels were not found to improve the quality flagging over the 4/5 comparison. 2) Comparison with reference field This test involves comparing brightness temperatures with a reference field. Both a spatial and temporal homogeneity test is done to see if the values are within a + or - 2 degree difference. The degree threshold is chosen because at the mesoscale (100 km) ocean temperatures, except for frontal regions like the Gulf Stream, are not found to change by more than 2 degrees. Based on the following scenario of passage or failure a quality flag is assigned. 1) If both test pass then a quality flag of 3 is assigned indicating the highest quality. 2) If the satellite test is passed (test #1) but the reference test (test #2) fails a mediocre quality flag of 2 is assigned. 3) If the satellite test fails and the reference test passes a mediocre quality flag of 2 is assigned. 4) If both tests fail a quality flag of 1 is assigned. Level 2 testing SST fields are now compared with Reynolds optimally interpolated data. Direct comparisons are done over different time periods to see if they are within a threshold. If data is bad the time period can be extended up to 35 days. If test fails a new reference field is found and the test repeated.