Data Checks and Processing Performed by CDIAC
An important part of the NDP process at the Carbon Dioxide Information Analysis Center
(CDIAC) involves the quality assurance (QA) of data before distribution. Data received at
CDIAC are rarely in a condition that would permit immediate distribution, regardless of the
source. To guarantee data of the highest possible quality, CDIAC conducts extensive QA reviews.
Reviews involve examining the data for completeness, reasonableness, and accuracy. Although
they have common objectives, these reviews are tailored to each data set, often requiring extensive
programming efforts. In short, the QA process is a critical component in the value-added concept
of supplying accurate, usable data for researchers.
The following summarizes the data QA checks and processing performed by CDIAC on the
data obtained during the NOAA/PMEL R/V Malcolm Baldrige CGC-90 Cruise in the Southwest
Pacific.
-
These data were provided to CDIAC as two ASCII-formatted files and accompanying printed
documentation (NOAA Data Report ERL PMEL-42) (Lamb et al. 1993). A FORTRAN 77
retrieval code was written and used to reformat the original files.
- To check for obvious outliers all data were plotted by use of a PLOTNEST.C program
written by Stewart C. Sutherland, of the Lamont-Doherty Earth Observatory. The program
plots a series of nested profiles, using the station number as an offset; the first station is
defined at the beginning, and subsequent stations are offset by a fixed interval (Fig. 3, Fig. 4,
Fig. 5, Fig. 6, Fig. 7, and Fig. 8)1.
Several outliers were identified and flagged after consultation with the principal investigators.
- To generate a section profile plot of TCO2 concentrations along the 170o W, the
SURFER program developed by Golden Software, Inc. for Windows version 5.0 was used (Fig. 9).
- To identify "noisy" data and possible systematic methodological errors, property-property
plots for all parameters were generated (Fig. 10), carefully examined, and compared with
plots from previous expeditions in the Southwest Pacific.
- To identify possible instrumentation drifts and methodological errors, the data
intercomparison for reoccupied stations was provided(Fig. 11).
- All variables were checked for values exceeding physical limits, such as sampling depth
values that are greater than the given bottom depths.
- Station locations (latitudes and longitudes) and sampling dates were examined for consistency
with maps and with cruise information supplied by Lamb et al. (1993).
- The designation for missing values, given as "-99.00" in the original files, was changed to
"-999.90".
1 All data were plotted with excluded questionable measurements flagged by quality flag 3.
akozyr 04/30/97