December 23, 2010

Reef Check California – What happens with RCCA data after they are collected in the field and how is their quality insured?

Each month, Reef Check will answer a technical question regarding the monitoring protocol of our coral reef or rocky reef programs. If you have a question you would like answered, please email

Reef Check California – What happens with RCCA data after they are collected in the field and how is their quality insured?

With the survey season behind us, we are busy processing the data that all our volunteers collected this year. This data processing includes six steps of data quality assurance. Why is this so important? Our results are used and analyzed by people such as scientists and resource managers who were not involved in the data collection process. Since data users have no way of checking data quality for themselves, Reef Check California (RCCA) has to insure that the utmost care has been taken to eliminate data errors and that the data are collected, entered and processed in a way that insures its quality.

The first quality control step happens once the data are written on the diver’s slate and brought to the surface right after the dive. Every diver checks their datasheet for completeness and readability, and then it is checked again by a second person to make sure it is ready for data entry.

Next the datasheets are collected to be entered into RCCA’s online Ecological Nearshore Database (NED). During the data entry process there are again several data checks to insure data quality. When the data are entered online, two types of data validation are performed. First, invalid characters or impossible numbers are flagged and the system requests a valid number to prevent typos during data entry. Once a survey is entered, all data are automatically checked against expected species-specific data values. For example, if a large number is entered for a fish species for which we typically only see one or two individuals on a transect, the observation is flagged and the person entering the data is asked to verify the number. If the number is confirmed to be true it will remain in the data, if it was a mistake it will be corrected before the data are submitted.

Once the survey data are submitted, an RCCA staff scientist will check the entered data against the datasheets by comparing values in the database to the values on the underwater datasheets. After this step the survey data are finalized and can be viewed on NED’s interactive Map Viewer. But before data can be used in other ways and analyzed there is one more step. At the end of the year RCCA’s database manager runs the entire database of surveys through data checking programs and for example, removes surveys that have not been completed or labels missing values so that they can be treated correctly in data analyses. The data are then combined with RCCA’s data from previous years and ready to be distributed and analyzed. Along with the rigorous training of our volunteers, these procedures insure that no mistakes are made and that RCCA data are of high quality and can be trusted by people not involved in the data collections process.