Quality control for ocean observations: From present to future

Published in Scientia Sinica Terrae, 2022

The quality control (QC) of ocean observational data, essential to establish a high-quality global ocean database, is one of the basic data pre-processing steps in oceanography research, marine monitoring, and forecasting. With the introduction of various advanced instruments in recent decades, oceanographic surveys have expanded from coastal regions to open oceans, contributing to a big data era of marine science.

However, as ocean in-situ observations are obtained using different instruments that offer heterogeneous data qualities, it is paramount that bad data could be accurately and efficiently identified via QC to provide a reliable global ocean database.

A new study, leading by Institute of Atmospheric Physics, Chinese Academy of Sciences (IAP, CAS), in cooperation with CAS’s Center for Ocean Mega-Science, Institute of Oceanology, Chinese Academy of Sciences (IOCAS), Second Institute of Oceanography, Ministry of Natural Resources of China (SIO, MNR) and National Marine Data and Information Service (NMDIS) of China, was released online in the Scientia Sinica Terrae on September 17th that provided a review of QC for ocean observations from present to future.

“QC is a holistic check of data quality by using manual or computer-assisted methods, with the aim of identifying bad data and enhancing the accuracy, robustness, and usability of the data,” said Zhetao Tan, the first author of this review. “QC is a system that usually includes many sub-modules such as range check, consistency check, continuity check, temperature inversion check, spike check, gradient check, maximum depth check and climatology check (Figure 1)”. In this review, the latest progress of QC for oceanic in-situ observations (mainly focus on temperature and salinity data) and the similarities and differences between QC schemes developed by various ocean organizations are introduced. Organizations include U.S. National Centers for Environmental Information of the National Oceanic and Atmospheric Administration, the Met Office Hadley Centre, the Global Temperature and Salinity Profile Programm, the Argo program, the World Ocean Circulation Experiment, the China Argo Real-time Data Center, NMIDS and IOCAS.

“Although different automatic QC systems are developed, this review found that different QC schemes has its own strengths and weaknesses, and thus there is no recognized ‘gold standard’,” TAN continued to add. “We also discussed the possible methods for performance evaluation of QC, and we used some benchmark dataset for comprehensive evaluation based on some well-known but independent QC systems”.

The researchers also recommended that more actions need to be taken to put forward the development of QC for ocean observations to obtain high-quality ocean in-situ data. “We should improve QC to ensure a real-time data flow for real-time ocean monitoring and the early warning system.” said TAN. “For example, more research on the comprehensive evaluation of the QC schemes and the use of artificial intelligence (ML) are recommended. In addition, additional resources are required for QC development practices for dissolved oxygen, the partial pressure of CO2, pH, the chlorophyll concentration, and the concentration of nutrient salts.”

Reference:

Tan, Z., B. Zhang, X. Wu, M. Dong, L. Cheng*, 2021: Quality control for ocean observations: From present to future. Science China Earth Science, https://doi.org/10.1360/SSTe-2021-0096

[Chinese Version]

链接:http://www.ocean.iap.ac.cn/pages/detail/detail.html?type=p&id=9ca6b7170c4b41609a210b51e13a8f72&languageType=cn&navAnchor=publications

海洋观测数据的质量控制是建立高质量海洋科学数据库的基础,其对于推动海洋科学及多学科交叉研究、预测预报、灾害预警等具有重要意义。近几十年来,随着各种自动化观测平台的出现,海洋调查的深度和广度不断拓展,海洋科学已进入大数据时代。国内外对如何获得高质量的现场观测数据越来越重视。

然而,由于获取数据手段多样、数据质量千差万别、数据错情类型繁多等因素,使得如何高效和精准地发现这些质量问题并对其进行质量控制是一个难点,也是数据处理中的核心技术。

近期,由中国科学院大气物理研究所牵头,联合了来自中国科学院海洋研究所、中国科学院海洋大科学研究中心、自然资源部第二海洋所、国家海洋信息中心的科研团队,针对海洋观测数据的质量控制的发展历史、研究现状进行了回顾和展望。

综述论文系统性回顾了目前国内外针对温盐等物理海洋方面观测数据的主要质量控制技术及其原理方法,例如范围检查、相关性检查、连续性检查、等值检查、梯度检查、逆温检查、尖峰检查、最大深度检查、气候态检查等(图1)。同时,该论文还对目前由不同国内外科研组织(如美国国家海洋和大气管理局国家环境信息中心、英国气象局哈德莱中心、全球温盐剖面计划、Argo全球海洋观测阵列计划等组织)研发的主流质控系统进行了归纳整理和对比,并讨论了不同系统之间的差异。通过对比,文章指出,在模块的选取、定义和质控标记符(flag)的划分上,不同的质控系统之间存在着许多差异和区别。

“目前,国内外在自动化质量控制的研发上已有很好基础。但基于对国内外现有的几个质控系统的简要性能评估,论文发现:现有的各个系统存在各自的优势和缺陷,尚未有一种国内外公认的最佳方案”,论文的第一作者谭哲韬说:“在此基础上,我们还讨论了评估质控系统性能的可能方法和思路,并使用了一些基准数据集对目前已有的一些主流质控系统进行了独立的综合评估。”

除此之外,文章还指出海洋观测数据的质量控制在未来还有很大的发展空间,包括(1)应开展更广泛的质量控制系统性能的评估工作;(2)积极推进溶解氧、二氧化碳分压、pH值、叶绿素、营养盐等要素的质量控制研发工作;(3)开展基于人工智能(机器学习)的质量控制研发工作;(4)推进质量控制深度融合到我国实时、联动的海洋数据库业务化工作中,为我国海洋实时监测预警提供坚实支撑。

作者团队最后指出,一个科学、全面、可持续和高质量的数据管理工作是进行数据质量控制的基础。只有提供高质量的数据,才能更好地推动海洋科学及其相关交叉学科的发展。

日前,该综述论文已于2021年9月17日在线发表在《中国科学:地球科学》上。该研究得到中国科学院(B类)战略性先导科技专项(XDB42040402)、自然资源部第二海洋研究所卫星海洋环境动力学国家重点实验室资助项目(QNHX2133)、国家重点研发计划全球变化及应对专项(2017YFA0603202)、中国科学院海洋大科学中心自主部署项目(COMS2019Q01)、国家自然科学基金面上项目(42076202)、中国科学院青年创新促进会等项目资助。

论文(中文版):谭哲韬, 张斌, 吴晓芬, 董明媚, 成里京*, 2021: 海洋观测数据质量控制技术研究现状及展望. 中国科学: 地球科学. https://doi.org/10.1360/SSTe-2021-0096.

Recommended citation: Tan, Z., B. Zhang, X. Wu, M. Dong, L. Cheng*, 2021: Quality control for ocean observations: From present to future. Scientia Sinica Terrae
Download Paper