OceanBench: The Sea Surface Height Edition

J. Emmanuel Johnson,Quentin Febvre,Anastasia Gorbunova,Sammy Metref,Maxime Ballarotta,Julien Le Sommer,Ronan Fablet
2023-09-27
Abstract:The ocean profoundly influences human activities and plays a critical role in climate regulation. Our understanding has improved over the last decades with the advent of satellite remote sensing data, allowing us to capture essential quantities over the globe, e.g., sea surface height (SSH). However, ocean satellite data presents challenges for information extraction due to their sparsity and irregular sampling, signal complexity, and noise. Machine learning (ML) techniques have demonstrated their capabilities in dealing with large-scale, complex signals. Therefore we see an opportunity for ML models to harness the information contained in ocean satellite data. However, data representation and relevant evaluation metrics can be the defining factors when determining the success of applied ML. The processing steps from the raw observation data to a ML-ready state and from model outputs to interpretable quantities require domain expertise, which can be a significant barrier to entry for ML researchers. OceanBench is a unifying framework that provides standardized processing steps that comply with domain-expert standards. It provides plug-and-play data and pre-configured pipelines for ML researchers to benchmark their models and a transparent configurable framework for researchers to customize and extend the pipeline for their tasks. In this work, we demonstrate the OceanBench framework through a first edition dedicated to SSH interpolation challenges. We provide datasets and ML-ready benchmarking pipelines for the long-standing problem of interpolating observations from simulated ocean satellite data, multi-modal and multi-sensor fusion issues, and transfer-learning to real ocean satellite observations. The OceanBench framework is available at <a class="link-external link-http" href="http://github.com/jejjohnson/oceanbench" rel="external noopener nofollow">this http URL</a> and the dataset registry is available at <a class="link-external link-http" href="http://github.com/quentinf00/oceanbench-data-registry" rel="external noopener nofollow">this http URL</a>.
Machine Learning,Atmospheric and Oceanic Physics
What problem does this paper attempt to address?
The paper aims to address key challenges in the processing of ocean observation data, particularly in the interpolation of Sea Surface Height (SSH). Specifically: 1. **Data Processing and Evaluation**: The paper introduces the OceanBench framework, which aims to simplify the process for machine learning researchers to handle ocean observation data and provides standardized data preprocessing and evaluation methods. This helps lower the barrier for machine learning researchers to enter the field of ocean science research. 2. **SSH Interpolation Problem**: Sea Surface Height (SSH) data is difficult to extract effective information from due to the sparsity and irregular sampling of satellite observations. OceanBench provides datasets and benchmark testing pipelines for the SSH interpolation problem to address this long-standing challenge. 3. **Multimodal Fusion and Transfer Learning**: The paper also explores the issue of multimodal and multi-sensor data fusion and investigates how to transfer models trained on simulated data to actual ocean satellite observations, thereby improving prediction accuracy. Through these efforts, the OceanBench framework not only promotes the application of machine learning techniques in ocean science but also provides researchers with a flexible and scalable platform to customize and extend their research tasks.