Abstract:We perform an out-of-distribution analysis of ~12,000,000 semi-independent 128x128 pixel^2 SST regions, which we define as cutouts, from all nighttime granules in the MODIS R2019 Level-2 public dataset to discover the most complex or extreme phenomena at the ocean surface. Our algorithm (Ulmo) is a probabilistic autoencoder, which combines two deep learning modules: (1) an autoencoder, trained on ~150,000 random cutouts from 2010, to represent any input cutout with a 512-dimensional latent vector akin to a (non-linear) EOF analysis; and (2) a normalizing flow, which maps the autoencoder's latent space distribution onto an isotropic Gaussian manifold. From the latter, we calculate a log-likelihood value for each cutout and define outlier cutouts to be those in the lowest 0.1% of the distribution. These exhibit large gradients and patterns characteristic of a highly dynamic ocean surface, and many are located within larger complexes whose unique dynamics warrant future analysis. Without guidance, Ulmo consistently locates the outliers where the major western boundary currents separate from the continental margin. Buoyed by these results, we begin the process of exploring the fundamental patterns learned by Ulmo, identifying several compelling examples. Future work may find that algorithms like Ulmo hold significant potential/promise to learn and derive other, not-yet-identified behaviors in the ocean from the many archives of satellite-derived SST fields. As important, we see no impediment to applying them to other large, remote-sensing datasets for ocean science (e.g., sea surface height, ocean color).

What problem does this paper attempt to address?

This paper attempts to analyze sea - surface temperature (SST) patterns through deep - learning techniques in order to identify extreme phenomena on the ocean surface. Specifically, the authors developed an unsupervised machine - learning algorithm named ulmo to analyze approximately 12 million semi - independent 128x128 - pixel SST regions (called cutouts) in the nighttime MODIS L2 SST dataset. The ulmo algorithm combines two deep - learning modules: Autoencoder and Normalizing Flow, aiming to discover the most complex or extreme phenomena on the ocean surface. ### Main Objectives 1. **Identify Extreme Phenomena**: By analyzing a large amount of SST data, identify those ocean - surface areas with highly dynamic characteristics. These areas may contain large ocean - current separation points or other complex ocean - dynamic processes. 2. **Explore Unknown Physical Processes**: Through unsupervised - learning methods, discover previously unnoticed ocean - surface physical processes that may be rare or not fully studied. 3. **Verify the Effectiveness of the Algorithm**: By applying ulmo to actual data, verify its effectiveness and reliability in identifying abnormal phenomena on the ocean surface. ### Method Overview - **Data Pre - processing**: Pre - process the SST data, including steps such as cloud removal, median filtering, size adjustment, and mean subtraction, to improve the performance of the algorithm. - **Model Architecture**: ulmo is a Probabilistic Autoencoder (PAE) that combines an Autoencoder and a Normalizing Flow. The Autoencoder compresses the input SST image into a 512 - dimensional latent space, and the Normalizing Flow maps the distribution of this latent space to an isotropic Gaussian distribution. - **Training Process**: First, train the Autoencoder so that it can reconstruct the input SST image; then train the Normalizing Flow so that it can estimate the probability of each sample in the latent space. ### Results - **Anomaly Detection**: By calculating the log - likelihood (LL) value of each cutout, define the 0.1% of cutouts with the lowest LL values as abnormal samples. These abnormal samples are usually located near the major western - boundary ocean currents, such as along the coasts of Japan, North America, South America, and South Africa. - **Seasonal Trends**: The number of abnormal samples is higher in the northern - hemisphere winter, which may be related to the larger temperature differences in the northern - hemisphere winter. - **Geographical Distribution**: Abnormal samples are mainly concentrated near the western - boundary ocean currents, where the ocean dynamics are very active. ### Significance - **Verify Known Phenomena**: The ulmo algorithm has successfully rediscovered the known ocean - dynamic hot - spot areas, which verifies the effectiveness of the algorithm. - **Discover New Phenomena**: Although ulmo has rediscovered the known hot - spot areas, it also has the potential to discover new, unrecognized ocean behaviors, especially when applied to other large - scale remote - sensing datasets. In conclusion, this paper has successfully identified extreme phenomena on the ocean surface through deep - learning techniques and provided new tools and methods for further research on ocean dynamics.

Deep Learning of Sea Surface Temperature Patterns to Identify Ocean Extremes

AI based Out-Of-Distribution Analysis of Sea Surface Height Data

The Fundamental Patterns of Sea Surface Temperature

Unveiling Regional Climate Patterns Through Global Subsurface Ocean Temperature Data: An AI Multi-Layer Analysis Framework

Machine Learning in Extreme Value Analysis, an Approach to Detecting Harmful Algal Blooms with Long-Term Multisource Satellite Data.

Deep Learning for Sea Surface Temperature Reconstruction under Cloud Occlusion

Predicting temporal and spatial 4-D ocean temperature using satellite data based on a novel deep learning model

Deep learning for the super resolution of Mediterranean sea surface temperature fields

Learning Sea Surface Height Interpolation From Multi‐Variate Simulated Satellite Observations

BlackBox: Generalizable reconstruction of extremal values from incomplete spatio-temporal data

Ocean Surface Parameters Estimation From Microwave Radiometer Voltages Using Deep Learning

Multi-scale decomposition of sea surface height snapshots using machine learning

A Deep Framework for Eddy Detection and Tracking From Satellite Sea Surface Height Data

An Ensemble-Based Machine Learning Model for Estimation of Subsurface Thermal Structure in the South China Sea

Learning of Sea Surface Height Interpolation from Multi-variate Simulated Satellite Observations

A Comparison of Machine Learning Algorithms for Predicting Sea Surface Temperature in the Great Barrier Reef Region

An Operational Global Near-Real-Time High-Resolution Seamless Sea Surface Temperature Products From Satellite-Based Thermal Infrared Measurements

Transforming Observations of Ocean Temperature with a Deep Convolutional Residual Regressive Neural Network

Spatiotemporal Prediction of Monthly Coastal Upwelling Scenario in SST Fields Using Deep Learning Based Models

SCMNet: Toward Subsurface Chlorophyll Maxima Prediction Using Embeddings and Bi-GRU Network

Systematic multi-scale decomposition of ocean variability using machine learning