Deep Learning of Sea Surface Temperature Patterns to Identify Ocean Extremes

J. Xavier Prochaska,Peter C. Cornillon,David M. Reiman
DOI: https://doi.org/10.3390/rs13040744
2023-05-10
Abstract:We perform an out-of-distribution analysis of ~12,000,000 semi-independent 128x128 pixel^2 SST regions, which we define as cutouts, from all nighttime granules in the MODIS R2019 Level-2 public dataset to discover the most complex or extreme phenomena at the ocean surface. Our algorithm (Ulmo) is a probabilistic autoencoder, which combines two deep learning modules: (1) an autoencoder, trained on ~150,000 random cutouts from 2010, to represent any input cutout with a 512-dimensional latent vector akin to a (non-linear) EOF analysis; and (2) a normalizing flow, which maps the autoencoder's latent space distribution onto an isotropic Gaussian manifold. From the latter, we calculate a log-likelihood value for each cutout and define outlier cutouts to be those in the lowest 0.1% of the distribution. These exhibit large gradients and patterns characteristic of a highly dynamic ocean surface, and many are located within larger complexes whose unique dynamics warrant future analysis. Without guidance, Ulmo consistently locates the outliers where the major western boundary currents separate from the continental margin. Buoyed by these results, we begin the process of exploring the fundamental patterns learned by Ulmo, identifying several compelling examples. Future work may find that algorithms like Ulmo hold significant potential/promise to learn and derive other, not-yet-identified behaviors in the ocean from the many archives of satellite-derived SST fields. As important, we see no impediment to applying them to other large, remote-sensing datasets for ocean science (e.g., sea surface height, ocean color).
Atmospheric and Oceanic Physics
What problem does this paper attempt to address?
This paper attempts to analyze sea - surface temperature (SST) patterns through deep - learning techniques in order to identify extreme phenomena on the ocean surface. Specifically, the authors developed an unsupervised machine - learning algorithm named ulmo to analyze approximately 12 million semi - independent 128x128 - pixel SST regions (called cutouts) in the nighttime MODIS L2 SST dataset. The ulmo algorithm combines two deep - learning modules: Autoencoder and Normalizing Flow, aiming to discover the most complex or extreme phenomena on the ocean surface. ### Main Objectives 1. **Identify Extreme Phenomena**: By analyzing a large amount of SST data, identify those ocean - surface areas with highly dynamic characteristics. These areas may contain large ocean - current separation points or other complex ocean - dynamic processes. 2. **Explore Unknown Physical Processes**: Through unsupervised - learning methods, discover previously unnoticed ocean - surface physical processes that may be rare or not fully studied. 3. **Verify the Effectiveness of the Algorithm**: By applying ulmo to actual data, verify its effectiveness and reliability in identifying abnormal phenomena on the ocean surface. ### Method Overview - **Data Pre - processing**: Pre - process the SST data, including steps such as cloud removal, median filtering, size adjustment, and mean subtraction, to improve the performance of the algorithm. - **Model Architecture**: ulmo is a Probabilistic Autoencoder (PAE) that combines an Autoencoder and a Normalizing Flow. The Autoencoder compresses the input SST image into a 512 - dimensional latent space, and the Normalizing Flow maps the distribution of this latent space to an isotropic Gaussian distribution. - **Training Process**: First, train the Autoencoder so that it can reconstruct the input SST image; then train the Normalizing Flow so that it can estimate the probability of each sample in the latent space. ### Results - **Anomaly Detection**: By calculating the log - likelihood (LL) value of each cutout, define the 0.1% of cutouts with the lowest LL values as abnormal samples. These abnormal samples are usually located near the major western - boundary ocean currents, such as along the coasts of Japan, North America, South America, and South Africa. - **Seasonal Trends**: The number of abnormal samples is higher in the northern - hemisphere winter, which may be related to the larger temperature differences in the northern - hemisphere winter. - **Geographical Distribution**: Abnormal samples are mainly concentrated near the western - boundary ocean currents, where the ocean dynamics are very active. ### Significance - **Verify Known Phenomena**: The ulmo algorithm has successfully rediscovered the known ocean - dynamic hot - spot areas, which verifies the effectiveness of the algorithm. - **Discover New Phenomena**: Although ulmo has rediscovered the known hot - spot areas, it also has the potential to discover new, unrecognized ocean behaviors, especially when applied to other large - scale remote - sensing datasets. In conclusion, this paper has successfully identified extreme phenomena on the ocean surface through deep - learning techniques and provided new tools and methods for further research on ocean dynamics.