Broadband Ground Motion Synthesis by Diffusion Model with Minimal Condition

Jaeheun Jung,Jaehyuk Lee,Chang-Hae Jung,Hanyoung Kim,Bosung Jung,Donghun Lee
2024-12-23
Abstract:Earthquakes are rare. Hence there is a fundamental call for reliable methods to generate realistic ground motion data for data-driven approaches in seismology. Recent GAN-based methods fall short of the call, as the methods either require special information such as geological traits or generate subpar waveforms that fail to satisfy seismological constraints such as phase arrival times. We propose a specialized Latent Diffusion Model (LDM) that reliably generates realistic waveforms after learning from real earthquake data with minimal conditions: location and magnitude. We also design a domain-specific training method that exploits the traits of earthquake dataset: multiple observed waveforms time-aligned and paired to each earthquake source that are tagged with seismological metadata comprised of earthquake magnitude, depth of focus, and the locations of epicenter and seismometers. We construct the time-aligned earthquake dataset using Southern California Earthquake Data Center (SCEDC) API, and train our model with the dataset and our proposed training method for performance evaluation. Our model surpasses all comparable data-driven methods in various test criteria not only from waveform generation domain but also from seismology such as phase arrival time, GMPE analysis, and spectrum analysis. Our result opens new future research directions for deep learning applications in seismology.
Machine Learning,Artificial Intelligence,Geophysics
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve several key challenges in the generation of seismic waveform data, especially in data - driven seismology methods. Specifically, the authors focus on the following problems: 1. **Data scarcity**: Large - earthquake events are very rare, resulting in a limited amount of data available for training and testing. This makes it difficult to generate realistic seismic waveforms. 2. **Limitations of existing generation models**: Methods based on generative adversarial networks (GAN) can generate seismic waveforms, but usually require additional geological information (such as fault mechanisms, local geological features, etc.), and the generated waveforms often cannot meet seismological constraints, such as phase arrival times and ground - motion amplitudes. 3. **Authenticity of generated waveforms**: Existing generation models often fail to accurately capture the key features of seismic waveforms, such as the arrival times of P - and S - waves, the amplitude of ground motion, etc., when generating waveforms. These features are crucial for seismic analysis. To solve the above problems, the authors propose a new method based on the diffusion model. This method can generate broadband ground - motion data with high seismological authenticity using only minimal conditions (seismic location and magnitude). Through this method, the authors hope to provide more reliable and realistic seismic waveform data in data - driven seismology research. ### Main contributions - **Designed a new diffusion model**: This model can generate realistic seismic waveforms only with the location and magnitude of the source and the observation station as conditional information. - **Proposed a domain - specific training framework**: Paired seismic waveform data are used for efficient learning, which improves the learning efficiency of the model. - **Constructed a new evaluation dataset**: Paired seismic waveform data are extracted and organized from public seismic datasets for model training and evaluation. - **Demonstrated the effectiveness of the model**: The performance of the model is verified through multiple evaluation metrics (such as GMPE analysis, phase arrival time, spectral analysis, etc.), proving that it is superior to existing benchmark models. ### Method overview The core of this method is to use the diffusion model to generate seismic waveforms, which is achieved through the following steps: 1. **Data preparation**: Paired waveform data of multiple seismic events are collected and organized from the SCEDC dataset. 2. **Model architecture**: The U - Net backbone network and cross - attention mechanism are adopted, and an amplitude correction module (ACM) is introduced to improve the quality of the generated waveforms. 3. **Training process**: The diffusion model is trained with paired data to ensure that the generated waveforms are consistent with the real waveforms in phase arrival time and amplitude. 4. **Evaluation and verification**: The authenticity and reliability of the generated waveforms are verified through quantitative and qualitative analysis. In conclusion, this paper proposes an innovative diffusion model method that can generate high - quality seismic waveform data under minimal conditions, providing new tools and directions for seismology research.