Deep generative model conditioned by phase picks for synthesizing labeled seismic waveforms with limited data

Guoyi Chen,Junlun Li,Hao Guo
2023-10-02
Abstract:Shortage of labeled seismic field data poses a significant challenge for deep-learning related applications in seismology. One approach to mitigate this issue is to use synthetic waveforms as a complement to field data. However, traditional physics-driven methods for synthesizing data are computationally expensive and often fail to capture complex features in real seismic waveforms. In this study, we develop a deep-learning-based generative model, PhaseGen, for synthesizing realistic seismic waveforms dictated by provided P- and S-wave arrival labels. Contrary to previous generative models which require a large amount of data for training, the proposed model can be trained with only 100 seismic events recorded by a single seismic station. The fidelity, diversity and alignment for waveforms synthesized by PhaseGen with diverse P- and S-wave arrival labels are quantitatively evaluated. Also, PhaseGen is used to augment a labelled seismic dataset used for training a deep neural network for the phase picking task, and it is found that the picking capability trained with the augmented dataset is unambiguously improved. It is expected that PhaseGen can offer a valuable alternative for synthesizing realistic waveforms and provide a promising solution for the lack of labeled seismic data.
Geophysics
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the issue of insufficient labeled seismic data in the field of seismology for deep learning applications. Specifically, the authors propose a method based on a deep generative model (PhaseGen) to synthesize realistic seismic waveforms in situations where data is limited. #### Main Contributions: 1. **Synthesizing Realistic Seismic Waveforms**: Traditional physics-driven methods for synthesizing seismic waveforms are computationally expensive and struggle to capture the complex features of real seismic waveforms. PhaseGen can generate realistic seismic waveforms based on provided P-wave and S-wave arrival time labels. 2. **Training with Small Datasets**: Unlike previous generative models that require large amounts of data for training, PhaseGen can be trained using data from only 100 seismic events. 3. **Quantitative Evaluation**: The paper quantitatively evaluates the fidelity, diversity, and conditional consistency of the waveforms generated by PhaseGen. 4. **Data Augmentation**: By using PhaseGen to augment labeled seismic datasets for training a deep neural network for phase identification, the results show that this augmented dataset can significantly improve phase identification capabilities. In summary, PhaseGen provides a valuable alternative for synthesizing realistic seismic waveforms in situations where labeled seismic data is insufficient.