Typhoon disaster state information extraction for Chinese texts

Peng Ye,Chunju Zhang,Mingzhu Chen,Shengcai Li
DOI: https://doi.org/10.1038/s41598-024-58585-8
IF: 4.6
2024-04-05
Scientific Reports
Abstract:Typhoon disasters undergo a complex evolutionary process influenced by temporal changes, and investigating this process constitutes the central focus of geographical research. As a key node within the typhoon disaster process, the state serves as the foundation for gauging the dynamics of the disaster. The majority of current approaches to disaster information extraction rely on event extraction methods to acquire fundamental elements, including disaster-causing factors, disaster-bearing bodies, disaster-pregnant environment and the extent of damage. Due to the dispersion of various disaster information and the diversity of time and space, it is a challenge for supporting the analysis of the typhoon disaster process. In this paper, a typhoon disaster state information extraction (TDSIE) method for Chinese texts is proposed, which aims to facilitate the systematic integration of fragmented typhoon disaster information. First, the integration of part-of-speech tagging with spatio-temporal information extraction is employed to achieve the tagging of typhoon disaster texts. Second, within the framework of spatio-temporal semantic units, the typhoon disaster semantic vector is constructed to facilitate the identification of information elements of typhoon disaster states. Third, co-referential state information fusion is performed based on spatio-temporal cues. Experimental analysis, conducted using online news as the data source, reveals that the TDSIE achieves precision and recall rates consistently surpassing 85%. The typhoon disaster state information derived from the TDSIE allows for the analysis of spatio-temporal patterns, evolutionary characteristics, and activity modes of typhoon disasters across various scales. Therefore, TDSIE serves as valuable support for investigating the inherent process properties of typhoon disasters.
multidisciplinary sciences
What problem does this paper attempt to address?
This paper aims to solve the problems of information extraction and integration during typhoon disasters. Specifically, the paper focuses on how to systematically extract and integrate typhoon disaster state information from Chinese texts to support the analysis of multi - scale spatio - temporal patterns, evolution characteristics and activity patterns of typhoon disaster processes. The paper points out that the existing disaster information extraction methods face the following challenges: 1. **Dispersion of information and diversity of spatio - temporal granularity**: Due to the dispersion of disaster information and the diversity of spatio - temporal granularity, traditional disaster information extraction methods are difficult to effectively support the analysis of typhoon disaster processes. 2. **Lack of standardized information models**: At present, there is a lack of information modeling for typhoon disaster processes, resulting in difficulties in standardizing various typhoon disaster information. 3. **Multi - scale spatio - temporal dependence**: The process of typhoon disasters is scale - dependent, and there are differences in spatio - temporal ranges and evolution sequences at different scales. It is necessary to understand the typhoon disaster process from multiple spatio - temporal scales. 4. **Dynamically changing disaster elements**: During the typhoon disaster process, various disaster elements (such as infrastructure damage, house collapse, airport closure, etc.) will change with time and location, and the existing event extraction methods are difficult to capture these dynamic changes. For this reason, the paper proposes a new typhoon disaster state information extraction method (TDSIE), which is implemented through the following steps: 1. **Part - of - speech tagging and spatio - temporal information extraction**: First, perform part - of - speech tagging, and independently extract time and location information, and then merge these information to achieve comprehensive tagging of Chinese texts. 2. **Text segmentation into spatio - temporal semantic units**: Based on time and location tags, divide Chinese texts into different spatio - temporal semantic units, and expand the embedding features of word vectors, and use vector clustering to identify typhoon disaster state elements in each spatio - temporal semantic unit. 3. **State coreference relation identification**: Use time and location elements in the state as clues to identify coreference relations between different spatio - temporal semantic units and fuse relevant information. The experimental results show that the TDSIE method exceeds 85% in both precision and recall rates, and can effectively extract and integrate typhoon disaster state information, providing support for emergency management and disaster research.