Semantic Framework-Based Defect Text Mining Technique and Application in Power Grid

CAO Jing,CHEN Lushen,QIU Jian,WANG Huifang,YING Gaoliang,ZHANG Bo
DOI: https://doi.org/10.13335/j.1000-3673.pst.2016.1044
2017-01-01
Abstract:Power grid enterprises have large amounts of equipment defect texts in Chinese, containing important reliability information. It is of low efficiency and uncertain accuracy to mine information hiding behind the texts manually. Taking transformer defect texts as study object, after analyzing text characteristics, a defect text mining model is established based on semantic framework. The model provides a new technology for unstructured data mining in power grid domain because it solves problems of segmenting sentence elements of defect texts and extracting digital information precisely. Firstly, defect texts are pretreated based on established ontology thesaurus, such as segmentation and feature extraction. Then, power semantic framework and semantic slots are defined, process of slot-filling and semantic framework construction is raised, and ontology dictionary is auto-perfected by merging word series. Finally, application of defect text mining results in statistical reliability is studied. Example shows that the proposed mining technology is feasible and effective when applied to automatic classification and statistics of grid defect.
What problem does this paper attempt to address?