The Research and Realization about Automatic Abstracting Based on Text Clustering and Natural Language Understanding

Guo Qing-lin,Fan Xiao-zhong,Liu Chang-an
DOI: https://doi.org/10.1007/s11460-006-0088-y
2006-01-01
Frontiers of Electrical and Electronic Engineering in China
Abstract:A method of realization of automatic abstracting based on text clustering and natural language understanding is explored, aimed at overcoming shortages of some current methods. The method makes use of text clustering and can realize automatic abstracting of multi-documents. The algorithm of twice word segmentation based on the title and first sentences in paragraphs is investigated. Its precision and recall is above 95 %. For a specific domain on plastics, an automatic abstracting system named TCAAS is implemented. The precision and recall of multi-document’s automatic abstracting is above 75%. Also, the experiments prove that it is feasible to use the method to develop a domain automatic abstracting system, which is valuable for further in-depth study.
What problem does this paper attempt to address?