Automatic Recognition of Chinese Scientific and Technological Terms Using Integrated Linguistic Knowledge

ZF Sui,YR Chen,ZC Wei
DOI: https://doi.org/10.1109/nlpke.2003.1275948
2003-01-01
Abstract:We introduce our research on using integrated linguistic knowledge to automatically recognize Chinese scientific and technological terms based on the careful analysis of the characteristics of this kind of terms. The system of automatic term recognition includes two phases: learning stage and application stage. In the stage of learning, we use a series of machine learning methods to get various kinds of integrated knowledge for automatic term recognition from a large-scale corpus and a term bank. These knowledge includes the inner structural knowledge of terms, the statistical domain features of term component, the statistical mutual information between the components of terms, the outer environment features of terms and the distinct text-level features of term recognition etc.. In the stage of application, through an efficient model, we use all these various types of knowledge into automatic term recognition. The experiments show that the system can give great help to the expert of term standardization to discover new terms.
What problem does this paper attempt to address?