Towards Bilingual Term Extraction in Comparable Patents.

Bin Lu,Benjamin K. Tsou
2009-01-01
Abstract:In order to extract bilingual terms in a corpus of comparable patents, we present a novel framework in this paper. The framework includes the following major steps: 1) extract monolingual single-word and multi-word term candidates in monolingual patents; 2) Find parallel sentences in comparable patents; 3) extract bilingual single-word and multi-word term candidates; 4) identify correct bilingual terms using a SVM classifier by integrating both linguistic and statistical information. The experimental results show that the framework can well identify correct bilingual terms from comparable patents, and the SVM classifier can further improve its performance.
What problem does this paper attempt to address?