Abstract:In recent years, due to the explosive growth of patent applications, patent mining has drawn extensive attention and interest. An important issue of patent mining is that of recognizing the technologies contained in patents, which serves as a fundamental preparation for deeper analysis. To this end, in this paper, we make a focused study on constructing a technology portrait for each patent, i.e., to recognize technical phrases concerned in it, which can summarize and represent patents from a technical perspective. Along this line, a critical challenge is how to analyze the unique characteristics of technical phrases and illustrate them with definite descriptions. Therefore, we first generate the detailed descriptions about the technical phrases existing in extensive patents based on different criteria, including various previous works, practical experience and statistical analyses. Then, considering the unique characteristics of technical phrases and the complex structure of patent documents, such as multi-aspect semantics and multi-level relevances, we further propose a novel unsupervised model, namely TechPat, which can not only automatically recognize technical phrases from massive patents but also avoid the need for expensive human labeling. After that, we evaluate the extraction results from various aspects. Specifically, we propose a novel evaluation metric called Information Retrieval Efficiency (IRE) to quantify the performance of extracted technical phrases from a new perspective. Extensive experiments on real-world patent data demonstrate that the TechPat model can effectively discriminate technical phrases in patents and greatly outperform existing methods. We further apply extracted technical phrases to two practical application tasks, namely patent search and patent classification, where the experimental results confirm the wide application prospects of technical phrases. Finally, we discuss the generalization ability of our proposed methods.

Chinese Patent Mining Based on Sememe Statistics and Key-Phrase Extraction

The patent mining analysis method based on Chinese word segmentation

Exploiting Semantic Knowledge Base for Patent Retrieval

A Semantic Query Expansion-Based Patent Retrieval Approach

An Ontology-Based Automatic Semantic Annotation Approach for Patent Document Retrieval in Product Innovation Design

An SDN Architecture for Patent Prior Art Search System Based on Phrase Embedding

Technical Phrase Extraction for Patent Mining: A Multi-level Approach

TechPat: Technical Phrase Extraction for Patent Mining

Automatic Abstraction of Long Chinese Patent Texts Based on P-Bertsum Model

Patent Keyword Extraction Algorithm Based on Distributed Representation for Patent Classification

PatentMiner: topic-driven patent analysis and mining.

Mining Parallel Knowledge from Comparable Patents

Towards Accurate Word Segmentation for Chinese Patents

A Patent Keyword Extraction Method Based on Corpus Classification

Mining Technical Topic Networks from Chinese Patents.

Chinese technical terminology extraction based on DC-value and information entropy

An Automatic Generation Method of Patent Specification Abstract Based on "Extraction- Abstraction "Model

PatentMiner: Patent Vacancy Mining via Context-enhanced and Knowledge-guided Graph Attention

Automatic summarization of long text of Chinese patents based on PatBertsum model

Experimental Study of Patent Information Content Mining

Patent Mining by Extracting Functional Analysis Information Modelled As Graph Structure: A Patent Knowledge-base Collaborative Building Approach