Abstract:Patent classification aims to assign multiple International Patent Classification (IPC) codes to a given patent. Recent methods for automatically classifying patents mainly focus on analyzing the text descriptions of patents. However, apart from the texts, each patent is also associated with some assignees, and the knowledge of their applied patents is often valuable for classification. Furthermore, the hierarchical taxonomy formulated by the IPC system provides important contextual information and enables models to leverage the correlations between IPC codes for more accurate classification. However, existing methods fail to incorporate the above aspects. In this paper, we propose an integrated framework that comprehensively considers the information on patents for patent classification. To be specific, we first present an IPC codes correlations learning module to derive their semantic representations via adaptively passing and aggregating messages within the same level and across different levels along the hierarchical taxonomy. Moreover, we design a historical application patterns learning component to incorporate the corresponding assignee's previous patents by a dual channel aggregation mechanism. Finally, we combine the contextual information of patent texts that contains the semantics of IPC codes, and assignees' sequential preferences to make predictions. Experiments on real-world datasets demonstrate the superiority of our approach over the existing methods. Besides, we present the model's ability to capture the temporal patterns of assignees and the semantic dependencies among IPC codes.

What problem does this paper attempt to address?

The paper attempts to address several key issues in patent classification, specifically: 1. **Limitations of existing methods**: Current automatic patent classification methods mainly focus on analyzing the textual description of patents, while neglecting other relevant information, such as the applicant's historical patent records and the hierarchical relationships between International Patent Classification (IPC) codes. This information is crucial for improving classification accuracy. 2. **Utilization of applicant behavior patterns**: The paper points out that applicants exhibit certain behavior patterns when applying for patents, especially in the continuous application behavior within specific technical fields. By analyzing these historical behaviors, it is possible to better predict the current patent's IPC code, thereby improving classification accuracy. 3. **Semantic dependencies between IPC codes**: IPC codes are organized in a hierarchical structure, and there are complex semantic relationships between codes at different levels. Existing methods often fail to fully utilize the information in these hierarchical structures, leading to limited classification performance. To address the above challenges, the paper proposes an integrated framework that comprehensively considers various information related to patents, including: - **Patent text embedding module**: Converts the patent text description into semantic vectors, generating meaningful contextual representations. - **IPC code correlation learning module**: Establishes semantic relationships between IPC codes by horizontally and vertically propagating messages within the IPC hierarchy. - **Historical application pattern learning module**: Learns the applicant's higher-order temporal preferences by aggregating contextual and label information from historical patents. - **Prediction module**: Combines contextual embedding information (including the semantics of IPC codes and the applicant's application preferences) to predict the classification probability of patent documents. Through the comprehensive use of these modules, the paper aims to provide a more comprehensive and accurate patent classification solution. Experimental results show that this method outperforms existing methods on real datasets and effectively captures the applicant's historical behavior patterns and the semantic dependencies between IPC codes.

Adaptive Taxonomy Learning and Historical Patterns Modelling for Patent Classification

Adaptive Taxonomy Learning and Historical Patterns Modelling for Patent Classification

Event-based Dynamic Graph Representation Learning for Patent Application Trend Prediction

An Ontology-Based Automatic Semantic Annotation Approach for Patent Document Retrieval in Product Innovation Design

Supervised Approaches to Assign Cooperative Patent Classification (CPC) Codes to Patents

Knowledge Powered Cooperative Semantic Fusion for Patent Classification

Identifying patent classification codes associated with specific search keywords using machine learning

Hierarchical multi-instance multi-label learning for Chinese patent text classification

Mapping Patent Classifications: Portfolio and Statistical Analysis, and the Comparison of Strengths and Weaknesses

Analysis of the effect of data properties in automated patent classification

An automatic classification method for patents

An ensemble framework for patent classification

Patent Classifications as Indicators of Intellectual Organization

A Use Case of Patent Classification Using Deep Learning with Transfer Learning

Multi-label Classification and Interactive NLP-based Visualization of Electric Vehicle Patent Data

Exploiting Ontologies to Rank Relationships Between Patents

Patent2Vec: Multi-view representation learning on patent-graphs for patent classification

Research on the Classification and Identification Method of Fintech Patents

Parameter tuning Naïve Bayes for automatic patent classification

DeepPatent: patent classification with convolutional neural networks and word embedding

$\texttt{PatentAgent}$: Intelligent Agent for Automated Pharmaceutical Patent Analysis