Abstract:Patent application technology disclosure document is one of the important bases for judging patent novelty and uniqueness. Automated evaluation can effectively solve the problems of long time and strong subjectivity of human evaluation. The text similarity evaluation algorithm based on corpus and deep learning technology has problems such as insufficient amount of cross-library learning data and insufficient core content tendency in the similarity judgment of patent application technology disclosure document, which limits their performance and practical application. In this paper, we propose a similarity evaluation method of patent application technology disclosure document based on multi-dimensional fusion strategy to realize the similarity measurement of patents. Firstly, in the text preprocessing section, word segmentation reconstruction and similarity evaluation optimization strategies based on word frequency and part-of-speech score weighted fusion are proposed. Then, a similarity calculation method of patent application technology disclosure document based on two new mapping spaces of dot matrix and image is proposed to achieve a more diversified comprehensive evaluation. The algorithm was evaluated by using four published text similarity matching datasets (containing 0–5 or 0/1 labels) and a set of patent application technology disclosure documents. Experimental results show that on the published text similarity matching datasets, the similarity evaluation method under the multi-dimensional fusion strategy proposed in this paper has a discrimination accuracy improvement of about 10% compared to traditional vector semantic model, and can match the discriminative ability of lightweight deep learning models without the need for training. At the same time, the discrimination accuracy of the proposed method on the sample dataset of patent application technology disclosure document is superior to traditional vector semantic model (20%) and various deep learning models (1%-8%), and the precision and recall rate are relatively balanced. The visual analysis results on the dataset of the patent application technology disclosure documents also prove the effectiveness and reliability of the similarity calculation method proposed in the dot matrix and image space, which provide a new idea and method for the similarity evaluation between patent application technology disclosure document.

Chinese long text similarity calculation of semantic progressive fusion based on Bert

An adaptive method for text domain similarity calculation

Semantic Similarity Computing Model Based on Multi Model Fine-Grained Nonlinear Fusion

Hybrid Attention Based Neural Architecture for Text Semantics Similarity Measurement

Traditional Chinese Medicine Text Similarity Calculation Model Based on the Bidirectional Temporal Siamese Network

Sentence Similarity Computation by Integrating Shallow and Deep Information

Legal Feature Enhanced Semantic Matching Network for Similar Case Matching

Semantic Similarity Matching for Patent Documents Using Ensemble BERT-related Model and Novel Text Processing Method

A multi-dimensional fusion strategy similarity measure method for patent application technology disclosure document

Bridging the Semantic Latent Space Between Brain and Machine: Similarity is All You Need

Improving Text Matching with Semantic Dependency Graph via Message Passing Neural Network

Sentence Similarity Computation in Question Answering Robot

Automatic Abstraction of Long Chinese Patent Texts Based on P-Bertsum Model

Semantic Matching Model based on Layer-Wise Attention Pooling Network and Dynamic Feature Fusion Mechanism

Deep Fusion LSTMs for Text Semantic Matching.

Short text matching model with multiway semantic interaction based on multi-granularity semantic embedding

Text Semantic Communication Systems with Sentence-Level Semantic Fidelity

A joint FrameNet and element focusing Sentence-BERT method of sentence similarity computation

Chinese Semantic Matching with Multi-granularity Alignment and Feature Fusion

Research on Chinese Semantic Similarity Algorithm

Chinese Sentences Similarity via Cross-Attention Based Siamese Network