LLM-based Extraction of Contradictions from Patents

Stefan Trapp,Joachim Warschat

2024-03-21

Abstract:Already since the 1950s TRIZ shows that patents and the technical contradictions they solve are an important source of inspiration for the development of innovative products. However, TRIZ is a heuristic based on a historic patent analysis and does not make use of the ever-increasing number of latest technological solutions in current patents. Because of the huge number of patents, their length, and, last but not least, their complexity there is a need for modern patent retrieval and patent analysis to go beyond keyword-oriented methods. Recent advances in patent retrieval and analysis mainly focus on dense vectors based on neural AI Transformer language models like Google BERT. They are, for example, used for dense retrieval, question answering or summarization and key concept extraction. A research focus within the methods for patent summarization and key concept extraction are generic inventive concepts respectively TRIZ concepts like problems, solutions, advantage of invention, parameters, and contradictions. Succeeding rule-based approaches, finetuned BERT-like language models for sentence-wise classification represent the state-of-the-art of inventive concept extraction. While they work comparatively well for basic concepts like problems or solutions, contradictions - as a more complex abstraction - remain a challenge for these models. This paper goes one step further, as it presents a method to extract TRIZ contradictions from patent texts based on Prompt Engineering using a generative Large Language Model (LLM), namely OpenAI's GPT-4. Contradiction detection, sentence extraction, contradiction summarization, parameter extraction and assignment to the 39 abstract TRIZ engineering parameters are all performed in a single prompt using the LangChain framework. Our results show that "off-the-shelf" GPT-4 is a serious alternative to existing approaches.

Computation and Language

What problem does this paper attempt to address?

The topic discussed in this paper is how to effectively extract technical contradictions (TRIZ contradictions) from patent texts. Traditional patent retrieval and analysis methods are based on keywords, but as the number, length, and complexity of patents increase, this method becomes insufficient. The paper points out that although there are some methods based on artificial intelligence (AI) and Transformer language models (such as BERT) to identify contradictions in patents, it is still a challenge. The paper proposes a new approach that utilizes large language models (LLM), especially OpenAI's GPT-4, to extract TRIZ contradictions through prompt engineering. The study used the existing patent dataset "PaGAN" to demonstrate the ability of GPT-4 to extract TRIZ contradictions in the "background" section of patents from the United States Patent and Trademark Office (USPTO). By comparing the results of GPT-4 with the annotated sentences from PaGAN, the paper shows the performance of GPT-4 in extracting contradictions, achieving a high F1 score of 0.93. The paper also introduces existing techniques such as rule-based NLP methods and BERT fine-tuning models, as well as complex multi-stage approaches (such as PaTRIZ) to extract innovative concepts, especially contradictions. Although PaTRIZ performs well in certain aspects, it still faces challenges in extracting contradictions. In contrast, the untuned GPT-4 model has been proven to be a viable alternative. Overall, the paper aims to address how to utilize the latest AI technologies, particularly LLM, to automatically detect and extract technical contradictions from patent literature in a more effective manner, overcoming the limitations of traditional methods.

LLM-based Extraction of Contradictions from Patents

PaTRIZ: A framework for mining TRIZ contradictions in patents

TRIZ-GPT: An LLM-augmented method for problem-solving

AutoTRIZ: Artificial Ideation with TRIZ and Large Language Models

Latent Semantic Extraction and Analysis for TRIZ-based Inventive Design

A contradiction solving method for complex product conceptual design based on deep learning and technological evolution patterns

Comparing Complex Concepts with Transformers: Matching Patent Claims Against Natural Language Text

Accelerating Science TRIZ inventive methodology in illustrations

Extraction and linking of motivation, specification and structure of inventions for early design use

A TRIZ-based Trimming Method for Patent Design Around

The process for individuating TRIZ Inventive Principles: deterministic, stochastic or domain-oriented?

PatentGPT: A Large Language Model for Patent Drafting Using Knowledge-based Fine-tuning Method

A Framework of Product Innovative Design Process Based on TRIZ and Patent Circumvention

Large-Scale Text Analysis Using Generative Language Models: A Case Study in Discovering Public Value Expressions in AI Patents

Automated requirement contradiction detection through formal logic and LLMs

TRIZ and technical epistemology

Can Large Language Models Generate High-quality Patent Claims?

Automated patent extraction powers generative modeling in focused chemical spaces

A new method for extracting knowledge from patents to inspire designers during the problem-solving phase

InstructPatentGPT: training patent language models to follow instructions with human feedback