An empirical evaluation of using ChatGPT to summarize disputes for recommending similar labor and employment cases in Chinese

Po-Hsien Wu,Chao-Lin Liu,Wei-Jie Li
2024-09-14
Abstract:We present a hybrid mechanism for recommending similar cases of labor and employment litigations. The classifier determines the similarity based on the itemized disputes of the two cases, that the courts prepared. We cluster the disputes, compute the cosine similarity between the disputes, and use the results as the features for the classification tasks. Experimental results indicate that this hybrid approach outperformed our previous system, which considered only the information about the clusters of the disputes. We replaced the disputes that were prepared by the courts with the itemized disputes that were generated by GPT-3.5 and GPT-4, and repeated the same experiments. Using the disputes generated by GPT-4 led to better results. Although our classifier did not perform as well when using the disputes that the ChatGPT generated, the results were satisfactory. Hence, we hope that the future large-language models will become practically useful.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on two aspects: 1. **Generating points of contention in labor and employment cases using ChatGPT**: - The paper explores whether ChatGPT can be used to summarize and list the points of contention in labor and employment litigation. Since not all judgment documents contain clearly listed points of contention, this limits the scale of the research. Therefore, the author attempts to use ChatGPT to generate these points of contention in order to expand the scale of the experiment. 2. **Improving the performance of the similar - case recommendation system**: - The author proposes a hybrid mechanism for recommending similar labor and employment cases. The system judges the similarity between two cases through clustering and Convolutional Neural Network (CNN). Specifically, the system first clusters the points of contention, then calculates the cosine similarity between the points of contention, and uses the result as a feature for the classification task. The experimental results show that this hybrid method is superior to their previous method based only on the clustering of points of contention. ### Specific problem description #### 1. Generating points of contention using ChatGPT - **Background**: In the legal field, finding similar previous cases is of great significance for judges' judgments and lawyers' defenses. However, many judgment documents do not clearly list the points of contention, which limits the application scope of the similar - case recommendation system. - **Objective**: By using ChatGPT to generate points of contention, expand the scale of the experiment, thereby alleviating the problem of insufficient resources. - **Method**: The author designs a multi - step prompting strategy to guide ChatGPT to extract and list the points of contention from the statements of the plaintiff and the defendant. #### 2. Improving the performance of the similar - case recommendation system - **Background**: Previous recommendation systems mainly rely on the clustering results of points of contention to judge the similarity of cases, but the effect of this method is limited. - **Objective**: Develop a hybrid method combining clustering and convolutional neural network to improve the performance of the similar - case recommendation system. - **Method**: - **Data preparation**: Obtain judgment documents from the Taiwan Judicial Yuan and screen out labor and employment cases containing points of contention. - **Text embedding**: Use Sentence - BERT to vectorize the points of contention, where Lawformer or Chinese RoBERTa can be used as pre - trained models. - **Feature extraction**: Calculate the cosine similarity between the points of contention, construct a matrix, and convert it into a grayscale image. - **Classification training**: Use a convolutional neural network and a max - pooling layer to classify the grayscale image and judge whether two cases are similar. ### Experimental results - **Effect of ChatGPT in generating points of contention**: The points of contention generated by GPT - 4 are more effective than those generated by GPT - 3.5, but overall they are still not as good as those in the original judgment documents. Nevertheless, the results are still satisfactory, indicating that future large - language models may become more useful in practical applications. - **Classifier performance**: The hybrid method significantly improves the performance of the similar - case recommendation system, especially when using Fine - tuned Lawformer and Chinese RoBERTa. ### Conclusion - The author has successfully demonstrated how to use ChatGPT to generate points of contention to expand the application scope of the similar - case recommendation system. - The hybrid method combining clustering and convolutional neural network significantly improves the performance of the recommendation system. - Future research can further explore how to optimize the application of large - language models in the legal field.