Large Language Models Enable Few-Shot Clustering

Vijay Viswanathan,Kiril Gashteovski,Carolin Lawrence,Tongshuang Wu,Graham Neubig

2023-07-02

Abstract:Unlike traditional unsupervised clustering, semi-supervised clustering allows users to provide meaningful structure to the data, which helps the clustering algorithm to match the user's intent. Existing approaches to semi-supervised clustering require a significant amount of feedback from an expert to improve the clusters. In this paper, we ask whether a large language model can amplify an expert's guidance to enable query-efficient, few-shot semi-supervised text clustering. We show that LLMs are surprisingly effective at improving clustering. We explore three stages where LLMs can be incorporated into clustering: before clustering (improving input features), during clustering (by providing constraints to the clusterer), and after clustering (using LLMs post-correction). We find incorporating LLMs in the first two stages can routinely provide significant improvements in cluster quality, and that LLMs enable a user to make trade-offs between cost and accuracy to produce desired clusters. We release our code and LLM prompts for the public to use.

Computation and Language

What problem does this paper attempt to address?

The problem this paper attempts to address is: How to utilize large language models (LLMs) to achieve few-shot semi-supervised text clustering, thereby reducing the amount of feedback required from experts and improving clustering quality. Specifically, traditional unsupervised clustering methods fail to meet the specific needs of domain experts because they cannot organize data without explicit guidance. Existing semi-supervised clustering methods, while allowing expert feedback, typically require a significant amount of expert intervention, which is costly and inefficient in practical applications. Therefore, this paper proposes a new approach that leverages large language models to amplify expert guidance, enabling the clustering algorithm to efficiently generate high-quality clustering results with minimal feedback. The paper mainly explores how to integrate large language models into the clustering process in three stages: 1. **Pre-clustering**: Enhancing text representation by generating key phrases. 2. **During clustering**: Guiding the clustering algorithm by providing pairwise constraints. 3. **Post-clustering**: Improving clustering results by correcting low-confidence cluster assignments. Experimental results show that using large language models in the first two stages (pre-clustering and during clustering) can significantly improve clustering quality and achieve results close to traditional semi-supervised clustering methods on certain tasks, but at a much lower cost. However, the post-clustering correction effect is limited.

Large Language Models Enable Few-Shot Clustering

ClusterLLM: Large Language Models as a Guide for Text Clustering

Human-interpretable clustering of short-text using large language models

Context-Aware Clustering using Large Language Models

Text Clustering with Large Language Model Embeddings

Balanced Data Sampling for Language Model Training with Clustering

Text Clustering as Classification with LLMs

Scaling Expert Language Models with Unsupervised Domain Discovery

Large Language Models Offer an Alternative to the Traditional Approach of Topic Modelling

TeC: A Novel Method for Text Clustering with Large Language Models Guidance and Weakly-Supervised Contrastive Learning

Supervised Knowledge Makes Large Language Models Better In-context Learners

Clustering Algorithms and RAG Enhancing Semi-Supervised Text Classification with Large LLMs

ZeroDL: Zero-shot Distribution Learning for Text Clustering via Large Language Models

Large Language Models are Good Prompt Learners for Low-Shot Image Classification

Large Language Model Enhanced Machine Learning Estimators for Classification

Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling

Large Language Model with Graph Convolution for Recommendation

Large Language Models aren't all that you need

Generalized Category Discovery with Large Language Models in the Loop

Large Language Model Enhanced Clustering for News Event Detection

Large Language Models Enhanced Collaborative Filtering