AdaptGCD: Multi-Expert Adapter Tuning for Generalized Category Discovery

Yuxun Qu,Yongqiang Tang,Chenyang Zhang,Wensheng Zhang
2024-10-29
Abstract:Different from the traditional semi-supervised learning paradigm that is constrained by the close-world assumption, Generalized Category Discovery (GCD) presumes that the unlabeled dataset contains new categories not appearing in the labeled set, and aims to not only classify old categories but also discover new categories in the unlabeled data. Existing studies on GCD typically devote to transferring the general knowledge from the self-supervised pretrained model to the target GCD task via some fine-tuning strategies, such as partial tuning and prompt learning. Nevertheless, these fine-tuning methods fail to make a sound balance between the generalization capacity of pretrained backbone and the adaptability to the GCD task. To fill this gap, in this paper, we propose a novel adapter-tuning-based method named AdaptGCD, which is the first work to introduce the adapter tuning into the GCD task and provides some key insights expected to enlighten future research. Furthermore, considering the discrepancy of supervision information between the old and new classes, a multi-expert adapter structure equipped with a route assignment constraint is elaborately devised, such that the data from old and new classes are separated into different expert groups. Extensive experiments are conducted on 7 widely-used datasets. The remarkable improvements in performance highlight the effectiveness of our proposals.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is **the challenges in the Generalized Category Discovery (GCD) task**. Specifically, the GCD task assumes that the unlabeled data set contains new categories that do not appear in the labeled data set, and aims to not only classify the old categories but also discover new categories in the unlabeled data. Existing GCD research usually transfers the general knowledge of self - supervised pre - training models to the target GCD task through some fine - tuning strategies (such as partial fine - tuning and prompt learning), but these methods fail to achieve a good balance between the generalization ability of the pre - training backbone network and the ability to adapt to the GCD task. ### Main problem description in the paper 1. **Limitations of traditional semi - supervised learning**: - The traditional semi - supervised learning paradigm is limited by the closed - world assumption, that is, the labels of all categories are known. - The GCD task relaxes this assumption and assumes that the unlabeled data set contains new categories. 2. **Deficiencies of existing methods**: - Existing GCD methods mainly transfer general knowledge by fine - tuning self - supervised pre - training models, but these methods fail to achieve a good balance between the generalization ability of the pre - training model and the ability to adapt to the GCD task. - The partial fine - tuning method may destroy the knowledge of the pre - training model, while the prompt learning method performs poorly when the data quality is poor and has limited adaptability. 3. **Information imbalance problem**: - In the GCD task, there is more supervision information for old categories than for new categories, causing the model to be inclined towards old categories and affecting the discovery of new categories. ### Solutions To solve the above problems, the paper proposes a new method named **AdaptGCD**, which introduces an adapter fine - tuning mechanism and designs a multi - expert adapter structure (MEA) and a routing assignment constraint. The specific contributions are as follows: 1. **First introduce adapter fine - tuning into the GCD task**: - Keep the parameters of the pre - training model unchanged and only introduce a small number of task - specific parameters, thereby enhancing the adaptability to the GCD task while retaining the pre - training knowledge. 2. **Multi - expert adapter structure (MEA)**: - Design a multi - expert adapter structure, and dispatch different experts to process old - class and new - class data through the routing layer to reduce the interference between the two types of data. 3. **Routing assignment constraint**: - Introduce a routing assignment constraint to ensure that old - class and new - class data are assigned to different expert groups, further alleviating the information imbalance problem. ### Experimental results The paper conducted a large number of experiments on 7 widely - used data sets, and the results show that AdaptGCD significantly improves the performance in the GCD task, verifying the effectiveness of this method. ### Formula summary - **Forward propagation formula of the bottleneck module**: \[ \Delta x^{l,i}=s\cdot \text{ReLU}(e^{x^{l,i}}W_{\text{down}}^{\top}+b_{\text{down}})W_{\text{up}}^{\top}+b_{\text{up}} \] - **Routing function of the multi - expert adapter**: \[ \omega^{l,i}=\text{Softmax}\left(\frac{e^{x^{l,i}}W_{\text{route}}^{\top}}{\tau_r}\right) \] - **Total loss function**: \[ L_{\text{overall}} = L_{\text{gcd}}+L_{\text{ra}} \] where \(L_{\text{ra}}\) is the routing assignment loss, which is composed of the balanced load loss \(L_{\text{bl}}\) and the partial balanced load loss \(L_{\text{pbl}}\). Through these improvements, AdaptGCD effectively solves the key challenges in the GCD task and provides important insights for future research.