Differentially Private Parameter-Efficient Fine-tuning for Large ASR Models

Hongbin Liu,Lun Wang,Om Thakkar,Abhradeep Thakurta,Arun Narayanan
2024-10-03
Abstract:Large ASR models can inadvertently leak sensitive information, which can be mitigated by formal privacy measures like differential privacy (DP). However, traditional DP training is computationally expensive, and can hurt model performance. Our study explores DP parameter-efficient fine-tuning as a way to mitigate privacy risks with smaller computation and performance costs for ASR models. Through extensive experimentation and progressive optimization, we achieve 4.6%/8.1% word error rate on LibriSpeech clean/other test-sets, setting a new performance benchmark while maintaining (10, 3.52e-6)-DP in fine-tuning a large ASR model with over 600M parameters.
Cryptography and Security
What problem does this paper attempt to address?
This paper aims to solve the problem that large - scale Automatic Speech Recognition (ASR) models may leak sensitive information during the training process. Specifically, although traditional Differential Privacy (DP) training methods can protect privacy, they are computationally costly and may damage model performance. Therefore, this paper explores the method of Differential Privacy - Parameter - Efficient Fine - Tuning (DP - PEFT) to improve the privacy protection level of ASR models while reducing computational costs and performance losses. ### Main Research Contents 1. **Background and Motivation**: - **Development of ASR Models**: In recent years, ASR technology based on large - scale pre - trained models has made remarkable progress. These models perform excellently in speech recognition tasks and are widely used in multi - modal understanding and large - language models. - **Privacy Issues**: Research shows that large - scale ASR models may inadvertently remember rare or unique samples in their fine - tuning data, which has led to the need for privacy - protection technologies. - **Limitations of Traditional DP Methods**: Traditional differential privacy training methods (such as DP - SGD) will lead to a decline in model performance and an increase in computational costs when dealing with large - scale models. 2. **Research Methods**: - **DP - PEFT Method**: This paper comprehensively studies the application of DP - PEFT in ASR model fine - tuning for the first time. Through a large number of experiments and optimizations, different DP - PEFT methods (such as DP - BitFit, DP - LoRA, DP - Compacter, etc.) are compared. - **Optimization Strategies**: The author conducts detailed ablation studies and optimizes the application of existing DP - PEFT methods in ASR models. For example, adjusting specific bias terms, different parameter initialization strategies, etc. - **Application of Synthetic Data**: A method of using low - quality synthetic audio data to improve DP - BitFit initialization is proposed, thereby further enhancing model performance. 3. **Experimental Results**: - **Performance Comparison**: On the LibriSpeech test set, the model fine - tuned by the DP - PEFT method achieves a low Word Error Rate (WER) while maintaining a high privacy protection level ((10, 3.52e−6)-DP). Among them, DP - BitFit performs the best, reaching a WER of 4.6% (clean) and 8.1% (other). - **Computational Efficiency**: Compared with the traditional DP - FT method, the DP - PEFT method can achieve better performance under the same computational resources. ### Conclusion This paper demonstrates the effectiveness and superiority of the DP - PEFT method in fine - tuning large - scale ASR models through extensive experiments and optimizations. In particular, the DP - BitFit method provides strong privacy protection while maintaining low computational costs and high performance. In addition, pre - training with low - quality synthetic data further improves the performance of the model, providing new ideas for privacy protection in practical applications. ### Limitations Although this paper has achieved remarkable results in differential - privacy ASR models, there is still a certain performance gap compared with non - privacy - protected models. Future research needs to further explore how to minimize performance losses while ensuring privacy.