Stable Preference: Redefining Training Paradigm of Human Preference Model for Text-to-Image Synthesis

Hanting Li,Hongjing Niu,Feng Zhao
DOI: https://doi.org/10.1007/978-3-031-73390-1_15
2024-01-01
Abstract:In recent years, deep generative models have developed rapidly and can generate high-quality images based on input texts. Assessing the quality of generated images in a way consistent with human preferences is critical for both generative model evaluation and preferred image selection. Previous works aligned models with human preferences by training scoring models on image pairs with preference annotations (e.g., ImageReward and HPD). These carefully annotated image pairs well describe human preferences for choosing images. However, current training paradigm of these preference models is to directly maximize the preferred image score while minimizing the non-preferred image score in each image pair through cross-entropy loss. This simple and naive training paradigm mainly has two problems: 1) For image pairs of similar quality, it is unreasonable to blindly minimize the score of non-preferred images and can easily lead to overfitting. 2) The human robustness to small visual perturbations is not taken into account, resulting in the final model being unable to make stable choices. Therefore, we propose Stable Preference to redefine the training paradigm of human preference model and a anti-interference loss to improve robustness to visual disturbances. Our method achieves state-of-the-art performance on two popular text-to-image human preference datasets. Extensive ablation studies and visualizations demonstrate the rationality and effectiveness of our method.
What problem does this paper attempt to address?