Human-in-the-Loop Mixup

Katherine M. Collins,Umang Bhatt,Weiyang Liu,Vihari Piratla,Ilia Sucholutsky,Bradley Love,Adrian Weller
2023-07-30
Abstract:Aligning model representations to humans has been found to improve robustness and generalization. However, such methods often focus on standard observational data. Synthetic data is proliferating and powering many advances in machine learning; yet, it is not always clear whether synthetic labels are perceptually aligned to humans -- rendering it likely model representations are not human aligned. We focus on the synthetic data used in mixup: a powerful regularizer shown to improve model robustness, generalization, and calibration. We design a comprehensive series of elicitation interfaces, which we release as HILL MixE Suite, and recruit 159 participants to provide perceptual judgments along with their uncertainties, over mixup examples. We find that human perceptions do not consistently align with the labels traditionally used for synthetic points, and begin to demonstrate the applicability of these findings to potentially increase the reliability of downstream models, particularly when incorporating human uncertainty. We release all elicited judgments in a new data hub we call H-Mix.
Machine Learning,Computer Vision and Pattern Recognition,Human-Computer Interaction
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is **whether the use of synthetic data in machine learning is consistent with human perception**. Specifically, the author focuses on whether the synthetic data labels used in the mixup method match human perceptual judgments. mixup is a method of generating new training data by linearly combining training samples and their labels, and has been proven to improve the robustness, generalization ability and calibration ability of the model. However, the labels of such synthetic data may not always be consistent with humans' actual perception of these synthetic images, which may lead to a decline in model performance. To explore this issue, the author designed a series of experiments, collected the perceptual judgments of 159 participants, and evaluated the human consistency of synthetic data labels through these judgments. The main research questions include: 1. **Which synthetic image ˜x do the participants think best matches the given label ˜y?** 2. **Given the synthetic image ˜x, how do humans perceive its label ˜y?** The author collected these data in two ways: - **Constructing mid - points**: Let the participants select an image that best matches a specific mixing ratio (for example, 50/50) from a set of pre - constructed linearly interpolated images. - **Inferring labels**: Let the participants directly assign labels to synthetic images and report their uncertainty about this judgment. The research results show that although overall, human perception has a certain alignment with the generation process of synthetic data, there are large differences among individuals, especially in certain specific category combinations. In addition, human estimates of mixing coefficients often show a non - linear sigmoid relationship, which is inconsistent with the traditional linear mixup assumption. These findings suggest that by adjusting the labels of synthetic data to better reflect human perception, the performance of the model may be further improved. In conclusion, this paper experimentally proves the inconsistency between synthetic data labels and human perception, and explores how to use human uncertainty and perceptual judgments to improve model training, thereby enhancing the robustness and generalization ability of the model.