AutoML in The Wild: Obstacles, Workarounds, and Expectations

Yuan Sun,Qiurong Song,Xinning Gui,Fenglong Ma,Ting Wang
DOI: https://doi.org/10.1145/3544548.3581082
2024-04-04
Abstract:Automated machine learning (AutoML) is envisioned to make ML techniques accessible to ordinary users. Recent work has investigated the role of humans in enhancing AutoML functionality throughout a standard ML workflow. However, it is also critical to understand how users adopt existing AutoML solutions in complex, real-world settings from a holistic perspective. To fill this gap, this study conducted semi-structured interviews of AutoML users (N=19) focusing on understanding (1) the limitations of AutoML encountered by users in their real-world practices, (2) the strategies users adopt to cope with such limitations, and (3) how the limitations and workarounds impact their use of AutoML. Our findings reveal that users actively exercise user agency to overcome three major challenges arising from customizability, transparency, and privacy. Furthermore, users make cautious decisions about whether and how to apply AutoML on a case-by-case basis. Finally, we derive design implications for developing future AutoML solutions.
Human-Computer Interaction,Artificial Intelligence
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on the challenges and limitations of AutoML (Automated Machine Learning) in practical applications. Specifically, through semi - structured interviews, the paper investigates the actual usage of 19 AutoML users from different fields, aiming to understand the following three aspects: 1. **Limitations of AutoML**: The limitations of AutoML that users encounter in real - world practice, including insufficient customization capabilities, lack of transparency, and privacy issues. 2. **Coping Strategies**: The strategies that users adopt to cope with these limitations. 3. **Impact on Use**: How these limitations and coping strategies affect users' use of AutoML. ### Main Findings #### 1. Insufficient Customization Capabilities - **Contextualizing Background Data**: Users improve the adaptability of AutoML by adding "contextual cues" to the input data. For example, when conducting a user experience study on voice self - tracking applications, P7 improved the system's performance by having participants provide more specific contextual information (such as "7 to 9 AM" instead of "7 to 9"). - **Integrating Domain Knowledge**: Users incorporate the knowledge of industry experts into the optimization goals of AutoML. For example, P13, through cooperation with enterprises in traditional industries, transformed the knowledge in fields such as inventory levels in supply chain management into the goals of model design. - **Building In - house AutoML Tools**: Users develop their own AutoML tools to support specific data types. For example, P11's company has developed an AutoML platform that supports tabular data, while mainstream platforms usually do not support this data type. #### 2. Lack of Transparency - **Manually Verifying AutoML Results**: Users manually verify the results generated by AutoML to ensure their accuracy. For example, P2, P10, and P16 all manually check the output of AutoML. - **Tracking the AutoML Process**: Users improve transparency by tracking the training process of AutoML. For example, P3 and P8 will pay attention to the changes in the learning curves during the training, validation, and testing processes. - **Creating Custom Visualization Tools**: Users develop custom visualization tools to better understand the operation process of AutoML. For example, P1, P2, P3, P5, P13, and P15 all use custom - made visualization tools. #### 3. Privacy Issues - **Eradicating Privacy Leaks**: Users take measures to prevent data leakage. For example, P6, P10, P13, P15, P17, P18, and P19 all emphasize this point. - **Applying Privacy - Protection Technologies**: Users use privacy - protection technologies to process sensitive data. For example, P3 and P9 use technologies such as differential privacy. - **Entrusting Legal Supervision**: Users rely on legal and compliance teams to handle privacy issues. For example, P1 and P13 hand over privacy issues to the legal team for handling. - **Selecting Trustworthy Platforms**: Users choose well - reputed AutoML platforms to reduce privacy risks. For example, P2, P4, P5, P8, P9, and P17 all choose trustworthy platforms. ### Conclusion Through in - depth analysis of users' actual usage, the paper reveals the main challenges of AutoML in practical applications and proposes various coping strategies adopted by users. These findings not only help to understand how users use AutoML in complex real - world environments but also provide valuable references for the future design of AutoML.