The Importance of Generalizability in Machine Learning for Systems

Varun Gohil,Sundar Dev,Gaurang Upasani,David Lo,Parthasarathy Ranganathan,Christina Delimitrou
DOI: https://doi.org/10.1109/lca.2024.3384449
IF: 2.3
2024-05-03
IEEE Computer Architecture Letters
Abstract:Using machine learning (ML) to tackle computer systems tasks is gaining popularity. One of the shortcomings of such ML-based approaches is the inability of models to generalize to out-of-distribution data i.e., data whose distribution is different than the training dataset. We showcase that this issue exists in cloud environments by analyzing various ML models used to improve resource balance in Google's fleet. We discuss the trade-offs associated with different techniques used to detect out-of-distribution data. Finally, we propose and demonstrate the efficacy of using Bayesian models to detect the model's confidence in its output when used to improve cloud server resource balance.
computer science, hardware & architecture
What problem does this paper attempt to address?