Modeling and Predicting Power Consumption of High Performance Computing Jobs

Curtis Storlie,Joe Sexton,Scott Pakin,Michael Lang,Brian Reich,William Rust
DOI: https://doi.org/10.48550/arXiv.1412.5247
2014-12-17
Applications
Abstract:Power is becoming an increasingly important concern for large supercomputing centers. Due to cost concerns, data centers are becoming increasingly limited in their ability to enhance their power infrastructure to support increased compute power on the machine-room floor. At Los Alamos National Laboratory it is projected that future-generation supercomputers will be power-limited rather than budget-limited. That is, it will be less costly to acquire a large number of nodes than it will be to upgrade an existing data-center and machine-room power infrastructure to run that large number of nodes at full power. In the power-limited systems of the future, machines will in principle be capable of drawing more power than they have available. Thus, power capping at the node/job level must be used to ensure the total system power draw remains below the available level. In this paper, we present a statistically grounded framework with which to predict (with uncertainty) how much power a given job will need and use these predictions to provide an optimal node-level power capping strategy. We model the power drawn by a given job (and subsequently by the entire machine) using hierarchical Bayesian modeling with hidden Markov and Dirichlet process models. We then demonstrate how this model can be used inside of a power-management scheme to minimize the affect of power capping on user jobs.
What problem does this paper attempt to address?