Beyond Desktop Computation: Challenges in Scaling a GPU Infrastructure

Martin Uray,Eduard Hirsch,Gerold Katzinger,Michael Gadermayr
DOI: https://doi.org/10.1007/978-3-658-36295-9_11
2022-01-01
Abstract:Enterprises and labs performing computationally expensive data science applications sooner or later face the problem of scale but unconnected infrastructure. For this upscaling process, an IT service provider can be hired or in-house personnel can attempt to implement a software stack. The first option can be quite expensive if it is just about connecting several machines. For the latter option often experience is missing with the data science staff in order to navigate through the software jungle. In this technical report, we illustrate the decision process towards an on-premises infrastructure, our implemented system architecture, and the transformation of the software stack towards a scaleable Graphics Processing Unit (GPU) cluster system.
What problem does this paper attempt to address?