Scalable Systems and Software Architectures for High-Performance Computing on cloud platforms

Risshab Srinivas Ramesh
2024-08-19
Abstract:High-performance computing (HPC) is essential for tackling complex computational problems across various domains. As the scale and complexity of HPC applications continue to grow, the need for scalable systems and software architectures becomes paramount. This paper provides a comprehensive overview of architecture for HPC on premise focusing on both hardware and software aspects and details the associated challenges in building the HPC cluster on premise. It explores design principles, challenges, and emerging trends in building scalable HPC systems and software, addressing issues such as parallelism, memory hierarchy, communication overhead, and fault tolerance on various cloud platforms. By synthesizing research findings and technological advancements, this paper aims to provide insights into scalable solutions for meeting the evolving demands of HPC applications on cloud.
Distributed, Parallel, and Cluster Computing,Performance
What problem does this paper attempt to address?