Finally, how many efficiencies supercomputers have? And, what do they measure?

János Végh

DOI: https://doi.org/10.48550/arXiv.2001.01266

2022-07-11

Abstract:Using an extremely large number of processing elements in computing systems leads to unexpected phenomena, such as different efficiencies of the same system for different tasks, that cannot be explained in the frame of classical computing paradigm. The simple non-technical (but considering the temporal behavior of the components) model, introduced here, enables us to set up a frame and formalism, needed to explain those unexpected experiences around supercomputing. Introducing temporal behavior into computer science also explains why only the extreme scale computing enabled us to reveal the experienced limitations. The paper shows, that degradation of efficiency of parallelized sequential systems is a natural consequence of the classical computing paradigm, instead of being an engineering imperfectness. The workload, that supercomputers run, is much responsible for wasting energy, as well as limiting the size and type of tasks. Case studies provide insight, how different contributions compete for dominating the resulting payload performance of a computing system, and how enhancing the interconnection technology made computing+communication to dominate in defining the efficiency of supercomputers. Our model also enables to derive predictions about supercomputer performance limitations for the near future, as well as it provides hints for enhancing supercomputer components. Phenomena experienced in large-scale computing show interesting parallels with phenomena experienced in science, more than a century ago, and through their studying a modern science was developed.

Performance

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the performance efficiency problem of supercomputers when handling large - scale parallel tasks. Specifically, the paper explores the different efficiencies exhibited by supercomputers when performing different tasks, and these phenomena cannot be explained by traditional computing paradigms. The author proposes a simplified non - technical model based on component time behavior to establish a framework and formalism for explaining these unexpected phenomena surrounding supercomputing. By introducing time behavior into computer science, the paper explains why only extremely large - scale computing can allow us to discover these empirical limitations. The main contributions of the paper include: 1. **Proposing a model**: Introducing a simplified model that takes into account component time behavior to explain unexpected phenomena in supercomputing. 2. **Efficiency degradation**: Demonstrating that the efficiency degradation of parallelized sequential systems is a natural result of traditional computing paradigms, rather than an engineering defect. 3. **Impact of workload**: Analyzing how the workloads run by supercomputers can lead to energy waste and limit the scale and type of tasks. 4. **Predicting the future**: Based on the proposed model, predicting the limitations of future supercomputer performance and providing suggestions for enhancing supercomputer components. 5. **Historical analogy**: Pointing out that the phenomena emerging in large - scale computing have interesting parallels with the phenomena experienced in science more than a century ago, and through the study of these phenomena, modern science has been developed. Through case studies, the paper analyzes in detail how different contributing factors compete, affect the final load performance of computing systems, and how enhancing interconnect technology makes computing + communication a key factor in defining supercomputer efficiency.

Finally, how many efficiencies supercomputers have? And, what do they measure?

Supercomputers as a Continous Medium

Comments on the parallelization efficiency of the Sunway TaihuLight supercomputer

How Amdahl's low restricts supercomputer applications and building ever bigger supercomputers

Conceptual and Technical Challenges for High Performance Computing

Statistical considerations on limitations of supercomputers

High Performance Optimization at the Door of the Exascale

Speedup and efficiency of computational parallelization: A unifying approach and asymptotic analysis

Energy-Efficient Superconducting Computing—Power Budgets and Requirements

Opportunities and Challenges for Next Generation Computing

The performance wall of parallelized sequential computing: the dark performance and the roofline of performance gain

Is stochastic thermodynamics the key to understanding the energy costs of computation?

Full Lifecycle Data Analysis on a Large-scale and Leadership Supercomputer: What Can We Learn from It?

Energy Wall for Exascale Supercomputing

Trends in Energy Estimates for Computing in AI/Machine Learning Accelerators, Supercomputers, and Compute-Intensive Applications

Recent trends in the marketplace of high performance computing

Stochastic thermodynamics of computation

Practical Strategies for Power-Efficient Computing Technologies

Computing Just What You Need: Online Data Analysis and Reduction at Extreme Scales

Failure Analysis and Quantification for Contemporary and Future Supercomputers