Scalable Delivery of Scalable Libraries and Tools: How ECP Delivered a Software Ecosystem for Exascale and Beyond

Michael A. Heroux
DOI: https://doi.org/10.1109/mcse.2024.3384937
2024-01-01
Computing in Science & Engineering
Abstract:The Exascale Computing Project (ECP) was one of the largest open-source scientific software development projects ever. It supported approximately 1,000 staff from US Department of Energy laboratories, and university and industry partners. About 250 staff contributed to 70 scientific libraries and tools to support applications on multiple exascale computing systems that were also under development. Funded as a formal construction project, ECP was required to use earned-value management, based on milestones, and a key performance parameter system based, in part, on integrations. With accelerated delivery schedules and significant project risk, we also emphasized software quality using community policies, automated testing, and continuous integration. Software Development Kit teams provided cross-team collaboration. and products were delivered via E4S, a curated portfolio of libraries and tools. In this paper, we discuss the organizational and management elements of ECP that enabled the delivery of libraries and tools, our lessons learned and our next steps.
computer science, interdisciplinary applications
What problem does this paper attempt to address?