ORBIT: Oak Ridge Base Foundation Model for Earth System Predictability

Xiao Wang,Siyan Liu,Aristeidis Tsaris,Jong-Youl Choi,Ashwin Aji,Ming Fan,Wei Zhang,Junqi Yin,Moetasim Ashfaq,Dan Lu,Prasanna Balaprakash
2024-08-19
Abstract:Earth system predictability is challenged by the complexity of environmental dynamics and the multitude of variables involved. Current AI foundation models, although advanced by leveraging large and heterogeneous data, are often constrained by their size and data integration, limiting their effectiveness in addressing the full range of Earth system prediction challenges. To overcome these limitations, we introduce the Oak Ridge Base Foundation Model for Earth System Predictability (ORBIT), an advanced vision transformer model that scales up to 113 billion parameters using a novel hybrid tensor-data orthogonal parallelism technique. As the largest model of its kind, ORBIT surpasses the current climate AI foundation model size by a thousandfold. Performance scaling tests conducted on the Frontier supercomputer have demonstrated that ORBIT achieves 684 petaFLOPS to 1.6 exaFLOPS sustained throughput, with scaling efficiency maintained at 41% to 85% across 49,152 AMD GPUs. These breakthroughs establish new advances in AI-driven climate modeling and demonstrate promise to significantly improve the Earth system predictability.
Atmospheric and Oceanic Physics,Artificial Intelligence,Distributed, Parallel, and Cluster Computing,Image and Video Processing,Geophysics
What problem does this paper attempt to address?