TabulaROSA: Tabular Operating System Architecture for Massively Parallel Heterogeneous Compute Engines
Jeremy Kepner,Ron Brightwell,Alan Edelman,Vijay Gadepally,Hayden Jananthan,Michael Jones,Sam Madden,Peter Michaleas,Hamed Okhravi,Kevin Pedretti,Albert Reuther,Thomas Sterling,Mike Stonebraker
DOI: https://doi.org/10.1109/HPEC.2018.8547577
2018-07-14
Abstract:The rise in computing hardware choices is driving a reevaluation of operating systems. The traditional role of an operating system controlling the execution of its own hardware is evolving toward a model whereby the controlling processor is distinct from the compute engines that are performing most of the computations. In this context, an operating system can be viewed as software that brokers and tracks the resources of the compute engines and is akin to a database management system. To explore the idea of using a database in an operating system role, this work defines key operating system functions in terms of rigorous mathematical semantics (associative array algebra) that are directly translatable into database operations. These operations possess a number of mathematical properties that are ideal for parallel operating systems by guaranteeing correctness over a wide range of parallel operations. The resulting operating system equations provide a mathematical specification for a Tabular Operating System Architecture (TabulaROSA) that can be implemented on any platform. Simulations of forking in TabularROSA are performed using an associative array implementation and compared to Linux on a 32,000+ core supercomputer. Using over 262,000 forkers managing over 68,000,000,000 processes, the simulations show that TabulaROSA has the potential to perform operating system functions on a massively parallel scale. The TabulaROSA simulations show 20x higher performance as compared to Linux while managing 2000x more processes in fully searchable tables.
Distributed, Parallel, and Cluster Computing,Databases,Operating Systems,Performance