Abstract:High performance computing (HPC) and cloud have traditionally been separate, and presented in an adversarial light. The conflict arises from disparate beginnings that led to two drastically different cultures, incentive structures, and communities that are now in direct competition with one another for resources, talent, and speed of innovation. With the emergence of converged computing, a new paradigm of computing has entered the space that advocates for bringing together the best of both worlds from a technological and cultural standpoint. This movement has emerged due to economic and practical needs. Emerging heterogeneous, complex scientific workloads that require an orchestration of services, simulation, and reaction to state can no longer be served by traditional HPC paradigms. However, while cloud offers automation, portability, and orchestration, as it stands now it cannot deliver the network performance, fine-grained resource mapping, or scalability that these same simulations require. These novel requirements call for change not just in workflow software or design, but also in the underlying infrastructure to support them. This is one of the goals of converged computing. While the future of traditional HPC and commercial cloud cannot be entirely known, a reasonable approach to take is one that focuses on new models of convergence, and a collaborative mindset. In this paper, we introduce a new paradigm for compute -- a traditional HPC workload manager, Flux Framework, running seamlessly with a user-space Kubernetes "Usernetes" to bring a service-oriented, modular, and portable architecture directly to on-premises HPC clusters. We present experiments that assess HPC application performance and networking between the environments, and provide a reproducible setup for the larger community to do exactly that.

Towards Standard Kubernetes Scheduling Interfaces for Converged Computing

HPC Alongside User-space Kubernetes

The Flux Operator

Fine-Grained Scheduling for Containerized HPC Workloads in Kubernetes Clusters

Cloud-Fog Automation: Heterogenous Applications over New Generation Infrastructure of Virtualized Computing and Converged Networks

Kub: Enabling Elastic HPC Workloads on Containerized Environments

Container orchestration on HPC systems through Kubernetes

A Dynamic, Hierarchical Resource Model for Converged Computing

Converged Computing: A Best of Both Worlds of High-Performance Computing and Cloud

Hybrid Workload Scheduling on HPC Systems

KCES: A Workflow Containerization Scheduling Scheme Under Cloud-Edge Collaboration Framework

Preemptive and Low Latency Datacenter Scheduling via Lightweight Containers

Co-Scheduler: A Coflow-Aware Data-Parallel Job Scheduler in Hybrid Electrical/Optical Datacenter Networks

Qubernetes: Towards a Unified Cloud-Native Execution Platform for Hybrid Classic-Quantum Computing

On the Convergence of Malleability and the HPC PowerStack: Exploiting Dynamism in Over-Provisioned and Power-Constrained HPC Systems

Distributed Bottleneck-Aware Coflow Scheduling in Data Centers

Concurrent container scheduling on heterogeneous clusters with multi-resource constraints

A Cost-Efficient Container Orchestration Strategy in Kubernetes-Based Cloud Computing Infrastructures with Heterogeneous Resources

Energy-Efficient Scheduling of HPC Applications in Cloud Computing Environments

A HPC Co-Scheduler with Reinforcement Learning

Auto-scaling HTCondor pools using Kubernetes compute resources