TALICS$^3$: Tape Library Cloud Storage System Simulator

Suayb S. Arslan,James Peng,Turguy Goker
DOI: https://doi.org/10.1016/j.simpat.2024.102947
2024-06-12
Abstract:High performance computing data is surging fast into the exabyte-scale world, where tape libraries are the main platform for long-term durable data storage besides high-cost DNA. Tape libraries are extremely hard to model, but accurate modeling is critical for system administrators to obtain valid performance estimates for their designs. This research introduces a discrete--event tape simulation platform that realistically models tape library behavior in a networked cloud environment, by incorporating real-world phenomena and effects. The platform addresses several challenges, including precise estimation of data access latency, rates of robot exchange, data collocation, deduplication/compression ratio, and attainment of durability goals through replication or erasure coding. Using the {proposed} simulator, {one can} compare the single enterprise configuration with multiple commodity library configurations, making it a useful tool for system administrators and reliability engineers. This makes the simulator a valuable tool for system administrators and reliability engineers, enabling them to acquire practical and dependable performance estimates for their enduring, cost-efficient cold data storage architecture designs.
Distributed, Parallel, and Cluster Computing,Systems and Control
What problem does this paper attempt to address?
The paper introduces and describes the TALICS3 (Tape Library Cloud Storage System Simulator), which is designed to address the challenges associated with modeling the complex behavior of tape library systems in cloud environments. Here are the key points regarding the problem the paper aims to solve: 1. **Complexity of Tape Library Systems**: Tape library systems are critical for long-term durable data storage, especially in high-performance computing (HPC) environments. However, they are difficult to model accurately due to their intricate design and the interplay between various components such as robots, drives, and cartridges. 2. **Performance Estimation**: Accurate modeling is essential for system administrators to estimate the performance of their designs, including data access latency, robot exchange rates, data collocation, deduplication/compression ratios, and durability goals. 3. **Comparative Analysis**: There is a need for a simulation platform that allows for comparative analyses between centralized single-library systems and distributed systems comprising multiple commodity libraries. 4. **Design Tool for Reliability Engineers**: The paper aims to provide a valuable design tool for reliability engineers, enabling them to acquire practical and dependable performance estimates for their enduring, cost-efficient cold data storage architecture designs. In summary, the paper seeks to solve the problem of accurately modeling the behavior of tape library systems in cloud environments.