CloudHeatMap: Heatmap-Based Monitoring for Large-Scale Cloud Systems

Sarah Sohana,William Pourmajidi,John Steinbacher,Andriy Miranskyy
2024-10-28
Abstract:Cloud computing is essential for modern enterprises, requiring robust tools to monitor and manage Large-Scale Cloud Systems (LCS). Traditional monitoring tools often miss critical insights due to the complexity and volume of LCS telemetry data. This paper presents CloudHeatMap, a novel heatmap-based visualization tool for near-real-time monitoring of LCS health. It offers intuitive visualizations of key metrics such as call volumes, response times, and HTTP response codes, enabling operators to quickly identify performance issues. A case study on the IBM Cloud Console demonstrates the tool's effectiveness in enhancing operational monitoring and decision-making. A demonstration is available at <a class="link-external link-https" href="https://www.youtube.com/watch?v=3u5K1qp51EA" rel="external noopener nofollow">this https URL</a> .
Distributed, Parallel, and Cluster Computing,Software Engineering
What problem does this paper attempt to address?