Relevant Backtracking: an Efficient Telemetry Data Collection Method for Data Center Networks

Jiaqi Liu,Peng Xun,Baosheng Wang
DOI: https://doi.org/10.1109/iccsn57992.2023.10297330
2023-01-01
Abstract:As an important bridge connecting network measurement and management, the performance of telemetry data collection directly affects the efficiency of datacenter network (DCN) operation and maintenance. The biggest challenge with collection methods such as In-band measurement is controlling overhead and maintaining high efficiency. In order to solve the above problems, in this paper we propose a “Relevant Backtracking” method, which collects telemetry data by retracing a small amount of event-related data hop-by-hop along the real path to the edge switch without the involvement of the controller, while the upstream switch can also obtain remote potential anomalies information and decide what information to report instead of uploading a blanket of data. At the same time, to improve reliability and scalability of our methods, we have also made optimizations in many aspects. To adapt the scale of data at real-world traffic traces, by setting up a custom Cuckoo filter, it can insert and delete the flow data very conveniently, which can effectively relieve storage pressure. Moreover, relying on the “Fast-Slow” packet mechanism and self-designed backtracking packet format, the alarm information can be returned in time and easily extended to different troubleshooting algorithms. Finally, our simulation verifies the effectiveness and reliability of the method and the results show that our method has a high data recall rate, less event response time and low overhead in several realistic traffic traces.
What problem does this paper attempt to address?