KeySight: Troubleshooting Programmable Switches Via Scalable High-Coverage Behavior Tracking

Yu Zhou,Jun Bi,Tong Yang,Kai Gao,Cheng Zhang,Jiamin Cao,Yangyang Wang
DOI: https://doi.org/10.1109/icnp.2018.00045
2018-01-01
Abstract:The rise of programmable switches and P4 brings much flexibility to networks, but this flexibility comes with increased risks of bugs. Diagnosing these bugs is essential for network operation but is non-trivial. A potential approach is to track packet behaviors through postcards, but existing tools either generate substantial postcards (limited scalability) or only track a small proportion of packet behaviors (low coverage). In this paper, we present KeySight, a platform that troubleshoots programmable switches with high scalability and high coverage. The key idea is based on the Packet Equivalence Class (PEC) abstraction that aggregates packets with identical behaviors and generates one postcard per behavior. The PEC abstraction minimizes the number of postcards while tracking all packet behaviors. We design novel algorithms to analyze PECs of P4 programs and to implement the PEC abstraction on programmable switches. We deploy KeySight on Tofino and SmartNIC, and evaluate it against 80 P4 programs and real packet traces of over 5TB. Results show that in the premise of overseeing over 99.9% packet behaviors, KeySight reduces the number of postcards by one to two orders of magnitude when comparing with NetSight.
What problem does this paper attempt to address?