When In-Network Computing Meets Distributed Machine Learning

Haowen Zhu,Wenchao Jiang,Qi Hong,Zehua Guo
DOI: https://doi.org/10.1109/mnet.2024.3368138
IF: 10.294
2024-01-01
IEEE Network
Abstract:Emerging In-Network Computing (INC) technique provides a new opportunity to improve application’s performance by using network programmability, computational capability, and storage capacity enabled by programmable switches. One typical application is Distributed Machine Learning (DML), which accelerates machine learning training by employing multiple works to train model parallelly. This paper introduces INC-based DML systems, analyzes performance improvement from using INC, and overviews current studies of INC-based DML systems. We also propose potential research directions for applying INC to DML systems.
computer science, information systems,telecommunications,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?