Cascade: A Platform for Delay-Sensitive Edge Intelligence

Weijia Song,Thiago Garrett,Yuting Yang,Mingzhao Liu,Edward Tremel,Lorenzo Rosa,Andrea Merlina,Roman Vitenberg,Ken Birman
2023-11-29
Abstract:Interactive intelligent computing applications are increasingly prevalent, creating a need for AI/ML platforms optimized to reduce per-event latency while maintaining high throughput and efficient resource management. Yet many intelligent applications run on AI/ML platforms that optimize for high throughput even at the cost of high tail-latency. Cascade is a new AI/ML hosting platform intended to untangle this puzzle. Innovations include a legacy-friendly storage layer that moves data with minimal copying and a "fast path" that collocates data and computation to maximize responsiveness. Our evaluation shows that Cascade reduces latency by orders of magnitude with no loss of throughput.
Artificial Intelligence,Operating Systems
What problem does this paper attempt to address?