Efficient Orchestrated AI Workflows Execution on Scale-out Spatial Architecture
Jinyi Deng,Xinru Tang,Zhiheng Yue,Guangyang Lu,Qize Yang,Jiahao Zhang,Jinxi Li,Chao Li,Shaojun Wei,Yang Hu,Shouyi Yin
DOI: https://doi.org/10.1109/tcasai.2024.3476237
2024-01-01
Abstract:Given the increasing complexity of AI applications, traditional spatialarchitectures frequently fall short. Our analysis identifies a pattern ofinterconnected, multi-faceted tasks encompassing both AI and generalcomputational processes. In response, we have conceptualized "Orchestrated AIWorkflows," an approach that integrates various tasks with logic-drivendecisions into dynamic, sophisticated workflows. Specifically, we find that theintrinsic Dual Dynamicity of Orchestrated AI Workflows, namely dynamicexecution times and frequencies of Task Blocks, can be effectively representedusing the Orchestrated Workflow Graph. Furthermore, the intrinsic DualDynamicity poses challenges to existing spatial architecture, namelyIndiscriminate Resource Allocation, Reactive Load Rebalancing, and ContagiousPEA Idleness. To overcome these challenges, we present Octopus, a scale-out spatialarchitecture and a suite of advanced scheduling strategies optimized forexecuting Orchestrated AI Workflows, such as the Discriminate Dual-SchedulingMechanism, Adaptive TBU Scheduling Strategy, and Proactive Cluster SchedulingStrategy. Our evaluations demonstrate that Octopus significantly outperformstraditional architectures in handling the dynamic demands of Orchestrated AIWorkflows, and possesses robust scalability in large scale hardware such aswafer-scale chip.