A Blueprint Architecture of Compound AI Systems for Enterprise

Eser Kandogan,Sajjadur Rahman,Nikita Bhutani,Dan Zhang,Rafael Li Chen,Kushan Mitra,Sairam Gurajada,Pouya Pezeshkpour,Hayate Iso,Yanlin Feng,Hannah Kim,Chen Shen,Jin Wang,Estevam Hruschka
2024-06-02
Abstract:Large Language Models (LLMs) have showcased remarkable capabilities surpassing conventional NLP challenges, creating opportunities for use in production use cases. Towards this goal, there is a notable shift to building compound AI systems, wherein LLMs are integrated into an expansive software infrastructure with many components like models, retrievers, databases and tools. In this paper, we introduce a blueprint architecture for compound AI systems to operate in enterprise settings cost-effectively and feasibly. Our proposed architecture aims for seamless integration with existing compute and data infrastructure, with ``stream'' serving as the key orchestration concept to coordinate data and instructions among agents and other components. Task and data planners, respectively, break down, map, and optimize tasks and data to available agents and data sources defined in respective registries, given production constraints such as accuracy and latency.
Databases,Artificial Intelligence
What problem does this paper attempt to address?
This paper mainly discusses how to build a Compound AI (Artificial Intelligence) system suitable for enterprise environments to address the challenges encountered when integrating large language models (LLMs) into production environments effectively. In existing methods, LLMs play a central role in task planning and data retrieval, but considerations such as latency, accuracy, and cost need to be taken into account during actual deployment. The paper proposes a blueprint architecture aimed at seamlessly integrating the Compound AI system with existing computational and data infrastructure in an economically viable manner. Key design factors include: 1. Ensuring smooth integration with existing infrastructure through appropriate touchpoints and interfaces. 2. Efficient coordination of internal and external components through proper resource allocation and workflow coordination. 3. Maximizing system utilization while reducing costs. The key components of this architecture include: 1. Agents: Computational entities that perform tasks and can interact with service APIs, LLMs, and other tools. 2. Agent and data registries: Store and organize metadata of deployed models, APIs, databases, and tools for integration purposes. 3. Streams: The primary concept for coordinating data and instructions across components. 4. Task and data schedulers: Optimize task execution and data retrieval based on cost and quality constraints. Event-driven orchestration is achieved through the concepts of streams and sessions, while task and data schedulers are used to optimize the execution of tasks and data operations based on production constraints. The paper believes that this blueprint architecture helps in developing reliable, efficient, and user-friendly artificial intelligence applications, and encourages interdisciplinary research.