Large Language Models as Software Components: A Taxonomy for LLM-Integrated Applications

Irene Weber
2024-06-14
Abstract:Large Language Models (LLMs) have become widely adopted recently. Research explores their use both as autonomous agents and as tools for software engineering. LLM-integrated applications, on the other hand, are software systems that leverage an LLM to perform tasks that would otherwise be impossible or require significant coding effort. While LLM-integrated application engineering is emerging as new discipline, its terminology, concepts and methods need to be established. This study provides a taxonomy for LLM-integrated applications, offering a framework for analyzing and describing these systems. It also demonstrates various ways to utilize LLMs in applications, as well as options for implementing such integrations. Following established methods, we analyze a sample of recent LLM-integrated applications to identify relevant dimensions. We evaluate the taxonomy by applying it to additional cases. This review shows that applications integrate LLMs in numerous ways for various purposes. Frequently, they comprise multiple LLM integrations, which we term ``LLM components''. To gain a clear understanding of an application's architecture, we examine each LLM component separately. We identify thirteen dimensions along which to characterize an LLM component, including the LLM skills leveraged, the format of the output, and more. LLM-integrated applications are described as combinations of their LLM components. We suggest a concise representation using feature vectors for visualization. The taxonomy is effective for describing LLM-integrated applications. It can contribute to theory building in the nascent field of LLM-integrated application engineering and aid in developing such systems. Researchers and practitioners explore numerous creative ways to leverage LLMs in applications. Though challenges persist, integrating LLMs may revolutionize the way software systems are built.
Software Engineering,Machine Learning
What problem does this paper attempt to address?
The problem this paper attempts to address is: With the widespread application of large language models (LLMs) in software development, how to systematically classify and analyze applications integrated with LLMs. Specifically, the paper aims to provide a classification framework to help researchers and practitioners better understand and design these applications. ### Background of the Paper Large language models (LLMs) such as GPT-3.5, GPT-4, etc., have already made significant impacts in various fields, including medicine, law, marketing, education, and human resources. These models are widely adopted for their capabilities in text understanding, creative work, communication, knowledge work, and code writing. However, despite the increasing application of LLMs, research on how to integrate them as software components into applications is still in its early stages. Existing research mainly focuses on the role of LLMs as software development tools, with less attention on their functionality as software components. ### Research Objectives The goal of this paper is to develop a taxonomy that provides a structured framework for LLM-integrated applications. This taxonomy aims to: 1. **Classify and Analyze**: Provide a framework to classify and analyze LLM-integrated applications across various domains. 2. **Theory Building**: Contribute to the theoretical construction of the emerging field of LLM-integrated application engineering. 3. **Practical Guidance**: Inspire practitioners by showcasing potential uses of LLMs in applications and help identify challenges and solutions. ### Methodology To develop and evaluate the taxonomy, the authors adopted the following methods: 1. **Sample Collection**: Collected a set of LLM-integrated application samples from technical and industrial fields. 2. **Taxonomy Design**: Based on the classic three-layer software architecture, iteratively refined the taxonomy through multiple iterations. 3. **Evaluation**: Applied the taxonomy to new instances to verify its effectiveness and stability. 4. **Visualization**: Developed a compact visualization method based on feature vectors to better present the results of the taxonomy. ### Main Contributions 1. **Taxonomy**: Proposed a taxonomy with 13 dimensions to describe and analyze LLM components. 2. **Application Examples**: Showcased multiple real-world application examples to illustrate the use of LLM components in different scenarios. 3. **Visualization Method**: Developed a compact visualization method to facilitate the understanding and application of the taxonomy. ### Conclusion The taxonomy provides an effective framework for the description and analysis of LLM-integrated applications. It not only aids in theory building but also offers practical guidance for practitioners, promoting the development of LLM-integrated applications. Despite some challenges, integrating LLMs into applications has the potential to revolutionize the way software systems are built.