SLEGO: A Collaborative Data Analytics System with LLM Recommender for Diverse Users

Siu Lung Ng,Hirad Baradaran Rezaei,Fethi Rabhi
2024-08-19
Abstract:This paper presents the SLEGO (Software-Lego) system, a collaborative analytics platform that bridges the gap between experienced developers and novice users using a cloud-based platform with modular, reusable microservices. These microservices enable developers to share their analytical tools and workflows, while a simple graphical user interface (GUI) allows novice users to build comprehensive analytics pipelines without programming skills. Supported by a knowledge base and a Large Language Model (LLM) powered recommendation system, SLEGO enhances the selection and integration of microservices, increasing the efficiency of analytics pipeline construction. Case studies in finance and machine learning illustrate how SLEGO promotes the sharing and assembly of modular microservices, significantly improving resource reusability and team collaboration. The results highlight SLEGO's role in democratizing data analytics by integrating modular design, knowledge bases, and recommendation systems, fostering a more inclusive and efficient analytical environment.
Software Engineering,Artificial Intelligence
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the gap between users of different technical levels in current data analysis tools. Specifically: 1. **Skill gap between proficient developers and non - technical personnel**: Existing data analysis tools are either too complex and require programming knowledge (such as Python, Jupyter Notebook), or have limited functions and lack flexibility (such as Microsoft Excel). This has led to a significant skill gap between proficient developers and non - technical personnel, limiting the widespread use of data analysis. 2. **Lack of reusability and modularity of tools**: Many existing data analysis tools lack a modular design, resulting in redundant development and waste of resources. Reusable analysis functions or programs created by developers are often not fully utilized, increasing the development cost of new projects. 3. **Difficulty in sharing data analysis resources across teams and departments**: Due to differences in software and hardware configurations, sharing data analysis resources across different teams and departments is challenging. For example, a trend - prediction tool in the financial field may not be directly applicable to retail - sales prediction. 4. **Interoperability and standardization issues in collaborative data analysis**: Collaborative data analysis (CDA) is crucial in multiple fields, but current technical frameworks have deficiencies in data management, privacy protection, integrity, and interoperability, and it is difficult to efficiently support cross - platform data exchange and analysis. To solve these problems, the paper introduces the SLEGO (Software - Lego) system, a cloud - platform - based collaborative data analysis system. SLEGO bridges the above - mentioned gaps in the following ways: - **Modular microservice architecture**: Developers can publish analysis tools and workflows as microservices on the cloud platform, promoting modularity and reusability. - **Intuitive graphical user interface (GUI)**: Non - technical personnel can build complex analysis pipelines through simple drag - and - drop operations without programming skills. - **Recommendation system driven by large - language models (LLM)**: A recommendation system based on the knowledge base and LLM helps users select and integrate appropriate microservices, improving the efficiency of building analysis pipelines. - **Cloud - side storage and collaboration**: All analysis components and data are stored in the cloud, ensuring secure access and seamless collaboration. Through these innovations, SLEGO aims to democratize data analysis, enabling users of different technical levels to conduct data analysis efficiently, thereby promoting broader collaboration and innovation. ### Key Formulas This article does not involve specific mathematical, physical, or chemical formulas, mainly focusing on the system architecture and design concept. Therefore, there are no formulas that need to be specifically presented. ### Summary The SLEGO system combines modular design, a low - code platform, a knowledge base, and a recommendation system to solve the skill - gap problem of data analysis tools among different user groups, improving resource reusability and team - collaboration efficiency.