A Feature Dataset of Microservices-based Systems

Weipan Yang,Yongchao Xing,Yiming Lyu,Zhihao Liang,Zhiying Tu
2024-04-02
Abstract:Microservice architecture has become a dominant architectural style in the service-oriented software industry. Poor practices in the design and development of microservices are called microservice bad smells. In microservice bad smells research, the detection of these bad smells relies on feature data from microservices. However, there is a lack of an appropriate open-source microservice feature dataset. The availability of such datasets may contribute to the detection of microservice bad smells unexpectedly. To address this research gap, this paper collects a number of open-source microservice systems utilizing Spring Cloud. Additionally, feature metrics are established based on the architecture and interactions of Spring Boot style microservices. And an extraction program is developed. The program is then applied to the collected open-source microservice systems, extracting the necessary information, and undergoing manual verification to create an open-source feature dataset specific to microservice systems using Spring Cloud. The dataset is made available through a CSV file. We believe that both the extraction program and the dataset have the potential to contribute to the study of micro-service bad smells.
Software Engineering
What problem does this paper attempt to address?
This research paper focuses on a problem in microservices architecture, which is the anti-patterns or "bad smells" in microservices. Currently, there is a lack of a suitable open-source dataset for microservice characteristics in the research of microservice anti-patterns, which hinders the effective detection of such problems. Therefore, the goal of this paper is to collect a set of open-source microservice systems based on Spring Cloud, establish a set of microservice characteristic metrics, develop an extraction program to obtain the necessary information from these systems, and create an open-source microservice characteristic dataset through manual verification. Specifically, the research proposes three questions: 1. How to collect and organize Spring Cloud-style microservice systems as data sources? 2. How to identify and extract the fundamental elements that need to be extracted from Spring Cloud-style microservice systems? 3. How to validate the extracted data for building a reliable dataset? To answer these questions, the paper follows the following steps: 1. Select Spring Cloud-based microservice systems on GitHub, excluding third-party libraries, frameworks, and low-code development tools, and create an open-source directory consisting of mature projects. 2. Analyze Spring Boot-style microservices and define 23 key metrics based on their three-tier architecture to evaluate the granularity, design, and interactions of microservices, and implement an extraction program. 3. Perform manual verification on the extracted data to ensure accuracy, and finally form a publicly available dataset in CSV format. This work aims to provide foundational resources for the research of microservice anti-patterns, exploring bad practices within and between different microservices through machine learning, heuristic algorithms, and other methods. The paper also mentions the limitations of the dataset, such as the data being sourced from open-source microservice systems, which may result in the omission of excellent projects due to search condition restrictions, and challenges encountered during data extraction, such as updates in inter-service communication and issues with static variable references. With this dataset, different detection methods can be further evaluated and compared, such as the detection of nanoservices (overly fine-grained service granularity), in order to improve the quality and maintainability of microservices architecture.