Abstract:Microservice architecture has become a dominant architectural style in the service-oriented software industry. Poor practices in the design and development of microservices are called microservice bad smells. In microservice bad smells research, the detection of these bad smells relies on feature data from microservices. However, there is a lack of an appropriate open-source microservice feature dataset. The availability of such datasets may contribute to the detection of microservice bad smells unexpectedly. To address this research gap, this paper collects a number of open-source microservice systems utilizing Spring Cloud. Additionally, feature metrics are established based on the architecture and interactions of Spring Boot style microservices. And an extraction program is developed. The program is then applied to the collected open-source microservice systems, extracting the necessary information, and undergoing manual verification to create an open-source feature dataset specific to microservice systems using Spring Cloud. The dataset is made available through a CSV file. We believe that both the extraction program and the dataset have the potential to contribute to the study of micro-service bad smells.

What problem does this paper attempt to address?

This research paper focuses on a problem in microservices architecture, which is the anti-patterns or "bad smells" in microservices. Currently, there is a lack of a suitable open-source dataset for microservice characteristics in the research of microservice anti-patterns, which hinders the effective detection of such problems. Therefore, the goal of this paper is to collect a set of open-source microservice systems based on Spring Cloud, establish a set of microservice characteristic metrics, develop an extraction program to obtain the necessary information from these systems, and create an open-source microservice characteristic dataset through manual verification. Specifically, the research proposes three questions: 1. How to collect and organize Spring Cloud-style microservice systems as data sources? 2. How to identify and extract the fundamental elements that need to be extracted from Spring Cloud-style microservice systems? 3. How to validate the extracted data for building a reliable dataset? To answer these questions, the paper follows the following steps: 1. Select Spring Cloud-based microservice systems on GitHub, excluding third-party libraries, frameworks, and low-code development tools, and create an open-source directory consisting of mature projects. 2. Analyze Spring Boot-style microservices and define 23 key metrics based on their three-tier architecture to evaluate the granularity, design, and interactions of microservices, and implement an extraction program. 3. Perform manual verification on the extracted data to ensure accuracy, and finally form a publicly available dataset in CSV format. This work aims to provide foundational resources for the research of microservice anti-patterns, exploring bad practices within and between different microservices through machine learning, heuristic algorithms, and other methods. The paper also mentions the limitations of the dataset, such as the data being sourced from open-source microservice systems, which may result in the omission of excellent projects due to search condition restrictions, and challenges encountered during data extraction, such as updates in inter-service communication and issues with static variable references. With this dataset, different detection methods can be further evaluated and compared, such as the detection of nanoservices (overly fine-grained service granularity), in order to improve the quality and maintainability of microservices architecture.

A Feature Dataset of Microservices-based Systems

A curated Dataset of Microservices-Based Systems

Benchmarking Microservice Systems for Software Engineering Research

The PetShop Dataset -- Finding Causes of Performance Issues across Microservices

An Open-Source Benchmark Suite for Cloud and IoT Microservices

Benchmarking Data Management Systems for Microservices

No Free Lunch: Microservice Practices Reconsidered in Industry

A Feature Table approach to decomposing monolithic applications into microservices

A Dataflow-Driven Approach to Identifying Microservices from Monolithic Applications

An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems

Fault Analysis and Debugging of Microservice Systems: Industrial Survey, Benchmark System, and Empirical Study

A Novel Method for Identifying Microservices by Considering Quality Expectations and Deployment Constraints

An Empirical Study on Underlying Correlations Between Runtime Performance Deficiencies and “bad Smells” of Microservice Systems

An Intelligent Anomaly Detection Scheme for Micro-Services Architectures with Temporal and Spatial Data Analysis.

An Empirical Study of Security Practices for Microservices Systems

A Microservice-Based Big Data Analysis Platform for Online Educational Applications

Environmental sanitation operation vehicle supervision system based on SpringCloud microservices

VECROsim: A Versatile Metric-oriented Microservice Fault Simulation System (tools and Artifact Track)

Multi-task federated learning-based system anomaly detection and multi-classification for microservices architecture

Approach to Anomaly Detection in Microservice System with Multi-Source Data Streams

Root-Cause Metric Location for Microservice Systems Via Log Anomaly Detection