Software effort estimation using convolutional neural network and fuzzy clustering

Mohammad Azzeh,Abedalrhman Alkhateeb,Ali Bou Nassif
DOI: https://doi.org/10.1007/s00521-024-09855-z
2024-05-08
Neural Computing and Applications
Abstract:Adopting an efficient software process model is critical for building high-quality software applications. An important factor impacting the software development process is an accurate estimate of human effort required to complete the software project. While machine learning methods were historically used to develop estimation models, there has been little investigation into the potential of deep convolutional neural networks (DCNNs) for improving software effort estimation. One of the biggest obstacles in using DCNN for this purpose is the common nature of software datasets, which often consist of vectorized samples rather than matrices. To defeat this obstacle and reduce vagueness in software attribute measurement, this study uses Fuzzy theory to generate an appropriate two-dimensional datapoint representation. The fuzzy clustering is commonly used to split dataset samples into separate clusters, which can help to generate Fuzzy membership functions. This approach makes it easier to generate a two-dimensional array representation for each data sample based on the membership values, allowing it to be used as input to the DCNN model. The efficiency of the proposed model was thoroughly evaluated using PROMISE benchmark datasets. The findings based on mean absolute errors and standardized accuracy show that our proposed model produced very good performance with low error rates and outperformed several current state-of-the-art effort estimation models. Nonetheless, further research is needed to determine the impact of different cluster numbers and features on the performance of our model. In conclusion, this study emphasizes the possibility for incorporating DCNN into software effort estimates and highlights the viability of utilizing fuzzy modeling and clustering techniques to enhance the data representation of software datasets.
computer science, artificial intelligence
What problem does this paper attempt to address?