An interactive model based on regression tree and K-nearest neighbor for storage device performance prediction

Guo Chang-Hui,Liu Gui-Quan,Zhang Lei
DOI: https://doi.org/10.13232/j.cnki.jnju.2012.02.001
2012-01-01
Abstract:Storage device performance prediction is a significant element of self-managed storage systems and application planning tasks,such as data assignment.The traditional methods for storage device performance prediction,such as accurate simulations and analytic models,needs sufficient expertise about storages.As the storage devices are becoming more and more high-end and complex,the accurate simulations and analytic models are not available.Compared with traditional methods,the machine learning methods consider the storage devices as black boxes,and needs no information about the internal components or algorithms of those storage devices.So machine learning methods are more appropriate for the trend of current storage devices development.Classification and regression tree(CART) method for modelling storage devices is simple.This work explores an interactive model based on regression tree and K-nearest neighbor algorithm to improve the machine learning method.Experiments show that our proposed model has a higher prediction precise and a better stability than regression tree or KNN.In our experiments,we found out that the caching effect is very important.We improved the method of workload characterization considering caching effect,which makes a substantial difference on prediction accuracy.
What problem does this paper attempt to address?