DataTalk-V: time series visualisation for internet of things based on clustering and dimension reduction for an IoT platform

Jiun-Yi Lin,Yun-Wei Lin,Yi-Bing Lin,Jiun Yi Lin,Yun Wei Lin,Yi Bing Lin
DOI: https://doi.org/10.1504/ijsnet.2024.136688
2024-02-18
International Journal of Sensor Networks
Abstract:Understanding the complexities of the growing time-series data collection poses a challenge. To extract valuable insights and knowledge from this data, data mining approaches have been developed to process and analyse it effectively. Dimension reduction (DR) is a commonly employed method for this purpose. Selecting appropriate hyperparameter values and measuring visualisation quality for DR are critical for ensuring the usefulness of the visualisation. To enhance DR further, we propose integrating it with pseudo labels generated by clustering techniques. This paper designs DataTalk Visualisation (DataTalk-V), an algorithm for visualising time series data. DataTalk-V automatically performs clustering and selects hyperparameters for the dimension reduction (DR) method on high-dimensional data, resulting in two-dimensional data. DataTalk-V is built on IoTtalk, an IoT application development platform. DataTalk-V leverages a cost function in Bayesian optimisation to effectively optimise the hyperparameters for DR. We demonstrate that the two-dimensional data reduced by DataTalk-V not only facilitates data visualisation but also enhances the prediction accuracy of the k-nearest neighbours (k-NN) algorithm. We demonstrate that the DR model generated by DataTalk-V is applied to analyse the sensitivity of the features from soil samples and successfully predicts the correlation of these features with their respective machine learning models.
computer science, information systems,telecommunications
What problem does this paper attempt to address?