Significance and methodology: Preprocessing the big data for machine learning on TBM performance

Hao-Han Xiao,Wen-Kun Yang,Jing Hu,Yun-Pei Zhang,Liu-Jie Jing,Zu-Yu Chen
DOI: https://doi.org/10.1016/j.undsp.2021.12.003
IF: 5.327
2022-01-01
Underground Space
Abstract:This paper addresses the significance of preprocessing big data collected during a tunnel boring machine (TBM) excavation before it is used for machine learning on various TBM performance predictions. The research work is based on two water diversion tunneling projects that cover 29.52 km and 17 051 boring cycles. It has been found that the penetration rate calculated from the raw measured penetration distances exhibits more random behavior owing to their percussive and vibratory behavior of the cutterhead. A moving average method to process the negative instantaneous velocities and a noise reduction filter to deal with signals with abnormal frequencies have been recommended. An index called the drilling efficiency index is introduced to assess the relationships between the mechanical parameters in a boring cycle, whose linear regression coefficient R 2 is taken for a preliminary investigation of possible problems requiring preprocessing. The research work defines the irrelevant data whose errors are caused by human or mechanical mistakes, and therefore should be cleaned or amended. These irrelevant data can be divided into five categories: (1) premature cycles, (2) sensor defects, (3) mechanical defects, (4) human interruption, and (5) missing files. A program TBM-Processing has been coded for the recognition and classification of these categories. PDF books generated by the program have been uploaded at GitHub to encourage discussions, collaboration, and upgrading of the data processing work with our peers.
engineering, civil
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to pre - process the collected big data during the excavation process using Tunnel Boring Machines (TBM) for various machine - learning of TBM performance prediction. Specifically, based on the data of two water diversion tunnel projects, the study explored some problems existing in the original data, such as the random behavior of penetration rate calculation, negative instantaneous velocity and abnormal frequency signals, and proposed corresponding processing methods, such as the moving average method and the noise - reduction filter. In addition, a new index called the drilling efficiency index was introduced to evaluate the relationship between mechanical parameters during the drilling cycle. The paper also defined irrelevant data caused by human or mechanical errors and proposed methods for identifying and classifying these data. In summary, this study aims to lay the foundation for the application of machine learning by providing a clean and high - quality database.