Predicting possible recommendations related to causes and consequences in the HAZOP study worksheet using natural language processing and machine learning: BERT, clustering, and classification

Ali Ekramipooya,Mehrdad Boroushaki,Davood Rashtchian
DOI: https://doi.org/10.1016/j.jlp.2024.105310
IF: 3.916
2024-04-03
Journal of Loss Prevention in the Process Industries
Abstract:A set of recommendations is one of the most valuable outputs of the hazard and operability (HAZOP) study. The HAZOP study team provides recommendations when deficiencies are detected in the chemical process plant. These deficiencies can cause chemical process accidents and operability issues. This study employed a data-driven approach using natural language processing (NLP) and machine learning (ML) to predict potential recommendations based on causes and consequences. The dataset had no label; thus, clustering was used to label it. Firstly, bidirectional encoder representations from transformers (BERT) converted recommendation sentences into vectors. Secondly, uniform manifold approximation and projection (UMAP) and hierarchical density-based spatial clustering of applications with noise (HDBSCAN) were utilized to determine recommendation categories and label the dataset. Then, BERT was used to convert causes and consequences into vectors. Finally, a multi-layer perceptron (MLP) classifier was employed to predict possible recommendations based on causes and consequences. The class imbalance problem was handled by random over-sampling. The prediction accuracy of possible recommendations based on causes and consequences equals 93.7% and 89.5%, respectively. As a result of predicting potential recommendations utilizing causes and consequences, major recommendations will not be overlooked during the HAZOP study. This can further expand NLP and ML applications in HAZOP study automation.
engineering, chemical
What problem does this paper attempt to address?