Text mining and natural language processing in construction

Alireza Shamshiri,Kyeong Rok Ryu,June Young Park
DOI: https://doi.org/10.1016/j.autcon.2023.105200
IF: 10.3
2024-02-01
Automation in Construction
Abstract:Text mining (TM) and natural language processing (NLP) have stirred interest within the construction field, as they offer enhanced capabilities for managing and analyzing text-based information. This highlights the need for a systematic review to identify the status quo, gaps, and future directions from the perspective of construction management. A review was conducted by aligning the objectives of 205 publications with the specific domains, areas, tasks, and processes outlined in construction management practices. This review reveals multiple facets of the construction sector empowered by TM/NLP approaches and highlights essential voids demanding consideration for automation possibilities and minimizing manual tasks. Ultimately, following identified obstacles, the review results indicate potential research opportunities: (1) strengthening overlooked construction aspects, (2) coupling diverse data formats, and (3) leveraging pre-trained language models and reinforcement learning. The findings will provide vital insights, fostering further progress in TM/NLP research and its applications in academia and industry.
construction & building technology,engineering, civil
What problem does this paper attempt to address?
This paper aims to address the application of text mining (TM) and natural language processing (NLP) technologies in the construction field, and to explore the current status, gaps, and future directions of these technologies in construction management. Specifically, through a systematic review of 205 related papers, the paper identifies the application of TM/NLP technologies in different areas, tasks, and processes of construction management, and reveals the following key aspects: 1. **Strengthening Neglected Aspects of Construction**: Many specific aspects of the construction field have not been fully explored, requiring further research to enhance automation and reduce manual operations. 2. **Integrating Diverse Data Formats**: Current research mostly focuses on a single type of data source, while construction projects typically involve various data formats (such as emails, drawings, contracts, etc.). Therefore, integrating different types of data becomes a key challenge. 3. **Utilizing Pre-trained Language Models and Reinforcement Learning**: Although existing research has demonstrated the potential of pre-trained models in certain tasks, their specific application in construction management is still limited, especially in combining reinforcement learning to optimize decision-making processes. Through the above analysis, the paper points out future research opportunities and provides valuable insights for both academia and industry to promote the further development of TM/NLP technologies in the field of construction management. Particularly in areas such as cost management, schedule management, quality management, and advanced work packaging, although existing research has made some progress, there are still many unresolved issues that urgently require new methods and technologies to address these deficiencies.