An Intelligent Industrial Visual Monitoring and Maintenance Framework Empowered by Large-Scale Visual and Language Models

F. Tsung,Yan-Fu Li,Chenxi Li,Huan Wang
DOI: https://doi.org/10.1109/TICPS.2024.3414292
Abstract:Industrial visual monitoring (IVM) is crucial for operation and maintenance, and artificial intelligence (AI) has excelled in this domain. As a revolutionary breakthrough in AI, large models are set to revolutionize IVM by advancing comprehensive automation and intelligence. This paper proposes an intelligent IVM and maintenance framework (IVMMF) empowered by large-scale visual and language models. Firstly, the proposed large-scale visual model comprehensively understands industrial images, providing accurate defect identification and descriptions. Subsequently, the local-knowledge-bases-based large language model was proposed to understand technical knowledge in specific fields, provide professional suggestions for engineers, and realize intelligent information interaction between the system and engineers. IVMMF achieves the intelligence of the entire process, including industrial image understanding, text dialogue, maintenance suggestions, and information communication. Finally, we construct a large-scale image-text IVM dataset, and the experiments demonstrate its exceptional performance, indicating its potential to transform the application paradigm in IVM.
Computer Science,Engineering
What problem does this paper attempt to address?