A hybrid transformer with domain adaptation using interpretability techniques for the application to the detection of risk situations

Rupayan Mallick,Jenny Benois-Pineau,Akka Zemmari,Kamel Guerda,Boris Mansencal,Helene Amieva,Laura Middleton
DOI: https://doi.org/10.1007/s11042-024-18687-x
IF: 2.577
2024-03-12
Multimedia Tools and Applications
Abstract:Multimedia approaches are strongly required in multi-modal data processing for the detection and recognition of specific events in the data. Hybrid architectures with time series and image/video inputs in the framework of twin CNNs have shown increased performances compared to mono-modal approaches. Pre-trained models have been used in transfer learning to fine-tune the last few layers in the network. This often leads to distribution shifts in the domain. In a real-world scenario, the distribution shifts between the source and target domains can yield poor classification results. With interpretable techniques used in deep neural networks, important features can be highlighted not only for trained models but also reinforced in the training process. Hence the initialization of the target domain model can be performed with improved weights. During data transfer between datasets, the dimensions of the data are also different. We propose a method for model transfer with the adaptation of data dimension and improved initialization with interpretability approaches.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?