Transformer-Based Domain Adaptation for Event Data Classification

Junwei Zhao,Shiliang Zhang,Tiejun Huang
DOI: https://doi.org/10.1109/icassp43922.2022.9747832
2022-01-01
Abstract:Event cameras encode the change of brightness into events, differing from conventional frame cameras. The novel working principle makes them to have stronger potential in high-speed applications. However, the lack of labeled event annotations limits the applications of such cameras in deep learning frameworks, making it appealing to study more efficient deep learning algorithms and architectures. This paper devises the Convolutional Transformer Network (CTN) for processing event data. The CTN enjoys the advantages of convolution networks and transformers, presenting stronger capability in event-based classification tasks compared with existing models. To address the insufficiency issue of annotated event data, we propose to train the CTN via the source-free Unsupervised Domain Adaptation (UDA) algorithm leveraging large-scale labeled image data. Extensive experiments verify the effectiveness of the UDA algorithm. And our CTN outperforms recent state-of-the-art methods on event-based classification tasks, suggesting that it is an effective model for this task. To our best acknowledge, it is an early attempt of employing vision transformers with the source-free UDA algorithm to process event data.
What problem does this paper attempt to address?