Behavior Recognition in Mouse Videos Using Contextual Features Encoded by Spatial-temporal Stacked Fisher Vectors.

Zheheng Jiang,Danny Crookes,Brian Desmond Green,Shengping Zhang,Huiyu Zhou
DOI: https://doi.org/10.5220/0006244602590269
2017-01-01
Abstract:Manual measurement of mouse behavior is highly labor intensive and prone to error. This investigation aims to efficiently and accurately recognize individual mouse behaviors in action videos and continuous videos. In our system each mouse action video is expressed as the collection of a set of interest points. We extract both appearance and contextual features from the interest points collected from the training datasets, and then obtain two Gaussian Mixture Model (GMM) dictionaries for the visual and contextual features. The two GMM dictionaries are leveraged by our spatial-temporal stacked Fisher Vector (FV) to represent each mouse action video. A neural network is used to classify mouse action and finally applied to annotate continuous video. The novelty of our proposed approach is: (i) our method exploits contextual features from spatiotemporal interest points, leading to enhanced performance, (ii) we encode contextual features and then fuse them with appearance features, and (iii) location information of a mouse is extracted from spatio-temporal interest points to support mouse behavior recognition. We evaluate our method against the database of Jhuang et al. [7] and the results show that our method outperforms several state-of-the-art approaches.
What problem does this paper attempt to address?