Abstract:Machine learning‐based behaviour classification using acceleration data is a powerful tool in bio‐logging research. Deep learning architectures such as convolutional neural networks (CNN), long short‐term memory (LSTM) and self‐attention mechanism as well as related training techniques have been extensively studied in human activity recognition. However, they have rarely been used in wild animal studies. The main challenges of acceleration‐based wild animal behaviour classification include data shortages, class imbalance problems, various types of noise in data due to differences in individual behaviour and where the loggers were attached and complexity in data due to complex animal‐specific behaviours, which may have limited the application of deep learning techniques in this area. To overcome these challenges, we explored the effectiveness of techniques for efficient model training: data augmentation, manifold mixup and pre‐training of deep learning models with unlabelled data, using datasets from two species of wild seabirds and state‐of‐the‐art deep learning model architectures. Data augmentation improved the overall model performance when one of the various techniques (none, scaling, jittering, permutation, time‐warping and rotation) was randomly applied to each data during mini‐batch training. Manifold mixup also improved model performance, but not as much as random data augmentation. Pre‐training with unlabelled data did not improve model performance. The state‐of‐the‐art deep learning models, including a model consisting of four CNN layers, an LSTM layer and a multi‐head attention layer, as well as its modified version with shortcut connection, showed better performance among other comparative models. Using only raw acceleration data as inputs, these models outperformed classic machine learning approaches that used 119 handcrafted features. Our experiments showed that deep learning techniques are promising for acceleration‐based behaviour classification of wild animals and highlighted some challenges (e.g. effective use of unlabelled data). There is scope for greater exploration of deep learning techniques in wild animal studies (e.g. advanced data augmentation, multimodal sensor data use, transfer learning and self‐supervised learning). We hope that this study will stimulate the development of deep learning techniques for wild animal behaviour classification using time‐series sensor data.

Zero-shot animal behavior classification with vision-language foundation models

Multimodal Foundation Models for Zero-shot Animal Species Recognition in Camera Trap Images

Improved Zero-Shot Classification by Adapting VLMs with Text Descriptions

At First Sight: Zero-Shot Classification of Astronomical Images with Large Multimodal Models

VLM4Bio: A Benchmark Dataset to Evaluate Pretrained Vision-Language Models for Trait Discovery from Biological Images

LLM meets Vision-Language Models for Zero-Shot One-Class Classification

Vision-Language Models for Zero-Shot Classification of Remote Sensing Images

Computer Vision for Primate Behavior Analysis in the Wild

Enhancing Remote Sensing Vision-Language Models for Zero-Shot Scene Classification

Pushing Boundaries: Exploring Zero Shot Object Classification with Large Multimodal Models

Benchmarking Zero-Shot Recognition with Vision-Language Models: Challenges on Granularity and Specificity

AnimalFormer: Multimodal Vision Framework for Behavior-based Precision Livestock Farming

GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding

Past, Present, and Future Approaches Using Computer Vision for Animal Re-Identification from Camera Trap Data

ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition

Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment

SuperAnimal pretrained pose estimation models for behavioral analysis

Exploring deep learning techniques for wild animal behaviour classification using animal‐borne accelerometers

WildCLIP: Scene and Animal Attribute Retrieval from Camera Trap Data with Domain-Adapted Vision-Language Models

Towards Context-Rich Automated Biodiversity Assessments: Deriving AI-Powered Insights from Camera Trap Data

Review on Social Behavior Analysis of Laboratory Animals: From Methodologies to Applications