AnimalFormer: Multimodal Vision Framework for Behavior-based Precision Livestock Farming

Ahmed Qazi,Taha Razzaq,Asim Iqbal
2024-06-14
Abstract:We introduce a multimodal vision framework for precision livestock farming, harnessing the power of GroundingDINO, HQSAM, and ViTPose models. This integrated suite enables comprehensive behavioral analytics from video data without invasive animal tagging. GroundingDINO generates accurate bounding boxes around livestock, while HQSAM segments individual animals within these boxes. ViTPose estimates key body points, facilitating posture and movement analysis. Demonstrated on a sheep dataset with grazing, running, sitting, standing, and walking activities, our framework extracts invaluable insights: activity and grazing patterns, interaction dynamics, and detailed postural evaluations. Applicable across species and video resolutions, this framework revolutionizes non-invasive livestock monitoring for activity detection, counting, health assessments, and posture analyses. It empowers data-driven farm management, optimizing animal welfare and productivity through AI-powered behavioral understanding.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address the issues present in farm animal behavior monitoring, particularly the invasiveness, subjectivity, and impracticality of current methods (such as direct observation or physiological measurements) for large-scale applications. The paper proposes a multimodal visual framework named AnimalFormer, which utilizes advanced models like GroundingDINO, HQSAM, and ViTPose to conduct non-invasive and precise analysis of farm animal behavior. Specifically, this framework can extract various behavior patterns (such as grazing, running, sitting, standing, and walking) from video data and analyze their interaction dynamics and detailed posture assessments. This framework is not only applicable to different species and video resolutions but also provides revolutionary solutions for activity detection, counting, health assessment, and posture analysis, thereby optimizing farm management, improving animal welfare, and enhancing production efficiency. By applying non-invasive monitoring technology, this framework helps reduce animal stress and bias, collect reliable animal welfare data, and ultimately improve the quality of life and production output of livestock.