Towards Lexical Analysis of Dog Vocalizations via Online Videos

Yufei Wang,Chunhao Zhang,Jieyi Huang,Mengyue Wu,Kenny Zhu

2023-09-22

Abstract:Deciphering the semantics of animal language has been a grand challenge. This study presents a data-driven investigation into the semantics of dog vocalizations via correlating different sound types with consistent semantics. We first present a new dataset of Shiba Inu sounds, along with contextual information such as location and activity, collected from YouTube with a well-constructed pipeline. The framework is also applicable to other animal species. Based on the analysis of conditioned probability between dog vocalizations and corresponding location and activity, we discover supporting evidence for previous heuristic research on the semantic meaning of various dog sounds. For instance, growls can signify interactions. Furthermore, our study yields new insights that existing word types can be subdivided into finer-grained subtypes and minimal semantic unit for Shiba Inu is word-related. For example, whimper can be subdivided into two types, attention-seeking and discomfort.

Sound,Computation and Language,Machine Learning,Audio and Speech Processing

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to understand the lexical semantics of canine languages and explore the minimal semantic units. Specifically, researchers analyze the types of Shiba Inu voices in different situations to reveal the specific meanings carried by these voices. The goals of the paper include: 1. **Determine whether canines use consistent voice patterns to express specific meanings**: Researchers hope to understand whether canines will use specific voice patterns to convey specific information in different scenarios. 2. **Calculate the correlation between voice expressions and factors that may cause different meanings**: This involves how to quantify and analyze the relationship between voices and environmental factors (such as location and activity) to reveal the meanings behind the voices. To answer these questions, researchers have proposed the following technical challenges: - **Classification of voice types**: Define a "word" as an independent and continuous canine voice segment, usually lasting about 1 second, and segment the "word" by detecting the transitions between silent frames and dog - voice frames in the audio. - **Extraction of context information**: Define a diverse and comprehensive list of locations and activities, and use corresponding extraction methods to obtain the specific context information, including location and activity, when each segment of voice occurs. Through the above methods, researchers have constructed a large - scale time - stamp - aligned dataset containing quadruples of <word, sub - word, location, activity> for in - depth analysis of the lexical semantics of canine languages and their minimal semantic units. This research not only provides new insights into canine languages but also provides an extensible data - processing framework for similar research in the future.

Towards Lexical Analysis of Dog Vocalizations via Online Videos

Does My Dog ''Speak'' Like Me? The Acoustic Correlation between Pet Dogs and Their Human Owners

Phonetic and Lexical Discovery of a Canine Language using HuBERT

Towards Dog Bark Decoding: Leveraging Human Speech Processing for Automated Bark Classification

Silent Signals, Loud Impact: LLMs for Word-Sense Disambiguation of Coded Dog Whistles

DogChat: A Pet-centered Smart Collar Prototype Based on Large Language Models and Wechat

Automated Call Detection for Acoustic Surveys with Structured Calls of Varying Length

Animal speech and singing synthesis model based on So-VITS-SVC

Sound identification of abnormal pig vocalizations: Enhancing livestock welfare monitoring on smart farms

Animal cognition: Dogs build semantic expectations between spoken words and objects

Inferring Emotions from Large-Scale Internet Voice Data.

ARBUR, a machine learning-based analysis system for relating behaviors and ultrasonic vocalizations of rats

GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding

Feature Representations for Automatic Meerkat Vocalization Classification

The Dog Soundscape: Recurrence, Emotional Impact, Acoustics, and Implications for Dog Observations and Dog–Human Interactions

DogFLW: Dog Facial Landmarks in the Wild Dataset

Advanced Framework for Animal Sound Classification With Features Optimization

Lateralized behavior and cardiac activity of dogs in response to human emotional vocalizations

Perception of vocoded speech in domestic dogs

Identification, Analysis and Characterization of Base Units of Bird Vocal Communication: The White Spectacled Bulbul (Pycnonotus xanthopygos) as a Case Study

Behavior-Based Video Summarization System for Dog Health and Welfare Monitoring