On Scalable and Interpretable Autism Detection from Social Interaction Behavior

William Saakyan,Matthias Norden,Lola Herrmann,Simon Kirsch,Muyu Lin,Simon Guendelman,Isabel Dziobek,Hanna Drimalla
DOI: https://doi.org/10.1109/ACIIW59127.2023.10388157
2023-01-01
Abstract:Autism Spectrum Condition (ASC) is characterized by social interaction difficulties that can be challenging to assess objectively in the diagnostic process. In this paper, we evaluate the capability of using videos of a standardized social interaction to differentiate non-verbal behaviors of individuals with and without ASC. We collected a large video dataset consisting of 164 participants with ASC (n = 83) and neurotypical individuals (n = 81) who completed the computer-based Simulated Interaction Task (SIT) in different studies including lab and home settings. To classify individuals with and without ASC, we trained uni- and multimodal machine learning models based on different modalities such as facial expressions, gaze behavior, head pose and voice features. Our results indicate that a multimodal late fusion approach achieved the highest accuracy (74%). In the unimodal setting, classification based on facial expressions (accuracy 73%) and voice features (accuracy 70%) were most effective. An explainability analysis of the most relevant features for the facial expression model indicated that features from all emotional parts as well as from both the speaking and listening part of the interaction were informative. Based on our results, we developed a scalable online version of the SIT to collect diverse data on a large scale for the development of machine learning models that can differentiate between different clinical conditions. Our study highlights the potential of machine learning on videos of standardized social interactions in supporting clinical diagnosis and the objective and effective measurement of differences in social interaction behavior.
What problem does this paper attempt to address?