For a semiotic AI: Bridging computer vision and visual semiotics for computational observation of large scale facial image archives

Lia Morra,Antonio Santangelo,Pietro Basci,Luca Piano,Fabio Garcea,Fabrizio Lamberti,Massimo Leone
DOI: https://doi.org/10.1016/j.cviu.2024.104187
2024-10-12
Abstract:Social networks are creating a digital world in which the cognitive, emotional, and pragmatic value of the imagery of human faces and bodies is arguably changing. However, researchers in the digital humanities are often ill-equipped to study these phenomena at scale. This work presents FRESCO (Face Representation in E-Societies through Computational Observation), a framework designed to explore the socio-cultural implications of images on social media platforms at scale. FRESCO deconstructs images into numerical and categorical variables using state-of-the-art computer vision techniques, aligning with the principles of visual semiotics. The framework analyzes images across three levels: the plastic level, encompassing fundamental visual features like lines and colors; the figurative level, representing specific entities or concepts; and the enunciation level, which focuses particularly on constructing the point of view of the spectator and observer. These levels are analyzed to discern deeper narrative layers within the imagery. Experimental validation confirms the reliability and utility of FRESCO, and we assess its consistency and precision across two public datasets. Subsequently, we introduce the FRESCO score, a metric derived from the framework's output that serves as a reliable measure of similarity in image content.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to study the cognitive, emotional and practical values of human face and body images through the combination of computer vision and visual semiotics in large - scale social media image archives. Specifically, the paper introduces a framework named FRESCO (Face Representation in E - Societies through Computational Observation), aiming to explore the sociocultural significance of images on social media platforms. FRESCO breaks down images into numerical and categorical variables by using advanced computer vision techniques, which is in line with the principles of visual semiotics. The framework analyzes three levels of images: the formative level (covering basic visual features such as lines and colors), the representational level (representing specific entities or concepts), and the narrative level (paying special attention to constructing the perspectives of viewers and observers). The analysis of these levels helps to reveal the deep - narrative layers in the images. The main contributions of the paper include: - Introducing the FRESCO framework to operationalize structured visual semiotics for studying the sociocultural significance of social media images on a large scale. - Proposing the practical implementation of the FRESCO framework and verifying its effectiveness and practicality on human - centered datasets through experiments. - Proposing the FRESCO score, which is a reliable and transparent similarity metric based on the output of the FRESCO pipeline. Through these contributions, FRESCO aims to fill the gap of digital humanities scholars in large - scale image analysis tools and techniques, enabling them to use tools in the fields of big data analysis and data mining to discover new and unexpected patterns from large visual collections.