Guest editorial: web multimedia semantic inference using multi-cues

Yahong Han,Yi Yang,Xiaofang Zhou
DOI: https://doi.org/10.1007/s11280-015-0360-2
2015-01-01
World Wide Web
Abstract:With the popularity of social media applications and Web 2.0 techniques, user-generated multimedia contents such as blogs, test messages, photos, videos, user click log and Place of Interest (POI) check-ins become pervasive, which enables the study on exploiting them as multiple cues for web multimedia semantic inference. Most of the time when one speaks of web multimedia corpora, he/she may think of heterogeneous corpora consisting of data from various sources, and of different modality. The heterogeneous multimedia content provides a variety of cues for semantic inference of real-world multimedia applications. Research so far has mostly focused on mono-cue analysis of multimedia content, such as looking only into images, videos, or text, but rarely leverage multiple semantic cues like the surrounding texts of images/videos on a web page or the click logs of users’ profiles from the same community. As such, new algorithms and models for analyzing correlations among multiple semantic cues become one of the most active research areas in web multimedia applications. From the above background, many efforts have focused on the utilization of multiple semantic cues for web multimedia semantic inference. Particularly, the different semantic cues may be temporally synchronized (e.g., video clips and corresponding audio transcripts, animations, multimedia presentations), spatially related (images embedded in text, object relationships in 3D space), semantically correlated (combined analysis of collections of videos, set of images created by one’s social network), or otherwise click-through connected (images World Wide Web (2016) 19:177–179 DOI 10.1007/s11280-015-0360-2
What problem does this paper attempt to address?