The 2nd International Workshop on Deep Multi-modal Generation and Retrieval

Wei Ji,Hao Fei,Yinwei Wei,Zhedong Zheng,Juncheng Li,Long Chen,Lizi Liao,Yueting Zhuang,Roger Zimmermann
DOI: https://doi.org/10.1145/3689091.3690093
2024-01-01
Abstract:Information generation (IG) and information retrieval (IR) are two key representative approaches of information acquisition, i.e., producing content either via generation or via retrieval. While traditional IG and IR have achieved great success within the scope of languages, the under-utilization of varied data sources in different modalities (i.e., text, images, audio, and video) would hinder IG and IR techniques from giving the full advances and thus limits the applications in the real world. Knowing the fact that our world is replete with multimedia information, this special issue encourages the development of deep multimodal learning for the research of IG and IR. Benefiting from a variety of data types and modalities, some latest prevailing techniques are extensively invented to show great facilitation in multimodal IG and IR learning, such as DALL-E, Stable Diffusion, GPT4, Sora, etc. Given the great potential shown by multimodal-empowered IG and IR, there can be still unsolved challenges and open questions in these directions. With this workshop, we aim to encourage more explorations in Deep Multimodal Generation and Retrieval, providing a platform for researchers to share insights and advancements in this rapidly evolving domain.
What problem does this paper attempt to address?