Out-of-Distribution with Text-to-Image Diffusion Models.

Jinglin Tong,Longquan Dai
DOI: https://doi.org/10.1007/978-981-99-8552-4_22
2024-01-01
Abstract:Out-of-distribution detection, identifying unexpected data from the known concepts, is essential for reliable machine learning. We present a novel method that explores the application of a text-to-image diffusion model for out-of-distribution detection. Our method is motivated by the fact that the text-to-image diffusion model has shown remarkable capability in generating high-quality images with diverse text descriptions. The text description generates a corresponding text embedding and is injected into the diffusion model to affect image generation. This demonstrates that its internal representation contains semantic information and is highly enhanced by text concepts. This inspires us to apply the diffusion model to extract image representations with suitable text embeddings. In addition, we noticed that describing images directly using native text is often vague and lacking in detail. Thus, we propose an implicit captioner to generate text embeddings for the input images. Subsequently, a compression head is introduced to compress the representations, facilitating easy comparison and removal of noise information. We formulate the proposed text-to-image diffusion model, implicit captioner, and compression head into a network, which we call ODDM: Out-of-distribution Detection with Text-to-Image Diffusion Models. Several experiments shows that our method can achieved superior performance.
What problem does this paper attempt to address?