Nomic Embed Vision: Expanding the Latent Space

Zach Nussbaum,Brandon Duderstadt,Andriy Mulyar
2024-06-07
Abstract:This technical report describes the training of nomic-embed-vision, a highly performant, open-code, open-weights image embedding model that shares the same latent space as nomic-embed-text. Together, nomic-embed-vision and nomic-embed-text form the first unified latent space to achieve high performance across vision, language, and multimodal tasks.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?