Towards Multi-Modal DBMSs for Seamless Querying of Texts and Tables

Matthias Urban,Carsten Binnig

2023-04-28

Abstract:In this paper, we propose Multi-Modal Databases (MMDBs), which is a new class of database systems that can seamlessly query text and tables using SQL. To enable seamless querying of textual data using SQL in an MMDB, we propose to extend relational databases with so-called multi-modal operators (MMOps) which are based on the advances of recent large language models such as GPT-3. The main idea of MMOps is that they allow text collections to be treated as tables without the need to manually transform the data. As we show in our evaluation, our MMDB prototype can not only outperform state-of-the-art approaches such as text-to-table in terms of accuracy and performance but it also requires significantly less training data to fine-tune the model for an unseen text collection.

Databases,Computation and Language

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to propose a new database system—Multimodal Database (MMDB), which can seamlessly query both text and tabular data. Specifically: 1. **Addressing the Insufficient Support for Multimodal Data in Traditional Databases**: - Current database systems are primarily optimized for tabular data and provide poor support for other data types (such as text, images, etc.). - Modern data applications often need to handle various data types, and existing relational database systems are not adept at managing these multimodal scenarios. 2. **Achieving Seamless Querying of Multimodal Data**: - Proposes the introduction of multimodal operators (MMOps) by extending relational database systems, allowing seamless handling of text and other unstructured data directly within SQL queries. - By using large-scale pre-trained models (such as GPT-3), text data can be processed as tabular data without the need for manual data format conversion. 3. **Improving Query Efficiency and Accuracy**: - Proposes a new method based on large-scale pre-trained models to implement multimodal operators (such as multimodal joins). - This method not only improves the accuracy and performance of queries but also significantly reduces the amount of training data required, performing well even with unseen text collections. With these improvements, MMDB can better support the querying needs of multimodal data without sacrificing efficiency.

Towards Multi-Modal DBMSs for Seamless Querying of Texts and Tables

Multimodal Neural Databases

Dual Reader-Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering

Efficient Learned Query Execution over Text and Tables [Technical Report]

Querying Large Language Models with SQL

Schema-Aware Multi-Task Learning for Complex Text-to-SQL

Multi-SQL: An extensible multi-model data query language

MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL

A Multi-Paradigm Querying Approach For A Generic Multimedia Database Management System

An Interactive Multi-modal Query Answering System with Retrieval-Augmented Large Language Models

Interactive-T2S: Multi-Turn Interactions for Text-to-SQL with Large Language Models

ThalamusDB: Approximate Query Processing on Multi-Modal Data

MM-LLMs: Recent Advances in MultiModal Large Language Models

MulmQA: Multimodal Question Answering for Database Alarm

Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs

CAESURA: Language Models as Multi-Modal Query Planners

Cooperative SQL Generation for Segmented Databases By Using Multi-functional LLM Agents

MAG-SQL: Multi-Agent Generative Approach with Soft Schema Linking and Iterative Sub-SQL Refinement for Text-to-SQL

Querying multidimensional databases

MoMQ: Mixture-of-Experts Enhances Multi-Dialect Query Generation across Relational and Non-Relational Databases