Sevi: Speech-to-Visualization Through Neural Machine Translation

Jiawei Tang,Yuyu Luo,Mourad Ouzzani,Guoliang Li,Hongyang Chen
DOI: https://doi.org/10.1145/3514221.3520150
2022-01-01
Abstract:Data visualization is a powerful tool for understating information through visual cues. However, allowing novices to create visualization artifacts for what they want to see is not easy, just as not everyone can write SQL queries. Arguably, the most natural way to specify what to visualize is through natural language or speech, similar to our daily search on Google or Apple Siri, leaving to the system the task of reasoning about what to visualize and how. In this demo, we present Sevi an end-to-end data visualization system that acts as a virtual assistant to allow novices to create visualizations through either natural language or speech. Sevi is powered by two main components: Speech2Text which is based on Google Cloud Speech-to-Text Rest API, and Text2VIS, which uses an end-to-end neural machine translation model called ncNet trained using a cross-domain benchmark called nvBench. Both ncNet and nvBench have been developed by us. We will walk the audience through two general domain datasets, one related to COVID-19 and the other on NBA player statistics, to highlight how Sevi enables novices to easily create data visualizations. Because nvBench contains Text2VIS training samples from 105 domains (e.g., sport, college, hospital, etc.), the audience can play with speech or text input with any of these domains.
What problem does this paper attempt to address?