Just Speak It: Minimize Cognitive Load for Eyes-Free Text Editing with a Smart Voice Assistant

Jiayue Fan,Chenning Xu,Chun Yu,Yuanchun Shi
DOI: https://doi.org/10.1145/3472749.3474795
2021-01-01
Abstract:Entering text precisely by voice, users might encounter colloquial inserts, inappropriate wording, and recognition errors, which brings difficulties to voice editing. Users need to locate the errors and then correct them. In eyes-free scenarios, this select-modify mode brings a cognitive burden and a risk of error. This paper introduces neural networks and pre-trained models to understand users’ revision intention based on semantics, reducing the need for the information from users’ statements. We present two strategies. One is to remove the colloquial inserts automatically. The other is to allow users to edit by just speaking out the target words without having to say the context and the incorrect text. Accordingly, our approach can predict whether to insert or replace, the incorrect text to replace, and the position to insert. We implement these strategies in SmartEdit, an eyes-free voice input agent controlled with earphone buttons. The evaluation shows that our techniques reduce the cognitive load and decrease the average failure rate by 54.1% compared to descriptive command or re-speaking.
What problem does this paper attempt to address?