Having Difficulty Understanding Manuals? Automatically Converting User Manuals into Instructional Videos

Songsong Liu,Shu Wang,Kun Sun
2023-11-21
Abstract:While users tend to perceive instructional videos as an experience rather than a lesson with a set of instructions, instructional videos are more effective and appealing than textual user manuals and eliminate the ambiguity in text-based descriptions. However, most software vendors only offer document manuals that describe how to install and use their software, leading burden for non-professionals to comprehend the instructions. In this paper, we present a framework called M2V to generate instructional videos automatically based on the provided instructions and images in user manuals. M2V is a two-step framework. First, an action sequence is extracted from the given user manual via natural language processing and computer vision techniques. Second, M2V operates the software sequentially based on the extracted actions; meanwhile, the operation procedure is recorded into an instructional video. We evaluate the usability of automatically generated instructional videos via user studies and an online survey. The evaluation results show, with our toolkit, the generated instructional videos can better assist non-professional end users with the software operations. Moreover, more than 85% of survey participants prefer to use the instructional videos rather than the original user manuals.
Human-Computer Interaction
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the difficulties users encounter when understanding and using software user manuals. Specifically, although user manuals usually contain detailed installation and use instructions, for non - professional users, these text descriptions may be difficult to understand, causing them to have difficulty in successfully completing software operations. In contrast, instructional videos can more effectively guide users to complete tasks and eliminate ambiguities in text descriptions by intuitively showing the operation steps. However, most software vendors only provide manuals in document form and do not provide the corresponding video versions, which brings a burden to non - professional users. To solve this problem, the paper proposes a framework named M2V, which can automatically convert software user manuals into instructional videos. M2V extracts the operation sequences in the manuals through natural language processing (NLP) and computer vision (CV) technologies, then reproduces these operations in a simulated environment and records the entire operation process as an instructional video. In this way, even users without professional knowledge can better understand and perform software operations by watching the video. The paper also evaluates the effectiveness and usability of the automatically generated instructional videos through user studies and online surveys. The results show that more than 85% of the participants prefer to use instructional videos rather than the original user manuals.