Multi-Think Transformer for Enhancing Emotional Health

Jiarong Wang,Jiaji Wu,Shaohong Chen,Xiangyu Han,Mingzhou Tan,Jianguo Yu
DOI: https://doi.org/10.1145/3652512
IF: 5.3
2024-03-18
ACM Transactions on Internet Technology
Abstract:The smart healthcare system not only focuses on physical health but also on emotional health. Music therapy, as a non-pharmacological treatment method, has been widely used in clinical treatment, but music selection and generation still require manual intervention. AI music generation technology can assist people in relieving stress and providing more personalized and efficient music therapy support. However, existing AI music generation highly relies on the note generated at the current time to produce the note at the next time. This will lead to disharmonious results. The first reason is the small errors being ignored at the current generated note. This error will accumulate and spread continuously, and finally make the music become random. To solve this problem, we propose a music selection module to filter the errors of generated note. The multi-think mechanism is proposed to filter the result multiple times, so that the generated note is as accurate as possible, eliminating the impact of the results on the next generation process. The second reason is that the results of multiple generation of each music clip are not the same or even do not follow the same music rules. Therefore, in the inference phase, a voting mechanism is proposed in this paper to select the note that follow the music rules that most experimental results follow as the final result. The subjective and objective evaluations demonstrate the superiority of our proposed model in generation of more smooth music that conforms to music rules. This model provides strong support for clinical music therapy, and provides new ideas for the research and practice of emotional health therapy based on the Internet of Things.
computer science, information systems, software engineering
What problem does this paper attempt to address?
The paper aims to address two main issues in music generation algorithms when generating music: 1. **Cumulative errors leading to increased randomness in music**: Existing music generation algorithms heavily rely on the currently generated note when generating the next note. This dependency can cause small errors to be overlooked, which then accumulate and propagate, ultimately making the generated music random and disharmonious. 2. **Inconsistent results across multiple generations**: The results of generating the same music segment may differ each time, and may not even conform to the same musical rules, further exacerbating the discontinuity and inconsistency of the generated music. To solve these problems, the authors propose a Multi-Think Transformer (MTT) model, which includes three key components: - **Music Selection Module**: Used to filter out erroneous information in the generated notes and extract useful features. - **Multi-Think Mechanism**: Ensures the accuracy of the generated notes through multiple verifications, avoiding error propagation. - **Voting Mechanism**: During the inference phase, selects the notes that best conform to musical rules as the final result through multiple experiments and voting, reducing the randomness of the generated results and ensuring the harmony of the generated music. The application of these techniques makes the generated music smoother, more natural, and compliant with musical rules, thereby providing strong support for clinical music therapy and offering new ideas for research and practice in emotion health treatment based on the Internet of Things.