Abstract:During the research, an information system for voicing Ukrainian-language text was developed based on NLP and machine learning methods. The created information system is implemented in the form of a desktop application, which allows the process of voicing the Ukrainian-language text. The created system included all stages of software development: the design process, the implementation process, and the testing process. For the feasibility of creating this system, already existing software solutions on the market were analysed, their advantages and disadvantages were listed, which were subsequently taken into account to create a new system. During the system analysis of the system, a goal tree, a decision tree, and examples of context diagrams with process decomposition are given. One of the stages of the design of the economic part, where the budget that will be spent on the implementation of the system is analysed, all tax and administrative costs are calculated, development strategies are also analysed and the development strategy of the existing product with accompanying solutions and the product development strategy are selected. After that, an assessment was made for the feasibility of creating the designed system, it’s payback and profit. The object of the research is the process of the voiceover system of the Ukrainian-language text based on NLP and machine learning methods. The subject of the research is the methods and means of the Ukrainian-language text voicing system process based on NLP and machine learning methods. The purpose of the research is to create an information system for voicing Ukrainian- language text based on NLP and machine learning methods. The result of the work is a ready-to- implement information system for voicing Ukrainian-language text based on NLP and machine learning methods, an analytical review of literary and online sources related to the topic of voicing Ukrainian- language text based on NLP and machine learning methods, a systematic analysis of the research object, analysis and selection of software tools for system implementation, practical implementation of the system, economic justification of system implementation activities.

Automated Pipeline for Training Dataset Creation from Unlabeled Audios for Automatic Speech Recognition

Enabling ASR for Low-Resource Languages: A Comprehensive Dataset Creation Approach

Transcribe, Align and Segment: Creating speech datasets for low-resource languages

LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition

CrowdSpeech and VoxDIY: Benchmark Datasets for Crowdsourced Audio Transcription

Snow Mountain: Dataset of Audio Recordings of The Bible in Low Resource Languages

Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development

An Automated End-to-End Open-Source Software for High-Quality Text-to-Speech Dataset Generation

Creating Spoken Dialog Systems in Ultra-Low Resourced Settings

Almost Unsupervised Text to Speech and Automatic Speech Recognition

Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels

Error-preserving Automatic Speech Recognition of Young English Learners' Language

A Large-scale Dataset for Audio-Language Representation Learning

Information System for Ukrainian Text Voiceover Based on Nlp and Machine Learning Methods

Speech recognition datasets for low-resource Congolese languages

Map and Relabel: Towards Almost-Zero Resource Speech Recognition.

A bandit approach to curriculum generation for automatic speech recognition

Kite: Automatic speech recognition for unmanned aerial vehicles

Automating Crowd-supervised Learning for Spoken Language Systems.

Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation

Algorithms For Automatic Accentuation And Transcription Of Russian Texts In Speech Recognition Systems