A Classification Benchmark for Artificial Intelligence Detection of Laryngeal Cancer from Patient Speech

Mary Paterson,James Moor,Luisa Cutillo
2024-12-20
Abstract:Cases of laryngeal cancer are predicted to rise significantly in the coming years. Current diagnostic pathways cause many patients to be incorrectly referred to urgent suspected cancer pathways, putting undue stress on both patients and the medical system. Artificial intelligence offers a promising solution by enabling non-invasive detection of laryngeal cancer from patient speech, which could help prioritise referrals more effectively and reduce inappropriate referrals of non-cancer patients. To realise this potential, open science is crucial. A major barrier in this field is the lack of open-source datasets and reproducible benchmarks, forcing researchers to start from scratch. Our work addresses this challenge by introducing a benchmark suite comprising 36 models trained and evaluated on open-source datasets. These models are accessible in a public repository, providing a foundation for future research. They evaluate three different algorithms and three audio feature sets, offering a comprehensive benchmarking framework. We propose standardised metrics and evaluation methodologies to ensure consistent and comparable results across future studies. The presented models include both audio-only inputs and multimodal inputs that incorporate demographic and symptom data, enabling their application to datasets with diverse patient information. By providing these benchmarks, future researchers can evaluate their datasets, refine the models, and use them as a foundation for more advanced approaches. This work aims to provide a baseline for establishing reproducible benchmarks, enabling researchers to compare new methods against these standards and ultimately advancing the development of AI tools for detecting laryngeal cancer.
Sound,Machine Learning,Audio and Speech Processing,Quantitative Methods
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve several key problems faced in laryngeal cancer diagnosis: 1. **Improve the early detection rate of laryngeal cancer**: - The number of laryngeal cancer cases is expected to increase significantly in the coming years. Early detection of laryngeal cancer can significantly improve patients' survival rates and increase treatment options, thereby improving their quality of life (Cancer Research UK, 2019, 2021b). - Current diagnostic methods (such as nasopharyngoscopy or laryngoscopy) are usually invasive, uncomfortable, and resource - intensive, resulting in many patients being wrongly referred to the urgent suspected cancer pathway, bringing unnecessary pressure to patients and the healthcare system. 2. **Develop non - invasive laryngeal cancer screening tools**: - Artificial intelligence (AI) - based voice analysis provides a non - invasive screening tool, which can reduce the need for invasive and uncomfortable medical tests. This tool can help screen patients with voice problems, prioritize those with the highest cancer risk, accelerate their specific care pathways, and improve the accessibility of diagnosis by reducing the need for expensive medical equipment. - AI tools can also relieve patients' anxiety and the burden on the healthcare system. 3. **Establish reproducible benchmark tests**: - Although there have been some previous studies on detecting laryngeal cancer from patients' voices, these studies lack public code, models, and data, leading to limited progress. Current methods are ad - hoc, and there is no standardized model evaluation method, making the results incomparable and inconsistent (Paterson et al., 2024). - This paper addresses this challenge by introducing a benchmark suite consisting of 36 models, which are trained and evaluated on open - source datasets and made available in public repositories, providing a basis for future research. 4. **Standardize evaluation methods**: - To ensure the consistency and comparability of results, this paper defines and explains evaluation metrics, including classification performance, inference time, and fairness tests, to standardize the evaluation methods in this field. - These benchmarks will help researchers evaluate new datasets, improve models, and accelerate the development of general - purpose and high - performance AI tools for laryngeal cancer detection. By solving these problems, this paper provides new directions for the early detection and non - invasive screening of laryngeal cancer, promoting further development in related fields.