Simulated multimodal deep facial diagnosis

Bo Jin,Nuno Gonçalves,Leandro Cruz,Iurii Medvedev,Yuanyu Yu,Jiujiang Wang
DOI: https://doi.org/10.1016/j.eswa.2024.123881
IF: 8.5
2024-05-13
Expert Systems with Applications
Abstract:Facial phenotypes are extensively studied in medical and biological research, serving as critical markers that potentially indicate underlying genetic traits or medical conditions. With the recent advancements in big data, algorithms, and hardware, deep facial diagnosis, which employs deep learning techniques to systematically examine facial phenotypes and identify signs of certain diseases or medical conditions, has attracted significant attention and research, gradually emerging as a promising tool in precision medicine. Primarily limited by the scarcity of data for training facial diagnosis models, the accuracy of facial diagnosis for various conditions remains low up to now. In the past decade, RGB-D cameras, measuring depth information along with standard RGB capabilities, have proven superior in processing spatial details with more stability and accuracy. Motivated by the facts mentioned above, in this paper, we propose a Simulated Multimodal Framework, which effectively improves the computer-aided facial diagnosis performance of state-of-the-art models in experiments under different conditions. The underlying principle is to leverage the simulated depth by generative models to improve the performance of RGB image recognition. Furthermore, as a rapid and non-invasive tool for disease screening and detection, our proposal demonstrated an average accuracy improvement of over 20% compared to practicing physicians in the study.
computer science, artificial intelligence,engineering, electrical & electronic,operations research & management science
What problem does this paper attempt to address?