Discovering Disease Patterns Using the Supervised Topic Model

Shu Li,Jingyuan Wang,Yi Wang
DOI: https://doi.org/10.1109/icsssm.2018.8465101
2018-01-01
Abstract:In this paper, we explore the methods of medical data mining. The medical data usually have some unique characteristics such as sparseness, highly correlated features and unbalanced sample categories. After researching the models commonly used in current medical data mining, we use the topic-based model for medical data mining. We build a supervised topic model (the SLDA model) and use Gibbs sampling to estimate parameters. From the results of the model, we can find some important relationships among features in our medical data. Finally, the SLDA model was combined with a Random Forest classifier, which gets good predictive performance in disease prediction.
What problem does this paper attempt to address?