A Hassle-Free Machine Learning Method for Cohort Selection of Clinical Trials

Liu Man
DOI: https://doi.org/10.48550/arXiv.1808.04694
2018-08-10
Computation and Language
Abstract:Traditional text classification techniques in clinical domain have heavily relied on the manually extracted textual cues. This paper proposes a generally supervised machine learning method that is equally hassle-free and does not use clinical knowledge. The employed methods were simple to implement, fast to run and yet effective. This paper proposes a novel named entity recognition (NER) based an ensemble system capable of learning the keyword features in the document. Instead of merely considering the whole sentence/paragraph for analysis, the NER based keyword features can stress the important clinic relevant phases more. In addition, to capture the semantic information in the documents, the FastText features originating from the document level FastText classification results are exploited.
What problem does this paper attempt to address?