PanKA: Leveraging population pangenome to predict antibiotic resistance

Van Hoan Do,Van Sang Nguyen,Son Hoang Nguyen,Duc Quang Le,Tam Thi Nguyen,Canh Hao Nguyen,Tho Huu Ho,Nam S Vo,Trang Nguyen,Hoang Anh Nguyen,Minh Duc Cao
DOI: https://doi.org/10.1016/j.isci.2024.110623
IF: 5.8
2024-08-02
iScience
Abstract:Machine learning has the potential to be a powerful tool in the fight against antimicrobial resistance (AMR), a critical global health issue. Machine learning can identify resistance mechanisms from DNA sequence data without prior knowledge. The first step in building a machine learning model is a feature extraction from sequencing data. Traditional methods like single nucleotide polymorphism (SNP) calling and k-mer counting yield numerous, often redundant features, complicating prediction and analysis. In this paper, we propose PanKA, a method using the pangenome to extract a concise set of relevant features for predicting AMR. PanKA not only enables fast model training and prediction but also improves accuracy. Applied to the Escherichia coli and Klebsiella pneumoniae bacterial species, our model is more accurate than conventional and state-of-the-art methods in predicting AMR.
What problem does this paper attempt to address?