A systematic analysis of miRNA markers and classification algorithms for forensic body fluid identification

Yang Liu,Hongxia He,Zhi-Xiong Xiao,Anquan Ji,Jian Ye,Qifan Sun,Yang Cao
DOI: https://doi.org/10.1093/bib/bbaa324
IF: 9.5
2020-12-14
Briefings in Bioinformatics
Abstract:Abstract Identifying the types of body fluids left at the crime scene can be essential to reconstructing the crime scene and inferring criminal behavior. MicroRNA (miRNA) molecule extracted from the trace of body fluids is one of the most promising biomarkers for the identification due to its high expression, extreme stability and tissue specificity. However, the detection of miRNA markers is not the answer to a yes–no question but the probability of an assumption. Therefore, it is a crucial task to develop complicated methods combining multi-miRNAs as well as computational algorithms to achieve the goal. In this study, we systematically analyzed the expression of 10 most probable body fluid-specific miRNA markers (miR-451a, miR-205-5p, miR-203a-3p, miR-214-3p, miR-144-3p, miR-144-5p, miR-654-5p, miR-888-5p, miR-891a-5p and miR-124-3p) in 605 body fluids-related samples, including peripheral blood, menstrual blood, saliva, semen and vaginal secretion. We introduced the kernel density estimation (KDE) method and six well-established methods to classify the body fluids in order to find the most optimal combinations of miRNA markers as well as the corresponding classifying method. The results show that the combination of miR-451a, miR-891a-5p, miR-144-5p and miR-203a-3p together with KDE can achieve the most accurate and robust performance according to the cross-validation, independent tests and random perturbation tests. This systematic analysis suggests a reference scheme for the identification of body fluids in an accurate and stable manner.
biochemical research methods,mathematical & computational biology
What problem does this paper attempt to address?
The paper aims to address the issue of identifying types of biological traces at crime scenes, specifically through the combination of microRNA (miRNA) markers and classification algorithms to achieve accurate recognition of five common body fluids (peripheral blood, menstrual blood, saliva, semen, and vaginal secretions). Specifically, the research objectives include: 1. **Screening miRNA markers**: Screening from reported miRNAs to identify markers that can be specifically expressed in particular body fluids. 2. **Developing classification algorithms**: Utilizing various machine learning methods (such as Kernel Density Estimation [KDE], k-Nearest Neighbors [KNN], Logistic Regression [LR], etc.) to establish classification models for efficient identification of different body fluid types. 3. **Optimizing combination schemes**: Determining the optimal miRNA combinations and their corresponding classification algorithms to achieve high accuracy and robustness in classifying body fluid samples. Through this research, the paper hopes to provide a new and reliable body fluid type identification scheme for the field of forensic identification.