Analytical Data Review on an Artificial Intelligence Platform for Doping Control in Horse Racing

Chun Sing Lai,April S. Y. Wong,Kin-Sing Wong,Terence S. M. Wan,Emmie N. M. Ho
DOI: https://doi.org/10.26434/chemrxiv-2024-pt7ks
2024-09-24
Abstract:In the screening of prohibited substances (PS) in horse biological samples with Gas Chromatography/Mass Spectrometry (GC/MS) and Liquid Chromatography/Mass Spectrometry (LC/MS) for doping control, enormous number of chromatograms are generated. Reviewing these chromatograms to identify suspicious findings requires extensive manual effort. Recent advancement in Artificial Intelligence (AI) enables its use to classify images into different categories. This can potentially be utilised to perform first-line analysis of chromatograms, which are usually displayed as images, by classifying them into "positive" (POS) or "negative" (NEG) in respect of the presence of PS. This study explores the feasibility of using AI to perform initial chromatogram analysis, aiming to improve the efficiency and accuracy of data vetting. A predictive model was developed using the image recognition tool in "Alteryx Designer", a data analytic software, to analyse chromatograms generated from LC/MS analysis of horse urine. The model was developed by training with over 6000 chromatograms that had manually been classified as "POS" or "NEG". To evaluate the model’s accuracy, around 700 manually-classified chromatograms were analysed by the model and the prediction accuracy was over 90 %. The model was applied to two of our in-house screening methods each covering over 300 drug targets. It was shown that the model can identify "SUS"/ "POS" and "NEG" chromatograms with high accuracy with no false negative classification. There are two major challenges in applying the developed model to perform first-line analysis in regular testing, with the first being the analysis time. With the existing Alteryx workflow, analysing one batch of samples from one of our in-house screening methods with a standard office PC requires 3-5 hours. The second challenge is the inflexibility of data extraction workflow. The workflow only works on analytical data generated from specific instruments and software which poses challenges to its implementation in regular testing which involves large variety of instruments and processing software.
Chemistry
What problem does this paper attempt to address?
The main problem this paper attempts to address is improving the efficiency and accuracy of chromatogram review in the detection of prohibited substances in horse racing. Specifically: 1. **Background Issue**: - During the screening process for prohibited substances (PS) in horse racing, the use of gas chromatography/mass spectrometry (GC/MS) and liquid chromatography/mass spectrometry (LC/MS) analysis generates a large number of chromatograms. - Reviewing these chromatograms to identify suspicious results requires a significant amount of manual effort and is prone to human error due to fatigue. 2. **Solution**: - Utilize artificial intelligence (AI) technology, particularly image classification techniques, to automatically analyze chromatograms and classify them as "positive" (POS, indicating the possible presence of prohibited substances) or "negative" (NEG, indicating no prohibited substances). - Developed a predictive model using image recognition tools in Alteryx Designer to classify chromatograms generated by LC/MS. 3. **Objective**: - Improve the efficiency and accuracy of chromatogram review, reducing the time and workload of manual review. - Filter out obviously "negative" chromatograms through preliminary analysis and quickly flag "positive" or "suspicious" (SUS) chromatograms that require further review. 4. **Challenges**: - Flexibility of the data extraction process: The current workflow is only applicable to data generated by specific instruments and software. - Long analysis time: Processing a batch of samples using the existing Alteryx workflow takes 3-5 hours, which is too long for routine testing. Through these measures, the research aims to explore the potential of AI in the detection of prohibited substances in horse racing, laying the foundation for future automated data analysis systems.