Prediction of Autism Risk From Family Medical History Data Using Machine Learning: A National Cohort Study From Denmark

Linda Ejlskov,Jesper N Wulff,Amy Kalkbrenner,Christine Ladd-Acosta,M Danielle Fallin,Esben Agerbo,Preben Bo Mortensen,Brian K Lee,Diana Schendel
DOI: https://doi.org/10.1016/j.bpsgos.2021.04.007
2021-05-05
Abstract:Background: A family history of specific disorders (e.g., autism, depression, epilepsy) has been linked to risk for autism spectrum disorder (ASD). This study examines whether family history data could be used for ASD risk prediction. Methods: We followed all Danish live births, from 1980 to 2012, of Denmark-born parents for an ASD diagnosis through April 10, 2017 (N = 1,697,231 births; 26,840 ASD cases). Linking each birth to three-generation family members, we identified 438 morbidity indicators, comprising 73 disorders reported prospectively for each family member. We tested various models using a machine learning approach. From the best-performing model, we calculated a family history risk score and estimated odds ratios and 95% confidence intervals for the risk of ASD. Results: The best-performing model comprised 41 indicators: eight mental conditions (e.g., ASD, attention-deficit/hyperactivity disorder, neurotic/stress disorders) and nine nonmental conditions (e.g., obesity, hypertension, asthma) across six family member types; model performance was similar in training and test subsamples. The highest risk score group had 17.0% ASD prevalence and a 15.3-fold (95% confidence interval, 14.0-17.1) increased ASD risk compared with the lowest score group, which had 0.6% ASD prevalence. In contrast, individuals with a full sibling with ASD had 9.5% ASD prevalence and a 6.1-fold (95% confidence interval, 5.9-6.4) higher risk than individuals without an affected sibling. Conclusions: Family history of multiple mental and nonmental conditions can identify more individuals at highest risk for ASD than only considering the immediate family history of ASD. A comprehensive family history may be critical for a clinically relevant ASD risk prediction framework in the future.
What problem does this paper attempt to address?