Control-independent mosaic single nucleotide variant detection with DeepMosaic
Xiaoxu Yang,Xin Xu,Martin W. Breuss,Danny Antaki,Laurel L. Ball,Changuk Chung,Jiawei Shen,Chen Li,Renee D. George,Yifan Wang,Taejeong Bae,Yuhe Cheng,Alexej Abyzov,Liping Wei,Ludmil B. Alexandrov,Jonathan L. Sebat,Joseph G. Gleeson,Dan Averbuj,Subhojit Roy,Eric Courchesne,August Y. Huang,Alissa D’Gama,Caroline Dias,Christopher A. Walsh,Javier Ganz,Michael Lodato,Michael Miller,Pengpeng Li,Rachel Rodin,Robert Hill,Sara Bizzotto,Sattar Khoshkhoo,Zinan Zhou,Alice Lee,Alison Barton,Alon Galor,Chong Chu,Craig Bohrson,Doga Gulhan,Eduardo Maury,Elaine Lim,Euncheon Lim,Giorgio Melloni,Isidro Cortes,Jake Lee,Joe Luquette,Lixing Yang,Maxwell Sherman,Michael Coulter,Minseok Kwon,Peter J. Park,Rebeca Borges-Monroy,Semin Lee,Sonia Kim,Soo Lee,Vinary Viswanadham,Yanmei Dou,Andrew J. Chess,Attila Jones,Chaggai Rosenbluh,Schahram Akbarian,Ben Langmead,Jeremy Thorpe,Sean Cho,Andrew Jaffe,Apua Paquola,Daniel Weinberger,Jennifer Erwin,Jooheon Shin,Michael McConnell,Richard Straub,Rujuta Narurkar,Yeongjun Jang,Cindy Molitor,Mette Peters,Fred H. Gage,Meiyan Wang,Patrick Reed,Sara Linker,Alexander Urban,Bo Zhou,Xiaowei Zhu,Aitor S. Amero,David Juan,Inna Povolotskaya,Irene Lobon,Manuel S. Moruno,Raquel G. Perez,Tomas Marques-Bonet,Eduardo Soriano,Gary Mathern,Diane Flasch,Trenton Frisbie,Huira Kopera,Jeffrey Kidd,John Moldovan,John V. Moran,Kenneth Kwan,Ryan Mills,Sarah Emery,Weichen Zhou,Xuefang Zhao,Aakrosh Ratan,Alexandre Jourdon,Flora M. Vaccarino,Liana Fasching,Nenad Sestan,Sirisha Pochareddy,Soraya Scuderi,
DOI: https://doi.org/10.1038/s41587-022-01559-w
IF: 46.9
2023-01-02
Nature Biotechnology
Abstract:Mosaic variants (MVs) reflect mutagenic processes during embryonic development and environmental exposure, accumulate with aging and underlie diseases such as cancer and autism. The detection of noncancer MVs has been computationally challenging due to the sparse representation of nonclonally expanded MVs. Here we present DeepMosaic, combining an image-based visualization module for single nucleotide MVs and a convolutional neural network-based classification module for control-independent MV detection. DeepMosaic was trained on 180,000 simulated or experimentally assessed MVs, and was benchmarked on 619,740 simulated MVs and 530 independent biologically tested MVs from 16 genomes and 181 exomes. DeepMosaic achieved higher accuracy compared with existing methods on biological data, with a sensitivity of 0.78, specificity of 0.83 and positive predictive value of 0.96 on noncancer whole-genome sequencing data, as well as doubling the validation rate over previous best-practice methods on noncancer whole-exome sequencing data (0.43 versus 0.18). DeepMosaic represents an accurate MV classifier for noncancer samples that can be implemented as an alternative or complement to existing methods.
biotechnology & applied microbiology