Relating enhancer genetic variation across mammals to complex phenotypes using machine learning
Irene M. Kaplow,Alyssa J. Lawler,Daniel E. Schäffer,Chaitanya Srinivasan,Heather H. Sestili,Morgan E. Wirthlin,BaDoi N. Phan,Kavya Prasad,Ashley R. Brown,Xiaomeng Zhang,Kathleen Foley,Diane P. Genereux,Elinor K. Karlsson,Kerstin Lindblad-Toh,Wynn K. Meyer,Andreas R. Pfenning,Gregory Andrews,Joel C. Armstrong,Matteo Bianchi,Bruce W. Birren,Kevin R. Bredemeyer,Ana M. Breit,Matthew J. Christmas,Hiram Clawson,Joana Damas,Federica Di Palma,Mark Diekhans,Michael X. Dong,Eduardo Eizirik,Kaili Fan,Cornelia Fanter,Nicole M. Foley,Karin Forsberg-Nilsson,Carlos J. Garcia,John Gatesy,Steven Gazal,Diane P. Genereux,Linda Goodman,Jenna Grimshaw,Michaela K. Halsey,Andrew J. Harris,Glenn Hickey,Michael Hiller,Allyson G. Hindle,Robert M. Hubley,Graham M. Hughes,Jeremy Johnson,David Juan,Irene M. Kaplow,Elinor K. Karlsson,Kathleen C. Keough,Bogdan Kirilenko,Klaus-Peter Koepfli,Jennifer M. Korstian,Amanda Kowalczyk,Sergey V. Kozyrev,Alyssa J. Lawler,Colleen Lawless,Thomas Lehmann,Danielle L. Levesque,Harris A. Lewin,Xue Li,Abigail Lind,Kerstin Lindblad-Toh,Ava Mackay-Smith,Voichita D. Marinescu,Tomas Marques-Bonet,Victor C. Mason,Jennifer R. S. Meadows,Wynn K. Meyer,Jill E. Moore,Lucas R. Moreira,Diana D. Moreno-Santillan,Kathleen M. Morrill,Gerard Muntané,William J. Murphy,Arcadi Navarro,Martin Nweeia,Sylvia Ortmann,Austin Osmanski,Benedict Paten,Nicole S. Paulat,Andreas R. Pfenning,BaDoi N. Phan,Katherine S. Pollard,Henry E. Pratt,David A. Ray,Steven K. Reilly,Jeb R. Rosen,Irina Ruf,Louise Ryan,Oliver A. Ryder,Pardis C. Sabeti,Daniel E. Schäffer,Aitor Serres,Beth Shapiro,Arian F. A. Smit,Mark Springer,Chaitanya Srinivasan,Cynthia Steiner,Jessica M. Storer,Kevin A. M. Sullivan,Patrick F. Sullivan,Elisabeth Sundström,Megan A. Supple,Ross Swofford,Joy-El Talbot,Emma Teeling,Jason Turner-Maier,Alejandro Valenzuela,Franziska Wagner,Ola Wallerman,Chao Wang,Juehan Wang,Zhiping Weng,Aryn P. Wilder,Morgan E. Wirthlin,James R. Xue,Xiaomeng Zhang,
DOI: https://doi.org/10.1126/science.abm7993
IF: 56.9
2023-04-28
Science
Abstract:Protein-coding differences between species often fail to explain phenotypic diversity, suggesting the involvement of genomic elements that regulate gene expression such as enhancers. Identifying associations between enhancers and phenotypes is challenging because enhancer activity can be tissue-dependent and functionally conserved despite low sequence conservation. We developed the Tissue-Aware Conservation Inference Toolkit (TACIT) to associate candidate enhancers with species’ phenotypes using predictions from machine learning models trained on specific tissues. Applying TACIT to associate motor cortex and parvalbumin-positive interneuron enhancers with neurological phenotypes revealed dozens of enhancer–phenotype associations, including brain size–associated enhancers that interact with genes implicated in microcephaly or macrocephaly. TACIT provides a foundation for identifying enhancers associated with the evolution of any convergently evolved phenotype in any large group of species with aligned genomes.
multidisciplinary sciences