Statistical Challenges in Analyzing Methylation and Long-Range Chromosomal Interaction Data

Zhaohui Qin,Ben Li,Karen N. Conneely,Hao Wu,Ming Hu,Deepak Ayyala,Yongseok Park,Victor X. Jin,Fangyuan Zhang,Han Zhang,Li,Shili Lin
DOI: https://doi.org/10.1007/s12561-016-9145-0
2016-01-01
Statistics in Biosciences
Abstract:With the rapid development of high-throughput technologies such as array and next generation sequencing, genome-wide, nucleotide-resolution epigenomic data are increasingly available. In recent years, there has been particular interest in data on DNA methylation and 3-dimensional (3D) chromosomal organization, which are believed to hold keys to understand biological mechanisms, such as transcription regulation, that are closely linked to human health and diseases. However, small sample size, complicated correlation structure, substantial noise, biases, and uncertainties, all present difficulties for performing statistical inference. In this review, we present an overview of the new technologies that are frequently utilized in studying DNA methylation and 3D chromosomal organization. We focus on reviewing recent developments in statistical methodologies designed for better interrogating epigenomic data, pointing out statistical challenges facing the field whenever appropriate.
What problem does this paper attempt to address?