Connecting the Dots: Using Machine Learning to Forge Gene Regulatory Networks from Large Biological Datasets. At the Intersection of GRNs: Where System Biology Meets Machine Learning

Isha Monga,Vinay Randhawa,Sandeep Kumar Dhanda
DOI: https://doi.org/10.1007/978-981-16-5993-5_6
2022-01-01
Abstract:The last decade witnessed the exponential increase in the large and complex biological datasets. These studies range from human to different species under different conditions like disease versus control expression, evaluating clinical trials, microbiome, and metagenomics to generate a number of molecular entities like genes, proteins, transcription factors, and metabolites. The former ones still capture a gist of the rate at which computational repositories are getting filled up. These ever-increasing datasets provide an opportunity by raising a community question in the scientific field: about the effective utilization of the biological datasets for the predictive modeling to gain insights about the deep-seated meaningful information and hence to assist the health informatics community in tackling the grave diseases like autoimmune disorders, cancers, and childhood malignancies by predicting the better treatments or prior diagnosis. Therefore, since the last decade, the prime focus has been the understanding of the complex biological systems using the statistical and data analytical methods, which can specifically be made to handle large, heterogeneous, complex datasets in their training and yield some biological insight. Machine learning (ML) aims to address this question and provides predictive models which allow the scientific community to understand the disease from a new perspective and generate novel hypotheses. Present chapter is a primer on the use of machine learning to connect the different dots hidden in the high-dimensional datasets and infer biological networks. It puts particular interest on the gene regulatory networks (GRNs) and the current computational resources available for the same. It provides a comprehensive picture of the GRNs along with the current advancements and limitations in the area.
What problem does this paper attempt to address?