Deciphering regulation in eukaryotic cell: from sequence to function

Valentina Boeva
DOI: https://doi.org/10.48550/arXiv.1409.6661
IF: 4.31
2014-09-23
Genomics
Abstract:A transversal topic of my research has been the development and application of computational methods for DNA sequence analysis. The methods I have been developing aim at improving our understanding of the regulation processes happening in normal and cancer cells. This topic connects together the projects presented in this thesis. Two chapters of the thesis represent major areas of my research interests: (1) methods for deciphering transcriptional regulation and their application to answer specific biological questions, and (2) methods to study the genome structure and their application in cancer studies. The first chapter predominantly focuses on transcriptional regulation. Here I describe my contribution to the development of methodology for the discovery of transcription factor binding sites and the positioning of histone proteins. I also explain how sequence analysis, in combination with gene expression data, can allow the identification of direct target genes of a transcription factor under study, as well as the physical mechanisms of its action. As two examples, I provide the results of my study of transcriptional regulation by (i) oncogenic protein EWS-FLI1 in Ewing sarcoma and (ii) oncogenic transcription factor Spi-1/PU.1 in erythroleukemia. In the second chapter, I describe the sequence analysis methods aimed at the identification of the genomic rearrangements in species with existing reference genome. I explain how the developed methodology can be applied to detect the structure of cancer genomes. I provide an example of how such an analysis of tumor genomes can result in a discovery of a new phenomenon: chromothripsis, when hundreds of rearrangements occur in a single cellular catastrophe. The thesis is concluded by listing the major challenges in high-throughput sequencing analysis. I also discuss the current top questions demanding the integration of sequencing data.
What problem does this paper attempt to address?