Analyzing high-throughput genomics data for cancer studies
Abstract
[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT REQUEST OF AUTHOR.] NGS data output has increased at a rate that outpaces Moore's law, more than doubling each year since it was invented. Studying such high-throughput data has revealed limitless insight about the genome, transcriptome, and epigenome of many species. In this thesis, I contributed the research community with means to better study of such data along with leveraging a high-throughput biological data to better understand epigenetic regulation of a cell and their disease associations using computational methods. Firstly, I contributed towards building on and improving an existing software tool, PRIMEGENS used to design primers for polymerase chain reaction (PCR) which is one of the most breakthrough and highly used technology in the field of genetics. Apart from contributing towards releasing its new version, PRIMEGENSv2, I designed and made available its web-version PRIMEGENSw3 providing an interactive, easy-to-use and user-friendly online tool for high-throughput primer and probe designed. Next, I leverage the high-through sequencing data profiling genomic methylation, expression of genes and histone modifications. I conducted computational analysis of genome-wide epigenetic modifications that play a key role in cancer development and cellular proliferation. We found evidences showing that hypomethylation changes at regions other than promoter region might also contribute to some significant deleterious effect that can result in malignant transformation or tumor progression and thus have higher biological significance. Also, this study contributes to our understanding about the relationship between methylation of different genic parts including exons and introns from 3' and 5' UTRs, with expression levels in chronic lymphocyte leukemia (CLL) samples. Next, a systems biology approach of independent network construction and preservation of 3'UTR methylation and expression data also revealed expression regulation by hypomethylation of 3'UTRs. Lastly, I validated the presence of widespread hypomethylation regions like 3'UTR, gen body and introns and expression regulation in other cancer types. Hence, I present this study as a new paradigm of looking at genome-wide DNA hypomethylation, in addition to hypermethylation, that can be very helpful to unveil their underlying synergistic mechanism regulating the disease. Overall, this dissertation focuses and present how scalability and specificity of the PCR based enrichment method, combined with the throughput and accuracy of the NGS technology, enable researchers to perform ultra-deep sequencing of regions of interest to better understand areas such as tumorigenesis, population diversity, microbial resistance, and disease susceptibility and thereby advance the scientific fields.
Degree
Ph. D.
Thesis Department
Rights
Access is limited to the campuses of the University of Missouri.