Physics-based predictions of RNA loop stability and structures
Abstract
[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] RNA (ribonucleic acid) molecules play a variety of crucial roles in cellular functions at the level of transcription, translation and gene regulation. RNA functions are tied to structures. In parallel to the experimental determination of RNA structures, such as X-ray crystallography and NMR spectroscopy, which can be laborious, time-consuming and expensive, it is imperative to develop a reliable theoretical/computational model for RNA structure prediction from its sequence. We aim to develop a novel free energy-based model for RNA structures, especially for RNA loops and junctions. One of the major roadblocks for the physics-based RNA tertiary structure prediction is the evaluation of the entropy for RNA tertiary folds. In particular, the entropies of structures with multiple loops and helices can be highly convoluted due to the volume exclusion between the loops and helices. In the first project, we develop a new conformational entropy model for RNA structures consisting of multiple helices connected by cross-linked loops. The basic strategy of our approach is to decompose the whole structure into a number of three-body building blocks, where each building block consists of a loop and two helices that are directly connected to the two ends of the loop. The simple construct of the three-body system allows for accurate computation of the conformational entropy for each building block. Assembly of the building blocks gives the entropy of the whole structure. This approach enables treatment of a large class of RNA tertiary folds. Tests against exact computer enumeration indicate that the method can yield accurate results for the entropy. The method provide a solid first step toward a systematic development of an entropy and free energy model for complex tertiary folds for RNA and other biopolymer. In the second project, we developed a novel approach to the prediction of loop structures from the sequence. The current loop free energy parameters (such as the Turner rules) depend only on the loop length and ignore the loop sequence-dependence. Such an oversimplification can lead to significant inaccuracies in the prediction of loop structure and stability. Here we tackle the problem by extracting the sequence-dependent scoring functions from the known loop structures. Specifically, based on the survey of all the known RNA structures, we derive a set of virtual bond-based scoring functions for the different types of dinucleotides. To circumvent the problem of reference state selection, we apply an iterative method to extract the effective potential, based on the complete conformational ensemble. This new new method has two notable advantages: (1) the statistical potential is extracted from the complete conformational ensemble, including the nonnative structures, (2) the method predicts low-energy loop structures from the sequence without additional information such as the homologous structural template. With such a set of knowledge-based energy parameters, for a given sequence, we can successfully identify the native structure (the best-scored structure) from a set of structural decoys. Our extensive benchmark tests show consistently encouraging success rates in the coarse-grained loop structure predictions.
Degree
Ph. D.
Thesis Department
Rights
Access is limited to the campus of the University of Missouri--Columbia.