Functional Sorting of Evolutionary Effects in Protein Domains

No Thumbnail Available

Meeting name

Sponsors

Date

Journal Title

Format

Thesis

Research Projects

Organizational Units

Journal Issue

Abstract

One of the most important questions in evolutionary genetics is whether a gene of interest is under positive selection, negative selection or neutral evolution. Although there are numerous existed methods and statistical tests to detect the action of positive selection, almost all of them heavily rely on the availability of enough within-species polymorphism data and between-species divergence data. But for the comparatively more conservative protein domain data, the absence of enough variations makes these general methods incompetent. The other problem lies at the difficulty of discerning the difference between positive selection and relaxation of functional constraint; both processes generally increase the rate of amino acid change relative to synonymous changes within coding regions, and unless the amino acid rate is overwhelmingly high across an entire gene, the signature of positive selection can be obscured.Two different kinds of methods to tackle the above problems were proposed. The first method utilizes the pattern of phylogenetic trees. Through the case study of the evolutionary analysis of the human hemopexin protein domain family, it is hypothesized that two family members hemopexin (HPX) and matrix metalloproteinase 12 (MMP12) genes might have been under positive selection. These two genes were further systematically sequenced within other six primate species; positive selections were detected by H-test. The tree parameters including average family distance, distances between tree members, tree symmetry, and numbers of changes along the branches were checked. Based on multiple linear regression analysis, three tree parameters were utilized to construct an equation for identification of family members under selection, which significantly deviate from the prediction line. These prediction lines essentially represent the neutral theory in a phylogenetic way. A more practical and valuable question can be asked is what part or which amino acids of a protein may be under selection, as this might help to pinpoint the targets with critical functional significance. One of the critical differences between the functional gene and its corresponding pseudogene is that the non-functional version of the gene tends to become simpler in terms of sequence complexity. This provides an independent variable for assessment of functional change. The information theory was applied to measure the change of information content (entropy) within a sequence. Combining this variable with the likelihood of amino acid change, a two dimensional plane is divided into four quadrants. Each of them will represent a different evolutionary mode. This method can sort the functional meaning of variations to each amino acid level. The phylogenetic method is a family-wise method and has a lower computational requirement. While the second method is a more refined one, it is more computationally intensive. Thus they can complement each other to satisfy diverse analysis requirements.

Table of Contents

Abstract -- List of Illustrations -- List of Tables -- Abbreviations -- Introduction to Current Statistical Tests for Positive Selection -- Selection in Two Matrix Metalloproteinase Genes in the Primate Lineage -- Detection of Selections Utilizing Molecular Phylogenetics -- Evolutionary Analysis of Protein Domains -- Novel Method for Discerning the Action of Selection During Evolution -- Appendix -- References -- Vita.

DOI

PubMed ID

Degree

Ph.D

Rights

License