dc.contributor.advisor | Xu, Dong, 1965- | eng |
dc.contributor.author | Alazmi, Meshari Saud | eng |
dc.date.issued | 2012 | eng |
dc.date.submitted | 2012 Spring | eng |
dc.description | Title from PDF of title page (University of Missouri--Columbia, viewed on September 10, 2012). | eng |
dc.description | The entire thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file; a non-technical public abstract appears in the public.pdf file. | eng |
dc.description | Thesis advisor: Dr. Dong Xu | eng |
dc.description | Includes bibliographical references. | eng |
dc.description | M. S. University of Missouri--Columbia 2012. | eng |
dc.description | "May 2012" | eng |
dc.description.abstract | Quality assessment for protein structure models is an important issue in protein structure prediction. Consensus methods assess each model based on its structural similarity to all the other models in a model set, while single scoring methods, such as Opus-ca and RW, evaluate each model based on its structural properties. In this work, a novel method proposed and developed to effectively combine consensus methods and single scoring methods for better quality assessment. At first, a new method called Single Position Specific Probability (SPSP) Score is proposed based on consensus method using 4-mer sequence. Specifically, every letter in the 4-mer sequence represents a state for a local region consisting of four amino acids. A machine learning method (Neural Network) helped to combine several single scoring methods, RW, DDFire, and OPusCa with consensus methods, SPSP and Consensus Global Distance Test-Total Score (CGDT-TS) to achieve a good combination of all the terms. The method was tested on two benchmark datasets and achieved improvements over the state-of-the-art methods. The first benchmark was on Yang Zhang's data containing 56 targets. The second benchmark was from Rosetta data containing 35 targets. For Zhang's data, the CGDT score is 0.6058, while combined method achieved 0.6105. For Rosetta data, the CGDT score achieved 0.4255, while combined method achieved 0.4529. | eng |
dc.format.extent | xiii, 98 pages | eng |
dc.identifier.uri | http://hdl.handle.net/10355/15238 | |
dc.language | English | eng |
dc.publisher | University of Missouri--Columbia | eng |
dc.relation.ispartofcommunity | University of Missouri--Columbia. Graduate School. Theses and Dissertations | eng |
dc.rights | OpenAccess. | eng |
dc.rights.license | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License. | |
dc.subject | 4-mer sequence | eng |
dc.subject | protein structure prediction | eng |
dc.subject | protein structure model | eng |
dc.title | Protein structural models selection using 4-mer sequence and combined single and consensus scores | eng |
dc.type | Thesis | eng |
thesis.degree.discipline | Computer science (MU) | eng |
thesis.degree.grantor | University of Missouri--Columbia | eng |
thesis.degree.level | Masters | eng |
thesis.degree.name | M.S. | eng |