Protein structural models selection using 4-mer sequence and combined single and consensus scores

Alazmi, Meshari Saud

URI

http://hdl.handle.net/10355/15238

dc.contributor.advisor	Xu, Dong, 1965-	eng
dc.contributor.author	Alazmi, Meshari Saud	eng
dc.date.issued	2012	eng
dc.date.submitted	2012 Spring	eng
dc.description	Title from PDF of title page (University of Missouri--Columbia, viewed on September 10, 2012).	eng
dc.description	The entire thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file; a non-technical public abstract appears in the public.pdf file.	eng
dc.description	Thesis advisor: Dr. Dong Xu	eng
dc.description	Includes bibliographical references.	eng
dc.description	M. S. University of Missouri--Columbia 2012.	eng
dc.description	"May 2012"	eng
dc.description.abstract	Quality assessment for protein structure models is an important issue in protein structure prediction. Consensus methods assess each model based on its structural similarity to all the other models in a model set, while single scoring methods, such as Opus-ca and RW, evaluate each model based on its structural properties. In this work, a novel method proposed and developed to effectively combine consensus methods and single scoring methods for better quality assessment. At first, a new method called Single Position Specific Probability (SPSP) Score is proposed based on consensus method using 4-mer sequence. Specifically, every letter in the 4-mer sequence represents a state for a local region consisting of four amino acids. A machine learning method (Neural Network) helped to combine several single scoring methods, RW, DDFire, and OPusCa with consensus methods, SPSP and Consensus Global Distance Test-Total Score (CGDT-TS) to achieve a good combination of all the terms. The method was tested on two benchmark datasets and achieved improvements over the state-of-the-art methods. The first benchmark was on Yang Zhang's data containing 56 targets. The second benchmark was from Rosetta data containing 35 targets. For Zhang's data, the CGDT score is 0.6058, while combined method achieved 0.6105. For Rosetta data, the CGDT score achieved 0.4255, while combined method achieved 0.4529.	eng
dc.format.extent	xiii, 98 pages	eng
dc.identifier.uri	http://hdl.handle.net/10355/15238
dc.language	English	eng
dc.publisher	University of Missouri--Columbia	eng
dc.relation.ispartofcommunity	University of Missouri--Columbia. Graduate School. Theses and Dissertations	eng
dc.rights	OpenAccess.	eng
dc.rights.license	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License.
dc.subject	4-mer sequence	eng
dc.subject	protein structure prediction	eng
dc.subject	protein structure model	eng
dc.title	Protein structural models selection using 4-mer sequence and combined single and consensus scores	eng
dc.type	Thesis	eng
thesis.degree.discipline	Computer science (MU)	eng
thesis.degree.grantor	University of Missouri--Columbia	eng
thesis.degree.level	Masters	eng
thesis.degree.name	M.S.	eng

Files in this item

Name:: public.pdf
Size:: 2.020Kb
Format:: PDF

View/Open

Name:: research.pdf
Size:: 3.321Mb
Format:: PDF

View/Open

Name:: short.pdf
Size:: 68.87Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

2012 MU theses - Freely available online
Computer Science electronic theses and dissertations (MU)
The electronic theses and dissertations of the Department of Computer Science.

[-] Show simple item record