[-] Show simple item record

dc.contributor.advisorCheng, Jianlineng
dc.contributor.authorHong, Yechaneng
dc.date.issued2019eng
dc.date.submitted2019 Springeng
dc.description.abstractMotivation: SCOPe 2.07 is a dataset of 276,231 protein domains that have been partitioned into varying folds according to their shape and function. Since a protein's fold reveals valuable information about it's shape and function, it is important to fin d a mapping between proteins and their folds. There are existing techniques to map a protein's sequence into a fold [2] but none to map a protein's shape into a fold for the entire SCOPe 2.07 dataset. We focus on the topological features of a protein to map it into a fold. We introduce several new techniques that accomplish this. Results: We develop a 2D-convolutional neural network to classify any protein structure into one of 1232 folds. We extract two classes of input features for each protein: distance matrix and persistent homology images. Due to restrictions in our computing resources, we make sample every other point in the carbon alpha chain. We find that it does not lead to significant loss in accuracy. Using the distance matrix, we achieve an accuracy of 90% on the entire dataset. With persistence images of 100x100 resolution, we achieve an accuracy of 54% on SCOP 1.55.eng
dc.description.bibrefiv, 62 pages : illustrationseng
dc.identifier.urihttps://hdl.handle.net/10355/70142
dc.titlePRO3DCNN : convolutional neural network for mapping protein structure into foldseng
dc.typeThesiseng
thesis.degree.disciplineComputer scienceeng
thesis.degree.grantorUniversity of Missouri--Columbiaeng
thesis.degree.levelMasterseng
thesis.degree.nameM.S.eng


Files in this item

[PDF]

This item appears in the following Collection(s)

[-] Show simple item record