Improving protein structure prediction by deep learning and computational optimization

Hou, Jie

Hou, Jie

View/Open

HouJie.pdf (60.50Mb)

Date

2019

Format

Thesis

Metadata

[+] Show full item record

Abstract

Protein structure prediction is one of the most important scientific problems in the field of bioinformatics and computational biology. The availability of protein three-dimensional (3D) structure is crucial for studying biological and cellular functions of proteins. The importance of four major sub-problems in protein structure prediction have been clearly recognized. Those include, first, protein secondary structure prediction, second, protein fold recognition, third, protein quality assessment, and fourth, multi-domain assembly. In recent years, deep learning techniques have proved to be a highly effective machine learning method, which has brought revolutionary advances in computer vision, speech recognition and bioinformatics. In this dissertation, five contributions are described. First, DNSS2, a method for protein secondary structure prediction using one-dimensional deep convolution network. Second, DeepSF, a method of applying deep convolutional network to classify protein sequence into one of thousands known folds. Third, CNNQA & DeepRank, two deep neural network approaches to systematically evaluate the quality of predicted protein structures and select the most accurate model as the final protein structure prediction. Fourth, MULTICOM, a protein structure prediction system empowered by deep learning and protein contact prediction. Finally, SAXSDOM, a data-assisted method for protein domain assembly using small-angle X-ray scattering data. All the methods are available as software tools or web servers which are freely available to the scientific community.

URI

https://hdl.handle.net/10355/76251
https://doi.org/10.32469/10355/76251

Degree

Ph. D.

Thesis Department

Computer science (MU)

Rights

OpenAccess.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License.