Fully automated deep supervised and unsupervised learning approaches for 3D protein cryo-EM density map reconstruction
One of the most important components of the human body is the protein. Protein uses for building and repairing tissues, making enzymes and hormones. It is the essential building block of bones, muscles, cartilages, skin and blood. Therefore, a large quantity of protein always needed. Proteins are stored in the form of sequence of nucleotides that can be easily converted into a sequence of amino acids, which is known as a protein primary structure. For protein to perform its job, it needs to be in its three-dimensional structure, which also known as the protein tertiary structure. Several methods were developed for this reason. The most important one among them are X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and recently Electron Microscopy (EM). These methods required complicated procedures that are hard to implement, very time consuming, labor intensive, required well-trained specialists. Therefore, an alternative approach that is less time and cost consuming is required. Molecular structure prediction and understanding leads to major breakthroughs in medicine to design and produce better drugs, which will increase its efficiency and reduce its side effect. Whereas for biotechnology new and more efficient enzymes can be designed which impact many areas of our daily life such as detergents, Textiles, Food and Beverages, Leather, and Bioethanol. In terms of gaining the popularity in structural biology using the Electron Microscopy (EM) technology, a hundred of thousands of single particle images are required to be extracted from two-dimensional (2D) cryo-electron microscopy (cryo-EM) to build a reliable high-resolution (3D) model. In order to reduce the radiation damage to the biomolecules of interest during the imaging process, a limited electron dose is used as the high-energy electrons can greatly damage the specimen during imaging and results in extremely noisy micrographs. Hence, single particle images picking still present significant challenges due to that much single particle in the original (2D) micrographs arises from different sources such as the very low single-to-noise-ration (SNR), low contrast, heavy background noise, ice contamination, particle overlap, and amorphous carbon. Many different computational methods have been proposed for the automated semi-automated single particle piking over the past decades. Most of these methods are based on different techniques such as template-based matching, edge detection, feature extraction, and conversional computational vison. These methods for particle picking often need a large training dataset, which requires extensive manual labor. Other reference-dependent methods rely on low-resolution templates for particle detection, matching and picking, and therefore are not fully automated. To address this challenge, we develop different models such as AutoCryoPicker--a fully automated particle picking approach based on image preprocessing, unsupervised clustering and shape detection. SuperCryoEMPicker--a fully automated super particle clustering method for picking particles of complex and irregular shape in cryo-EM images. DeepCryoPicker--a fully automated deep neural network for single particle picking in cryo-EM. Our approach solves the fully automated single particle in diversity cryo-EM images. We combined two different fully automated particle picking approaches (AutoCryoPicker and SuperCryoEMPicker) to do the fully automated single particle picking. Also, we generated fully automated approach for training dataset expanding and training particle images increasing. The fully automated training particle-selection can automatically distinguish between the "good" and "bad" training examples and isolated the selected particles to positive and negative detection examples. Later, a deep neural network is designed and trained using the generated training dataset. Finally, for each testing micrograph, we used the developed preprocessing stage to improve the quality of the low-SNR micrographs. Then, we use the trained deep neural network model and sliding windows to test every single sub-image based on using the NMS. The results indicated that DeepCryoPicker performed accurately as good as the RELION which is "semi-automated particle picking method", and DeepEM. Another essential process for fully understanding and determining the protein structure is a 3D density map reconstruction. 3D density map of a single protein molecule gives a significant indication to understand the protein functions and structural dynamics relationship. Individual cryo-EM particles provide an opportunity to build/reconstruct a 3D density map using single protein particles. However, always using low-dose images causes radiation of the particle damage (very low particle image contrast and highly noise particle images). That makes some limitations and more challenges for the particle's alignment during the 3D reconstruction at intermediate resolution (1-3nm). To overcome this issue, we design a DeepCryoMap a fully automated cryo-EM particles alignment for 3D Density Maps Reconstruction Based Deep Supervised and Unsupervised Learning Approaches. At the begging in the first two steps, we used our previous model DeepCryoPicker to fully automated pick the particle from the micrographs. The set of the picked particles are fully automated classified and labeled based on their view (top or side-view) using the deep classification network. Then, a perfect 2D particle mask is generated for every single particle and the original particle is aligned based on the binary mask. Finally, we used a 3D computer vision algorithm to reconstruct a localized 3D density map between every two single particle image that has the most corresponding features (information). Then, we average the localized 3D density maps localized to reconstruct the final 3D cryo-EM protein density map.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License.