A distributed CPU-GPU framework for large-scale pairwise alignment
Abstract
[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] Several problems in computational biology require the all-against-all pairwise comparisons of tens of thousands of individual biological sequences. Each such comparison can be performed with the well-known Needleman-Wunsch alignment algorithm. However, with the rapid growth of biological databases, performing all possible comparisons with this algorithm in serial becomes extremely time-consuming. The massive computational power of graphics processing units (GPUs) makes them an appealing choice for accelerating these computations. As such, CPU-GPU clusters can enable all-against-all comparisons on large datasets. This thesis presents a hybrid MPI-CUDA framework for computing multiple pairwise sequence alignments on CPU-GPU clusters. The design targets both homogeneous and heterogeneous clusters with nodes characterized by different hardware and computing capabilities. The framework consists of the following components: a cluster-level dispatcher, a set of node-level dispatchers, and a set of CPU- and GPUworkers. The cluster-level dispatcher progressively distributes work to the compute nodes and aggregates the results. The node-level dispatchers distribute alignment tasks to available CPUs and GPUs and perform dual-buffering to hide data transfers between CPU and GPU. CPU- and GPU-workers perform pairwise sequence alignments using the Needleman-Wunsch algorithm. The proposed GPU workers are evaluated on different platforms and all of them outperform the existing open-source implementation from the Rodinia Benchmark Suite.
Degree
M.S.
Thesis Department
Rights
Access is limited to the University of Missouri - Columbia.