Bayesian Change Point Analysis of Copy Number Variants Using Human Next Generation Sequencing Data
Date
2014Metadata
[+] Show full item recordAbstract
Read count analysis is the principal strategy implemented in detection of copy number variants using human next generation sequencing (NGS) data. Read count data
from NGS has been demonstrated to follow non homogeneous Poisson distributions.
The current change point analysis methods for detection of copy number variants are
based on normal distribution assumption and used ordinary normal approximation in
their algorithms. To improve sensitivity and reduce false positive rate for detection
of copy number variants, we developed three models: one Bayesian Anscombe normal approximation model for single genome, one Bayesian Poisson model for single
genome, and a Bayesian Anscome normal approximation model for paired genome.
The Bayesian statistics have been optimized for detection of change points and copy
numbers at single and multiple change points through Monte Carlo simulations. Three
R packages based on these models have been built up to simulate Poisson distribution data, estimate and display copy number variants in table and graphics. The high
sensitivity and specificity of these models have been demonstrated in simulated read
count data with known Poisson distribution and in human NGS read count data as
well in comparison to other popular packages.
Table of Contents
Background -- Single genome Bayesian approaches in NGS read count analysis -- Normal approximation Batesian change point model for paired genomes -- Conclusion and future work
Degree
Ph.D.