Bayesian Change Point Analysis of Copy Number Variants Using Human Next Generation Sequencing Data
Metadata[+] Show full item record
Read count analysis is the principal strategy implemented in detection of copy number variants using human next generation sequencing (NGS) data. Read count data from NGS has been demonstrated to follow non homogeneous Poisson distributions. The current change point analysis methods for detection of copy number variants are based on normal distribution assumption and used ordinary normal approximation in their algorithms. To improve sensitivity and reduce false positive rate for detection of copy number variants, we developed three models: one Bayesian Anscombe normal approximation model for single genome, one Bayesian Poisson model for single genome, and a Bayesian Anscome normal approximation model for paired genome. The Bayesian statistics have been optimized for detection of change points and copy numbers at single and multiple change points through Monte Carlo simulations. Three R packages based on these models have been built up to simulate Poisson distribution data, estimate and display copy number variants in table and graphics. The high sensitivity and specificity of these models have been demonstrated in simulated read count data with known Poisson distribution and in human NGS read count data as well in comparison to other popular packages.
Table of Contents
Background -- Single genome Bayesian approaches in NGS read count analysis -- Normal approximation Batesian change point model for paired genomes -- Conclusion and future work