Repetitive sequence analysis for soybean genome sequences
Metadata[+] Show full item record
Repetitive DNA sequences represent many difficulties in genome sequencing and analysis. Meanwhile, some of repetitive sequences play an important role in gene function and evolution. A new protocol was developed to identify and characterize long and highly reserved repetitive DNA sequences. This protocol adopted four existing tools: RepeatMasker, BLAST, RECON and Clustal W. The protocol has been applied to two soybean genomic sequence databases: one is a combined database of methylation filtered sequences and unfiltered sequences; the other one is a collection of soybean bacterial artificial chromosome (BAC) end sequences. There were total of 313 repetitive sequences identified. Each of those sequences was at least 100bp in length with frequency of seven or more. Further statistical analysis of the composition and distribution of obtained repeat sequences was also studied.