Nested association mapping and accuracy of predicting genomic breeding value for agronomic and seed composition traits in three interspecific soybean populations
Given the narrow genetic base of soybean, discovering useful traits in exotic germplasm could increase the diversity in the current elite gene pool. However, it is essential to characterize beneficial alleles from the wild soybean (Glycine soja) to enhance genetic gain. The objective of this study was to investigate grain yield, agronomic traits and seed composition traits using a soybean NAM panel containing crosses between Williams 82 (hub-parent) and three Glycine soja parents (PI464890B, PI458536, and PI522226). Field tests were conducted in Albany, Columbia, Novelty and Rock Port, Missouri for two years, 2016 and 2017, in an augmented incomplete block design with one replication in 2016 and two replications in 2017. The nested association mapping and linkage mapping could identify three major QTL for plant maturity from Glycine soja in Chromosomes 6, 11 and 12 presenting a significant increment in days to maturity. A major QTL for plant height was identified in Chromosome 13 and showed an increase in plant height for lines that carried the wild soybean allele. A significant QTL for grain yield from Glycine soja was detected in Chromosome 17 and showed a positive effect of 166.1 kg ha-1 and yielded an average of 6% more than the Glycine max parent (Williams 82) across environments. Also, we identified 61 and 12 QTLs associated with seed composition traits in the NAM analysis and linkage analysis, respectively. Four QTLs showed pleiotropic effects with soybean seed composition traits. Two QTLs, one on Chromosome 5 and another on Chromosome 15 were associated with the fatty acid profile, explaining 3-18% of the phenotypic variance. The confirmed QTLs for protein and oil cqSeed protein-001 on Chromosome 15 and cqSeed protein-003 on Chromosome 20 were identified. Also, the QTL on chromosome 20 was associated with ten amino acids. However, the allele associated with protein concentration was also responsible for a reduction in amino acid concentration. Another QTL on Chromosome 19 was associated with Cysteine, Methionine, and Leucine and explained 9-30% of the phenotypic variation. Our results reinforce that increasing protein may not increase amino acid concentrations and suggest independent genetic control for protein and sulfur-containing amino acids. In addition to the mapping study, we conducted a genomic prediction study in the NAM panel. Increasing the training population size from 50 to 300 individuals improved prediction accuracy from 0.49 to 0.77 (57% increase) across all traits, with little increment between 300 and 390 individuals (1%). Marker density had little impact on the prediction accuracy across traits, with a significant increment in prediction accuracy up to 1423 markers (18.5%). The training population design Across all families had higher prediction accuracies for all the traits compared with Leave one family out and Within families designs, with prediction accuracies ranged from moderate (0.55) to high (0.75). The NAM panel containing interspecific crosses, successfully predict polygenic traits. Our results showed encouraging prediction accuracies for grain yield (0.55-0.73), which is impressive from crosses originating from wild soybean. In conclusion, training population strategies where population size and multiple families were maximized (Across all families design) produce robust prediction accuracies for yield, maturity, protein, and oil. Genomic predictions might also accelerate genetic gain in pre-breeding efforts using wild soybeans.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License.