The Effect of Haplotype Size on Genomic Selection Accuracy and Epistasis: An Empirical Study in Rice
Description
This dataset contains phenotypic and genotypic data collected from rice populations to investigate the impact of haplotype size on genomic selection (GS) accuracy and epistasis. The phenotypic data covers field trials from 2020, 2021, and 2022 for the MP2 and MP6-8 populations. Genotypic data is included for the MP2, MP4, and MP6-8 populations. All scripts necessary for the analysis are also provided, ensuring reproducibility. Research Hypothesis and Key Findings: Genomic selection (GS) has revolutionized breeding by combining genotype and phenotype data to predict genomic estimated breeding values (GEBVs), potentially accelerating breeding cycles. This study hypothesized that recombination affects haplotype size and linkage disequilibrium (LD), influencing GS prediction accuracy. Specifically, the study aimed to: Examine the relationship between recombination and haplotype sizes. Compare additive (A) versus additive + epistasis (A+I) models on prediction accuracy. Investigate how haplotype resolution in the training set (TS) affects prediction accuracy. Results showed a direct correlation between LD decay and recombination opportunities within populations, with populations undergoing more recombination displaying smaller haplotype blocks. While the A+I model improved heritability, it did not enhance prediction accuracy. Populations with smaller haplotype sizes in the TS exhibited improved prediction accuracy, highlighting the importance of haplotype size in GS. Data Description and Use: The phenotypic data includes traits measured across three years of field trials, while the genotypic data represents the underlying genetic makeup of the populations. The unique aspect of this dataset is its focus on populations where recombination rate—and therefore haplotype size—is the primary variable. Researchers can use this dataset to explore the relationship between recombination, haplotype structure, and GS prediction accuracy, providing insights into breeding strategy design.