Genetic Basis Underlying Fertility Restoration of CMS-WA and CMS-HL
In this study, GWAS revealed that fertility restoration of CMS-WA was mainly conditioned by the major gene Rf4, as the remaining loci could not be repeatedly detected in 2 years and accounted for less variation (Table 1). The locus around 5.6 Mb of chromosome 1 contributed 19.01% of the variation of pollen fertility in year 2014, and was located to the mapping region of Rf3 (Qi et al. 2008, Suresh et al. 2012, Yao et al. 1997, Zhang et al. 1997). The other loci were novel. For CMS-HL, Rf5 was the unique major gene in the association population, and the remaining loci were only responsible for BSS in 2014 (Fig. 4, Table 2).
The majority of loci could not be repeatedly detected in 2 years, which may be attributed to two reasons. First of all, the number of loci conferring BSS far overweighs that conferring pollen fertility and NSS (Tables 1 and 2), demonstrating that some loci for BSS may be false, due to artificial and environmental effect. At flowering stage, selected panicles were tightly bagged, which affected the elongation of stem of different lines to varying degrees. Stems of some lines were bent over a large angle and even broken off, leading to less or no seed-setting rate. On the other hand, the bag limited the space of panicles, especially for those lines displaying large panicles, and caused increase in local temperature, which would also decrease seed-setting rate. In addition, the reality of the loci for BSS await further validation. Secondly, the size of the two association populations in 2013 was about 100 lines less than that in 2014 respectively, which could explain why some loci in 2014 were not detected in 2013, such as Rf3, the reported major gene for CMS-WA (Additional file 2: Table S1). The 337 paternal accessions and the two maternal parents displayed huge variations in heading date (http://ricevarmap.ncpgr.cn/v2/), which made it of great difficulty to make hybrids covering all paternal accessions in the planting season of year 2012, and some hybrids were further made in year 2013. With the two reasons above, the stable detection of Rf4 and Rf5 further demonstrated that the two are major genes for corresponding CMSs.
Haplotype Analysis of Rf Genes
Haplotype analysis of Rf4 revealed five types, among which the H1, H2, H4 and H5 had been reported by Tang et al. (2014). The H2 type is carried by the restorer IR24 (not in our accessions), demonstrating that it is functional, which is consistent with the result of multiple comparisons of fertility-related traits in our study (Tang et al. 2014). Therefore, the two nonsynonymous SNPs in H2 are likely to make little change to the function of RF4. No difference was observed between the three traits of H3 and H4, indicating that H3 is also a nonfunctional type, duo to the change of 50 amino-acid residues (Fig. 5a-b, Additional file 1: Figure S4a). In addition, among the paternal parents used in this study, aus accessions carry only the nonfunctional H3 and H4 type, 125 of 143 XI I accessions carry the nonfunctional H4 type, and 59 of 64 accessions carrying the functional H2 type belong to XI II group. The subgroup preferences of different Rf4 haplotypes implied that Rf4 had been subjected to selection in rice breeding, and contributed greatly to the differentiation of the three subgroups.
Haplotype analysis of Rf5 revealed two main haplotypes and several rare haplotypes that are carried by less than five accessions (Additional file 2: Table S1, Fig. 5c). The non-function H2 type carried by YTA does not have the functional SNP reported by Hu et al. (2012) (Additional file 2: Table S4). Therefore, the change of 14 amino-acid residues is likely to abolish the function of RF5 in fertility restoration, which is the same as the H3 type of Rf4. H1 type is carried by the majority of paternal accessions, implying that it has other important functions that facilitates its spreading in cultivated rice, though it is still not clear. Several rare haplotypes including the H3 type are existed only in aus accessions, and the two main haplotypes are also existed in aus accessions, showing that aus accessions are valuable germplasm resource for investigation of the evolution of Rf5 alleles (Additional file 2: Table S1).
Haplotype analysis of Rf6 revealed eight haplotypes, among which the first three types carry the functional 327 bp insertion reported by Huang et al. (2015) (Fig. 6a). However, multiple comparison of the six main haplotypes indicated that the 327 insertion is not likely to be the functional variation. As BSS and NSS are only indirect reflections of fertility restoration, the conclusion above is awaited further to be validated with pollen fertility, which was unfortunately difficult to evaluate in this study.
Difficulties in Mining Rf Genes Using GWAS
GWAS have been proved to be powerful in genetic dissection of complex quantitative traits in rice and identifying candidate genes underlying target traits (Han and Huang 2013). However, its power suffered a major setback in mining Rf genes, at least in this study. GWAS of fertility-related traits revealed that both the fertility restoration of CMS-WA and CMS-HL were controlled by a major locus and several minor loci (Tables 1 and 2). However, the two major loci were located to a region containing about 10 genes encoding PPR proteins, which show high sequence homology (Tang et al. 2014). Without previous studies on gene cloning of Rf4 and Rf5, it would be of great difficulty to ascertain underlying functional genes. Furthermore, although the Rf3 region was detected for pollen fertility, the region about ±100 kb away from the strongest signal contains 20 annotated ORFs, but none encode PPR proteins or other known homology proteins involved in fertility restoration, making it difficult to select candidate genes. Therefore, just like all the seven cloned Rf genes in rice viz Rf1a, Rf1b, Rf2, Rf4, Rf5, Rf6 and Rf17, map-based cloning is the only choice to narrow the target locus down to the smallest region containing the functional gene (Fujii and Toriyama 2009, Hu et al. 2012, Huang et al. 2015, Itabashi et al. 2011, Tang et al. 2014, Wang et al. 2006). In addition, the three fertility-related traits are easily affected by environment. The durable high temperature and humidity in Wuhan during the growing season exerted great pressure to the fate of developing and developed pollens, which resulted in the unrepeatable detection of many association signals in two different years, especially for those minor loci (Tables 1 and 2). Therefore, a combination of GWAS and linkage mapping would be better in mining Rf genes, which would provide not only an overview of the genetic basis, but also a high resolution of functional genes (Deng et al. 2017, Wang et al. 2018).
Application in Development of Three-Line Hybrid Rice
A three-line hybrid combination consists of three lines, a restorer, a CMS line and its maintainer line. No Rf genes are allowed in the genome of CMS lines and its maintainer lines, in order to maintain the complete sterility of CMS lines. In contrast, Rf genes are favored by restorers to restore the fertility of CMS lines as much as possible. Therefore, selection of Rf genes is of great importance in development of three-line hybrid rice. Some markers have been developed for Rf4 in previous studies (Chen et al. 2017, Suresh et al. 2012, Tang et al. 2014). In this study, the haplotypes of three Rf genes viz Rf4, Rf5 and Rf6 have been systematically classified using 337 accessions that covering the majority variation of XI and aus accessions worldwide, providing valuable sequence variations for the development of co-segregating markers (Additional file 2: Table S3-S4, Fig. 6a). Take Rf4 for example. SNPs at the position of + 503, + 919, + 929/930, + 1607, + 1618, + 1621 of the coding region and the 1515 bp insertion are co-segregated with the function of Rf4, and thus could be developed into suitable molecular markers to facilitate selection of Rf4.
Except for major genes, minor Rf genes are vital in breeding process, which not only affect the sterility stability of CMS lines greatly, but also the degree of fertility restoration of restorers. However, the selection efficiency of minor Rf genes is far from expectations, due to the inability to precisely mapping them. In order to avoid the disturbance of minor Rf genes, new maintainer lines are always developed from progenies of existing maintainer lines, and so do restorers, which severely limit the genetic diversity of hybrid combinations. In this study, the ability of fertility restoration of 337 accessions for CMS-WA and CMS-HL has been evaluated individually (Additional file 2: Table S1). The accessions that displayed no fertility restoration under the background of CMS could be directly used as maintainer lines or used as parents of novel maintainer lines, and those showing high fertility restoration under the background of CMS could be used in breeding of restorers. The majority of 337 accessions are inbred lines or landraces from worldwide, and several breeders are not familiar with them, suggesting that these accessions have not been exploited in hybrid rice breeding (personal communication, Xie et al. 2015). Therefore, results in this study could provide valuable germplasm resources to broaden the genetic diversity of three-line hybrid rice.