Overview of Rice Transcripts Detected by dRNA-seq
Alternative splicing is a widespread phenomenon, which is essential for post-transcriptional regulation mediating the mRNA stability and protein diversity of eukaryotic genomes. In this study, we utilized the dRNA-seq technology to study the alternative splicing events in rice using young panicles (YP), unfertilized florets (UF) and fertilized florets (F). The dRNA-seq generated 2.01 million, 2.07 million and 1.80 million reads with a read N50 (the minimum contig length required to cover 50% of the assembled genome sequence) of 1080, 1095 and 1210, respectively, in the YP, UF and F. The mean length of the reads ranged from 706 to 818 bp, while the max length reached from 7063 to 8030 bp. After error correction, 1,761,906, 1,836,813 and 1,606,743 reads were mapped to the Rice MSU7.0 genomes, while the mapping rate was 87.62%, 88.60% and 89.40%, respectively. Then we analyzed the quality of the sequencing reads, which showed consistently high-quality scores over the length of reads (Additional file 1: Fig. S1A, B). The full-length transcript numbers ranged from 1,718,488 to 2,189,176, which accounted over 73.9% among the total clean read numbers (Additional file 2: Table S1). In total, 56,718 genes were annotated in the genome, and 1347 genes were annotated as new genes (Additional file 2: Table S2). The dRNA-seq generated 51,742 transcripts at least in one sample, among which 10,067 were new isoforms mapped to the known genes, and 1633 were considered as novel isoforms, which mapped to the new genes (Additional file 2: Table S2). We also calculated the mapping rate of all the novel genes and transcripts to rice and other plant species (Additional file 1: Fig. S2). Results showed that ~ 94.63% new genes were mapped to rice genome, and few new genes were mapped to other species (Additional file 1: Fig. S2A). The above novel genes mapped to rice genome were significantly enriched in various GO terms (Additional file 1: Fig. S2B). In total, 85.76% novel transcripts were mapped to the rice genome which significantly enriched in various GO terms (Additional file 1: Fig. S2C, D). However, few of the novel transcripts were mapped to other species (Additional file 1: Fig. S2C). Thus, the novel genes and transcripts distributed among various pathways during floret development.
Identification of Full-Length Alternative Splicing Events by dRNA-seq
Alternative splicing events occurred in the three samples were defined as five major types of alternative splicing events, including Mutually exclusive exons (MEE), Intron retention (IR), Exon skipping (ES), Alternative 5′ splice site (A5SS) and Alternative 3′ splice site (A3SS) (Fig. 1a). We calculated all the alternative splicing events in each of the sample using astalavista (Foissac and Sammeth 2007) and the alternative splicing events were examined in the current annotation gene model. In total, we detected 35,317 alternative splicing events, in which 11,599 (~ 32.8%) splicing events were derived from the annotated genes, and 23,718 (67.2%) splicing events were identified as novel alternative splicing events originated from both annotated and unannotated genes. To figure out the distribution of alternative splicing types in different samples, we plotted the pie charts to present the percentages of each types. In the YP sample, IR was the most abundant alternative splicing events (41%), and A3SS was the second most abundant alternative splicing types (22%), then A5SS and ES comprised 18% (Fig. 1b). In contrast, alternative splicing types of IR (30%) decreased, and the A5SS (20%), A3SS (26%) and ES (23%) increased in the sample of UF compared with that of YP (Fig. 1c). Then, the percentage of IR (34%) and A5SS (22%) were slightly elevated, accompanied with the decreased ES (18%), and A3SS (25%) in sample of F compared with that of UF (Fig. 1d). It’s noteworthy that MEE alternative splicing types comprised 1% among all the three samples, indicating that MEE alternative splicing type was stable during floret development (Fig. 1b–d). To verify the reality of the splicing events, 14 AS isoforms were selected randomly to check each isoform in RT-PCR. Amongst 14 AS isoforms, 11 were detectable at the mRNA level either with changed transcript levels or new isoforms (Fig. 2, Additional file 1: Fig. S3).
Analysis of Differential Expressed AS (DAS) Events
To examine the function of floret development related alternative splicing genes in detail, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis was conducted in each of the three samples. Firstly, the alternative splicing transcripts in YP were significantly enriched in “spliceosome”, “metabolic pathways” and “ribosome”, indicating that genes responsible for alternative splicing events were also alternatively spliced in YP (Additional file 1: Fig. S4A). Then, the most three abundant KEGG pathways in UF, were “metabolic pathways”, “spliceosome”, and “ribosome”, which was the same as that in YP (Additional file 1: Fig. S4B). To be noted that the pathway of “plant hormone signal transduction” was significantly enriched before flowering, suggesting the hormone related genes were alternatively spliced before double fertilization (Additional file 1: Fig. S4B). For those spliced events occurred after double fertilization in F, the alternatively spliced genes were significantly enriched in the pathway of “metabolic pathways”, “spliceosome”, and “biosynthesis of secondary metabolites” (Additional file 1: Fig. S4C). In addition, nitrogen metabolism related genes were also alternatively spliced at this time point, revealing that the alternative splicing events of nitrogen metabolism related genes played essential roles after double fertilization (Additional file 1: Fig. S4C). A total of 13,691 differentially expressed (DE) genes were identified in the comparison of UF_vs_YP. And 10,425 DE genes was identified in the comparison of F_vs_UF. DE genes might also alternatively spliced. Thereby, we compared the DE genes with differentially alternative splicing genes to identify differential expressed AS (DAS) events. Subsequently, the DAS events were identified in the comparison of UF_VS_YP and F_VS_UF, respectively (Fig. 3a, b). In total, 1045 AS transcripts were differentially expressed in the comparison of UF_vs_YP, and those DAS genes were enriched in the GO terms of “catalytic activity”, “transporter activity”, “binding”, and “nucleic acid binding transcription factor activity” (Fig. 3a, Additional file 1: Fig. S5). In contrast, much less AS transcripts were differentially expressed in the comparisons of F_vs_UF (Fig. 3b). The GO analysis of those DAS genes showed that terms of “catalytic activity”, “transporter activity”, and “binding”, excluding “nucleic acid binding transcription factor activity” were significantly enriched (Fig. 3b, Additional file 1: Fig. S5). In addition, one AP2 domain containing protein encoded by LOC_Os02g51300 was the differentially expressed AS among the three samples, which was validated by RT-PCR (Fig. 2). LOC_Os02g10920 encoding a zinc finger family protein displayed two transcripts in UF and F, while only one transcript in YP. Moreover, the transcript levels in UF and F increased highly in UF and F. We thereby proposed that alternatively spliced transcript of LOC_Os02g10920 played different roles in regulating floret development from the young panicle to post-fertilization process.
Alternative Spliced Splicing Factors are Essential for AS
Splicing factors play key roles in guiding tissue-specific development processes (Thatcher et al. 2016). Alternative splicing difference among different tissues is thought to be the result of differentially expressed splicing factors and is likely to be influenced by tissue-specific methylation patterns (Regulski et al. 2013). Hence, the expression level of splicing factors was essential in modulating the alternative splicing events. In this study, a total of 38 splicing factor related genes were detected in the three samples according to the protein annotation (Fig. 3c), in which 14 DEs were either up or down regulated in the comparison of UF_vs_YP, and 6 DEs were identified in the comparison of F_vs_UF (Fig. 3a, b). Nevertheless, 9 splicing factors were alternatively spliced in UF compared with YP, in which 6 splicing factors were the DAS genes (Fig. 3a). Besides, 7 splicing factors were alternatively spliced, whereas only 2 DAS genes were identified in the comparison of F_vs_UF (Fig. 3b). Expression pattern of all the splicing factors expressed in the three samples were presented in the heat map (Fig. 3c). Previous evidence showed that splicing factors were alternative spliced frequently, leading to an increased or decreased number of alternative splicing events in their targeted genes (Zhang and Mount 2009; Li et al. 2020). Therefore, it’s reasonable that differentially expressed alternative splicing factors may result in the differentially expressed AS genes during floret development. In addition, one SR repressor protein encoded by LOC_Os12g38430 was alternatively spliced in the three developmental stages, which might be significant for the developmental stage transition.
Effect of Alternative Spliced Transcripts on Rice miRNA Targets
miRNA-targets interaction usually repressed the transcript levels of target genes through guiding cleavage of target miRNAs by base-pairing. To assess how miRNAs interact with the alternative splicing, we performed the miRNA-targets prediction against all the transcripts identified in this study by using previous described methods with modification (Dai and Zhao 2011). In total, 1648 genes alternatively spliced were the predicted targets of known rice miRNAs (Additional file 2: Table S3). Many of these predicted targets were the IR type, displaying the lost or gained target sites. Then, we checked the miRNA binding sites of some AS genes, which verified by RT-PCR (Fig. 2). Results showed that ten out of eleven AS genes were targeted by different miRNAs through gain or loss targeting sites. For example, the second isoform of LOC_Os04g53740 lost the miR1856 targeting site (Fig. 4a). Transcript of LOC_Os08g38910.2 was targeted by miR2924 in the IR type (Fig. 4a). And the transcripts of LOC_Os04g33560.2 and LOC_Os04g33560.3 were targeted by miR2864.1 and miR535, respectively (Fig. 4a). Gene of LOC_Os03g05390 also gained the miR2275 and miR2864 targeting sites because of the IR types transcription (Fig. 4a). Interestingly, miR2775 was proposed to trigger phasiRNA production in premeiotic and meiotic anthers, which might responsible for the male fertility (Sun et al. 2018; Li et al. 2019; Xia et al. 2019), suggesting its role in targeting LOC_Os03g05390 to mediate anther development. The second isoform of LOC_Os01g55570 gained miR2920 targeting sites (Fig. 4a). ES transcripts could lose the targeting sites because of the alternative splicing events, for example, genes of LOC_Os02g10920 and LOC_Os11g03060 lost the targeting sites of miR2864 and miR159a, respectively (Fig. 4b). ES and IR types existed together also caused the gain or loss of targeting sites (Fig. 4c). These results suggested that miRNA-targets interactions could be affected by alternative splicing events.
Association Between lncRNA and AS Genes
Long noncoding RNAs (lncRNAs) participate in the regulations of transcription, splicing, and nuclear structure in plant (Chekanova 2015). In this study, we predicted new lncRNAs based on the novel transcripts identified in the dRNA-seq. In total, we identified 270 new lncRNAs based on four prediction methods, which contain anti-sense lncRNA (20.7%), intronic lncNRA (0.7%), lincRNA (52.2%) and sense lncRNA (26.3%) (Fig. 5a, b). Then we predicted its target genes according to the lncRNA-targets interaction mode, including the relative position of lncRNA and mRNA which differentially expressed per 100kbp on chromosomes, and complementary base pairing between lncRNA and mRNA. To examine how many genes were targeted by lncRNAs, which resulted in alternative splicing events, a comparison between lncRNAs targeted genes and alternative splicing genes was conducted (Fig. 5c). A total of 64, 51 and 56 lncRNAs targets were alternatively spliced in YP, UF, and F, respectively. Among all the lncRNAs, which targeted alternative splicing genes, lncRNAs of ONT.11439.1 repressed its potential targets of Histone H3 and RNA recognition motif containing protein in each of the three development stages (Fig. 5d). lncONT.200.2 and lncONT.3986.1 were predicted to interact with gene encoding 14-3-3 protein and gene encoding ankyrin repeat domain containing protein, resulting in alternative splicing of those genes respectively across the three stages (Fig. 5d). In addition, lncRNA could target different genes at different stages during floret development. lncONT.2048.1, targeted genes, which encoded CDA, MAPK, GSK3, and CLKC kinases in YP, and genes responsible for E1-BTB1—Bric-a-Brac, Tramtrack, and Broad Complex domain with E1 subfamily in UF. While BRASSINOSTEROID INSENSITIVE 1-associated receptor kinase 1 precursor gene was also targeted by lncONT.2048.1 in F.
Analysis of Alternatively Spliced Transcription Factors (TFs)
TFs are essential for plant development process, and the AS events associated with TFs are potentially important in regulating gene expression. In this study, we conducted the comparison analysis between AS genes and TFs, to detect alternatively spliced TFs during development (Fig. 6a). A total of 16 TFs were alternatively spliced among the three stages, which belonged to the family of ERF, WRKY, C3H, NAC, bZIP, Co-like, and etc. Amongst the 16 alternatively spliced TFs, Co-like encoded by LOC_Os02g49230, was predicted to interact with Casein kinase 1-like protein HD1 (CKI), which involved in development of male floral organs and grains, and flowering time under long day (Fig. 6b). Moreover, the transcript level of Co-like was increased during floret development, while expression level of CKI was opposite to that of Co-like, indicating that Co-like inhibited the expression of CKI (Fig. 6b). To validate whether Co-like co-expressed with CKI, we performed transient luciferase assay in the protoplasts. Results showed that both the two isoforms of Co-like, Colike.1 and Colike.2 inhibited the expression of the CKI (Fig. 7a, Additional file 1: Fig. S6A), which were consistent with its opposite expression pattern. Depending on Venn diagram, a total of 28 alternatively spliced TFs were specially identified in YP, in which 22 alternatively spliced TFs were either up or down regulated (Fig. 6a). Isoforms of GRF4.1 and GRF4.2 were significantly expressed in YP, suggesting its special role in the young panicle development (Fig. 6c). Previous evidence showed that GRF4 controls grain size and yield in rice (Duan et al. 2015). Therefore, the dramatically expressed isoforms of GRF4 could be important for the young panicle development. Isoforms of MYBAS2.1 and MYBAS2.3 were also significantly detected in the YP samples (Fig. 6d). And it was proposed to interact with MYBS2 (LOC_Os10g41260), which was down regulated in YP, suggesting the negative correlation between MYBS2 and MYBAS2 (Fig. 6d). TF of MADS2 encoded by LOC_Os01g66030 was predicted to interact with MADS16 which regulated carpel specification in flower development, and DL which was required for normal development of lodicules and stamens (whorls 2 and 3) (Prasad and Vijayraghavan 2003) (Fig. 6e). A total of six isoforms of MADS2 were identified, among which ONT.1529.1, ONT.1529.2, ONT.1529.3, and ONT.1529.5. were its novel isoforms (Fig. 6e). Transcript levels of all the differentially expressed isoforms of MADS2 showed similar expression pattern that MADS2 was positively correlated with that of MADS16, whereas displayed a negative relationship with DL (Fig. 6e). Then, we performed the luciferase assay to validate whether MADS2 transactivate or inhibit expression of DL and MADS16. Results showed that MADS2.1 and MADS2.2 inhibited the DL, which was consistent with the negative correlations between transcript levels (Fig. 7b, Additional file 1: Fig. S6B). In contrast, MADS16 was also inhibited by the two isoforms of MADS, which was not consistent with the positive correlations between expression levels (Fig. 7c, Additional file 1: Fig. S6C). In addition, G2-like transcription factor was predicted to interact with GAMYB, which showed opposite expression pattern between G2-like and GAMYB (Fig. 6f). Furthermore, the novel isoform of ONT.9772.1 derived from G2-like showed the similar expression pattern to the known isoforms of G2-like.1 (Fig. 6f). There were 14 TFs and 19 TFs uniquely alternatively spliced in UF and F, which might be also essential for the corresponding development (Fig. 6a).