Epigenetic and Three-dimensional Genomic Analysis of Long Non-coding and Circular RNAs in Xian/Indica Rice

Background: Long non-coding RNAs (lncRNAs) and circular RNAs (circRNAs) can play important roles in many biological processes. However, no study of the inuence of epigenetics factors or the 3D structure of the genome in their regulation is available in plants. Results: In the current analysis, we identied a total of 15,122 lncRNAs and 7,902 circRNAs in three tissues (root, leaf and panicle) of the rice varieties Minghui 63, Zhenshan 97 and their hybrid Shanyou 63. Above 73% of the lncRNAs and parental genes of circRNAs (P-circRNA) are shared among Oryza sativa with high expression specicity. Compared with protein-coding genes, lncRNAs have higher methylation levels. CircRNAs tend to locate in the middle of genes with high CG and CHG methylation. The activated lncRNAs and P-circRNA are mainly transcribed from demethylated regions containing CHH methylation. In addition, ~53% lncRNAs and ~15% P-circRNA are associated with transposable elements, especially miniature inverted-repeat transposable elements and RC/helitron. We didn’t nd correlation between the expression of lncRNAs and histone modications; however, the binding strength and interaction of RNAPII signicantly affects lncRNA expression. In contrast, parental genes of circRNAs tend to combine active histone modications. Finally, lncRNAs and circRNAs acting as competing-endogenous RNAs have the potential to regulate the expression of genes with important roles in the growth and development of rice. Conclusions: Together, our results provide insights into the epigenetic and regulatory mechanisms of lncRNAs and circRNAs in rice and open the door to increase our understanding of ncRNAs in plants.

xian/indica II) and Zhenshan 97 (ZS97, Oryza sativa xian/indica I) are widely cultivated in China and southeast Asia, which are the parents of the elite rice hybrid Shanyou 63 (SY63), and important varieties in functional genomic studies (Zhang et al. 2016a; Wang et al. 2014). Besides, we suspect that because lncRNAs and circRNAs are widely distributed in the genome, the three-dimensional (3D) con guration of the genome, which is complex, dynamic and crucial for gene regulation, may hide important information of lncRNAs and circRNAs. It is reported that 82% of the MH63 genome is in 3D chromatin interaction modules with different transcriptional activities (Zhao et al. 2019), making it very suitable for studying the 3D genomic characteristics of lncRNAs and circRNAs, which have not been analyzed in plants up to now.
To uncover the comprehensive role that ncRNAs played in plants, we performed high-throughput sequencing analysis to identify and characterize lncRNAs and circRNAs from three different tissues in MH63, ZS97 and SY63, respectively. Then we characterized their genomic regions of origin, analyzed the in uence of methylation, TEs and the chromatin 3D structure in their formation and between them, and uncovered their functions and interactions with miRNAs as ceRNAs in the regulation of gene expression. Our data provides an important resource for future ceRNA and 3D genome research and a genome-wide pro ling of lncRNAs and circRNAs in Oryza sativa xian/indica, increasing our understanding of a crop that is essential for the quality, reliability and sustainability of the world's food supply.

Results
LncRNAs and CircRNAs Are Widely Distributed in Rrice About 1.3 billion read pairs were generated from 18 Oryza sativa L. ssp. indica samples, including three tissues (young leaf, panicle and root) from the three rice varieties (MH63, ZS97 and SY63), with two replicates per tissue and variety (Table S1). Based on the pipeline in Figure S1a, we identi ed 11,513, 13,153 and 13,549 lncRNAs in MH63, ZS97 and SY63, respectively (Fig. 1a, Table S2). LncRNAs are divided into ve types, i.e., intergenic, intronic, antisense, bidirectional and sense lncRNAs, of which long intergenic non-coding RNAs (lincRNAs) and long non-coding natural antisense transcripts (lncNATs) accounted for the majority with 5,723, 5,923 and 6,585 lincRNAs, and 1,934, 2,070 and 2,216 lncNATs in MH63, ZS97 and SY63, respectively (Fig. 1a). LncRNAs that intersected with protein-coding (PC) genes in the sense strand (sense lncRNAs) possibly resulted from an incomplete annotation or a coding-free transcript of coding gene due to alternative splicing events. The Pearson correlation of lncRNA expression between two replicates per tissue was > 0.9, indicating the high repeatability of the data (Fig. S1b).
A total of 5,122, 4,273 and 4,707 different high-con dence circRNAs were identi ed in MH63, ZS97 and SY63 after quality control and ltering. All samples showed similar values to their biological replicates (Table S3). Genome-wide annotation of these high-con dence circRNAs included 4,013, 3,571 and 3,847 exonic, 298, 112 and 112 intronic and 811, 590 and 754 intergenic circRNAs in MH63, ZS97 and SY63, respectively (Fig. 1a). Only ~ 25% of the circRNAs were shared by more than one tool in each variety (Table S4), which is similar to previously reported , highlighting the expected difference between the available circRNAs identi cation tools.
We found that the expression levels of circRNAs were not associated with their parental genes, as only 4% circRNAs were signi cantly positively correlated with the expression of their parental genes (P-value < 0.05, PCC > 0) (Fig. 1b). Furthermore, more than 30% of the parental genes produce more than one circRNA in each variety (Table S5), con rming that there is alternative splicing of circRNA in rice. As expected, the middle length of lncRNAs in rice was longer than circRNAs, while was shorter than the average length of mRNAs (916 bp, 1,339 bp and 1,793 bp in circRNA, lncRNA and mRNA, respectively) (Fig. 1c). In addition, the expression of lncRNAs was lower than the PC genes, which was also lower than the parental genes of circRNAs (Fig. 1d). Some circRNAs were very long (more than 10 kb), which might be caused by the presence of repetitive DNA in the plant genome (Mehrotra and Goyal, 2014). In total, ~ 4.8% (196) of circRNAs longer than 10 kb and located in intergenic regions was ltered from subsequent analyses. Sequence analysis revealed that single exonic lncRNAs or circRNAs were the most common ( Fig. S1c), accounting for 41% of lncRNAs, which was signi cantly higher than in mRNAs (23%) and circRNAs (20%). LncRNAs/circRNAs were not equally distributed in each chromosome, and the proportion of lncRNAs/circRNAs per length or number of genes in the chromosomes changed between chromosomes and varieties (Table S6, S7). There were six lncRNA clusters distributed in chromosomes 4, 9, 10, 11 and 12 (Fig. 1e). Among them, one lncRNA cluster located in 20-20.5 Mb of chromosome 12 was conserved among the three rice varieties.
To assess the conservation of lncRNAs/circRNAs in Oryza sativa and other plant genomes, these sequences were aligned to the genomes of some representative plant species using BLASTN. More than 82% of the lncRNAs sequences in MH63 were conserved in the genomes of different rice varieties (Fig. 1f, Table S8), including Oryza sativa subsp. xian (ZS97), Oryza sativa subsp. geng (Oryza sativa japonica), African cultivated rice (Oryza glaberrima) and wild rice (Oryza barthii, Oryza ru pogon). Similarly, the conservation of circRNAs in rice was ~ 73%. In contrast, analysis of the genomes of different monocots including Brachypodium distachyon, Setaria italica, Zea may and Sorghum bicolor showed similar low conservation in both lncRNAs (< 11%) and circRNAs (< 15%), while in dicots such as Arabidopsis thaliana and Populus trichocarpa the conservation was very low (1-2%). This re ects that lncRNAs/circRNAs were not conserved among different plant species, but had certain intra-species conservation (Fig. 1f), indicating that lncRNAs/circRNAs could be considered as "young" genes as they evolved relatively recently.
Further analyses revealed that 47% and 58% of the lncRNAs sequences from MH63 could align to those of ZS97 and SY63, respectively. The conservation analysis of the back-splicing junctions of circRNAs in MH63 showed that ~ 39% circRNAs were conserved in ZS97, and almost 51% could align to SY63 (Table   S8). In addition, 3,445 (30%) lncRNAs and 1,361 (32%) circRNAs of MH63 were conserved in ZS97 and SY63, indicating that a large number of lncRNAs/circRNAs had species-speci c expression.

DNA Methylations of LncRNAs and CircRNAs in Rice Genome
To further explore the regulation of lncRNAs/circRNAs in rice, we investigated the possible in uence of DNA methylation, an important epigenetic modi cation in plants, which has been observed for CG, CHG and CHH contexts with H being any nucleotide but G. We analyzed the methylation densities of lncRNAs loci and the parental genes of circRNAs (P-circRNAs), and compared them with those of PC genes in MH63 genome. We found that lncRNAs had higher methylation level (CG = 38.0%, CHG = 15.4%, CHH = 2.2%) than PC genes (27.3%, 8.4%, 1.4%, wilcox.test, P < 2.2e-16 for CG, CHG and CHH, respectively), while the P-circRNAs had the higher CG and lower CHG than PC genes (36.0%, 4.7%, 1.1%, wilcox.test, P < 2.2e-16 for CG and CHG, Fig. 2a). DNA methylation is more likely to be associated with promoter regulatory regions (Zhong et al. 2013). Therefore, the average methylation signals within 1 kb upstream the transcriptional start site (TSS) was examined. The lncRNA loci displayed a relatively higher CG and lower CHH methylation density than the PC genes, while P-circRNAs were the opposite. Although the CG methylation density was also reduced near the TSS of lncRNAs, it remained ~ 2-fold higher than that of the PC genes, and the CG methylation of P-circRNAs was approximately 2-fold lower than PC genes ( Fig. 2a and Fig. S2a).
Using PC genes as control, we normalized P-circRNAs into 1 kb and compared the position distribution of circRNAs relative to their parental genes and DNA methylation density. We found that circRNAs were mainly concentrated in the middle of gene-body rather than two sides, and circRNAs were highly correlated with CG and CHG enrichment compared with PC genes ( Fig. 2b and Fig. S2b). The lncRNAs were divided into three groups according to their expression (low, FPKM < 0.5; middle, 0.5 < FPKM < 2; high, FPKM > 2), and the average density of DNA methylation over the lncRNA loci were plotted. We observed that the DNA methylation level of lncRNAs with FPKM > 0.5 were similar, while lncRNAs with FPKM < 0.5 had different methylation level near TTS, indicating that lncRNA with FPKM < 0.5 might be incomplete (Fig. 2c), which can provide insights for screening high-con dence lncRNAs. Similarly, we analyzed the DNA methylation levels of P-circRNAs at different quantile expression levels and found that P-circRNAs with high expression levels showed lower methylation near TSS (Fig. 2d).
We further describe the impact of methylation changes on lncRNA loci and P-circRNAs based on the methylation data from (Zhao et al. 2020). In total, 889 and 1042 DMRs were located in the body of lncRNAs and P-circRNAs for different tissues, which involved 660 (8%) and 641 (22%) loci, respectively. However, about half of these DMR-containing genes were differentially expressed. Similar with previous report , we observed that DMRs related with CHG and CHH was predominantly hypomethylated, resulting in up-regulated expression of lncRNAs and P-circRNAs ( Fig. 2e and Fig. S2c).

LncRNAs and CircRNAs Are Associated with Different TEs
DNA methylation in plant predominantly occurs on transposons (Law and Jacobsen 2010). With the aim of testing a possible in uence of transposons in lncRNAs and circRNAs, we identi ed that ~ 53% lncRNAs (4,387) were associated with TEs (Fig. 3a), and ~ 15% circRNAs (598) were originated from TE-related genes in MH63, which accounted for ~ 37% of the total genes (Song et al. 2018). In addition, TE-related lncRNAs/circRNAs were found to be abundant in miniature inverted-repeat transposable elements (MITEs) and RC/Helitron transposons (Fig. 3b). Through comparing the expression of TE-associated lncRNAs/circRNAs with non-TE-associated lncRNAs/circRNAs, we observed that the expression of non-TE-associated lncRNAs and P-circRNAs was higher than those associated with TEs (Wilcoxon's test, P < 2.2e-16, Fig. 3c), while there was no difference in circRNA expression (Wilcoxon's test, P > 0.05). We further analyzed the methylation pro les of TE-associated lncRNAs/circRNAs and found that all the methylation densities were signi cantly higher than those of non-TE lncRNAs/circRNAs ( Fig. 3d-f), except that no difference was observed for CG methylation density between TE and non-TE-associated circRNAs.
In short, the transposon elements, containing high DNA methylation level, affected the expression of lncRNAs and P-circRNAs in xian/indica rice. In addition, the non-TE genes were more preferred to produce circRNAs than TE-related genes.

RNAPII-mediated Domains Affect Expression of LncRNAs in Rice
To characterize the epigenomic feature and RNA polymerase II (RNAPII) occupancy in lncRNA and circRNA, we examined the expression of lncRNA loci and P-circRNAs harboring different histone marks. It was shown that the lncRNA loci marked by RNAPII were signi cantly highly expressed, however, lncRNAs marked by active histone (H3K4me3 and H3K27ac) or repressed histone (H3K27me3) had similar expression compared with those without histone markers ( Fig. 4a and Fig. S3a). This phenomenon was different from protein-coding genes, which were activated by active histones including H3K27ac and H3K4me3 (Zhao et al. 2020). Therefore, P-circRNAs combined with active histone markers like H3K27ac and H3K4me3 explained the high expression of these genes ( circRNAs (322) were transcribed by RNAPII in seedling of MH63 (Fig. 4c), more than half of which were located in the RNAPII-mediated chromatin interaction domains (726 lncRNAs and 228 circRNAs, respectively). In order to determine if a quantitative relationship of intensity and expression was existed between RNAPII and lncRNA, we partitioned the RNAPII intensity in two groups (weak and strong), and plotted the expression of lncRNA loci contained in each group. We observed that lncRNAs harboring strong intensity of RNAPII were expressed higher than those with weak intensity of RNAPII with no linear relationship ( Fig. 4d and Fig. S4d). Notably, the expression of lncRNA loci and P-circRNAs in RNAPIImediated interactions domains were much higher than those without interactions, and it was the lowest for those without RNAPII. For example, ciri_circ147, exp_circ113 and exp_circ114 contained RNAPII peaks with and without interaction but only had RPM value of 2.90e − 08 , while exp_circ117 lacked RNAPII but had a higher RPM value of 2.61e − 07 , further con rming that high expression of parental genes of circRNAs did not correlate with high expression of circRNAs ( Fig. 4c and Fig. 4e).

LncRNAs and CircRNAs in Rice Show High Tissue Speci city
Both lncRNAs and circRNAs showed high tissue-speci city in all varieties, with panicle being the most abundant, then root and the last leaf in all the three rice varieties (Fig. S4a). In MH63, we identi ed 10,076 (88% account for total lncRNAs), 7,517 (65%) and 6,149 (52%) lncRNAs, and 3,367 (66%), 1,933 (38%) and 1,160 (23%) circRNAs expressed in panicle, root and young leaf, respectively. Interestingly, tissuespeci c expression of circRNAs was much higher than lncRNAs. We observed that ~ 43% lncRNAs were commonly expressed in three tissues and the rest were tissue-speci c expressed, while the commonly expressed ratio of circRNAs was only 7.5%. In addition, we performed differential expression analysis for lncRNAs, and obtained 8,419 (73%) differentially expressed lncRNAs (DELs) in MH63. GO analysis of the nearest genes of lncRNAs revealed that DELs were enriched in diverse biological processes, cellular components and molecular functions depending on the tissue of origin, con rming the dynamic expression of lncRNAs in different tissues. Nearest genes of Up-regulated lncRNAs in panicle were enriched in ' ower development and reproduction', and in leaf were enriched in 'transport and the formation of thylakoid and plastid', and in root were enriched in 'transcription factor activity'. These enriched functions and the tendency of lncRNAs to cis-regulate the expression of nearby genes suggests that the DELs play an important role in the growth and development of different plant tissues. GO analysis for parental genes of circRNAs revealed that they can participate in 'post-embryonic development', 'nitrogen compound metabolic process', 'binding of diverse molecules' and 'formation of different cell parts' (Fig. S5a). Considering that circRNAs are highly dynamic and usually only expressed in one tissue, we also performed GO enrichment on parental genes of circRNA in each tissue, revealing that circRNAs participate in ''reproductive and post-embryonic development' in panicle, and contribute to 'the formation of cytosol and cytoplasm' in root (Fig. S4b). However, parental genes of circRNAs from leaf were not signi cantly enriched in any function when compared to those of panicle and root.

LncRNAs and CircRNAs Acting as CeRNAs Can Regulate Important Biological Traits in Rice
To evaluate the ncRNA-associated ceRNA interaction landscape in MH63, a complete circRNA/lncRNA-miRNA-mRNA interaction network was constructed with multi-tissues. In total, 797 lncRNAs and 215 circRNAs were interacted with 137 miRNAs and targeted a total of 1,044 mRNAs, adding up to a total of 4,517 different ceRNA pairs (Fig. 5a). Target genes of these miRNAs were enriched in ower development, reproduction, nucleus, nitrogen compound metabolic process and so on (Fig. S5b). Most of these miRNAs played signi cant roles in rice growth and development, important agronomic traits, and resistance to disease and drought. For example, Osa-miR156, which over-expressed in rice can increase salt stress tolerance and delay owering (Wang et al., 2019). Another miRNA, Osa-miR444, is a key factor in relaying the antiviral signaling from virus infection and response to stress (Eren et al. 2015;Wang et al. 2016). To understand the regulatory mechanisms of ceRNAs, we further explored the sub-networks related to the above two important miRNAs. We found that osa-miR156 can target and regulate OsSPL gene family members (Fig. 5b), including OsSPL14, a gene related to rice tillering, panicle number and thousand-grain weight (Miura et al., 2010). However, the mechanism of how osa-miR156 regulates OsSPL14 has not been reported. Our research suggests that lncRNA TCONS_00049880 located in intergenic region, may Finally, to explore the difference of ceRNA networks among different tissues, we further constructed circRNAs/DELs-miRNA-DEMs networks in MH63 for panicle, leaf and root. In total, 61, 40 and 52 miRNAs were predicted as target of 231, 68 and 101 DELs and 64, 72 and 99 circRNAs, and could interact with 180, 162 and 225 DEMs in panicle, leaf and root, respectively (Fig. S6-S8). Difference in the ceRNA networks between tissues indicated that different lncRNAs and circRNAs participated in diverse functions in the growth and development of each tissue. We performed GO analysis of the target-genes and found that they were enriched in ' ower development and reproduction' in panicle, indicating a speci c function of the DELs and circRNAs with miRNA recognition elements in panicle in regulating gene expression (Fig.  S5b).

Discussion
The plant epigenome research falls behind that of mammalians, not to mention epigenetic of non-coding RNA. In this study, we produced high coverage whole transcriptome sequencing and combined with publicly available dataset, including ChIP-Seq, CHIA-PAT, whole-genome bisul te sequencing, etc, to reveal comprehensive epigenetic characteristics of lncRNA in rice. Based on these datasets with multiple tissues in three varieties, a high-resolution 3D architecture and epigenetic landscape of rice genomes were constructed, nearly 82% of the rice genomes was annotated, including DNA methylation, active or repressive DNA regulatory elements, euchromatin and heterochromatin (Zhao et  We investigated the epigenetics involving expression of lncRNAs is depended on changes in gene dosage (e.g., copy-number alterations) and promoter utilization (e.g., DNA methylation) and their regulation since there are reports linking expression of circRNAs with DNA methylation in humans (Ferreira et al. 2018). Similar with previous research, DNA methylation of promoter was negatively correlated with gene expression levels in lncRNAs and P-circRNAs, which was also the same with protein-coding genes. DNA methylation density was signi cantly decreased near the transcription initiation site, especially in P-circRNA. Interestingly, we found a strong correlation between CG methylation and the distribution of circRNAs in the parental genes. In addition, DNA methylation levels of low-expressed (FPKM < 0.5) lncRNAs were signi cantly different near TTS compared with highly expressed lncRNAs, indicating that this part of lncRNAs may be incomplete, which provides a screening threshold for high-con dence lncRNAs. Recent study on the epigenetic landscape of cancer cells ) and polyploid cotton  showed that the lncRNA loci were hypo-methylated which was similar in this study and accompanied with the up-regulated of lncRNA loci, indicating that DNA methylation-mediated lncRNA regulation is a common mechanism in plants and animals.
TEs are normally transcriptionally silent regions due to DNA methylation. In rice, siRNAs have been . In our results, more than half (53%) of lncRNAs were associated with TE, and only a small amount (15%) of circRNAs were derived from TE-related genes. LncRNAs and circRNAs showed a clear consistent association with TEs with a high proportion of MITE and RC/helitron in rice. The parental genes of circRNAs and lncRNAs that associated with TE correlated with low expression and high methylation, except for CG in circRNAs.
Furthermore, we analyzed the epigenetic pro le of lncRNAs and found that more than half lncRNAs (56% in panicle, 71% in root, 72% in seedling) contains a variety of histone modi cations, including active histone modi cations (H3K4me3 and H3K27ac) and suppressed histone modi cations (H3K27me3 and H3K9me2). Unlike active histone modi cation activating gene expression and suppressive modi cation silencing gene expression, histone modi cation near lncRNA had little in uence on its expression. However, further analysis indicated that the nearest genes of lncRNAs combined with active histones (H3K4me3, H3K27ac) had signi cant higher expression than those without histone modi cation; while the characteristics of the nearest genes of lncRNAs combined with repress histone H3K37me3 had the opposite trend (Fig. S3e), suggesting that lncRNAs involving epigenetic modi cations to speci c genomic loci, which then regulated the expression of neighboring protein-coding genes ). Hundreds of lncRNAs have been con rmed to interact with multiple proteins, for example, a lncRNA-LAIR signi cantly binding with RNA-binding proteins OsMOF and OsWDR5, which have shown to associate with the H3K4me3 and H4K16ac histone modi cation complex, suggesting lncRNAs in uence gene expression by targeting chromatin remodelers to speci c genomic regions as part of a molecular scaffold (Trapnell et al. 2012;Wang et al. 2018). Not surprisingly, the parental genes of circRNA were enriched with a large of active histone modi cations. Similar with previous reports (Kindgren et al. 2018), many (~ 29% in seedling) lncRNAs were transcribed by RNAPII. In addition, our research revealed that the high activity and interaction of RNAPII promoted lncRNA transcription.
Finally, ceRNAs interactions have opened a new way of understanding the cross-regulation of mRNAs among different ncRNAs. A ceRNA interaction involving lncRNAs and circRNAs in A. thaliana has been reported (Meng et al. 2018), nding a correlation between ceRNAs and seedling development. Our ceRNA network revealed that less than 10% of the lncRNAs and circRNAs had miRNA interaction domains, meaning that in rice the "miRNA sponge" function is limited to a few lncRNAs and circRNAs. The lncRNAs/circRNAs acting as ceRNAs in our network have the potential to regulate the expression of genes with important functions like ower development and reproduction through binding to miRNAs. Due to the dynamic character of ncRNAs, we further constructed three tissue speci c ceRNA networks with the circRNAs and up-regulated lncRNAs in panicle, seedling and root and, as we anticipated, the functions of the genes that they could regulate when interacting with miRNAs are different in each tissue, revealing in plants the tissue-speci city of the ceRNA interactions.

Conclusions
Taken together, our study is one of the most complete analyses of ncRNAs, their genomic regions and the different factors that affect the regulation of their expression in rice, revealing the functions of these ncRNAs and the genes that they modulate through competing interaction mechanisms. The correlation between CG methylation and genomic region of the circRNAs, the inverse correlation between methylation in the gene body and expression, the different TEs and interaction domains preferences and the limited function as "miRNA sponge" of circRNAs and lncRNAs bring useful information for understanding ncRNAs in plants.

Plant Materials, RNA Library Construction and Sequencing
Oryza sativa ssp. xian/indica varieties MH63, ZS97 and SY63 were grown in a greenhouse of Huazhong Agricultural University, Wuhan (China), at 25℃ in non-stress conditions. Leaf and root samples at the seedling stage of 4 leaves and ~V stage of panicle were collected for RNA isolation. A total amount of 3 µg was used as input for RNA sample preparation. Ribosomal RNA was removed by Epicentric Ribozero rRNA Removal Kit (Epicentre, USA), and rRNA free residue was cleaned up with ethanol precipitation. rRNA-depleted RNA was used for library construction according to NEBNext Ultra Directional RNA Library Prep Kit for Illumina (NEB, USA). In order to select cDNA fragments of preferentially 250-300 bp, the library fragments where puri ed with AMPure XP system (Beckman Coulter, Beverly, USA). Then 3 µL of USER Enzyme (NEB, USA) was used with size-selected, adaptor-ligated cDNA at 37 o C for 15 minutes followed by 5 minutes at 95 o C before PCR primers and Index (X) Primer. At last, products were puri ed (AMPure XP system) and library quality was assessed on the Agilent Bioanalyzer 2100 system. The clustering of the index-coded samples was performed on a cBot Cluster Generation System according to the manufacturer's protocol. After cluster generation, the library preparations were sequenced on an Illumina Hiseq3000 that generated paired-end reads of 150 bp in length. Finally, cDNA primer and adaptor were ltered out and 10 bp from the start were trimmed out with Trimmomatic v0.32 (Bolger et al. 2014) after and before quality control analyses of the reads with FastQC to obtain clean reads. The data is shown in Table S1.

Bioinformatics Pipeline for Characterization of LncRNAs
Clean reads were aligned to the MH63RS2 and ZS97RS2 reference genomes using Hisat2 v2.0.4 (Kim et al. 2015) and assembled with Cu inks v2.2.1 (Trapnell et al. 2012). First, the individual samples were assembled separately, merged together and the expression level of each transcript was normalized using FPKM value. Next, the transcripts shorter than 200 bp, transcripts with mean FPKM scores < 0.1 and known protein-coding transcripts were ltered. The ltered sequences were used for hmmscan search against PFam-  (Memczak et al. 2013). To obtain the most reliable results with each bioinformatics tool, we followed their recommended protocols for the identi cation and ltering of candidate circRNAs. First, sequencing reads were aligned to MH63RS2 and ZS97RS2 genomes (http://rice.hzau.edu.cn/). Then a series of different quality control steps were implemented for the inclusion of only high-con dence circRNAs for further analyses. For the candidate circRNAs detected with nd_circ, the recommended ltering criteria for the software were applied. We required at least two reads supporting the candidate circRNA junction, unambiguous detection of the breakpoint, unique anchor alignments on both sides of the junction, and removed splice sites that were more than 100 kb. In addition, due to the reported lower accuracy of nd_circ (Hansen et al. 2016), we further ltered the candidates with an in-house perl script, and those classi ed as exonic-, intronic-or intergenic-circRNA were kept (Memczak et al. 2013). Meanwhile, at least two junction reads covering the back splicing sites of the candidate circRNAs detected by CIRI2 and CIRCexplorer2 were required.
First the individual samples were assembled separately, and then merged together for each variety. The genomic region of the circRNAs in the parental genes was considered as the genomic region between both back-splicing junctions that corresponded with the circRNA strand. Gene expression levels were obtained with Cu inks v2.2.1 (Trapnell et al. 2012) and normalized as Fragments Per Kilobase of transcript per Million mapped reads (FPKM). Expression levels for circRNAs were normalized as Reads Per Million mapped reads (RPMs).

Conservation of LncRNAs and CircRNAs
The lncRNAs and circRNAs identi ed in MH63 were used as query sequences, and Blastn v2.2.30 (Altschul et al. 1990) was used to align them to the lncRNA, circRNA back-splicing sequences and genomes of other plant species. The genome sequences of 10 plant species were downloaded from Ensembl Plants (Ensembl Plants) and compared with lncRNAs/circRNAs sequences. Similar to a previously reported strategy (Xu et al. 2018), the thresholds for screening Oryza genus were identity ≥ 80% and e-value < 10 − 5 , and for monocotyledonous and dicotyledonous plants, the cutoff query was identity ≥ 50% and e-value < 10 − 5 . To compare the conservation between lncRNAs sequences, we compared the sequences of lncRNAs to those of ZS97 and SY63 with the same threshold as for Oryza genus. To ensure reliable alignment results comparing back-splicing sequences from MH63 and those of ZS97 and SY63, we required the identity ≥ 95%, e-value < 10 − 5 and a maximum of 1 gap.

Methylation and TE Analysis of LncRNAs and CircRNAs
In order to detect 5-methylcitosine modi cations, bisul te sequencing (BS-Seq) was performed to detect DNA methylation in the three rice varieties. Total genomic DNA was extracted from 12-days seedling leaves following the DNeasy plant mini kit (Qiagen) protocol. After library preparation and sequencing, Trim_galore v0.4.5 (TrimGalore webpage) was used as a quality control and to trim low-quality bases. Finally, Bismarck v0.19.1 (Bismarck webpage) was used for mapping the clean reads to rice genomes, allowing one mismatch in 20-nucleotide seed sequence (-N 1 -L 20). The reads not uniquely mapped were ltered. Bismarck was then used to determine the methylation level at each cytosine using its methylation extractor command. Because there is no methylation in the chloroplasts of plant genome (Du et al. 2017), all methylation levels detected in the chloroplast genome were accounted as false discovery rate with the error rate of ~ 0.3% (Fojtova et al. 2001

The Epigenome Feature Analysis of LncRNAs and CircRNAs
To analyze the 3D epigenome genome structure of lncRNAs/circRNAs, we downloaded distribution of different histones (H3K4me3, H3K27ac, H3K27me3, and H3K9me2) and RNAPII with distinct modi cation potentiality (Zhao et al. 2019;Zhao et al. 2020), and mapped the lncRNAs/circRNAs on histones region. If 1-bp overlap existed between a histone and a gene or locus, the histone binding was thought to in uence the expression of this locus. Considering the strong tissue-speci city of lncRNAs/circRNAs, and because the CHIA-PET data of RNAPII interaction were derived from young leaf, we only analyzed the lncRNAs and circRNAs from young leaf.

Differential Expression and GO Enrichment Aanalysis
DESeq2 (Love et al. 2014) was applied to lter the differentially expressed mRNAs (DEMs) and differentially expressed lncRNAs (DELs). Fold change (FC) and false discovery rate (FDR) were used to lter differentially expressed genes under the following criteria: (a) FC > 2 or < 0.5; (b) FDR < 0.05. To analyze the function of circRNAs we extracted their parental genes, and for DELs their nearest adjacent PC gene. For the target gene sets, AgriGO online tool (AgriGO webpage) was used for enrichment analysis (FDR < 0.05), and we set P-value < 0.05.