International Consortium of Rice Mutagenesis: resources and beyond

Rice is one of the most important crops in the world. The rice community needs to cooperate and share efforts and resources so that we can understand the functions of rice genes, especially those with a role in important agronomical traits, for application in agricultural production. Mutation is a major source of genetic variation that can be used for studying gene function. We will present here the status of mutant collections affected in a random manner by physical/chemical and insertion mutageneses. As of early September 2013, a total of 447, 919 flanking sequence tags from rice mutant libraries with T-DNA, Ac/Ds, En/Spm, Tos17, nDART/aDART insertions have been collected and publicly available. From these, 336,262 sequences are precisely positioned on the japonica rice chromosomes, and 67.5% are in gene interval. We discuss the genome coverage and preference of the insertion, issues limiting the exchange and use of the current collections, as well as new and improved resources. We propose a call to renew all mutant populations as soon as possible. We also suggest that a common web portal should be established for ordering seeds.


Introduction
Rice (Oryza sativa) is one of the most important crops in the world. Rice, wheat, and maize together account for 60% of the world's food production, and rice is the principal food of nearly 50% of the world's population. These cereal crops share a large degree of synteny, so rice is an excellent model cereal crop for genomics research (Gale and Devos 1998). In addition, because of its small genome size, high transformation efficiency and huge genetic resources, rice was the first crop plant chosen for complete genome sequencing.
The world's population will be greater than 9 billion in less than 40 years. How can farmers grow enough food to feed such a large population in a more sustainable and environmentally friendly way? This "9 billion-people" question is one of the world's most pressing issues and needs to be solved soon so that we can supply farmers with the seeds to feed future generations. Recently Qifa Zhang and colleagues proposed "Rice 2020" with a goal to assign a biological function to each identified gene by 2020 (Zhang et al. 2008;Zhang and Wing 2013). Rod Wing then proposed the "9-billion project" to solve the 9-billion people problem at the 10 th International Symposium of Rice Functional Genomics meeting held in Chiang Mai, Thailand in 2012. To fulfill this important responsibility, the rice community needs to cooperate and share efforts and resources so that we can understand the functions of rice genes, especially those with a role in important agronomical traits, for application in agricultural production.
With the complete genomic sequencing of rice (IRGSP 2005), the challenge of the post-genomic era is to systematically analyze the functions of all genes in the genome. An important and direct approach to defining the function of a novel gene is to abolish or activate its function by mutagenesis. Insertional mutagenesis, with T-DNA or a transposable element, provides opportunities for assigning a function to a particular DNA sequence and isolating the target gene causing a specific phenotype. The function of a gene can be explored efficiently by use of traditional mutagenesis with physical or chemical mutagens and modern tools such as targeting-induced local lesions in genomes (TILLING) and next-generation sequencing (NGS). Many review papers, including three recent ones (Droc et al. 2013;Wang et al. 2013;Yang et al. 2013), have introduced the rice mutant resources and their application. In this current review, we focus on the limitations, characterization and re-analysis of current resources as well as the creation of new improved resources.

Current status of international mutant collections
Mutation is a major source of genetic variation and may be used for gene functional analysis. Changes in the gene status of a rice plant can be driven in a random manner, by disruption via physical (Bruce et al. 2009;Wu et al. 2005) (such as fast neutron, γ-ray, ion beam), chemical (Till et al. 2007) such as ethyl methansulfonate [EMS], methyl nitrosourea [MNU], sodium azide [SA]) or insertion mutagenesis (T-DNA, Ac/Ds, En/Spm, Tos17, nDART/aDART (Krishnan et al. 2009).
Insertion mutant collections were generated in the late 1990s at the National Institute of Agrobiological Sciences (NIAS, Japan; Miyao et al. 2003) and at the Pohang University of Technology (POSTECH, Korea; Jeong et al. 2002), making use of the endogenous Tos17 retroelement and T-DNA insertion, respectively. These two laboratories have generated large collections of insertion lines (50,000 and 105,000 lines, respectively) that have been extensively shared among international research groups and thus allowed for deciphering the function of many genes by forward and reverse genetics strategies. Later, T-DNA insertion mutagenesis initiatives were launched in France (Génoplante consortium involving the public institutions CIRAD, INRA, CNRS and IRD, and private companies Biogemma and Bayer Crop Science [Sallaud et al. 2004;Johnson et al. 2005]), China (Hua Zhong Agricultural University, Zhejiang University [Wu et al. 2003]; the Beijing Biotechnology Research Institute (Wan et al. 2009) and the Shanghai Institute of Plant Physiology and Ecology [(Fu et al. 2009)]) and Taiwan (Institute of Plant and Microbial Biology, Academia Sinica, and the Taiwan Agricultural Research Institute   Jiang et al. 2007), and the United States (Cornell University [He et al. 2007]; UC Davis [Kumar et al. 2005]), with a total of about 150,000 lines. These resources include several functions such as knockout, gene trap, enhancer trap, and/or activation-tag. Most of the flanking sequence tags (FSTs) are searchable at the RiceGE (http://signal.salk.edu/cgi-bin/RiceGE) and OrygenesDB (http://orygenesdb.cirad.fr) websites. As of early September 2013, RiceGE contained 370,179 entries.
The maize Ac/Ds transposon system has been used to generate an insertional mutant population in maize itself, Arabidopsis, and rice. An Ac/Ds-based library has several advantages: 1) revertants can be readily obtained and easily identified, and 2) because the Ds elements prefer to transpose to genetically linked sites (i.e., the same chromosome, Greco et al. 2003;Wan et al. 2009;Kim et al. 2004), an indexed, insertional mutant library can be created with a series of starter lines.
Ac/Ds belong to the hAT super family, with the designation hAT from the Drosophila melanogaster element hobo, maize element Ac, and Antirrhinum majus Tam3 element. New rice hAT elements have been found and are suggested as new candidates to generate insertion mutants in rice. An active 0.6-kb endogenous DNA transposon, nonautonomous DNA-based active rice transposon1 (nDart1), was recently identified to act as a causative fragment. For instance, the somatic excision of nDART1 integrated into a nuclear-coded chloroplast protease led to color sectors in seedlings (Nishimura et al. 2008). Likewise, the integration of nDart1 into an unknown nuclear protein caused changes in panicle morphology ). Thus, this nDart/aDart system may be used to generate a large-scale insertional mutant population. Recently, another potential rice hAT element, dTok, was found by the identification of gene responsible for a mutant with multiple pistils and stamens (Moon et al. 2006).
Ac, Ds and Spm are maize elements. They are introduced by Agrobacterium-mediated transformation and these regenerated rice lines may exhibit somaclonal variation especially in the first generation, since the variation will tend to be diluted with no further changes in the offspring, once the mutants are crossed with the Ac lines and insert population amplified. Another drawback is the GM nature of the lines. Though belonging to the same hAT family as Ac/Ds, the use of new hAT elements are not prone to these concerns because they are endogenous elements in the rice genome and their mobilization does not necessitate callus formation. Indeed, the the transposition of nDart1 can be triggered by ordinary crossing under natural field conditions. As for Ac/Ds, the remobilization of the element would generate a revertant and thus may avoid follow-up complementation experiments.
Another important resource to identify useful rice genes is the creation of gain of function lines through the systematic overexpression of full-length cDNAs. This system called gene-hunting (FOX HUNTING) has been used by Kondou and colleagues by a joint effort of RIKEN and NIAS in Japan (Kondou et al. 2009). This group systematically produced 23,000 independent Arabidopsis transgenic lines that ectopically expressed rice full-length cDNA. By analyzing the heterologous system, they obtained more than 1,200 morphological mutant candidate lines. The same group has also systematically over expressed rice FL cDNAs in rice (Nakamura et al. 2007). Several groups, including researchers in Japan and Korea, used this gain-of-function system to elucidate the novel functions of several rice genes that play important roles in biotic and abiotic stress, seed and root morphology, and pigment accumulation, etc.
The insertion mutants have long been considered as the most user-friendly and prefered functional analysis resource because they contain molecular tags of known sequence and thus integration site information may be readily retrieved. However, with the new technologies such as TILLING and NGS, this advantage has become less conspicuous. Both chemical and ionizing radiation mutagenesis have been routinely used to generate genetic variability in rice varieties since the 1950s; examples are the IRRI effort with IR64 (e.g., Wu et al. 2005) and the Kyushu University effort with Taichung 65, Nipponbare and Kinmaze (e.g., Satoh and Omura 1979). Recently, institutes in Taiwan, Japan, the United States, China, and Brazil have used EMS, MNU, SA, γ-ray, and ion beam to prepare more mutant populations. The genome changes caused by these mutagens include SNP, small indel, large indel, TE transposition, and epigenetic changes. Such differences between wild type and mutant have been discovered efficiently in recent years by using NGS sequencing of bulked DNA of mutant F2 progeny from crosses between mutant lines and parental lines in several plant and animal systems. The first successful case in rice was using MutMap strategy (Abe et al., 2012). The EMSinduced mutant was crossed directly to the original wildtype line, selfed, and the bulked homozygous F2 DNA was subjected to NGS. Because the SNP responsible for the change of phenotype is homozygous, the authors monitored the SNP rates along the chromosomes, and found out followed by confirmation of the target gene. In some cases, the mutant genome sequence is quite different from the RefSeq and the candidate SNP region may be only present in the resequenced cultivar. Takagi and the colleagues then suggested a modified version, MutMap-Gap . In such "gap" region, the authors performed assembly and alignment after MutMap method, and identified as well as confirmed the target gene. Recently, this group suggested another modified version -MutMap + where they used the bulked DNA of mutant and wild-type progeny of M3 generation derived from selfing of an M2 heterozygous individual. That is, no cross between mutant and wild-type parental line is required (Fekih et al., 2013). Table 1 lists the 20 mutant collections available worldwide, associated information, and whether the phenomics database is available. We list several resources not listed in other recent reviews, including the ones at Cornell University, the Taiwan Agriculture Research Institute, the Chinese Academy of Agricultural Science, and the Brazil Plant Genomics and Breeding Center. Most of the 20 collections are japonica varieties, including Nipponbare (used in 11 resources), Tainung 67 (2), Dongjin (2), Zhonghua 11 (2), and Kitaano, Zhonghua 15, Hwayoung, Kitaake, Taichung 65, Yukihikari, Kinmaze, and BRS Querencia (1 each). Some of the collections are indica rice, including IR64 (2), SSBM (1), and Kasalath (1). Of note, indica rice represents about 80% of the world's rice production in terms of yield or cultivated area. However, more efforts have been invested in producing mutants of japonica rice.
Long hairpin RNA (hpRNA) technology has been used recently to induce large-scale gene silencing in rice plants . The investigators used an improved rolling-circle, amplification-mediated hpRNA method to produce a large number of hpRNA constructs simultaneously from 200-to 400-bp, 400-to 600-bp, and 600-to 1000-bp cDNA libraries. Thousands of transgenic hpRNA lines were produced. More than 50% of these transgenic lines showed visible phenotypes, including poor growth, sterility and alteration in plant morphology, seed size, panicles, and heading time. This proportion is much higher than that reported for T-DNA insertion populations Lorieux et al. 2012) or Tos17 insertion populations (Hirochika et al. 2004). Such high efficiency may be caused by 1) all the hpRNA constructs targeting an exon sequence but T-DNA or Tos17 integrated randomly in chromosomes and 2) hpRNA possibly causing the silencing of a gene family instead of only a single gene. However, this method still involves a callus growth and transformation process, and thus somaclonal variation is still the concern.

Genome coverage of sequenced-indexed inserts
International implementation of high-throughput PCRbased methods has allowed for the amplification and sequencing of genomic regions flanking insertion sites of T-DNA, Tos17, Ds and dSpm inserts. In total, 447,919 sequence flanking tags have been released in public databases over the past decade. From these, 336,262 sequenceindexed inserts are precisely positioned on the japonica rice chromosomes (cv. Nipponbare, MSU v7.0 release) ( Table 2). Many more PCR products were sequenced notably from T-DNA inserts but proven to be T-DNA or backbone sequences. Examination of the anchored FSTs has provided the community with a better understanding of the insertion behaviour of the different types of mutagens in the rice genome. The insertion preference deduced from these analyses may be slightly biased. For T-DNA inserts, the selectable marker gene has to fall into a genomic region favourable for recovery of gene expression. For class II transposon insertions, the position of remobilized inserts might depend on the position of a generally limited number of initial launching pads in turn determined by T-DNA behaviour. Whatever the mutagen, insertions are not evenly distributed along each chromosome. Insertion frequencies tend to be higher at the distal, sub-telomeric and euchromatic regions, which are generally gene-rich regions and lower in heterochromatic regions (e.g., those close to centromeres). Chromosomes with a larger proportion of heterochromatic DNA tend to show a lower insertion density than larger "euchromatic" chromosomes, which generally contain a high density of predicted genes. At a more local scale, "hot" and "cold" spots for integration are observed, notably for Tos17. T-DNA insertions are found at low frequency in repeated DNA and transposable-element (TE)-related sequences at high frequency in gene-rich regions. Gene intervals contain 48% to 63% of the T-DNA inserts, with a strong bias toward the 5' upstream and 3' downstream regions of genes (Sallaud et al. 2004;Jeong et al. 2006;Zhang et al. 2007). Tos17 inserts are in gene intervals, preferably introns and exons, at a higher frequency (75-85%) than T-DNA inserts. Tos17 also exhibits a clear preference for certain genes: the mean number of sequenceindexed allelic insertions in Tos17 target genes is~3 and can be up to >200 alleles in the current NIAS and Oryza Tag Line (OTL) collections (Piffanelli et al. 2007). Ds preferentially inserts into genic regions (64-75%), whereas sequence-indexed dSpm inserts have a more balanced distribution among intergenic and genic regions, which resembles that of T-DNA inserts. Altogether, these studies have concluded that the different types of mutagens are complementary to saturate the rice genome with insertions.
The latest joint Rice Annotation Project (RAP) and MSU release v7.0 revealed that 226,861 of the 336,262 positioned inserts (67.5%) are in a gene interval spanning from −1000 upstream of the ATG to +300 downstream of the STOP codon of the 38,866 predicted rice genes (Table 3). More than three-fourth (77%; 29,672) of the rice genes are interrupted by at least one sequenceindexed insert. Current insertions interrupt an annotated promoter, 5' UTR, exon, intron and 3' UTR sequences with 16%, 2.7%, 19.3%, 20.7% and 8.8% frequency, respectively. So far, less than half (17,696) of the rice genes have 3 or more insertions. Functional analysis of a gene requires several allelic insertions leading to a knockout and exhibiting confluent plant phenotypes. This situation indeed avoids tedious complementation steps or the implementation of alternative techniques such as searching for chemically induced lesions in TILLING mutant populations or RNAi-based gene knockdown experiments.
The use of PCR-based methods for systematic sequencing of chromosomal regions flanking insertion points has facilitated direct in silico access to mutant seed stocks via dedicated and user-friendly genome navigators such as Rice GE and OrygenesDB. The functions of a large range of genes involved in cell and developmental processes and in the plant response to biotic and abiotic environments have been unravelled by in silico reverse genetics with mutant resources. An expanding number of genes have also been identified and isolated after disruption or activation by Tos17, Ds or T-DNA inserts in forward genetics screens. These genes underlie important traits involved in hormone synthesis, cell wall synthesis, leaf anatomy, pollen/anther development, spikelet formation and response to biotic and abiotic stresses. In a recent survey,  Jiang and associates (Jiang et al. 2012) estimated that the function of more than 600 rice genes has been experimentally validated.

Issues limiting the exchange and use of current collections of insertion mutants
In rice, 6 different commercial cultivars have been chosen to generate insertion libraries (the temperate japonicas Nipponbare, Dong Jin, Hwa Young, Zhonghua 11, Zhonghua15, and Tainung 67). This choice was based on the availability of a genome sequence or the popularity and economic significance and tissue-culture amenability of the cultivar. The choice led to the unanticipated implications in adaptation of the cultivars to greenhouse growth conditions (notably in labs with no expertise in rice cultivation) and the constraints of handling several genetic backgrounds in functional investigations. For instance, the reference-sequenced Nipponbare cultivar is highly photoperiod-sensitive and does not flower under long-day conditions. It exhibits indeterminate tillering and a bushy phenotype under long-day conditions (>12 hr) which favours pest and pathogen attacks and eventually the yielding of a poor seed set. Contrasting phenotypes are observed depending on the growth period within a year. These findings must be taken into account when comparing mutants and transgenic lines by always setting a complete range of controls and/or using phytotron conditions. Temperate japonica seeds have rather short germination ability as compared with seeds of cultivars of other genetic groups, and germination decline must be anticipated in most collections generated more than 10 years ago. This situation may become critical because funding that allowed for the generation of biological resources may not be available again when rejuvenating them. Indeed, seeds must be increased under field experimental conditions, which could be complicated by the GM nature of most of the insertion collections.
Another limitation is the difficulty in exchanging seeds of a crop species: rice seed international exchanges have to be covered by import permits and phytosanitary certificates from national quarantine authorities. The GM nature of the seeds may further complicate the procedure. Some countries ask for the complex detection of pathogenic forms of otherwise rather ubiquitous bacterial genera such as Pseudomonas. Immersion of seeds in hot water (55°C-60°C for 15 mn) before shipment can be a requisite for eradicating seed-borne nematodes. However, once practiced, this treatment leads to the short-term shelf life of the seeds, which then must be readily sown after treatment. Intellectual property issues impose filing and signing Material Transfer Agreements (MTAs) before seed transfer, which might lead to additional delay or complication in seed delivery.
We lack a unique stock centre gathering duplicate seeds of international rice mutant collections comparable to the Nottingham Stock Centre for Arabidopsis. This dedicated centre has considerably facilitated access to seeds and furthered the integration of insertion lines in research programs in the model dicot. At least a common web portal should be established for ordering seeds of rice insertion lines from different collections. Indeed, scientists beginning to investigate rice have difficulty navigating such a complex internet landscape.
Although the current size of the international collection of insertion lines (675,000) appears sufficient, the level of molecular characterization of this global resource, with only 336,000 sequence-indexed inserts anchored on the rice genome, still lags behind that of Arabidopsis (385,000), which has a 3-fold smaller genome. So far~30,000 non-TE genes over 39,000 predicted gene sequences (i.e., 76.9%) harbour at least one sequence-indexed insert. Also, both FST characterization and seed multiplication are complex, error-prone processes involving multiple steps that can each be a source of contamination or mislabelling. Therefore, the quality of the library must be assessed by performing quality checks and by the benefit of experience from users. Experience has shown that the indexing of an insert to a correct seed bag is not always accurate and ranges from 60% to 80% reconfirmation rate. We stress again the need for ordering and examining several insertions putatively leading to knockout of a given gene. However, only 60% of the rice genes have at least 2 allelic insertions ( Table 4). The scientific community should intensify the FST generation effort. Nevertheless, the organization of the T-DNA inserts in some lines and redundancy of Tos17 may hamper fast progress in this area.
We hereafter summarize the current status of rice mutant resources.

Most efforts have focused on generation of
insertion mutants and less on using chemical or physical mutagens. 2. Japonica rice varieties are used in most of the mutant resources. 3. Few initiatives involve breeders in teams or provide phenomics data. 4. Each group uses different phenotype descriptions and codes for mutant traits. A crosstalk between holders of mutant collections should be promoted so that the mutant traits from different groups may be unified or compared. 5. Seeds produced in some groups do not benefit of good storage facilities. We need a call to renew all mutant populations as soon as possible. 6. Requesting the mutant lines involves a complicated process such as MTA and quarantine.
7. We lack a centralized seed stock center for all collections. 8. Some efforts are focused on finding new ideal tagging elements. 9. Most initiatives active in the 2000s included generating new mutant lines, generating flanking sequence data, and performing phenomics analysis. However, less effort has been invested in generating new lines in recent years. Instead, most groups focus on the characterization of specific mutant lines.

Further characterization of existing collections
Among the extensive collections shown in Table 1, only four institutions worldwide use the activation tagging (AT) method. As an example, the TRIM collection contains 38,840 FSTs. Of the 38,866 non-TE genes, the putative knocked-out gene number is 18,665 and the putative activated gene number is 27,403. So 48.6% of the genes may be knocked out and 70.5% of the genes may be activated. Therefore, the AT method may affect more genes than classical insertion mutagenesis. Most, if not all, of the FST information from all insertion mutant population is available in rice browsers RiceGE and OygenesDB. Revealing the effect of the integration for the knocked-out function is straightforward. However, interpreting the AT population is difficult because the affected genes vary by integration direction and distance as well as the construction of each group. For instance, a cassette with 8 copies of the 35S enhancer was located near the left border (LB) of the Tag8 vector used in the TRIM population. Thus, the genes within 15 kb of the LB to the genes within 5 kb of the right border (RB)a 20-kb regionmay be activated ). However, in the CAAS mutant population, the two copies of 35S enhancers were arranged next to the RB of pER38. Thus, genes located 7 kb from the RB may be activated (Wan et al. 2009).
Another important issue in the insertion mutant resource is the low tagging efficiency; that is, the relationship between genotype (integrated/affected gene) and phenotype is very low. So the observed phenotypes may or may not be caused by the integration of the vector. The estimated tagging efficiency was 5% to 10% in the Tos17 and T-DNA tagged population. A review paper published decades ago concluded that the plant cell culture itself generated genetic variability (i.e., somaclonal variation; (Larkin and Scowcroft 1981). Such variation occurred in culture subclones and in regenerated plants (somaclones). The somaclonal variation, resulting from a sum of genetic and epigenetic changes, might occur during the callus induction, growth, Agrobacterium co-culture, and regeneration process. The Tos17 mutant population had to go through callus induction, growth, and regeneration processes. However, the T-DNA and Ac/Ds mutant population had to go through Agrobacterium co-culture in addition to these processes.
With high-throughput sequencing, several studies investigated the sequence changes in rice Tos17 mutant lines or Arabidopsis regenerated plants by NGS strategies. Because clonal-regenerated Arabidopsis plants show poorly understood heritable phenotypic variation, Jiang and coworkers (Jiang et al. 2012) estimated the genome-wide sequence variation using five R1 lines by NGS. Both SNPs and small indels (<2 bp) were discovered and were confirmed by capillary sequencing. These mutations were evenly distributed between the 5 chromosomes. The average theoretical mutations per plant were 122. Thus, the mutation rate was 10.5 ± 1.0 × 10 -7 , about 60-to 350-fold higher than the spontaneous mutation rate. Miyao and associates (Miyao et al. 2011) sequenced three Tos17 mutant lines, ttm2, ttm5 and ttm11, that they had generated. Some of the sequence changes were then validated by the dye-terminator method. The SNPs in these regenerants were distributed on all chromosomes, with the transitions:transversion ratio 1.1. The estimated mutation rate was 1.74 × 10 -6 base substitutions per site per regeneration. In addition, the authors detected small indels. Using the paired-end read data, they searched against the sequences for 43 transposable elements such as Tos 17, Tos19, and mPing but detected no transposition events except for in Tos17. So they concluded that other than the integration of the retrotransposon Tos17, SNPs and indels were the major causes of somaclonal variation. Using a similar strategy, Panaud and associates (Sabot et al. 2011) studied the transposition events of another rice Tos17 mutant line, AB156365. From the paired-end data, in addition to the Tos17 insertions (11 of them), they detected the transpositions of another 11 LTR retrotransposons and 12 MITE integrations compared with the Nipponbare RefSeq. The conflicting results should be explained by the different batches of seeds used to produce these lines. Although the Nipponbare RefSeq is high quality, the rice seeds would accumulate sequence variation during propagation each year. Thus, SNPs, indels, and transpositions may pre-exist in the seed batch used for transformation. Figure 1 illustrates the SNPs in the TNG67 and four TRIM lines in four 100-kb fragments. These four mutant lines, M48349, M53677, M79651, M84311, were generated in the transgenic lab at least 4 months apart to ensure that they did not come from the same series of calli. All sequences were aligned to the Nipponbare sequence. Because of the availability of the TNG67 genome sequence, we may differentiate the new SNPs (indicated by blue arrows) and those in the original TNG67 (indicated by light blue circles). The sequencing depth is also important for detecting the real SNPs (complete red bar) or a low-quality read (red-blue bar). Sometimes we see identical SNPs in two different lines (indicated by red arrow) indicating that this variation might already exist in the seeds used for transformation.
Recently Jacobsen and coworkers (Stroud et al. 2013) have shown that rice plants regenerated from tissue culture also harbors stable epigenome changes. Cell dedifferentiation occurring as the first step of cell culture is accompanied with a reprogramming of gene expression which notably relies on a change of DNA methylation status. This group has shown that some of these methylation changes persist in regenerated plants. Notably the observed demethylation of regulatory regions of genes may result in change of expression and thereby of phenotype.
The following are recommendations for solving some of these issues. 1) We should complete the sequencing of all varieties that have been used to generate the mutant populations (i.e., Tainung 67, Dongjin, Zhonghua 11, Kitaano, Zhonghua 15, Hwayoung, and Kitaake). As well, these data should be available in a public domain. They should be used as the reference sequence for detecting changes in the genome rather than Nipponbare. 2) We should pay special attention to spontaneous mutations accumulating in seeds. Although most resource groups received the seeds from rice breeders for transformations, sequence differences could still appear among seeds and among batches.
To eliminate this problem, several lines may be sequenced to reveal possible identical SNPs or indels that preexist before transformation. Also it is important to always compare homozygous and azygous siblings in addition to wild type control. 3) Because the activated genes usually provide a dominant trait, we may use the segregation ratio to search for candidate mutant lines caused by integration but not somaclonal variation. The tagging efficiency should be higher. However, this method may be applied only for the Ac population. 4) We may prepare bulk DNA from several lines and perform NGS so that the T-DNA/Tos17/Ds integration sites can be discovered efficiently. However, the real SNP or indel and low-quality base readings may not be easy to distinguish for such bulked sample.

Creating new and improved resources
A call has recently been launched for a shared international effort in the fast generation of an insertion line collection of the temperate japonica cultivar Kitaake (G. An, unpublished). Kitaake is a short-cycling (2 months from seed to seed), temperate japonica variety grown in the northern Hokkaido island of the Japanese archipelago. It exhibits a compact habit, with few but productive tillers and is relatively photoperiod-insensitive (its height varies by day length). Its low tiller number and short cycle allows for high culture density and avoidance of most disease and pest attacks on vegetative and reproductive organs under greenhouse conditions. The Kitaake genome has been recently sequenced (P. Ronald, pers. comm.) and will be soon publicly released; the cultivar is amenable to high-throughput transformation (Kim et al. 2013).
A new, improved gene-trap T-DNA vector should be used, including for example, two T-DNA LBs for preventing backbone transfer as well as, possibly, adjunction of sequences facilitating further flanking-region amplification and sequencing and revealing and avoiding tandem insertions. Also, quantitative PCR screens for selecting single-copy T-DNA events and thereby eliminating multiple, complex and tandem insertions events are now available and can be set up on a large scale to restrict the transfer of useful plants under greenhouse conditions. The success rate in isolating flanking regions is also maximized in the latter events (unpublished), thereby reducing the overall sequencing cost. Also, the efficiency of gene isolation after reporter expression-mediated trapping should be optimized in such a selected subpopulation. A common stock center could be set up for this new Kitaake insertion resource, which would facilitate seed orders and expedite regulatory and import permit issues.
Also desirable is avoidance of unwanted DNA lesions and methylation due to somaclonal variation by shorter duration of tissue culture. Fast transformation protocols making use of primary seed embryo-derived calli and no longer their secondary calli have been set up (Toki et al. 2006). Automation of the tricky and tedious washing and decontamination procedure has allowed for increasing the throughput of the method (D. Meynard, unpublished observations). Another objective is the complete avoidance of tissue culture by the development of in planta transformation procedures. The in planta transformation protocol used in Arabidopsis, floral dipping, was instrumental in establishing insertion libraries, greatly facilitated the integration of routine transformation procedures in laboratory practices, and eliminated variations due to callus culture (although not those created by T-DNAaborted integrations). In rice, Agrobacterium-mediated transformation following the piercing of young seedlings has allowed for the generation of transgenic rice but still not in a routine manner (Supartana et al. 2005;Lin et al. 2009). Establishing a high-throughput in planta protocol would be of great interest and would certainly enhance the efficiency of forward genetics screens.
In the past 2 years, novel technologies such as zinc finger nucleases, Transcription Activator-Like Effector (TALE) nucleases or Clustered Regularly Interspaced Short Palindromic Repeats (CRIPSR)/Cas9 have shown promise for generating double-strand breaks and subsequent lesions in genomes of higher organisms (Gaj et al. 2013). TALE nucleases have recently been used for creating mutations in a rice gene involved in the development of bacterial blight disease, with a high frequency of mutation (20-30%), including bi-allelic mutations, at the T0 generation (Li et al. 2012). Recent papers have also shown that rice is amenable to CRISPR Cas9 technology (Shan et al. 2013) making rice the first crop having the genome edited using this method. The ability to generate at a high throughput lesions in the genome guided by sequence-specific single guide (sg) RNA complementary to the target DNA and to multiplex such sgRNA in a single transfection using a universal Cas9 nuclease module opens new avenue to generate novel resources that systematically target genes having no insert or no alleles in existing mutant collections.
Although these methods are experimental and their use is still limited to a few laboratories, the systematic creation of lesions and simultaneous insertion of a reporter in any rice gene may become tractable in the near future in an international effort. Then it might possible to establish an insertion resource that systematically targets rice genes by simultaneously creating both a knockout and a reporter line using these technologies that will greatly facilitates the deciphering of the function of all rice genes by 2020.

Conclusion
We summarize the present status of rice mutant collections affected in a random manner by physical/chemical and insertion mutageneses. After about 10 years since the first publication on insertional mutant work, the flanking sequence tags of the current world-wide collections reach three quarters of the calculated number for genome saturation and they are publicly available. We point out issues limiting the exchange and use of the current resources, and provide suggestions to solve the problems.