Table 1 Rules for Classifying Sequenced Genes as Suggested by the CGSNL

From: Gene Nomenclature System for Rice

Categories Classification Standard protocol Description
Category I Identical to rice protein with known function Identity > = 98%, length coverage = 100% to known rice protein [blastx] Receive the same, original gene name
Category II Similar to a known protein Identity > = 50% to a known protein. [blastx] Receive “original gene name, putative”
Category III InterPro domain-containing protein Not in category I or II, but contains InterPro domain. Receive “InterPro name domain-containing protein”
Category IV Conserved hypothetical protein Identity > = 50%, length coverage > = 50% to hypothetical protein [blast x] Receive “conserved hypothetical protein”
Category V Hypothetical protein If not in category I to IV Receive “hypothetical protein”
  1. This describes a system for classifying sequenced genes into categories based on their sequence similarity to previously reported genes, as recommended by the CGSNL. The genes predicted and/or known to be present on the O. sativa ssp. japonica cv. Nipponbare, based on sequence analysis are classified into five categories (column 1). Genes are assigned a gene name and a gene symbol only if there is substantial experimental evidence confirming that a gene is identical in sequence to a previously characterized rice gene of known function (category I). If the evidence is considered insufficient to substantiate assigning a gene function (assigned categories II–V), the gene name field is left empty and the description/definition field (columns 2 and 4) is utilized to document what is known about the characteristics of the gene.