Skip to main content

Table 1 Rules for Classifying Sequenced Genes as Suggested by the CGSNL

From: Gene Nomenclature System for Rice

Categories

Classification

Standard protocol

Description

Category I

Identical to rice protein with known function

Identity > = 98%, length coverage = 100% to known rice protein [blastx]

Receive the same, original gene name

Category II

Similar to a known protein

Identity > = 50% to a known protein. [blastx]

Receive “original gene name, putative”

Category III

InterPro domain-containing protein

Not in category I or II, but contains InterPro domain.

Receive “InterPro name domain-containing protein”

Category IV

Conserved hypothetical protein

Identity > = 50%, length coverage > = 50% to hypothetical protein [blast x]

Receive “conserved hypothetical protein”

Category V

Hypothetical protein

If not in category I to IV

Receive “hypothetical protein”

  1. This describes a system for classifying sequenced genes into categories based on their sequence similarity to previously reported genes, as recommended by the CGSNL. The genes predicted and/or known to be present on the O. sativa ssp. japonica cv. Nipponbare, based on sequence analysis are classified into five categories (column 1). Genes are assigned a gene name and a gene symbol only if there is substantial experimental evidence confirming that a gene is identical in sequence to a previously characterized rice gene of known function (category I). If the evidence is considered insufficient to substantiate assigning a gene function (assigned categories II–V), the gene name field is left empty and the description/definition field (columns 2 and 4) is utilized to document what is known about the characteristics of the gene.