Mitochondrial gene nomenclature

At the conference on 'Higher Plant Mitochondrial DNA', held at Airlie House, Virginia, USA in October 1986, a discussion on the possibility of creating a standardized nomenclature for plant mitochondrial and nuclear genes encoding mitochondrial proteins and RNAs was held. This proposal details the consensus opinion of that meeting and of a number of interested individuals who responded to the circulars. An overriding aim was to retain, wherever possible, the current commonly used gene designations and the general principles of chloroplast gene nomenclature were followed (R.B. Hallick and W. Bottomley, Plant Mol. Biol. Rep. 1:38-43, 1983).

Mitochondrial proteins: The polypeptide products of both nuclear and mitochondrial genes are to be designated by capital letters, e.g.:
ATPB-1: beta-subunit of the mitochondrial ATPase.
COXII: subunit 2 of cytochrome oxidase.

Mitochondrial genes: Genes on the mitochondrial chromosome or associated with mitochondrial plasmids will take a three letter code in lower case which will be either italicised or underlined, for example:
: Cox trn rrn orf urf
Any suffixed descriptors will also be italicised (underlined), for example:
atpA-1 coxII

Nuclear genes (specifying mitochondrial components): Nuclear genes will be distinguished from mitochondrial genes by being in UPPER CASE. Again a three letter italicised or underlined code will be used, for example:
The accepted convention for nuclear genes is to use numbers to identify genes with a related function or different members of the same gene family. However, we feel that where multipolypeptide complexes exist and where these have several components with unknown numbers of constituent polypeptides, for example the F0 and F1 components of ATP synthase, both letters and numbers should be used to distinguish the components of the complex whether or not the individual polypeptides of the components are mitochondrial encoded. This will allow additional polypeptides to be assigned to a complex regardless of the compartment in which it is encoded.

Mutations: In some instances mitochondrial genes specifying a gene product in one species or cytoplasm will be non-functional in another species or in another cytoplasm of the same species. It is therefore necessary to distinguish between functional and non-functional genes. The simplest way of achieving this is to provide the non-functional gene with a prefix, preferably a Greek symbol, e.g. phi (lower case).
  Functional Gene Mutant Gene
  urf1  furf1

Gene copy number: In both the nuclear and organelle compartments gene copy number may exceed one. In such instances genes will take a suffixed arabic number, for example:
atpA-1 atpA-2 trnS1 trnS2
Where the physical map of the mitochondrial genome has been determined, repeated sequences, if present, will be numbered sequentially from the origin of the map, the genes within repeats will be given the same reiteration-number as the repeat.

Species and cytoplasm designation: Species can be designated by the standard three letter code, for example:
Oenothera berteriana : Obe

Triticum aestiuum : Tae

Zea mays : Zma
Many higher plant species have more than one cytoplasm. If no nomenclature exists to distinguish these then they should be distinguished by an upper case letter. In many instances cytoplasms have been described as either fertile or sterile. In these instances S and F will suffice. In maize the accepted designations of N, T, S and C will be retained. These cytoplasm descriptors will follow the three letter species code, for example:
  Code  Description
  Zma C  The C-cytoplasm of maize

This type of abbreviation is most often used with restriction endonucleases. Without exception these cytoplasm identifiers must precede the gene to which they may be hyphenated, for example:
T-urf13 N-atpA-1 N-atpA-2 S-pcf
Such cytoplasm identifiers can be optional to reduce the size of the gene acronyms. Please note that the species acronym (if used) and the cytoplasm identifier are NOT to be italicised or underlined.

Genes of plasmids: Plasmids, both linear and circular, DNA or RNA are associated with some cytoplasms but not all. There is a growing body of evidence to suggest that specific plasmids are NOT associated with a particular cytoplasm. In order to account for this, we propose that a plasmid designation replaces the cytoplasm identifier. It will be necessary for the research worker to specify the cytoplasm elsewhere. Where many plasmids are associated with one cytoplasm it may be useful to identify them by mp1 : mitoplasmid 1, mp2 etc. Alternatively, trivial and commonly used designations may be retained, for example S1, S2, R2 etc. An mp prefix will assume a circular topology. Linear plasmids, not having commonly known designations, can be identified by their size in nucleotides or nucleotide base pairs and an L descriptor. Singlestranded molecules can be noted using the 'ss' abbreviation. Perhaps, fortunately, there are sufficiently few linear nucleic acid entities associated with motochondria for them to retain or be given trial designations.
  Plasmid Gene Format
  S2 urf1 S2-urf1
  2.3L trnW-TGG 2.3L-trnW-TGG

Recommended nomenclature for known m itochondrial genes
Ribosomal genes
  Gene Gene product
  rrn26 26S rRNA
  rrn18 18S rRNA
  rrn5 5S rRNA

The designation rDNA (e.g. 26S rDNA) includes the ribosomal gene as well as the transcriptional promoters and the transcribed flanking sequences.

Transfer RNA genes-Transfer RNA genes are designated 'trn' with the addition of the single letter amino-acid code to identify the species; isoaccepting species will be designated with the anticodon following the amino acid code. Duplicated genes, for example those associated with repeated sequences, will be identified with a reiteration-number corresponding to the reiteration-number of the repeat. Unfortunately, it is impossible to distinguish between tRNA gene duplications which have occurred due to promiscuous DNA transfer between organelles. A few of the chloroplast tRNA genes in the mitochondrial genome of maize, for example trnW-TGG, are transcribed, but their ability to accept amino acids and transfer these to growing polypeptide chains has never been demonstrated. Therefore, they could be considered pseudogenes and be designated as such.
  Gene Gene product Comment
  trnM or trnM-ATG tRNAMet Elongator species
  trnfM or trnfM-ATG tRNAfMet Formyl-methionine initiator species

Where there are isoacceptors:
  trnL-TTG tRNALeu-CUU
  trnL-CTG tRNALeu-CAG

Where gene duplication has occurred:
  trnL1 or trnL1-CTA tRNALeu1-UAG
  trnL2 or trnL2-CTA tRNALeu2 -UAG

The mitochondrial genetic code has one possible anomaly: CGG specifies tryptophan instead of arginine. It is therefore recommended that in this instance it be fully abbreviated as follows:

Mitochondrial polypeptide genes: The gene designations in this section will make use of the commonly used or accepted gene designations where possible.

Ribosomal protein genes-The designations 'rps' for small subunit proteins, 'rpl' for large subunit proteins, are recommended. Where homology to an existing E. coli ribosomal protein exists the gene can be designated with the same number. If no homology exists then the identifier should be a letter. For example:
Gene Gene product Comment
  rps13 RPS13  ribosomal protein S13

Polypeptides of the electron transport chain
Complex I: NADH-ubiquinone oxidoreductase-The components of this complex will be designated 'nad'. Individual genes will be given numerical identifiers, these will indicate homology or functional equivalence to the mammalian subunits. Additional genes will accept the next number of the series.
  Gene* Gene product Mammalian gene
  nad1 NAD1 urf1
  nad2 NAD2 urf2
  nad3 NAD3 urf3
  nad4 NAD4 urf4
  nad4L NAD4L urf4L
  nad5 NAD5 urf5
  nad6 NAD6 urf6

*These gene designations presume a mitochondrial location. To date only sequences related to nad1 and nad5 have been identified.

ATP synthase-The two multicomponent subunits of this complex, F0 and F1, will be differentiated by suffixed letters (F1) and suffixed arabic numbers (F0).

Subunits of the F1 complex:
  Gene Gene product
  atpA ATPA
  ATPB-2  ATPB-2

Subunits of the F0 complex:
  Gene Gene product
  atp6 ATP6
  atp9  ATP9

Genes of other complexes, including complex III and complex IV
  Gene Gene product Description
  coxI COXI subunit 1 of cytochrome oxidase
  coxII COXII subunit 2 of cytochrome oxidase
  coxIII COXIII subunit 3 of cytochrome oxidase
  cob COB Apocytochrome B

Open reading frames and unidentified reading frames-Open reading frame, orf, is a gene for which no specific gene product has been identified. Unassigned reading frame, urf, is an open reading frame which is transcribed and translated into a polypeptide whose function has not been assigned. Individual orf's and urf's will be distinguished by the number of amino acids which are coded for by the open reading frame. An urf can also accept the size of the polypeptide in kilo daltons as its descriptor, for example:
  Gene Gene product Comment
  T-urf115 or T-urf13 T-URF13 Gene coding for the 13 kDa polypeptide which is unique to the T-cytoplasm of maize.

Gene library and central registrar: It is hoped that during 1988 a gene library will be established at the Institute of Plant Science Research in Cambridge. The clones deposited will be freely available on request. As an adjunct to this facility it would be recommended that people wishing to designate a new gene would first check to ensure that the nomenclature is consistent with this proposal. The contents of the library and the rules of gene nomenclature should be accessible via Bitnet using the following address: Lonsdale @ UK.AC.AFRC.CAMB.

D.M. Lonsdale and C.J. Leaver

[Ed. note: This proposed nomenclature, like that for chloroplasts, should be considered a working framework, just as are those for nuclear genomes; readers and users are urged to deliberate on these nomenclatures and to convey suggestions and criticisms to the authors. The editor would appreciate receiving copies of correspondence, toward internally consistent evolution of all nomenclatures]

Please Note: Notes submitted to the Maize Genetics Cooperation Newsletter may be cited only with consent of the authors.

Return to the MNL 62 On-Line Index
Return to the Maize Newsletter Index
Return to the Maize Genome Database Page