Molecular Biology Crash Course: 5 Must-Know Gene Nomenclature Rules For Every Biology Researcher

Get Published
Getting your Trinity Audio player ready...

We humans love to name and categorize things. Naming natural phenomena, mechanisms, organisms, and molecules is one of the key roles of biologists. Carl Linnaeus formalized taxonomy in 1735, which was a crucial innovation in zoology and botany.1 Likewise, modern-day biologists have developed and applied systems of naming and categorizing the molecular underpinnings of life. Eukaryotes tend to have tens of thousands of genes, so applying a robust system of nomenclature has been a major challenge in molecular biology. In this article, I would like to introduce some of the most important rules regarding gene nomenclatures in life science research.

1. Consult the best resource for each species

Genetics has become a leading field of interest in biology. In the early 1970s, the first viral gene sequences were published, which was a landmark for research in microbiology.2

At first, naming genes was a free-for-all activity, but then problems emerged. Life sciences researchers often had issues with duplicate, conflicting3, or unusual names.4 So, research groups formalized nomenclature rules and public databases were established, ensuring that all researchers have access to the most accepted nomenclature for each species.

Here are some established guidelines for the gene nomenclature of common model organisms that molecular biology researchers should know:

Mouse: Mouse nomenclature at MGI5

Rat: Rat genome database nomenclature resources6

Saccharomyces cerevisiae: SGD nomenclature conventions7

Zebrafish: ZFIN zebrafish naming conventions8

Arabidopsis thaliana: TAIR Arabidopsis nomenclature9

Of course, these are by no means exhaustive, as there are dozens of other model species. A web search can often easily uncover resources for the species that you are studying as part of your research in the life sciences.

2. Apply the correct typographical rules for each species

If I saw a sentence like this in a paper describing RT-qPCR of mouse samples, I would alter it.

GAPDH was used as the reference gene.”

As mentioned previously, various research groups from disparate fields within molecular biology have applied differing rules for assigning gene names. This has in turn led to the emergence of species-specific typographical rules in life sciences research.

For humans and many mammalian species, the gene symbol is GAPDH. However, capitalization varies in some other species. For example, in the two most studied murine species in biochemistry and drug discovery, namely mice and rats, this gene is symbolized as Gapdh. Furthermore, it is written as gapdh for other model organisms, such as the African clawed frog and zebrafish. Do check the available resources and ensure that you are following the accepted name for the gene in question when writing your research paper.

3. Gene names can vary among organisms

Gene names are often shared among organisms, besides minor differences in their typography. However, we cannot assume that this is universal. For instance, consider the gene elongation factor 1-alpha 1. In many species, including humans, zebrafish, and rats, this gene is referred to as EEF1A1, with some variations in capitalization. However, just as how research in botany and zoology have varying approaches to taxonomy, so too are there differences in the most used gene symbols between kingdoms. While the name “elongation factor 1-alpha 1” is the same for Arabidopsis, its gene symbol is A1.10 However, it is fundamentally the same gene, despite its different symbols.

In all, as a life sciences researcher, you must always ensure that the most commonly accepted name is used in your research manuscripts to prevent ambiguity.

4. Genes and their products are written differently

The name of a gene is almost always related to its products. But this doesn’t mean that they are the same. Consider the following example:

“Western blotting was used to confirm the expression of PRKN in different lysates”

As anybody with basic biochemistry knowledge would know, western blotting is used to assess protein expression. As the products of genes do not follow the same naming, it would not be accurate to specify the gene name when describing a western blot in your research paper. Here, it would be more accurate to write “parkin” or another one of its common protein names.11

The converse would be true for a PCR experiment. As PCR amplifies nucleic acids, we should use the gene symbol in such a case. Note that mRNA symbols are presented similar to their corresponding gene symbols in life sciences research.

5. Watch out for these errors

Finally, here are some more common errors I have encountered that must be avoided at all costs.

In an immunology paper, I once saw “KLRC 1.” This doesn’t look too bad, but gene symbols do not contain spaces. It is easy to overlook, considering that protein names do not have these kinds of restrictions.

Let’s talk about a pharmacology paper that mentioned the relevance of SREBF2 in drug treatment. This gene was presented as SREFB2 at more than one instance in the research paper. When dealing with tight deadlines, innocent typographical errors like these can fly under the radar, so it is best to remain vigilant.

Then, there are times when similar names are used interchangeably. While there can be more than one name for a given gene, we should pick any one and use it consistently. For example, in a drug discovery paper on the D(1B) dopamine receptor, it would not be appropriate to use DRD5, DRD1B, and DRD1L2,12 even if all three are acceptable symbols for the same gene.

Conclusions

To summarize, although gene nomenclature can seem tricky given the immense number of genes, especially if you’ve mainly been studying less-related life sciences fields like ecology or pharmacy, it is governed by easily accessible and simple rules. Many journals and publication houses offer their own guidance on gene nomenclature.13 Check and double check your nomenclature and consider using Editage’s English Editing Service, which allows you to work with experienced editors for the best chance of moving your manuscript closer to publication.

References

  1. Müller-Wille, S. Collection and collation: theory and practice of Linnaean botany. Stud. Hist. Philos. Biol. Biomed. Sci. 38, 541–562 (2007).
  2. Min Jou, W., Haegeman, G., Ysebaert, M. & Fiers, W. Nucleotide sequence of the gene coding for the bacteriophage MS2 coat protein. Nature 237, 82–88 (1972).
  3. Obstacles of nomenclature. Nature 389, 1 (1997).
  4. Seringhaus, M. R., Cayting, P. D. & Gerstein, M. B. Uncovering trends in gene naming. Genome Biol. 9, 401 (2008).
  5. Mouse Nomenclature Home Page. MGI http://www.informatics.jax.org/mgihome/nomen/ (2022).
  6. Rat Nomenclature Guidelines. RGD https://rgd.mcw.edu/nomen/nomen.shtml (2022).
  7. SGD Help: Nomenclature Conventions. SGD https://sites.google.com/view/yeastgenome-help/community-help/nomenclature-conventions (2022).
  8. ZFIN Zebrafish Nomenclature Conventions. ZFIN https://zfin.atlassian.net/wiki/spaces/general/pages/1818394635/ZFIN+Zebrafish+Nomenclature+Conventions (2022).
  9. Arabidopsis Nomenclature. The Arabidopsis Information Resource https://www.arabidopsis.org/portals/nomenclature/guidelines.jsp (2022).
  10. UniProtKB – P0DH99 (EF1A1_ARATH). UniProt https://www.uniprot.org/uniprot/P0DH99 (2022).
  11. UniProtKB – O60260 (PRKN_HUMAN). UniProthttps://www.uniprot.org/uniprot/O60260#names_and_taxonomy (2022).
  12. UniProtKB – P21918 (DRD5_HUMAN). UniProthttps://www.uniprot.org/uniprot/P21918 (2022). 13. Gene/Protein Nomenclature Guidelines and Requirements. Mol. Hum. Reprod. https://academic.oup.com/molehr/pages/Gene_And_Protein_Nomenclature (2022).

Related post

Featured post

Comment

There are no comment yet.

TOP