Last modified January 14, 2013
NOTE: this website is frozen
since May 1, 2016. It has been replaced by a new version at http://www.HGVS.org/varnomen.
These pages serve as archival copy only.
The recommendations for the description of sequence variants are designed
to be stable, meaningful, memorable
and unequivocal. Still, every now and then small
modifications will need to be made to remove small inconsistencies and/or
to clarify confusing conventions. In addition, the recommendations may be
extended to resolve cases that were hitherto not covered. To allow users
to specify up to what point they follow the HGVS recommendations we have
started to work with version numbers.
As of now, any change in the recommendations will get a
new version number based on the date of the change. Both in the version
list, and on the page giving details of the change, it will be
clearly marked using a format like date
2012-08-31. The version of the HGVS
recommendations including that change will be version
At the top of all pages on this site you will also find a Last
modified date. This date indicates when the respective
page was modified last. When this includes changes/extensions of the HGVS
recommendation, the version number of the recommendation will also change.
Note however that it often happens that simply a typing error was
corrected, an example was added, an explanation was further clarified, a
question answered, etc. In such cases the recommendations do not actually
change and the version number will thus also not change.
Version 0 - On the page "History
the description of sequence variants" we give an overview of all
publications on the description of sequence variants. These papers can be
considered as pre-versions of the first recommendations, a version 0.
Version 1 - we
consider the 2000 publication of den Dunnen JT and Antonarakis SE (Mutation
nomenclature extensions and suggestions to describe complex mutations: a
15 (1): 7-12) as a more formal set of recommendations,
i.e. version 1.
Version 2 - We are currently preparing a new
publication that will summarize the current HGVS recommendations. The most
significant and latest changes for version 2.0 compared to version 1.0
- Reference sequence - the recommendation is
to use a Locus Reference Genomic
sequence (LRG) (Dalgleish
et al. 2010) as the reference sequence for variant
descriptions. LRGs support descriptions using both genomic and coding
DNA reference sequences and have been specifically made for application
in a diagnostic setting (see Reference
In addition, indicators for new types of reference sequences have been
added (e.g. m. and n., see Standards)
as well as indicators to specify different transcripts / protein
isoforms generated from one gene (see
- Definitions - to enhance clarity as well as
to facilitate computational analysis and description of sequence
variants, the basic types of variants had to be defined more strictly.
In addition descriptions have been prioritized, meaning that when a
description is possible according to several classes, e.g. as a
duplication or an insertion, one specific class is preferred. For an
overview see Standards - definitions)
- Pre-existing standards - several scientist
have pointed out that we have thus far neglected the fact that some
standards were already existing before those for the description of
sequence changes were made. It is thus essential that we follow these
standards in our recommendations. The most important of these are the
pre-existing standards from the IUPAC (International Union of Pure
and Applied Chemistry) and IUBMB (International Union of
Biochemistry and Molecular Biology) for the description of nucleic
acids and amino acids (see below). These include letter codes to
describe incompletely specified residues at both DNA and protein
level (see Standards).
The most controversial of these changes is that where the
description of the stop codon at protein/amino acid
level changed from 'X' to 'Ter'/'*' since
'X' in the IUPAC-IUB nomenclature means an "unspecified"
or "unknown" amino acid.
- Incorporate ISCN standards - to describe
microscopically visible chromosomal changes, the cytogenetics community
uses the ISCN (International System for Human Cytogenetic
Nomenclature) standards (see ISCN-2005);
the latest update is from 2009 (editors Lisa Schaffer, Marilyn Slovak,
Lynda Campbell). Were initially direct chromosome spreads used only,
later hybridisation technologies like FISH (Fluorescent In Situ
Hybridisation) and arrays (arrayCHG, SNP-arrays) were introduced to
determine the state of specific sequences tested. On
the HGVS pages we have since 2005 suggested ways to describe changes
detected using such technologies (see
Uncertainties). These recommendations have now matured and
been incorporated. Furthermore, where possible, we have
incorporated established ISCN standards in the HGVS recommendations.
Examples include the use of "/" to describe somatic variants and
"//" for chimerism (see Standards).
- Simplification - in the 2000 recommendations
(v1.0), some symbols were used for more then one purpose which may lead
to undesired confusion. For example the "+" character was used both in
nucleotide numbering (indicating an intronic position) and to separate
two alleles while for the latter also the ";" character was used. The
recommendation is now to use only ";". A complete overview of the
characters and codes use can be found at the Standards
- Prediction / experimental proof - it is
often not clear whether a description of a variant at protein level is
based on experimental evidence or merely a prediction based on what was
detected at DNA level. To make this distinction more obvious, the
recommendation is to describe the variant at protein level between
brackets, like p.(Arg12Gly), when it is a prediction based on DNA data
only. When RNA has been analysed, and some experimental evidence exists
to support the prediction, the variant may be described without
brackets, like p.Arg12Gly.
- Repeated sequences - the 2000
recommendations where not very specific regarding the description of a
variability in repeated sequences, mono-, di-, tri-nucleotide stretches,
etc. Recommendations for the description of such variability have now
set (see Recommendations). The
format designed is also used to describe more complex copy number
variation of larger stretches of DNA, e.g. the presence of two
additional copies of one or more exons of a gene, often with the
breakpoints not fully characterised.
See the Version list for
additional changes and the latest version number.
page | Homepage | Check-list
| Symbols, codons, etc. |
| Recommendations: DNA,
| Discussions | FAQ's |
| Example descriptions: QuickRef,
© HGVS 2010 All Rights Reserved
Website Created by Rania Horaitis, Nomenclature by J.T. Den
Dunnen - Disclaimer