Recommendations for the description of protein sequence variants (v2.0)


Last modified November 20, 2015

NOTE: this website is frozen since May 1, 2016. It has been replaced by a new version at http://www.HGVS.org/varnomen. These pages serve as archival copy only.


Contents


Protein level

(suggestions extending the published recommendations in italics)


NOTE: definitions of protein changes have been extensively reviewed (2013-Q2). This did not affect HGVS recommendations for variant descriptions but it did change under which category specific types are listed below. For example, where a nonsense variant (p.Trp26Ter or p.W26*) was originally listed under Substitutions it is now listed under Deletions.

The recommendations for the description of protein variants explain how changes in the sequence of a protein should be described. It should be noted that these changes are a consequence of a variant at DNA level that may or may not have influenced the processing of the RNA before it is translated into protein. Experimental evidence of protein level variants, e.g. from mass spectrometry amino acid sequencing, will rarely exist. In some cases indirect evidence might come from protein sizing (Western blot analysis) or localisation (immuno-histochemical staining). In most cases protein descriptions will however be deduced only, predicted from the changes detected on DNA and/or RNA level.

Specific terms are used to describe the consequences of a change at protein level, like missense, nonsense, silent and frame shift. These terms are not used in the descriptions given below. Missense is under substitution, nonsense under deletion, silent under no change and frame shift under deletion/insertion (indel).

General

Sequence changes at protein level are described like those at the DNA level with the following modifications / additions;

Amino acid coding and numbering


Silent changes

Description of so called "silent" changes can be described using p.(Leu54=) (see SVD-WG001). The format p.(Leu54Leu) (or p.(L54L)) should not be used. These descriptions can only be given in addition to a description at DNA level (see Discussion).


Substitutions

Substitutions (missense changes) replace one amino acid by one other amino acid and are described using the format p.Trp26Cys. The description does not use the ">"-character used on DNA- and RNA level (indicating "changes to").


Deletions

Deletions remove one or more amino acid residues from the protein and are described using "del" after an indication of the first and last amino acid(s) deleted separated by a "_" (underscore). Deletions remove either a small internal segment of the protein (in-frame deletion), part of the N-terminus of the protein (initiation codon change) or the entire C-terminal part of the protein (nonsense change). A nonsense change is a special type of deletion removing the entire C-terminal part of a protein starting at the site of the variant (specified 2013-03-16).

NOTE: for all descriptions the most C-terminal position possible is arbitrarily assigned to have been changed


Duplications

Duplications are described using "dup" after an indication of the first and last amino acid(s) duplicated separated by a "_" (underscore). In-frame duplications containing a translation stop codon in the duplicated sequence are described as an insertion of a nonsense variant, not as a deletion-insertion removing the entire C-terminal amino acid sequence.

NOTE: for all descriptions the most C-terminal position possible is arbitrarily assigned to have been changed


Insertions

Insertions add one or more amino acid residues between two existing amino acids and this insertion is not a copy of a sequence immediately 5'-flanking (see Duplication). Insertions are described using "ins" after an indication of the amino acids flanking the insertion site, separated by a "_" (underscore) and followed by a description of the amino acid(s) inserted. In-frame insertions containing a translation stop codon in the inserted sequence are described as an insertion of a nonsense variant, not as a deletion-insertion removing the entire C-terminal amino acid sequence. Since for large insertions the amino acids can be derived from the DNA and/or RNA descriptions they need not to be described exactly but the total number may be given (like "ins17").

NOTE:  duplicating insertions should be described as duplications (see Discussion), not as insertion.


Variability of short sequence repeats

Variability of short sequence repeats are described as p.Gln6(3_6); the description indicates that a stretch of Glutamines (Gln, Q) is present, starting at amino acid position 6 (e.g. in MKMGHQQQCC), which is found with a variable length from 3 to 6 in the population

NOTE: the underscore is used to indicate the range (3 to 6 times).


Deletion/insertions (indels)

Deletion/insertions (indels) replace one or more amino acid residues with one or more other amino acid residues. Deletion/insertions are described using "delins" as a deletion followed by an insertion after an indication of the amino acid(s)deleted separated by a "_" (underscore, see Discussion). Frame shifts are a special type of amino acid deletion/insertion affecting an amino acid between the first (initiation, ATG) and last codon (termination, stop), replacing the normal C-terminal sequence with one encoded by another reading frame (specified 2013-10-11). A frame shift is described using "fs" after the first amino acid affected by the change. Descriptions either use a short ("fs") or long ("fsTer#") description. The description of frame shifts does not include the deletion at protein level from the site of the frame shift to the natural end of the protein (stop codon). The inserted amino acid residues are not described, only the total length of the new shifted frame is given (i.e. including the first amino acid changed).
NOTE: typing error in den "Dunnen & Antonarakis (2000)". The suggestion to use ">" to indicate "delins" in frame shift descriptions has been retracted.
NOTE:  when one nucleotide is replaced by one other nucleotide the change is called a substitution

NOTE: the changes observed should be described on protein level and not try to incorporate any knowledge regarding the change at DNA-level (see Recommendation). Thus, p.His150Hisfs*10 is not correct, but p.Gln151Thrfs*9 is.


Extensions

Extensions affect either the first (start, translation initiation, N-terminus. ATG) or last codon (translation termination, stop) and as a consequence extend the protein sequence N- or C-terminally with one or more amino acids. Extensions are described using "ext" after a description of the change at the first amino acid affected and followed by a description of the position of the new translation initiation or termination codon.


More changes in one individual

Two or more changes in one individual are described by combining the changes, per chromosome (maternal and paternal), between square brackets ("[;];[;]") and using a semicolon (";") as separator:  [first change maternal; second change maternal] ; [first change paternal; second change paternal]". When changes are in different genes on different chromosomes a space (" ") is used to separate the different chromosomes ("[;] [;]").


| Top of page | Homepage | Check-list | Symbols, codons, etc. |
| Recommendations:  DNARNAprotein, uncertain |
| Discussions | FAQ's | History |
| Example descriptions:  QuickRefDNARNAprotein |

Copyright © HGVS 2010 All Rights Reserved
Website Created by Rania Horaitis, Nomenclature by J.T. Den Dunnen - Disclaimer