Description of sequence changes: 
examples protein-level

Last modified November 16, 2015

Since references to WWW-sites are not yet acknowledged as citations, please mention den Dunnen JT and Antonarakis SE (2000). Hum.Mutat. 15:7-12 when referring to these pages.



Within this page examples will be given for the description of sequence variants on protein level, examples to describe changes at DNA and RNA level are given at other pages. All examples are described relative to a reference sequence, here the amino acid (protein) sequence. 

Reference sequence

Part of gene  nucleotide numbering
Reference Sequence
nucleotide numbering
coding DNA
Reference Sequence
nucleotide numbering
Reference Sequence 
5' gene flanking region  - (-300 to -31) 1 to 270
exon 1 5' UTR  - -30 to -1 271 to 300
coding region 1 to 4 1 to 12 301 to 312
intron 1 - 12+1 ... 12+50,
13-50 ... 13-1
313 to 412
exon 2 5 to 29 (30) 13 to 88 413 to 488
intron 2 - 88+1 ... 88+100,
89-100 ... 89-1 
489 to 689
exon 3 30 to 41 89 to 123 689 to 723
intron 3 contains rare alternatively spliced exon from 800 to 859 (coding DNA 123+77 to 123+136) - 123+1 ... 123+150,
124-150 ... 124-1 
724 to 1023
exon 4 42 to 100 124 to 300 1024 to 1200
intron 4 - 300+1 ... 300+200,
301-200 ... 301-1 
1201 to 1600
exon 5 coding region 101 to 109 301 to 330 1601 to 1630
3' UTR, containing a (CA)7-stretch from nts 1700 to 1713 (coding DNA *71 to *83); poly-A addition site at 1825 (coding DNA *195) - *1 to *220 1631 to 1850
3' gene flanking region (*221 to *370) 1851 to 2000

Reference sequence of imaginary gene used for the exaples given on this page. Nucleotide +1 in the coding DNA reference sequence is the A of the ATG translation initiation codon. Abbreviations used: nt = nucleotide, nts = nucleotides, UTR = untranslated region of the mRNA. For a picture of part of this hypothetical sequence see Figure.


It should be noted that the descriptions at protein level, even more than those at RNA level, are mostly deduced and not based on experimental evidence. Publications describing changes at protein level should make it clear whether experimental proof was available or not. In fact, when changes are reported for which experimental proof is not available one should consider to list them between brackets.

Sequence changes at protein level are basically described like those at the DNA level, with a few modifications;

Silent changes

Description of so called "silent" changes in the format p.(Leu54Leu) (or p.(L54L)) should not be used. When desired such changes can be described using p.(=). Descriptions should always be given at DNA level (see Discussion).


Substitutions should be described without using the specific ">"-character which is used on DNA and RNA level (i.e. p.Trp26Cys, not p.Trp26>Cys).

NOTE: polymorphic variants are sometimes described as p.36Leu/Ile (p.36L/I) or p.36Leu/Leu (p.36L/L) but this is not correct (see Protein level recommendations).


Deletions are designated by "del" after a description of the deleted segment, i.e. the first (and last) amino acid(s) deleted.

NOTE: for all descriptions the most C-terminal position possible is arbitrarily assigned to have been changed


Duplications are designated by "dup" after a description of the duplicated segment, i.e. the first (and last) amino acid(s) duplicated.


Insertions are designated by "ins" after a description of the amino acids flanking the insertion site, followed by a description of the inserted amino acids. When the insertion is large it may be described by its length (e.g. p.Lys2_Leu3ins34). However, it should be possible to derive the inserted sequence from the description at DNA level. Duplicating insertions should be described as duplications (see Discussion).

When an insertion creates a new amino acid at the insertion junction the change is described as an insertion/deletion (see indels)


Translocations at protein level occur when a translocation at DNA level leads to the production of a fusion protein, joining the N-terminal end of the protein on one chromosome to the C-terminal end of the protein on the other chromosome (and vice versa). No recommendations have been made sofar to describe protein translocations.

Complex rearrangements

Complex rearrangements are rearrangements which consist of several different types of the six elementary content changes substitution, deletion, duplication, insertion, inversion and translocation. Such rearrangements can be very complex and difficult to describe. Specific recommendations to describe such changes have not made. Complex rearrangements can be best described as a combination of the elementary changes. 

Deletion/insertions (indels) are described as a deletion followed by an insertion (see Discussion)


| Top of page | Homepage | Check-list |
| Recommendations:  DNARNAprotein, uncertain |
| Discussions | FAQ's | Symbols, codons, etc. | History |
| Example descriptions:  QuickRef / symbolsDNARNA |

Copyright HGVS 2007 All Rights Reserved
Website Created by Rania Horaitis, Nomenclature by J.T. Den Dunnen - Disclaimer