Checklist for the description of sequence variants


Last modified July 28, 2013

Since references to WWW-sites are not yet acknowledged as citations, please mention den Dunnen JT and Antonarakis SE (2000). Hum.Mutat. 15:7-12 when referring to these pages.


Purpose

Going through publications one can easily see where people tend to offend the "Current recommendations for the description of sequence variants". The checklist below covers the most problematic issues and should assist those preparing a publication to describe sequence variants following the current recommendations.


Checklist

  1. Reference Sequence - do you clearly describe the sequence used as a reference sequence?
    A publication should mention, preferably in the Materials & Methods section and/or Table legend, which sequence file was used as reference sequence for numbering of the residues (DNA, RNA and protein) and describing the variants; see Recommendations, Discussion and mtDNA variants.
  2. Intronic variants - do you indicate where the reference intron sequence can be found ?
    The recommendation is to describe intronic variants in the format "c.89-2A>G" and not like "c.IVS4-2A>G" (see Discussion). When the format "c.IVS4-2A>G" is used, it is essentail to give a clear reference for intron / exon numbering and to give a reference for the intron sequence.
  3. Tabular overview - do you provide a clear, unequivocal overview of all changes reported?
    Preferably, a publication contains a tabular overview of all variants reported. This overview contains columns describing the change at DNA-level (absolutely essential) and, optional, at RNA and protein level. When data on RNA and/or protein level are provided, it should be made clear whether the data were deduced or experimentally verified (e.g. state explicitly when RNA was analysed to confirm the putative splice variant detected).
  4. Insertions
  5. Most 3' position - do you correctly assign the change to the most 3' (or C-terminal for protein variants) position possible?
    For deletions, duplications and insertions the most 3' position possible is arbitrarily assigned to have been changed (see Recommendations); important especially in single residue (nucleotide or amino acid) stretches or tandem repeats. Example ACTTTGTGCC to ACTTGCC is described as c.5_7delTGT (not as c.4_6delTTG)
  6. Recessive diseases - do you clearly describe which changes are found in which combination?
    a publication describing sequence changes found in patients suffering from a recessive disease should for each patient explicitly mention which combination of (pathogenic) changes was identified (see Recommendations). Example c.[76C>T]+[87G>A] or c.[76C>T]+[?].
    NOTE: this description differs from that describing several changes in one allele, which has the format c.[76A>C; 113G>C].
  7. Range - is the sign used to indicate a range a "_" (underscore) and not a "-" (minus)?
    To prevent confusion, the underscore should be used to indicate a range and not the minus sign. The minus sign should only be used to indicate negative numbers. The correct description to indicate a deletion of the coding DNA nucleotides 12 to 14 is c.12_14del. Not correct is c.12-14del, which describes a deletion of nucleotide -14 in the intron directly preceding cDNA nucleotide 12 (see Discussion).
  8. Deletion - do you indicate the first and last residue involved in a deletion?
    A deletion of more than one residue should mention the first and last residue deleted, separated using a "_" (underscore), e.g. c.21_24del or p.Ala13_Gln16del. Descriptions like c.21del3 should not be used.
  9. Describe at DNA-level - do you describe all changes reported at DNA-level?
    All changes reported must be described at DNA-level
  10. RNA protein level descriptions
    Recommendations exist to describe alternative transcripts deriving from one allele (see Recommendations). Since these descriptions are rather complex to explain, it is wise to include a link to the HGVS recommendations in the publication.
  11. Protein level descriptions
  12. Polymorphisms
    Do not describe polymorphic variants as c.127A/G or p.43Ile/Val (or p.43I/V). A description of a variant should be neutral and polymorphisms and pathogenic changes should not be described differently (see Discussion). Correct descriptions are c.127A>G and p.Ile43Val.

| Top of page | MutNomen homepage |
| Recommendations:  DNARNAprotein, uncertain |
| Discussions | FAQ's | Symbols, codons, etc. | History |
| Example descriptions:  QuickRef / symbolsDNARNAprotein |

Copyright HGVS 2007 All Rights Reserved
Website Created by Rania Horaitis, Nomenclature by J.T. Den Dunnen - Disclaimer