REPORT OF THE FIFTH HUGO MUTATION DATABASE INITIATIVE MEETING

27TH OCTOBER, 1998

DENVER, U.S.A.

8.00-20.00

The meetings of the HUGO Mutation Database Initiative (MDI) have for reasons of economy and convenience been usually held in association with the ASHG and HUGO HGM meetings. This also ensures maximum chances of capture of relevant and committed registrants. The 100 registrants (and 10-20 others who attended the meeting) in Denver were the largest and certainly the most enthusiastic group we have had. They were derived from 17 countries. 22 registrants represented 14 companies.

The format of the meeting was to receive reports and plans of the various working groups, papers on related activities and posters mainly on newly established mutation databases. Considerable time was allowed for discussion of the presented material and future plans.

Dr. Richard Cotton reported that the March of Dimes has supported the MDI through provision of a Research Assistant (Rania Horaitis) via the ALLIANCE Working Group. This assistance has allowed: members to be increased to 500 in 33 countries, complete listing with attribution of 124 databases on the MDI web site (1); systematic examination of possible genes for databases and 64 LSDB's have been initiated through this effort; organization of meetings; assistance and stimulation of the 10 working groups, maintenance of newsgroup, website and bimonthly newsletter; acting as a resource for advice.

The outcome of the NOMENCLATURE Working Group has recently been published (2), as a recommended mutation nomenclature. Dr. Antonarakis addressed the remaining problems of naming complex mutations and indicated he had had several enquries. Besides spelling out the mutation in full he undertook to provide some discussion points for the Initiative to be considered and commented upon.

The program to detail CONTENT AND SOFTWARE of the locus specific databases by the Working Group has resulted in a near final document, which had just been posted on a Website before the meeting (3). This has arisen from a range of experiences including creating and running the PAH LSDB and from loading the SRS system at EBI. The final document will be available for further comment until November 21, 1998 and then submitted for publication.

The CENTRAL databases reported individually. Victor McKusick reported on the status of mutations in OMIM (4) as at mid October. The number of genes reported was 845 with 9861 "allelic variants". Not all published variants are listed and he gave 7 criteria considered when deciding to list a variant. The advantages of the December, 1995 move to NCBI care were detailed. Dr. Michael Krawczak reported on the HGMD database (5) in Cardiff. Around 800 genes are covered with a total of 15,000 mutations documented. He noted the rate of accumulation of these published mutations was around 2,500 per year via 250 journals. New features have been addition of splice mutations and direct submission via Human Genetics, but use of this has been low. This database has an LSDB (Locus Specific DataBase) page and he noted that 90% of genes do not have a LSDB at this time. He also noted that some LSDB's are not being maintained. Dr. Heikki Levaslaiho reported on the EBI, Cambridge (6). This database is working with the LSDB's to mount them on the EBI server for SRS analysis. The current number is 27 LSDB's on the server. Particular work reported at the meeting was the linking of the two very large p53 databases, one containing around 10,000 entries, both of which were created for different reasons. Dr. Steve Sherry reported on two recent initiatives from NCBI (7), the SNP database (DBSNP) and the Human Reference Gene Project (REFGENE). REFGENE aims to provide a stable reference entry for the sequence of every reasonably well-characterized human gene and aims to and will aim to link GENBANK/UNIGENE (the large-scale databases) and LSDB's. DBSNP (8) is designed to be a public repository of simple genetic variation, particularly polymorphism and particularly, from the recently funded SNP discovery effort. Entries will include ethnicity, assay conditions, validation, etc. There will be links and other relevant databases.

Dr. Pui Kwok outlined the area of SNP databases and had earlier placed some discussion points on the MDI Website (1). He was consulted by NCBI during the creation of DBSNP. He described the current efforts at SNP accumulation. In the private sector, GENSET and INCYTE are aiming to describe 60,000 SNP's each, gene wide and gene specifically respectively. Celera and a pharmaceutical consortium are each planning a similar number, but propose placing them in the public domain. NIH has recently funded a public effort to define a similar number. The most difficult problem appears to be capturing of current SNP's, which are not necessarily published and require effort beyond the funded sources.

Dr. Ortigao described the recently posted SNP database HGBase (http://www.ibc.wustl.edu/SNP) which contains 2,000 - 3,000 SNP's and encouraged requesting support from industry where he is coming from.

Dr. Hernandez dealt with the SPECTRAL databases. These were initially stimulated by Dr. Franklin Hutchinson, as induced mutation databases. p53 currently contains 10,000 entries from the published literature. Entries accumulate at the rate of 2000 per year. Main difficulties seen were heterogeneity of detection methods used; heterogeneity of reporting of associated data and publication bias. Difficulty with the name "spectral" was highlighted and what did it mean? The meeting concluded best for it to mean somatic mutations due to environmental insults to distinguish them from the other LSDB's.

QUALITY CONTROL AND PEER REVIEW have been a concern from the start of the MDI and Dr. Cotton addressed recent proposals. The most fundamental point is to define what mutation and polymorphism mean and define what information is needed to incriminate a mutation, as causing disease. A manuscript addressing these points has recently been published (9). Further, a form to ensure the correct data is collected has been developed and presented by Dr. Auerbach (10). This form also asks questions so the curator/reader can assess the status of the data attributing disease causation to a base change. It is suggested this form when finalized be used by databases collecting mutations polymorphisms, particularly LSDB. Final comments on the form are to be made before the 31st December, 1998. Dr. Levaslaiho has produced a "mutation checker" (11), which ensures typographical or other errors of entry are eliminated.

Fears of unauthorized use and unattributed distribution have led to the formation of a COPYRIGHT and INTELLECTUAL PROPERTY Working Group. This is of particular concern to the LSDB, which contains unpublished mutations and grant and privately funded databases, which are the result of considerable intellectual input e.g. HGMD. This is an area of extreme difficulty due to inconsistency of international laws and recent changes. This Working Group has clearly outlined such problems in the past, but recommendations are difficult to formulate. Members representing different countries should be involved in this group to ensure all countries are covered. A discussion paper will be forthcoming shortly from Dr. Mike Brown and colleagues.

Two Working Groups were unrepresented. PATIENT ASPECTS address both information for patients and patient based databases. A discussion paper is expected shortly. ETHNIC AND NATIONAL DATABASES addresses complete listings of mutations affecting particular groups and the capture of mutations in genes in a country respectively. The development of those is planned not only to be useful in patient care, but also to create redundant collection to collect mutations from clinics, etc. who may not otherwise publish them. A recommendation sheet can be seen for Dr. Caglayan (12).

14 posters and 6 further papers were presented. Many of these were on LSDB and notable was that on Human Haemoglobin Variants. One of the key issues addressed amongst these was off-the-shelf software for LSDB and for those who do not wish to develop their own software. Represented were Mutation View (Dr. Minoshima) (13), Universal Mutation Database (Dr. Gallou) (14), and MuStar (Dr. A. Brown) (15). Dr. Cotton pointed out that the first, Mutation View, was the only one which provided software for both LSDB curators and a central database and thus, was unique due to its integration and should be seriously considered.

The meeting concluded with a discussion of six proposals and further work for the Working Groups. The meeting:-

The programs and speakers abstracts are available via http://ariel.ucs.unimelb.edu.au:80/~cotton/denver.htm

Individuals with ideas, skills and time to assist in this initiative should contact Rania Horaitis at:-

horaitis@ariel.its.unimelb.edu.au

Others who wish to show their support and receive mailings are encouraged to join the initiative via the Web site (1).

REFERENCES

1. MDI Website: http://ariel.ucs.unimelb.edu.au:80/~cotton/mut_database.htm

2. Antonarakis, S. et al. Human Mutation 11: 1-3 (1998). or http://journals.wiley.com/humanmutation/nomenclature.htm

3. Scriver, C.R.: http://data.mch.mcgill.ca/guidelines

4. OMIM Website: http://www3.ncbi.nlm.nih.gov/omim/

5. HGMD Website: http://www.uwcm.ac.uk/uwcm/mg/hgmd0.html

6. EBI Website: http://www2.ebi.ac.uk/mutations/

7. NCBI Website: http://www.ncbi.nlm.nih.gov/

8. DBSNP - http://www.ncbi.nlm.nih.gov/SNP

9. Cotton R.G.H. and C.R. Scriver. Human Mutation 12: 1-3 (1998).

10. Auerbach, A. - http://ariel.ucs.unimelb.edu.au:80/~cotton/entry.htm

11. Checker, Website: http://www2.ebi.ac.uk/cgi-bin/mutations/check.cgi

12. Caglayan, Website: http://ariel.ucs.unimelb.edu.au:80/~cotton/ethnic.htm

13. Mutation View, Website: http://mutview.dmb.med.keio.ac.jp

14. UMD, Website: http://wwww.umd.necker.fr

15. MuStar, Website: http://www.hgu.mrc.ac.uk/Research/Computing/Mustar/

13th November, 1998


Posted 17th November 1998, by Rania Horaitis