Chandra Sarkar1, Flavio R. Ortigao1, Ulf Gyllensten2, 3 and Anthony J. Brookes3
ortiagao@interactiva.de
1Interactiva Biotechnologie GmbH, D-89077 Ulm, Germany, and 2Swedish Genome Research Center and 3Department of Genetics and Pathology, Biomedical Center, Uppsala, Sweden.
The steadily rising importance and number of known SNPs and other intra-genic polymorphism led to us to the decision to create a database for human intra-genic polymorphisms called HGBASE (http://hgbase.interactiva.de) . The first release of HGBASE contains about 2300 intra-genic polymorphisms.
The following considerations where done a priori:
i.) We want to have a database that is accessible over the internet.
ii.) It shall be easy to maintain so it can be upgraded without problems even by a person with not too much skill in computing sciences.
iii.) The database should have an easy to use and yet powerful user interface to be both usable by people which aren't very familiar with database searches and experts on that field.
So we decided to keep the data in a relational database system (Oracle Server 8) which allows us to access the data for maintenance easily over an interface in MS Access by having a proven and powerful database engine. This also keeps us open a way to implement custom interface programs for advanced tasks like a java bulk submission program.
For accessing the data we have implemented three different means of access: You can access the data by a so-called "simple search", where one enters a keyword for which all data except the sequence data in HGBASE is queried. Another possibility is to use the SRS system (Sequence Retrival System, developed by Thure Etzold's group at EBI), a system that offers the user a common interface to query all kinds of life sciences related databases. SRS also is endorsed by the HUGO mutation database initiative as an interface that should be used for gene variation database. Another very powerful feature of SRS and one of the most important reasons for choosing it is the ability to create connections to related data in other databases anywhere on the Internet. This is a very important aspect, for polymorphisms only make sense in a context of related genetic information. The third means of accessing data in HGBASE is to search for a certain DNA sequence by using an interface to NCBI's BLAST program. Both SRS and BLAST get their input data automatically from data exports out of the Oracle database.
HGBASE is intended to become a comprehensive and up-to-date resource for intra-genic polymorphisms. To reach this goal our group constantly adds data out of literature, other databases and own laboratory work. We also ask all researchers on that field to contribute data to HGBASE by submitting their intra-genic polymorphisms data. This can be easily done by using our submission web pages. Data certainly can also be submitted by all usual means like email or fax. All submitted data is made available to any other public database that wishes to download it. HGBASE does not claim any rights to publicly available or submitted data, instead this remains the property of the original submitter.
HGBASE is accessible for no cost for all members of the scientific community. The HGBASE workgroup also is working on an exhanced database that will contain additional data like gene function, gene location, gene expression pattern, disease associations, and suggested assay formats. To cover the costs for collecting and maintaining this additional data, there will be a charge for commercial usage of this "value added" database, whereby academic use will remain free as well as the use of the regular HGBASE database for all members of the scientific community.
BACK TO PROGRAM & OTHER ABSTRACTS