Xpro: Database of Eukaryotic Protein Encoding Genes

Frequently Asked Questions

  1. What does Xpro database contain ... ?
  2. How is Xpro implemented ... ?
  3. Why do I need Xpro database ... ?
  4. How can I search Xpro database ... ?
  5. If I have a protein/nucleotide sequence, then how can I search it against Xpro... ?
  6. How do I find only the 'intron-containing genes in human' associated with keyword caspase ... ?
  7. How can I look for alternative splicing variants from the alignment between Xpro entries and EST sequences from dbEST database and how are they derived?

1. What does Xpro Database contain ... ?

Xpro is a relational database that contains all the eukaryotic protein-encoding DNA sequences in GenBank. It provides detailed and comprehensive features about both the intron containing and the intron-less genes.

In addition to the information found in the GenBank records, which includes properties such as sequence, position, length and description about introns, exons and protein coding regions, Xpro provides annotations on the splice sites motifs and intron phases. Furthermore, Xpro validates intron positions using alignment information between the records sequence and EST sequences found in dbEST. The entries in the XPro are also cross-referenced to SWISS-PROT/TrEMBL and Pfam databases.

top

2. How is Xpro implemented ?

Xpro is a MySQL database (version 3.23.29) running on a UNIX server (SGI ORIGIN 3200). The Web server software used is Apache version 1.3 with CGI using PERL programs.

3. Why do I need Xpro database ... ?

Unprecedented growth data in GenBank, the primary repository of nucleotide sequences due to the ever increasing number of genome and EST sequencing projects and the poor annotation of exon/intron details required for molecular evolution studies in the primary nucleotide database have made development of Xpro database. It is a specialized database that contains details about genomic features specific to eukaryotic genes and provides various web tools for analyzing/visualizing  these features.

top

4. How can I search Xpro database ... ?

The contents in the Xpro is searched by entering the parameters in the text box in the right side frame in the Xpro home page and clicking on the 'submit button next to it. The search page can also reached by pressing the 'Query page' link in  the left frame menu bar.

Advanced searches can be done by clicking on the 'Limits & Output Format' link next to the Reset button.

Search parameters
The database can be searched using any of the following details

  • GenBank locus name (e.g. AL91370, H006101S22)
  • Nucleotide accession or version number (e.g. AF016365.1)
  • Valid GenBank ID (e.g. 2873347)
  • Protein accession number (e.g. CAD24437.1)
  • Keywords in the Genbank Definition (e.g. yeast, gdnf, etc)
  • SWISS-PROT/TrEMBL accession numbers
  • Pfam accession numbers- PF00822

5. How do I search a sequence against Xpro database ... ?

The database also can be searched against a nucleotide or protein sequence for homologues using the BLAST search

top

6. How do I find only the 'intron-containing genes in human' associated with keyword caspase ... ?

Steps
  1. In  the Xpro home page, type 'Caspase' in the text box
  2. Click on the 'limit & output formats' link in the right side of  the Reset button.
  3. Click on  'intron containing genes' in the Datasets field and click on 'Homo sapiens' in the organism field.
  4. Then click on Submit button on the right side of the textbox to view the results.

top

7. How can I look for alternative splicing variants from the alignment between Xpro entries and EST sequences from dbEST database?

Alternate SplicingVariants Analyser:
BLAT alignment of the Xpro exon sequences with the EST sequences in dbEST is used in finding various alternative splicing categories in the eukaryotic intron-containing genes. The following are the various splicing types obtained by analyzing the alignment gaps in the Xpro query sequence against EST database sequence.

Analysis of exons
 By analyzing the gaps in EST sequence of the BLAT alignment between Xpro exon sequences and EST sequence four categories of alternative splicing are defined as below.

Analysis of introns Analysis
of the gaps in the query Xpro exon sequences are accounted by separately aligning the EST sequence in these gap regions to the intron sequences for that Xpro entry. This alignment is done using the global alignment program, STRETCHER from EMBOSS suite. Based on the above analysis four categories of alternative splicing types are defined and are described in the following section.

Steps

  1. In the Xpro home page, type 'BAA82697.1' in the text box and click on the submit button.
  2. In the results page, then click on the 'BAA82697.1' link to view the sequence and annotation page.
  3. Then Click on the 'Click here for more annotation and output formats' in the Xpro links field.
  4. In the next window click on 'Click here to view the EST Alignment Viewer/Splicing Analyser' link to open the 'EST Alignment Viewer analyzer' in a new window.
  5. The EST alignment viewer consists of two frames, the bottom frame shows the graphical representation of EST sequences aligned to the Xpro exon sequences.  The two numbers in the 3' region of the alignment shows the alignment gaps in the Xpro and EST sequences respectively). See the help page http://origin.bic.nus.edu.sg/xpro/images/help5.gif .
  6. From the alignment, it can be seen that the EST entry 'AWW128801.1' (second entry) has a gap of length 711. clicking on that entry shows the text format of the alignment in the top frame and a clear occurrence of  'exon skipping' event in the EST sequence can be seen from the alignment.

top

Contact Details :

G.Vivek
Dr. Tan Tin Wee
Dr. Shoba Ranganathan


Bioinformatics Centre, Dept. of Biochemisty, NUS July 2003
This page is best viewed with IE 5.x & above or NC 4.x, 1024x resolution.