Warning: Xpro database is derived database of eukaryotic protein encoding genes present in GenBank database. Hence, the following statistics of Xpro contents represent data distribution in GenBank database and may not represent data distribution in actual genomes

Exon Size Distribution (Intron Less Genes)
X axis range: 0-3000 nucleotides
No. of class intervals:100
Exon Size Distribution (Intron Containing Genes)
X axis range: 0-800 nucleotides
No. of class intervals: 40
Intron Size Distribution  

X axis range: 0-800 nucleotides
No.of class intervals: 40

X axis range: 0-200 nucleotides
No. of class intervals: 100
Content
  Whole Xpro Data Non Redundant (Swiss Prot)
Total No. of CDS 144,160 100% 104,022 72.2%
Total No. Intron Positions 582,676   443,283  

 

Splice sites Distribution
  Whole Xpro    Non-Redunant set ( based on SWISS-PROT)
Splice Motif No. % Splice Motif No. %
gt..ag 556,736 95.55% gt..ag 424,634 95.79%
nn..nn 5,035 0.86% nn..nn 4,048 0.91%
gc..ag 2,769 0.48% gc..ag 2,191 0.49%
eeee 1,587 0.27% gg..ca 358 0.08%
gg..ca 743 0.13% gt..at 324 0.07%
gt..at 467 0.08% gt..ac 293 0.07%
gt..ac 364 0.06% gt..nn 245 0.06%
nn..ag 354 0.06% nn..ag 238 0.05%
gt..nn 322 0.06% eeee 230 0.05%
ta..gg 287 0.05% tt..ag 206 0.05%
Others 14,012 2.40% Others 10,516 1.80%
'nn' represent unknown splicing site
'eeee' represent  introns less than four nucleotides
Intron Phase Distribution
  Total No. of  Intron Positions 0 1 2 0 1 2
Whole Xpro 582,676 277,661 167,423 137,592 47.7% 28.7% 23.6%
Human 72,436 31,775 24,871 15,790 43.9% 34.3% 21.8%
Mouse 23,612 10,313 8,414 4,885 43.7% 35.6% 20.7%
Rat 4,387 1,884 1,573 930 42.9% 35.9% 21.2%
C.elegans 126,978 60,052 33,322 33,604 47.3% 26.2% 26.5%
Arabidopsis 93,117 52,483 20,799 19,835 56.4% 22.3% 21.3%
Drosophila 67,376 28,255 21,851 17,270 41.9% 32.4% 25.6%

 

Intron Penetration

 

Total no. of Genes

Intron Containing Genes

Intron -less Genes

% Penetration

Whole Xpro

493,983

351,918

142,065

29%

Human

30,233

18,646

11,587

38%

Mouse

14,472

10,004

4,468

31%

Rat

1,968

1,035

933

47%

C.elegans

23,271

953

22,318

96%

Arabidopsis

23,019

5,137

17,882

78%

Drosophila

23,085

5,137

17,948

78%

 

Cellular Location Distribution

Cellular Location

Total no. of Genes

Intron Containing Genes

Intron -less Genes

Total no. of Genes

Intron Containing Genes

Intron -less Genes

Nuclear

400,196

258,882

141,314

74.85%

65.06%

99.10%

Mitochondria

93,789

93,036

753

18.99%

26.44%

0.53%

Chloroplast

30,468

29,939

529

6.17%

8.51%

0.37%