Table 1. Source databases for BioThesaurus construction. 

Database

#Records Mapped to UniProtKB

Annotation Fields

Extracted

# Names in

BioThesaurus

UniProtKB

 

2,083,713

DE

GN

1,070,596

PIR-PSD

 

243,796

ALTERNATE_NAMES

GENETICS

TITLE

340,397

Entrez Gene

 

990,757

DESCRIPTION

SYMBOL

SYNONYMS

1,200,792

RefSeq

1,314,514

DEFINITION

380,770

Genpept

2,429,669

FEATURES_CDS

FEATURES_gene

FEATURES_mRNA

603,950

HUGO

11,739

ALIASES

ENZYME

NAME

PREVIOUSGENENAME

PREVIOUSSYMBOL

SYMBOL

44,166

FlyBase

 

16,463

NAME

SYMBOL

SYN

52,346

RGD

6,165

NAME

SSLP

SYMBOL

ALIASES

12,221

MGD

11,585

NAME

SYMBOL

SYNONYM

33,696

SGD

4,486

GENEPRODUCT

LOCUSNAME

ORFNAME

OTHERNAME

11,742

WormBase

20,282

LOCI

WORMPEP

24,401

EC

2,394

ECNUM:AN

ECNUM:DE

6,966

OMIM

8,857

TI

35,029

Total number of database records: 7,144,420

Total number of distinct names: 2,869,972

 

This table is based on BioThesaurus Release 1.0, August 25, 2005. The updated table is available at:
http://pir.georgetown.edu/pirwww/iprolink/biothesaurus/statistics.html.