home uniprot
 
       Home      About PIR     Databases      Search/Retrieval      Download      Support
HOME / iProLINK / Entity Recognition
Entity Recognition/Ontology Development

Protein name dictionaries BioThesaurus: A protein entity dictionary constructed using online resources (download here)
 Dictionary (4.7 Mb), a protein name list derived from iProClass integrated protein knowledgebase
Word token dictionaries
BioMedical terms
Chemical terms
Macromolecules: enzymes, single word names, general names
Common English
Single non-word tokens
Protein name tagging guidelines
and tagged corpora
Name-tagged corpus 1 (300 abstracts) based on Tagging guideline v1.0
Name-tagged corpus 2 (300 abstracts) based on Tagging guideline v2.0 (Mani et al., 2005)
Protein ontology Ontology for protein classes and protein objects: PRO
eFIP data set  Corpus of 96 randomly selected abstracts from all PubMed abstracts containing sentences with trigger words for both phosphorylation and interaction mentions. The test set was manually curated by PRO curators (according to the Manual Annotation Guidelines for eFIP) to benchmark how well eFIP performs on a broad set of proteins and abstracts mentioning phosphorylation and interaction events.

PIR
 HomeAbout PIRDatabasesSearch/AnalysisDownloadSupport  SITE MAPTERMS OF USE
©2018 Protein Information Resource