home uniprot
 
       Home      About PIR     Databases      Search/Retrieval      Download      Support
HOME / About PIR / Publications

To request reprints for any publication please contact us.

Publications
(Some files are in PDF format)
Documents/Bulletins

Publications
BOOK: Bioinformatics for Comparative Proteomics.
Wu CH, Chen C (Eds.).
Methods in Molecular Biology, Volume 694, Series Editor J.M Walker, Humana Press. 2011.
BOOK: Computational Biology and Genome Informatics.
Wang J, Wu CH, Wang P (Eds.).
World Scientific. 2003.
BOOK: Neural Networks and Genome Informatics.
Wu CH, McLarty JM (Eds.).
Methods in Computational Biology and Biochemistry, Volume 1, Series Editor A. K. Konopka, Elsevier Science. 2000.

Natale DA, Arighi CN, Blake JA, Bona J, Chen C, Chen SC, Christie KR, Cowart J, D'Eustachio P, Diehl AD, Drabkin HJ, Duncan WD, Huang H, Ren J, Ross K, Ruttenberg A, Shamovsky V, Smith B, Wang Q, Zhang J, El-Sayed A, Wu CH. Protein Ontology (PRO): enhancing and scaling up the representation of protein entities..
Nucleic Acids Res. 2017 Jan 4;45(D1):D339-D346. doi: 10.1093/nar/gkw1075. Epub 2016 Nov 28.

Torii M, Arighi CN, Wang Q, Wu CH, Vijay-Shanker K. RLIMS-P 2.0: A generalizable rule-based information extraction system for literature mining of protein phosphorylation information.
IEEE Transactions on Computational Biology and Bioinformatics (TCBB) 12 (1): 17-29. doi: 10.1109/TCBB.2014.2372765. 2015.

Tudor CO, Rose KE, Li G, Vijay-Shanker K, Wu CH, Arighi CN. Construction of phosphorylation interaction networks by text mining of full-length articles using the eFIP system.
Database pii: bav020. doi: 10.1093/database/bav020. 2015. PubMed PMID: 25833953. PubMed Central PMCID: PMC4381107
UniProt: a hub for protein information.
UniProt Consortium.
Nucleic Acids Res. Jan 28;43 (Database issue) (2015)
RLIMS-P: an online text-mining tool for literature-based extraction of protein phosphorylation information.
Torii M, Li G, Li Z, Oughtred R, Diella F, Celen I, Arighi CN, Huang H, Vijay-Shanker K, Wu CH.
Database (Oxford). 2014 Aug 13;2014. pii: bau081. doi: 10.1093/database/bau081. Print 2014. PubMed PMID: 25122463; PubMed Central PMCID: PMC4131691.
iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system.
Peng Y, Tudor CO, Torii M, Wu CH, Vijay-Shanker K.
Database (Oxford). 2014 May 21;2014. pii: bau038. doi: 10.1093/database/bau038. Print 2014. PubMed PMID: 24850848; PubMed Central PMCID: PMC4028706.
UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches.
Suzek BE, Wang Y, Huang H, McGarvey PB, Wu CH; the UniProt Consortium.
Bioinformatics. 2014 Nov 13. pii: btu739. [Epub ahead of print]. PubMed PMID: 25398609.
Protein Ontology: a controlled structured network of protein entities.
Natale DA, Arighi CN, Blake JA, Bult CJ, Christie KR, Cowart J, D'Eustachio P, Diehl AD, Drabkin HJ, Helfer O, Huang H, Masci AM, Ren J, Roberts NV, Ross K, Ruttenberg A, Shamovsky V, Smith B, Yerramalla MS, Zhang J, AlJanahi A, Celen I, Gan C, Lv M, Schuster-Lezell E, Wu CH.
Nucleic Acids Res. 2014 Jan;42(Database issue):D415-21. doi: 10.1093/nar/gkt1173. Epub 2013 Nov 21. PubMed PMID: 24270789; PubMed Central PMCID: PMC3964965.
Activities at the Universal Protein Resource (UniProt).
UniProt Consortium.
Nucleic Acids Res. 2014 Jan;42(Database issue):D191-8. doi: 10.1093/nar/gkt1140. Epub 2013 Nov 18. PubMed PMID: 24253303; PubMed Central PMCID: PMC3965022.
Transcription factors and genetic circuits orchestrating the complex, multilayered response of Clostridium acetobutylicum to butanol and butyrate stress.
Wang Q, Venkataramanan KP, Huang H, Papoutsakis ET, Wu CH.
BMC Syst Biol. 2013 Nov 6;7:120. doi: 10.1186/1752-0509-7-120. PubMed PMID: 24196194; PubMed Central PMCID: PMC3828012.
Construction of protein phosphorylation networks by data mining, text mining and ontology integration: analysis of the spindle checkpoint.
Ross KE, Arighi CN, Ren J, Huang H, Wu CH.
Database (Oxford). 2013 Jun 7;2013:bat038. doi: 10.1093/database/bat038. Print 2013. PubMed PMID: 23749465; PubMed Central PMCID: PMC3675891.
A Framework for Biomedical Figure Segmentation Towards Image-based Document Retrieval.
Luis D. Lopez, Jingyi Yu, Cecilia Arighi, Catalina O Tudor, Manabu Torii, Hongzhan Huang, K. Vijay-Shanker, and Cathy Wu.
iBMC Systems Biology 2013, 7 (Suppl 4): S8. PubMed PMID: 24565394; PubMed Central PMCID: PMC3856606.
BioC: a minimalist approach to interoperability for biomedical text processing.
Comeau DC, Islamaj Dogan R, Ciccarese P, Cohen KB, Krallinger M, Leitner F, Lu Z, Peng Y, Rinaldi F, Torii M, Valencia A, Verspoor K, Wiegers TC, Wu CH, Wilbur WJ.
Database (Oxford). 2013 Sep 18;2013:bat064. doi: 10.1093/database/bat064. Print 2013. PubMed PMID: 24048470; PubMed Central PMCID: PMC3889917.
A fast Peptide Match service for UniProt Knowledgebase.
Chen C, Li Z, Huang H, Suzek BE, Wu CH; UniProt Consortium.
Bioinformatics. 2013 Nov 1;29(21):2808-9. doi: 10.1093/bioinformatics/btt484. Epub 2013 Aug 19. PubMed PMID: 23958731; PubMed Central PMCID: PMC3799477.
Construction of protein phosphorylation networks by data mining, text mining and ontology integration: analysis of the spindle checkpoint.
Ross KE, Arighi CN, Ren J, Huang H, Wu CH.
Database (Oxford). 2013 Jun 7;2013:bat038. doi: 10.1093/database/bat038. Print 2013. PubMed PMID: 23749465; PubMed Central PMCID: PMC3675891.
Use of the protein ontology for multi-faceted analysis of biological processes: a case study of the spindle checkpoint.
Ross KE, Arighi CN, Ren J, Natale DA, Huang H, Wu CH.
Front Genet. 2013 Apr 26;4:62. doi: 10.3389/fgene.2013.00062. eCollection 2013. PubMed PMID: 23637705; PubMed Central PMCID: PMC3636526.
Prediction of contact matrix for protein-protein interaction.
Gonzalez AJ, Liao L, Wu CH.
Bioinformatics. 2013 Apr 15;29(8):1018-25. doi: 10.1093/bioinformatics/btt076. Epub 2013 Feb 15. PubMed PMID: 23418186; PubMed Central PMCID: PMC3624801.
An overview of the BioCreative 2012 Workshop Track III: interactive text mining task.
Arighi CN, Carterette B, Cohen KB, Krallinger M, Wilbur WJ, Fey P, Dodson R, Cooper L, Van Slyke CE, Dahdul W, Mabee P, Li D, Harris B, Gillespie M, Jimenez S, Roberts P, Matthews L, Becker K, Drabkin H, Bello S, Licata L, Chatr-aryamontri A, Schaeffer ML, Park J, Haendel M, Van Auken K, Li Y, Chan J, Muller HM, Cui H, Balhoff JP, Chi-Yang Wu J, Lu Z, Wei CH, Tudor CO, Raja K, Subramani S, Natarajan J, Cejuela JM, Dubey P, Wu C.
Database (Oxford). 2013 Jan 17;2013:bas056. doi: 10.1093/database/bas056. Print 2013. PubMed PMID: 23327936; PubMed Central PMCID: PMC3625048.
Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature.
P.G. Arnison, M.J. Bibb, G. Bierbaum, A.A. Bowers, T.S. Bugni, G. Bulaj, J.A. Camarero, D.J. Campopiano, G.L. Challis, J. Clardy, P.D. Cotter, D.J. Craik, M. Dawson, E. Dittmann, S. Donadio, P.C. Dorrestein, K.D. Entian, M.A. Fischbach, J.S. Garavelli, U. Göransson, C.W. Gruber, D.H. Haft, T.K. Hemscheidt, C. Hertweck, C. Hill, A.R. Horswill, M. Jaspars, W.L. Kelly, J.P. Klinman, O.P. Kuipers, A.J. Link, W. Liu, M.A. Marahiel, D.A. Mitchell, G.N. Moll, B.S. Moore, R. Müller, S.K. Nair, I.F. Nes, G.E. Norris, B.M. Olivera, H. Onaka, M.L. Patchett, J. Piel, M.J. Reaney, S. Rebuffat, R.P. Ross, H.G. Sahl, E.W. Schmidt, M.E. Selsted, K. Severinov, B. Shen, K. Sivonen, L. Smith, T. Stein, R.D. Süssmuth, J.R. Tagg, G.L. Tang, A.W. Truman, J.C. Vederas, C.T. Walsh, J.D. Walton, S.C. Wenzel, J.M. Willey, W.A. van der Donk.
(2013) Nat. Prod. Rep. 30, 108-160 [PMID:23165928].
BioCreative-2012 virtual issue.
Wu CH, Arighi CN, Cohen KB, Hirschman L, Krallinger M, Lu Z, Mattingly C, Valencia A, Wiegers TC, John Wilbur W.
Database(Oxford). 2012 Dec 5;2012:bas049. doi: 10.1093/database/bas049. Print 2012. PubMed PMID: 23221175; PubMed Central PMCID: PMC3514749.
The eFIP system for text mining of protein interaction networks of phosphorylated proteins.
Tudor CO, Arighi CN, Wang Q, Wu CH, Vijay-Shanker K.
Database (Oxford). 2012 Dec 5;2012:bas044. doi: 10.1093/database/bas044. Print 2012. PubMed PMID: 23221174; PubMed Central PMCID: PMC3514748.
Building a Classifier for Identifying Sentences Pertaining to Disease-Drug Relationships in Tardive Dyskinesia.
X. Bi, H. Huang, S. Matis-Mitchell, P. McGarvey, M. Torii, H. Shatkay and C. Wu.
Proc. of the IEEE Int. Conf. on Bioinformatics and Biomedicine (BIBM). November, 2012
The eFIP system for text mining of protein interaction networks of phosphorylated proteins.
Tudor CO, Arighi CN, Wang Q, Wu CH, Vijay-Shanker K.
Database (Oxford). 2012 Dec 5;2012:bas044. doi: 10.1093/database/bas044. Print 2012. PubMed PMID: 23221174; PubMed Central PMCID: PMC3514748.
iSimp: A Sentence Simplification System for Biomedical Text.
Yifan Peng, Catalina O Tudor, Manabu Torii, Cathy H Wu, K Vijay-Shanker.
IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2012, Philadelphia, USA, October 2012, 211-216. PubMed PMID: 24850848; PubMed Central PMCID: PMC4028706.

Robust Segmentation of Biomedical Figures Toward an Image-based Document Retrieval.
Luis D Lopez, Jingyi Yu, Catalina O Tudor, Cecilia N Arighi, Hongzhan Huang, K Vijay-Shanker, Cathy H Wu.
IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2012, Philadelphia, USA, October 2012.

Text mining for the biocuration workflow.
Hirschman L, Burns GA, Krallinger M, Arighi C, Cohen KB, Valencia A, Wu CH, Chatr-Aryamontri A, Dowell KG, Huala E, Lourenço A, Nash R, Veuthey AL, Wiegers T, Winter AG.
Database (Oxford). 2012 Apr 18;2012:bas020. doi: 10.1093/database/bas020. Print 2012. PubMed PMID: 22513129; PubMed Central PMCID: PMC3328793.
Community annotation and bioinformatics workforce development in concert--Little Skate Genome Annotation Workshops and Jamborees.
Wang Q, Arighi CN, King BL, Polson SW, Vincent J, Chen C, Huang H, Kingham BF, Page ST, Rendino MF, Thomas WK, Udwary DW, Wu CH; North East Bioinformatics Collaborative Curation Team.
Database (Oxford). 2012 Mar 20;2012:bar064. doi: 10.1093/database/bar064. Print 2012. PubMed PMID: 22434832; PubMed Central PMCID: PMC3308154.
Informatics and data quality at collaborative multicenter Breast and Colon Cancer Family Registries.
McGarvey PB, Ladwa S, Oberti M, Dragomir AD, Hedlund EK, Tanenbaum DM, Suzek BE, Madhavan S.
J Am Med Inform Assoc. 2012 Feb 9. [Epub ahead of print] PMID: 22323393.
The Protein Ontology: a structured representation of protein forms and complexes.
Natale DA, Arighi CN, Barker WC, Blake JA, Bult CJ, Caudy M, Drabkin HJ, D'Eustachio P, Evsikov AV, Huang H, Nchoutmboube J, Roberts NV, Smith B, Zhang J, Wu CH.
Nucleic Acids Res. 39(Database issue):D539-45. 2011.
BioCreative III interactive task: an overview.
Arighi CN, Roberts PM, Agarwal S, Bhattacharya S, Cesareni G, Chatr-Aryamontri A, Clematide S, Gaudet P, Giglio MG, Harrow I, Huala E, Krallinger M, Leser U, Li D, Liu F, Lu Z, Maltais LJ, Okazaki N, Perfetto L, Rinaldi F, Sĉtre R, Salgado D, Srinivasan P, Thomas PE, Toldo L, Hirschman L, Wu CH.
BMC Bioinformatics. 2011 Oct 3;12 Suppl 8:S4. doi: 10.1186/1471-2105-12-S8-S4. PubMed PMID: 22151968; PubMed Central PMCID: PMC3269939.
Overview of the BioCreative III Workshop.
Arighi CN, Lu Z, Krallinger M, Cohen KB, Wilbur WJ, Valencia A, Hirschman L, Wu CH.
BMC Bioinformatics. 2011 Oct 3;12 Suppl 8:S1. doi: 10.1186/1471-2105-12-S8-S1. PubMed PMID: 22151647; PubMed Central PMCID: PMC3269932.
The representation of protein complexes in the Protein Ontology (PRO).
Bult CJ, Drabkin HJ, Evsikov A, Natale D, Arighi C, Roberts N, Ruttenberg A, D'Eustachio P, Smith B, Blake JA, Wu C.
BMC Bioinformatics. 2011 Sep 19;12:371. doi: 10.1186/1471-2105-12-371. PubMed PMID: 21929785; PubMed Central PMCID: PMC3189193.
Reorganizing the protein space at the Universal Protein Resource (UniProt).
UniProt Consortium.
Nucleic Acids Res. 40 (Database issue): D71-5 (2012). PMID: 22102590
Representative proteomes: a stable, scalable and unbiased proteome set for sequence analysis and functional annotation.
Chen C, Natale DA, Finn RD, Huang H, Zhang J, Wu CH, Mazumder R.
PLoS One. 2011 Apr 27;6(4):e18910. PMID: 21556138
A comprehensive protein-centric ID mapping service for molecular data integration.
Huang H, McGarvey PB, Suzek BE, Mazumder R, Zhang J, Chen Y, Wu CH.
Bioinformatics. Apr 15;27(8):1190-1. 2011.
Protein-centric data integration for functional analysis of comparative proteomics data.
McGarvey PB, Zhang J, Natale DA, Wu CH, Huang H.
Methods Mol Biol. 694:323-39. 2011.
Structure-guided rule-based annotation of protein functional sites in UniProt Knowledgebase.
Vasudevan S, Vinayaka CR, Natale DA, Huang H, Kahsay RY, Wu CH.
Methods Mol Biol. 694:91-105. 2011.
A tutorial on protein ontology resources for proteomic studies.
Arighi CN.
Methods Mol Biol. 694:77-90. 2011.
eFIP: a tool for mining functional impact of phosphorylation from literature.
Arighi CN, Siu AY, Tudor CO, Nchoutmboube JA, Wu CH, Shanker VK.
Methods Mol Biol. 694:63-75. 2011.
Protein bioinformatics databases and resources.
Chen C, Huang H, Wu CH.
Methods Mol Biol. 694:3-24. 2011.
Omics-Based Molecular Target and Biomarker Identification.
Zhang-Zhi Hu, Hongzhan Huang, Cathy H. Wu, Mira Jung, Anatoly Dritschilo, Anna T. Riegel, Anton Wellstein
Methods Mol Biol. 719, 547-571. 2011.
Ongoing and future developments at the Universal Protein Resource.
UniProt Consortium.
Nucleic Acids Res. 39(Database issue):D214-9. 2011.
eGIFT: mining gene information from the literature.
Tudor CO, Schmidt CJ, Vijay-Shanker K.
BMC Bioinformatics. 2010 Aug 9;11:418. doi: 10.1186/1471-2105-11-418. PubMed PMID: 20696046; PubMed Central PMCID: PMC2929241.
Phylogenomic analysis of marine Roseobacters.
Tang K, Huang H, Jiao N, Wu CH.
PLoS One. 5(7):e11604. 2010.
Document classification for mining host pathogen protein-protein interactions.
Yin L, Xu G, Torii M, Niu Z, Maisog JM, Wu C, Hu Z, Liu H.
Artif Intell Med. 49(3):155-60. 2010.
Molecular mechanisms mediating the effect of mono-(2-ethylhexyl) phthalate on hormone-stimulated steroidogenesis in MA-10 mouse tumor Leydig cells.
Fan J, Traore K, Li W, Amri H, Huang H, Wu C, Chen H, Zirkin B, Papadopoulos V.
Endocrinology. 151(7):3348-62. 2010.
Prediction of Catalytic Residues in Proteins Using a Consensus of Prediction (CoP) Approach.
Petrova NV, Wu CH.
IEEE International Conference on Bioinformatics and Bioengineering, bibe, 226-231. 2010.
A database developed from information extracted from chemotherapy drug package inserts to enhance future prescriptions.
D'Souza MK, Alabed GJ, Wheatley JM, Roberts N, Veturi Y, Bi X, Continisio CH.
CISE2011, IEEE Conference Record #17768; IEEE Catalog Number: CFP1160F-PRT; ISBN: 978-1-4244-8361-7. 2010.
From protein sequences to 3D-structures and beyond: the example of the UniProt knowledgebase.
Hinz U; UniProt Consortium.
Cell Mol Life Sci. 67(7):1049-64. 2010.
Community annotation in biology.
Mazumder R, Natale DA, Julio JA, Yeh LS, Wu CH.
Biol Direct. 5:12. 2010.
Protein Bioinformatics Infrastructure for the Integration and Analysis of Multiple High-Throughput omics Data.
Chen C, McGarvey PB, Huang H, Wu CH.
Adv Bioinformatics. 2010; 2010:423589. 2010.
The Universal Protein Resource (UniProt) in 2010.
UniProt Consortium.
Nucleic Acids Res. 38(Database issue):D142-8. 2010.
Systems integration of biodefense omics data for analysis of pathogen-host interactions and identification of potential targets.
McGarvey PB, Huang H, Mazumder R, Zhang J, Chen Y, Zhang C, Cammer S, Will R, Odle M, Sobral B, Moore M, Wu CH.
PLoS One. 4(9):e7162. 2009.
Sequence signatures in envelope protein may determine whether flaviviruses produce hemorrhagic or encephalitic syndromes.
Barker WC, Mazumder R, Vasudevan S, Sagripanti JL, Wu CH.
Virus Genes. 39(1):1-9. 2009.
Infrastructure for the life sciences: design and implementation of the UniProt website.
Jain E, Bairoch A, Duvaud S, Phan I, Redaschi N, Suzek BE, Martin MJ, McGarvey P, Gasteiger E.
BMC Bioinformatics. 10:136. 2009.
TGF-beta signaling proteins and the Protein Ontology.
Arighi CN, Liu H, Natale DA, Barker WC, Drabkin H, Blake JA, Smith B, Wu CH.
BMC Bioinformatics. 10 Suppl 5:S3. 2009.
BioTagger-GM: a gene/protein name recognition system.
Torii M, Hu Z, Wu CH, Liu H.
J Am Med Inform Assoc. 16(2):247-55. 2009.
An improved ontological representation of dendritic cells as a paradigm for all cell types.
Masci AM, Arighi CN, Diehl AD, Lieberman AE, Mungall C, Scheuermann RH, Smith B, Cowell LG.
BMC Bioinformatics. 10:70. 2009.
InterPro: the integrative protein signature database.
Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJ, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C.
Nucleic Acids Res. 37(Database issue):D211-5. 2009.
The Universal Protein Resource (UniProt) 2009.
UniProt Consortium.
Nucleic Acids Res. 37(Database issue):D169-74. 2009.
Structure-guided comparative analysis of proteins: principles, tools, and applications for predicting function.
Mazumder R, Vasudevan S.
PLoS Comput Biol. 4(9):e1000151. 2008.
Integrated Bioinformatics for Radiation-Induced Pathway Analysis from Proteomics and Microarray Data.
Hu ZZ, Huang H, Cheema A, Jung M, Dritschilo A, Wu CH.
J Proteomics Bioinform. 1(2):47-60. 2008.
Protein Bioinformatics.
McGarvey P, Huang H, Wu CH.
in: Medical Applications of Mass Spectrometry. Part III Biomolecules, Chapter 10:203-222. K Vekey, A Telekes, A Vertes (Eds.) Elsevier Science. 2008.
Protein functional annotation by homology.
Mazumder R, Vasudevan S, Nikolskaya AN.
Methods Mol Biol. 484:465-90. 2008.
An emerging cyberinfrastructure for biodefense pathogen and pathogen-host data.
Zhang C, Crasta O, Cammer S, Will R, Kenyon R, Sullivan D, Yu Q, Sun W, Jha R, Liu D, Xue T, Zhang Y, Moore M, McGarvey P, Huang H, Chen Y, Zhang J, Mazumder R, Wu C, Sobral B.
Nucleic Acids Res. 36(Database issue):D884-91. 2008.
Bioinformatic Databases.
Herbert KG, Spirollari J, Wang JTL, Piel WH, Westbrook J, Barker WC, Hu ZZ, Wu CH.
in: Wiley Encyclopedia of Computer Science and Engineering (Cassie Craig Assistant Editor), John Wiley & Sons, Ltd. 2007.
A comparison study on algorithms of detecting long forms for short forms in biomedical text.
Torii M, Hu ZZ, Song M, Wu CH, Liu H.
BMC Bioinformatics. 8 Suppl 9:S5. 2007.
Framework for a protein ontology.
Natale DA, Arighi CN, Barker WC, Blake J, Chang TC, Hu Z, Liu H, Smith B, Wu CH.
BMC Bioinformatics. 8 Suppl 9:S1. 2007.
Computational analysis and identification of amino acid sites in dengue E proteins relevant to development of diagnostics and vaccines.
Mazumder R, Hu ZZ, Vinayaka CR, Sagripanti JL, Frost SD, Kosakovsky Pond SL, Wu CH.
Virus Genes. 35(2):175-86. 2007.
Integration of bioinformatics resources for functional analysis of gene expression and proteomic data.
Huang H, Hu ZZ, Arighi CN, Wu CH.
Front Biosci. 12:5071-88. 2007.
Identification of Sensory and Signal-Transducing Domains in Two-Component Signaling Systems.
Galperin MY, Nikolskaya AN.
Methods in Enzymology 422:47-74. 2007.
UniRef: comprehensive and non-redundant UniProt reference clusters.
Suzek BE, Huang H, McGarvey P, Mazumder R, Wu CH.
Bioinformatics. 23(10):1282-8. 2007.
Challenges and solutions in proteomics.
Huang H, Shukla HD, Cathy W, Satya S.
Curr Genomics. 8(1):21-8. 2007.
PIRSF family classification system for protein functional and evolutionary analysis.
Nikolskaya AN, Arighi CN, Huang H, Barker WC, Wu CH.
Evol Bioinform Online. 2:197-209. 2007.
Dependence network modeling for biomarker identification.
Qiu P, Wang ZJ, Liu KJ, Hu ZZ, Wu CH.
Bioinformatics. 23(2):198-206. 2007.
Comparative Bioinformatics Analyses and Profiling of Lysosome-Related Organelle Proteomes.
Hu ZZ, Valencia JC, Huang H, Chi A, Shabanowitz J, Hearing VJ, Appella E, Wu C.
Int J Mass Spectrom. 259(1-3):147-160. 2007.
New developments in the InterPro database.
Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Buillard V, Cerutti L, Copley R, Courcelle E, Das U, Daugherty L, Dibley M, Finn R, Fleischmann W, Gough J, Haft D, Hulo N, Hunter S, Kahn D, Kanapin A, Kejariwal A, Labarga A, Langendijk-Genevaux PS, Lonsdale D, Lopez R, Letunic I, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Nikolskaya AN, Orchard S, Orengo C, Petryszak R, Selengut JD, Sigrist CJ, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C.
Nucleic Acids Res. 35(Database issue):D224-8. 2007.
The Universal Protein Resource (UniProt).
UniProt Consortium.
Nucleic Acids Res. 35(Database issue):D193-7. 2007.
Proteomic and bioinformatic characterization of the biogenesis and function of melanosomes.
Chi A, Valencia JC, Hu ZZ, Watabe H, Yamaguchi H, Mangini NJ, Huang H, Canfield VA, Cheng KC, Yang F, Abe R, Yamagishi S, Shabanowitz J, Hearing VJ, Wu C, Appella E, Hunt DF.
J Proteome Res. 5(11):3135-44. 2006.
Substring selection for biomedical document classification.
Han B, Obradovic Z, Hu ZZ, Wu CH, Vucetic S.
Bioinformatics. 22(17):2136-42. 2006.
Quantitative assessment of dictionary-based protein named entity tagging.
Liu H, Hu ZZ, Torii M, Wu C, Friedman C.
J Am Med Inform Assoc. 13(5):497-507. 2006.
An online literature mining tool for protein phosphorylation.
Yuan X, Hu ZZ, Wu HT, Torii M, Narayanaswamy M, Ravikumar KE, Vijay-Shanker K, Wu CH.
Bioinformatics. 22(13):1668-9. 2006.
Prediction of catalytic residues using Support Vector Machine with selected protein sequence and structural properties.
Petrova NV, Wu CH.
BMC Bioinformatics. 7:312. 2006.
BioThesaurus: a web-based thesaurus of protein and gene names.
Liu H, Hu ZZ, Zhang J, Wu C.
Bioinformatics. 22(1):103-5. 2006.
The Universal Protein Resource (UniProt): an expanding universe of protein information.
Wu CH, Apweiler R, Bairoch A, Natale DA, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Mazumder R, O'Donovan C, Redaschi N, Suzek B.
Nucleic Acids Res. 34(Database issue):D187-91. 2006.
Computational identification of strain-, species- and genus-specific proteins.
Mazumder R, Natale DA, Murthy S, Thiagarajan R, Wu CH.
BMC Bioinformatics. 6:279. 2005.
Large-scale, classification-driven, rule-based functional annotation of proteins.
Natale DA, Vinayaka CR, Wu CH.
in: Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics, Part 4 Bioinformatics, Section 3 Protein Function and Annotation, Chpater 36. S Subramaniam (Ed.), John Wiley & Sons, Ltd. 2005.
DynGO: a tool for visualizing and mining of Gene Ontology and its associations.
Liu H, Hu ZZ, Wu CH.
BMC Bioinformatics. 6:201. 2005.
Literature mining and database annotation of protein phosphorylation using a rule-based system.
Hu ZZ, Narayanaswamy M, Ravikumar KE, Vijay-Shanker K, Wu CH.
Bioinformatics. 21(11):2759-65. 2005.
Plant protein annotation in the UniProt Knowledgebase.
Schneider M, Bairoch A, Wu CH, Apweiler R.
Plant Physiol. 138(1):59-66. 2005.
The PIR superfamily (PIRSF) classification system.
Barker WC, Mazumder R, Nikolskaya A, Wu CH.
in: Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics, Part 3 Proteomics, Section 6 Proteome Families, Chapter 87. MJ Dunn (Ed.), John Wiley & Sons, Ltd. 2005.
Protein name tagging guidelines: lessons learned.
Mani I, Hu Z, Jang SB, Samuel K, Krause M, Phillips J, Wu CH.
Comp Funct Genomics. 6(1-2):72-6. 2005.
InterPro, progress and status in 2005.
Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bradley P, Bork P, Bucher P, Cerutti L, Copley R, Courcelle E, Das U, Durbin R, Fleischmann W, Gough J, Haft D, Harte N, Hulo N, Kahn D, Kanapin A, Krestyaninova M, Lonsdale D, Lopez R, Letunic I, Madera M, Maslen J, McDowall J, Mitchell A, Nikolskaya AN, Orchard S, Pagni M, Ponting CP, Quevillon E, Selengut J, Sigrist CJ, Silventoinen V, Studholme DJ, Vaughan R, Wu CH.
Nucleic Acids Res. 33(Database issue):D201-5. 2005.
The Universal Protein Resource (UniProt).
Bairoch A, Apweiler R, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Natale DA, O'Donovan C, Redaschi N, Yeh LS.
Nucleic Acids Res. 33(Database issue):D154-9. 2005.
Family classification and integrative associative analysis for protein functional annotation.
Huang H, Nikolskaya AN, Vinayaka CR, Chung S, Zhang J, Wu CH.
in: Trends in Bioinformatics Research. Chapter II:33-57. PV Yan (Ed.), Nova Science Publishers, Inc. 2005.
Information flow and data integration of databanks.
Wu CH, Barker WC.
in: Database Annotation in Molecular Biology:Principles and Practice. III Database Design and Integration, Chapter 11:187-201. AM Lesk (Ed.), John Wiley & Sons, Ltd. 2005.
Annotation of protein sequences.
Wu CH, Barker WC.
in: Database Annotation in Molecular Biology:Principles and Practice. II The Basis for Annotation, Chapter 8:131-147. AM Lesk (Ed.), John Wiley & Sons, Ltd. 2005.
iProLINK: an integrated protein resource for literature mining.
Hu ZZ, Mani I, Hermoso V, Liu H, Wu CH.
Comput Biol Chem. 28(5-6):409-16. 2004.
Gene and protein profiling of the response of MA-10 Leydig tumor cells to human chorionic gonadotropin.
Li W, Amri H, Huang H, Wu C, Papadopoulos V.
J Androl. 25(6):900-13. 2004.
A family classification approach to functional annotation of proteins.
Wu CH, Barker WC.
in: The Practical Bioinformatician. Chapter 19:417-434. L Wong (Ed.), World Scientific. 2004.
Update on genome completion and annotations: Protein Information Resource.
Wu C, Nebert DW.
Hum Genomics. 1(3):229-33. 2004.
The iProClass integrated database for protein functional analysis.
Wu CH, Huang H, Nikolskaya A, Hu Z, Barker WC.
Comput Biol Chem. 28(1):87-96. 2004.
Protein sequence databases.
Apweiler R, Bairoch A, Wu CH.
Curr Opin Chem Biol. 8(1):76-80. 2004.
UniProt: the Universal Protein knowledgebase.
Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Natale DA, O'Donovan C, Redaschi N, Yeh LS.
Nucleic Acids Res. 32(Database issue):D115-9. 2004.
PIRSF: family classification system at the Protein Information Resource.
Wu CH, Nikolskaya A, Huang H, Yeh LS, Natale DA, Vinayaka CR, Hu ZZ, Mazumder R, Kumar S, Kourtesis P, Ledley RS, Suzek BE, Arminski L, Chen Y, Zhang J, Cardenas JL, Chung S, Castro-Alvear J, Dinkov G, Barker WC.
Nucleic Acids Res. 32(Database issue):D112-4. 2004.
Protein family classification and functional annotation.
Wu CH, Huang H, Yeh LS, Barker WC.
Comput Biol Chem. 27(1):37-47. 2003.
iProClass: an integrated database of protein family, function and structure information.
Huang H, Barker WC, Chen Y, Wu CH.
Nucleic Acids Res. 31(1):390-2. 2003.
The Protein Information Resource.
Wu CH, Yeh LS, Huang H, Arminski L, Castro-Alvear J, Chen Y, Hu Z, Kourtesis P, Ledley RS, Suzek BE, Vinayaka CR, Zhang J, Barker WC.
Nucleic Acids Res. 31(1):345-7. 2003.
Accomplishments and challenges in literature data mining for biology.
Hirschman L, Park JC, Tsujii J, Wong L, Wu CH.
Bioinformatics. 18(12):1553-61. 2002.
The Protein Information Resource: an integrated public resource of functional annotation of proteins.
Wu CH, Huang H, Arminski L, Castro-Alvear J, Chen Y, Hu ZZ, Ledley RS, Lewis KC, Mewes HW, Orcutt BC, Suzek BE, Tsugita A, Vinayaka CR, Yeh LS, Zhang J, Barker WC.
Nucleic Acids Res. 30(1):35-7. 2002.
The RESID database of protein structure modifications: 2000 update.
Garavelli JS.
Nucleic Acids Res. 28(1):209-11. 2000.
PIR-ALN: a database of protein sequence alignments.
Srinivasarao GY, Yeh LS, Marzec CR, Orcutt BC, Barker WC.
Bioinformatics. 15(5):382-90. 1999.
Analysis and organization of protein sequence data: a retrospective spanning four decades.
Barker WC, Hunt LT.
J Protein Chem. 16(5):459-62. 1997.
Superfamily classification in PIR-International Protein Sequence Database.
Barker WC, Pfeiffer F, George DG.
Methods Enzymol. 266:59-71. 1996.
Mutation data matrix and its uses.
George DG, Barker WC, Hunt LT.
Methods Enzymol. 183:333-51. 1990.
Identifying domains in protein sequences.
Barker WC, Hunt LT, George DG.
Protein Seq Data Anal. 1(5):363-73. 1988.
A domain structure common to hemopexin, vitronectin, interstitial collagenase, and a collagenase homolog.
Hunt LT, Barker WC, Chen HR.
Protein Seq Data Anal. 1(1):21-6. 1987.
Evolution of prokaryote and eukaryote lines inferred from sequence evidence.
Hunt LT, George DG, Yeh LS, Dayhoff MO.
Orig Life. 14(1-4):657-64. 1984.
Establishing homologies in protein sequences.
Dayhoff MO, Barker WC, Hunt LT.
Methods Enzymol. 91:524-45. 1983.
Inferences from protein and nucleic acid sequences: early molecular evolution, divergence of kingdoms and rates of change.
Dayhoff MO, Barker WC, McLaughlin PJ.
Orig Life. 5(3):311-30. 1974.

Back to the top

Documents/Bulletins
PIRSF
A Proposal for the PIRSF Classification System (2003)
PIR Guidelines for Assigning Names to PIRSFs (2004)
Protein name tagging guidelines
PIR Guidelines for Protein Name Tagging Version 1.0 (2003)
PIR Guidelines for Protein Name Tagging Version 2.0 (2004)
Guide for Feature AnnotationsFeatures Document
Database Definition Document
PIR-International Protein Sequence Database (PSD) Database Definition Document: The Protein Sequence Component (1994)
ATLAS User's Guide Guide to the Atlas of Protein and Genomic Sequences on CD-ROM (1996)

Back to the top

Last Updated 04/10/2015

PIR
 HomeAbout PIRDatabasesSearch/AnalysisDownloadSupport  SITE MAPTERMS OF USE
©2018 Protein Information Resource