10/16/01- Dr. Wu appears in The Scientist
Cathy Wu at the Crossroads: She saved the Protein Information Resource database and now aims to restore it to the world's best
Full article
|
Primary Expertise |
Dr. Wu has conducted bioinformatics research since 1990 and developed several protein classification
systems and databases. She has managed large software and database projects, led the bioinformatics
effort of the Protein Information Resource (PIR) since 1999, and becoming the PIR Director in 2001. Her research interests include
protein family classification and functional annotation, biological data integration, and literature
mining.
|
Academic Appointments |
1989-1994 |
Assistant Professor, Department of Computer Science, University of Texas at Tyler |
1990-1999 |
Assistant Professor (90-94); Associate Professor (94-98); Professor (98-99)
of Biomathematics
University of Texas Health Center at Tyler |
1999-2002 |
Director of Bioinformatics, PIR (99-02); Vice President (00-02), National Biomedical
Research Foundation, Washington, D.C. |
2001-present |
Professor, Department of Biochemistry & Molecular Biology; Director, PIR,
Georgetown University Medical Center (GUMC) |
2002-present |
Professor, Department of Oncology; Member, Lombardi Comprehensive Cancer Center, GUMC |
Professional Activities |
 |
Member, Advisory Committee, Protein Structure Initiative, NIGMS, NIH (2002-present). |
 |
Member, Board of Directors, International Society for Computational Biology (2002-2004). |
 |
Over 15 Conference Organizing/Program Committees, including: ISMB, PSB, EITC, CBGI, BIOKDD |
 |
Over 20 Grant Review Panels/Study Sections at NIH, NSF, and DOE |
 |
Over 70 Invited Presentations/Lectures at international conferences, workshops, academia, and industry |
Education |
 |
B.S., Plant Pathology, National Taiwan University, Taiwan, 1978 |
 |
M.S., Plant Pathology, Purdue University, W. Lafayette, IN. 1982 |
 |
Ph.D., Molecular Plant Pathology, Purdue University, W. Lafayette, IN. 1984 |
 |
Post. Doc., Molecular Biology, Michigan State University, E. Lansing, MI, 1986 |
 |
M.S., Computer Science. University of Texas at Tyler, Tyler, TX. 1989 |
Patent |
United States Patent No. 5,845,049, December 1, 1998, C. H. Wu. A neural network system with n-gram term weighting method for molecular sequence classification and motif identification |
Publications |
|
 | |
BOOK: Wu, C. H. and McLarty, J. M. (2000). Neural Networks and Genome Informatics. Methods in Computational Biology and Biochemistry,
Volume 1, Series Editor A. K. Konopka, Elsevier Science. ISBN 0 08 042800 2 |
|
Mazumder R., Hu Z.Z., Vinayaka C.R., Sagripanti J.L., Frost S.D., Kosakovsky Pond S.L. and Wu C.H. (2007). Computational analysis and identification of amino acid sites in dengue E proteins relevant to development of diagnostics and vaccines. Virus Genes [EPub ahead of print]. |
Huang H., Hu Z.Z., Arighi C.N., Wu C.H. (2007). Integration of bioinformatics resources for functional analysis of gene expression and proteomic data. Front Biosci., 12: 5071-5088. |
Suzek B.E., Huang H., McGarvey P., Mazumder R., Wu C.H. (2007). UniRef: comprehensive and non-redundant UniProt reference clusters.
Bioinformatics, 23(10):1282-1288. |
Huang H., Shukla H., Saxena S., Wu C.H. (2007). Challenges and solutions in proteomics.
Current Genomics, 8 (in press). |
Mulder N.J., Apweiler R., Attwood T.K., Bairoch A., et al, Wu C.H., Yates C. (2007). New developments in the InterPro database.
Nucleic Acids Res., 35(Database issue):D224-228. |
UniProt Consortium (2007). The Universal Protein Resource (UniProt).
Nucleic Acids Res., 35(Database issue):D193-7. |
Torii M., Liu H.F., Hu Z.Z. and Wu C.H. (2006). A comparison study of biomedical short form definition detection algorithms. Proceedings of ACM First International Workshop on Text Mining in Bioinformatics, TMBIO 2006. |
Natale D.A., Arighi C.N., Barker W., Blake J., Chang T., Hu Z.Z., Liu
H., Smith B., Wu C.H. (2006). Framework for a Protein Ontology Proceedings of ACM First International Workshop on Text Mining in Bioinformatics, TMBIO 2006. |
Qiu P., Wang J., Ray Liu K.J., Hu Z.Z., Wu C.H. (2006). Dependence network modeling for biomarker identification.
Bioinformatics, 23:198-206. |
Hu Z.Z., Valencia J.C., Huang H., Chi A., Shabanowitz J., Hearing V.J., Appella
E., Wu C.H. (2006). Comparative
bioinformatics analyses and profiling of lysosome-related organelle
proteomes. Int J Mass Spec, 259:147-160. |
Chi A., Valencia J.C., Hu Z.Z., Watabe H., Yamaguchi H., Mangini N.J., Huang H.,
Canfield V.A., Cheng K.C., Yang F., Abe R., Yamagishi S., Shabanowitz J.,
Hearing V.J., Wu C.H., Appella E., Hunt D.F. (2006). Proteomic and Bioinformatic Characterization of the Biogenesis and
Function of Melanosomes. J Proteome Res, 5:3135-3144. |
Liu H., Hu Z.Z., Torii M., Wu C.H., Friedman C.(2006). Quantitative Assessment of Dictionary-based Protein Named Entity Tagging. J Am Med Inform Assoc, 13:497-507, 2006. |
Han B., Obradovic Z., Hu Z.Z., Wu C.H., Vucetic S.(2006). Substring selection for biomedical document classification. Bioinformatics, 22:2136-42.
|
Petrova N.V., Wu C.H. (2006). Prediction of catalytic residues using Support Vector Machine with selected protein sequence and structural properties. BMC Bioinformatics, 7:312.
|
Yuan X., Hu Z.Z., Wu H.T., Torii M., Narayanaswamy M., Ravikumar K.E., Vijay-Shanker
K., Wu C.H. (2006). An online literature mining tool for protein phosphorylation.Bioinformatics, 22(13):1668-1669.
|
Nikolskaya A.N., Arighi C.N., Huang H., Barker W.C., Wu C.H. (2006).PIRSF Family Classification System for Protein Functional and Evolutionary Analysis. Evolutionary Bioinformatics Online, 2:209-221. |
Liu, H.F., Hu, Z.Z., Zhang, J., Wu, C.H. (2006). BioThesaurus: a web-based thesaurus of protein and gene names.
Bioinformatics, 22, 103-105.
|
Wu, C.H., Apweiler, R., Bairoch, A., Natale, D.A., Barker, W.C., Boeckmann, B., Ferro, S.,
Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J., Mazumder, R., O'donovan, C.,
Redaschi, N., Suzek, B. (2006). The Universal Protein Resource (UniProt): an expanding universe of protein
information.
Nucleic Acids Research, 34, D187-91. |
Liu, H., Hu, Z.Z., Wu, C.H. (2005). DynGO: a tool for visualizing and mining of Gene Ontology and its associations
.
BMC Bioinformatics, 6, 201.
|
Mazumder, R., Natale, D., Murthy, S., Thiagarajan, R., Wu, C.H. (2005). Computational identification of strain-, species- and genus-specific proteins.
BMC Bioinformatics, 6, 279. |
Schneider, M., Bairoch, A., Wu, C.H., Apweiler, R. (2005). Plant Protein Annotation in the UniProt Knowledgebase
Plant Physiology, 138, 59-66.
|
Hu, Z.Z., Narayanaswamy, M., Ravikumar, K.E., Vijay-Shanker, K., Wu, C.H. (2005). Literature mining and database annotation of protein
phosphorylation using a rule-based system
Bioinformatics, 21(11), 2759-2765.
|
Mani I., Hu Z., Jang S.B., Samuel K., Krause M., Phillips J., Wu C.H. (2005).
Protein name tagging guidelines: lessons learned.
Comparative and Functional Genomics, 6(1-2), 72-76.
|
Natale, D. A., Vinayaka, C. R. and Wu, C. H. (2005). Large-scale, classification-driven, rule-based functional annotation of
proteins. Wiley, New York.
|
Wu, C.H., Huang, H., Nikolskaya, A., Vinayaka, C. R., Chung, S., Zhang, J. (2005). Family Classification and Integrative Associative Analysis for Protein Functional Annotation in Bioinformatics: New Research.
Nova Publishers, New York.
|
Bairoch, A., Apweiler, R., Wu, C. H., Barker, W. C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J., Natale, D.A., O'Donovan, C., Redaschi, N., Yeh, L.S. (2005).
The Universal Protein Resource (UniProt).Nucleic Acids Research, 33: D154-159.
|
Wu, C. H. and Nebert, D. W. (2004). Update on human genome completion and annotations: Protein Information Resource.
Human Genomics, 1, 229-233. |
Wu, C. H., Huang, H., Nikolskaya, A., Hu, Z. and Barker, W. C. (2004). The iProClass integrated database for protein functional analysis.
Computational Biology and Chemistry, 28, 87-96. |
Wu, C. H., Nikolskaya, A., Huang, H., Yeh, L.-S., Natale, D., Vinayaka, C. R., Hu, Z.,
Mazumder, R., Kumar, S., Kourtesis, P., Ledley, R. S., Suzek, B. E., Arminski, L., Chen, Y.,
Zhang, J., Cardenas, J. L., Chung, S., Castro-Alvear, J., Dinkov, G. and Barker, W. C. (2004).
PIRSF family classification
system at the Protein Information Resource.
Nucleic Acids Research, 32, D112-114. |
Apweiler, R., Bairoch, A. and Wu, C. H. (2004).
Protein sequence
databases. Current Opinion in Chemical Biology, 8, 76-80. |
Apweiler R, Bairoch A, Wu, C. H., Barker, W. C., Boeckmann, B., Ferro1, S., Gasteiger, E.,
Huang, H., Lopez, R., Magrane, M., Martin, M. J., Natale, D. A., O Donovan, C., Redaschi, N.,
Yeh, L. S. (2004).
UniProt: Universal Protein
Knowledgebase. Nucleic Acids Research, 32, D115-119. |
Hu, Z., Mani, I., Hermoso, V., Liu, H. and Wu, C. H. (2004).
iProLINK: an integrated
protein resource for literature mining. Computational Biology and Chemistry, 28, 409-416. |
Wu, C. H., Yeh, L.-S., Huang, H., Arminski, L., Castro-Alvear, J., Chen, Y., Hu, Z.,
Kourtesis, P., Ledley, R. S., Suzek, B.E., Vinayaka, C.R., Zhang, J. and Barker, W.C. (2003).
The Protein
Information Resource. Nucleic Acids Research, 31, 345-347. |
Huang, H., Barker, W. C., Chen, Y. and Wu, C. H. (2003).
iProClass:
An Integrated Database of Protein Family, Function, and Structure Information.
Nucleic Acids Research, 31, 390-392. |
Wu, C. H., Huang, H., Yeh, L.-S. and Barker, W. C. (2003).
Protein family
classification and functional annotation.
Computational Biology and Chemistry, 27, 37-47. |
Wu, C. H., Huang, H., Arminski, L., Castro-Alvear, J., Chen, Y., Hu, Z.,
Ledley, R. S., Lewis, K. C., Mewes, H. W., Orcutt, B. C., Suzek, B. E., Tsugita, A.,
Vinayaka, C. R., Yeh, L. S., Zhang, J. and Barker, W. C. (2002).
The Protein Information Resource:
an integrated public resource of functional annotation of proteins.
Nucleic Acids Research, 30, 35-37. |
Wu, C.H., Xiao, C., Hou, Z., Huang, H., and Barker, W. C. (2001).
iProClass: An integrated and
comprehensive protein classification database.
Nucleic Acids Research, 29, 52-54. |
McGarvey, P., Huang, H., Barker, W. C., Orcutt, B. C. and Wu, C. H. (2000).
PIR Web site: New
resource for bioinformatics. Bioinformatics, 16, 290-291. |
Wu, C. H., Huang, H. and McLarty, J. (1999). Gene family identification
network design for protein sequence analysis. International Journal of
Artificial Intelligence Tools, 8, 419-432. |
Wu, C. H., Shivakumar, S. and Huang, H. (1999). ProClass protein family
database. Nucleic Acids Research, 27, 272-274. |
Barker, W. C., Garavelli, J. S,, McGarvey, P. B, Marzec, C. R., Orcutt, B. C.,
Srinivasarao, G. Y., Yeh, L. S., Ledley, R. S., Mewes, H. W., Pfeiffer, F.,
Tsugita, A. and Wu, C. H. (1999). The PIR-International
Protein Sequence Database. Nucleic Acids Research, 27, 39-43. |
Wu, C. H., S. Shivakumar, C. V. Shivakumar and S. Chen. (1998). GeneFIND
web server for protein family identification and information retrieval. Bioinformatics,
14, 223-224. |
Wu, C. H. (1997). Artificial neural networks for molecular sequence
analysis. Computers & Chemistry, 21, 237 - 256. |
Wu, C. H., Chen, H. L. and Chen, S. (1997). Counter-propagation neural
networks for molecular sequence classification: Supervised LVQ and dynamic
node allocation. Applied Intelligence, 7, 27-38. |
Wu, C. H., Zhao, S. and Chen, H. L. (1996). A protein class database
organized with ProSite protein groups and PIR superfamilies. Journal of
Computational Biology, 3, 547-562. |
Wu, C. H., Zhao, S., Chen, H. L., Lo, C. J. and McLarty, J. (1996).
Motif identification neural design for rapid and sensitive protein family
search. CABIOS, 12, 109-118. |
Wu, C. H. (1996). Gene Classification Artificial Neural System.
Methods In Enzymolog, 266, 71-88. |
Wu, C. H., Berry, M., Shivakumar, S. and McLarty, J. (1995). Neural
networks for full-scale protein sequence classification: Sequence encoding
with singular value decomposition. Machine Learning, 21, 177-193. |
Wu, C. H. and Shivakumar, S. (1994). Back-propagation and counter-propagation
neural networks for phylogenetic classification of ribosomal RNA sequences.
Nucleic Acids Research, 22, 4291-4299. |
Wu, C. H., Whitson, G., McLarty, J., Ermongkonchai, A. and Chang, T.
(1992). Protein classification artificial neural system. Protein Science,
1, 667-677. |
Wu, C. H., Caspar, T., Browse, J., Lindquist, S. and Somerville, C. (1988).
Characterization of an HSP70 cognate gene family in Arabidopsis. Plant
Physiology, 88, 731-740. |
Wu, C. H., Warren, H. L., Sitaraman, K. and Tsai, C. Y. (1988). Translational
alterations in maize leaves responding to pathogen infection, paraquat treatment
or heat shock. Plant Physiology, 86, 1323-1329. |
Revised 07/13/07
|