A. Kulyyassov

National Center for Biotechnology,13/5, Korgalzhyn road, Nur-Sultan,  010000, Kazakhstan


Protein sequences are stored in public databases such as the UniProt Knowledgebase (UniProtKB), where curators add bioinformatics data, including prediction of structure and function of biomolecules and experimental results. Protein function prediction can be done using sequence similarity searches, but an alternative approach is to use protein signatures that classify proteins into families and domains. The main protein signature databases are accessible through the integrated InterPro database, which provides the UniProtKB sequence classification. In addition to characterizing proteins through protein families, many researchers are interested in analyzing the complete set of proteins from the genome (i.e., the proteome), and there are databases and resources providing unreduced sets of proteomes and analyzes of proteins from organisms with fully sequenced genomes. This article reviews the tools and resources available on the Internet for characterizing both individual proteins and analysis of the entire proteome.


Association-Rule-Based Annotator (ARBA), European Bioinformatics Institute (EBI), The European Molecular Biology Laboratory (EMBL), The DNA Data Bank of Japan (DDBJ), Gene Ontology Annotation (GOA), Global Proteome Machine (GPM), Mass spectrometry (MS), proteomics, Liquid Chromatography tandem Mass Spectrometry (LC-MS/MS), Multiple reaction monitoring (MRM), National Institutes of Health (NIH), Protein Data Bank (PDB), PRoteomics IDEntifications (PRIDE), Protein Information Resource (PIR), Post-translational modification (PTM), Swiss Institute of Bioinformatics (SIB), the Universal Protein Resource (UniProt), the UniProt Archive (UniParc), the UniProt Knowledgebase (UniProt), the UniProt Reference (UniRef)

