MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Frequent-Subsequence-Based Prediction of Outer (2003)

by Membrane Proteins Rong ,  Rong She ,  Fei Chen ,  Ke Wang ,  Martin Ester
In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Add To MetaCart

Abstract:

A number of medically important disease-causing bacteria (collectively called Gram-negative bacteria) are noted for the extra "outer" membrane that surrounds their cell. Proteins resident in this membrane (outer membrane proteins, or OMPs) are of primary research interest for antibiotic and vaccine drug design as they are on the surface of the bacteria and so are the most accessible targets to develop new drugs against. With the development of genome sequencing technology and bioinformatics, biologists can now deduce all the proteins that are likely produced in a given bacteria and have attempted to classify where proteins are located in a bacterial cell. However such protein localization programs are currently least accurate when predicting OMPs, and so there is a current need for the development of a better OMP classifier. Data mining research suggests that the use of frequent patterns has good performance in aiding the development of accurate and efficient classification algorithms. In this paper, we present two methods to identify OMPs based on frequent subsequences and test them on all Gramnegative bacterial proteins whose localizations have been determined by biological experiments. One classifier follows an association rule approach, while the other is based on support vector machines (SVMs). We compare the proposed methods with the state-of-the-art methods in the biological domain. The results demonstrate that our methods are better both in terms of accurately identifying OMPs and providing biological insights that increase our understanding of the structures and functions of these important proteins.

Citations

5044 Statistical Learning Theory – Vapnik - 1998
3356 C4.5: Programs for Machine Learning – Quinlan - 1993
262 The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 – Boeckmann, Bairoch, et al.
245 Integrating Classification and Association Rule Mining – Liu, Hsu, et al. - 1998
192 Learning to Classify Text using Support Vector Machines – Joachims - 2002
174 The spectrum kernel: A string kernel for svm protein classification – Leslie, Eskin, et al. - 2002
87 Fast parallel and serial approximate string matching – Landau, Vishkin - 1989
76 Hua : Support vector machine approach for protein subcellular localization prediction – Sun
67 T: Using neural networks for prediction of the subcellular location of proteins. Nucleic Acid Res – Reinhardt, Hubbard - 1998
47 R.: Partial Classification using Association Rules. In – Ali, Manganaris, et al. - 1997
44 Color set size problem with application to string matching – Hui - 1992
26 Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies – Nakashima, Nishikawa - 1994
23 Casadio R: A sequence-profilebased HMM for predicting and discriminating beta barrel membrane proteins – PL, Fariselli, et al.
21 Wanted: Subcellular localization of proteins based on sequence – Eisenhaber, Bork - 1998
19 Evaluation of Techniques for Classifying Biological Sequences – Deshpande, Karypis
15 Support vector machine prediction of signal peptide cleavage site using a new class of kernels for strings – Vert - 2002
13 Prediction of protein subcellular locations using Markov chain models – Yuan - 1999
10 Barrel membrane proteins – Schulz - 2000
6 Prediction by a neural network of outer membrane β-strand topology – Diederichs, Freigang, et al. - 1998
5 Hybrid decision tree – Wang, Zhou, et al. - 2000
5 The beta-barrel finder (BBF) program, allowing identification of outer membrane beta-barrel proteins encoded within prokaryotic genomes. Protein Sci – Zhai, Saier - 2002
4 Casadio R., Prediction of the transmembrane regions of β-barrel membrane proteins with a neural network-based predictor – Jacoboni, Martelli, et al. - 2001
4 Toward genomic identification of β-barrel membrane proteins: Composition and architecture of known structures – Wimley - 2002
3 Prediction of membrane-spanning β-strands and its application to maltoporin – Schirmer, Cowan - 1993
1 Ahuja N.: A Tale of Two Classifiers: SNoW vs – Yang, Roth