Results 1 - 10
of
17
Multiclass multiple kernel learning
- In ICML. ACM
"... In many applications it is desirable to learn from several kernels. “Multiple kernel learning” (MKL) allows the practitioner to optimize over linear combinations of kernels. By enforcing sparse coefficients, it also generalizes feature selection to kernel selection. We propose MKL for joint feature ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
In many applications it is desirable to learn from several kernels. “Multiple kernel learning” (MKL) allows the practitioner to optimize over linear combinations of kernels. By enforcing sparse coefficients, it also generalizes feature selection to kernel selection. We propose MKL for joint feature maps. This provides a convenient and principled way for MKL with multiclass problems. In addition, we can exploit the joint feature map to learn kernels on output spaces. We show the equivalence of several different primal formulations including different regularizers. We present several optimization methods, and compare a convex quadratically constrained quadratic program (QCQP) and two semi-infinite linear programs (SILPs) on toy data, showing that the SILPs are faster than the QCQP. We then demonstrate the utility of our method by applying the SILP to three real world datasets. 1.
Significantly improved prediction of subcellular localization by integrating text and protein sequence data
- In Proc. of PSB ’06
, 2006
"... Computational prediction of protein subcellular localization is a challenging problem. Several approaches have been presented during the past few years; some attempt to cover a wide variety of localizations, while others focus on a small number of localizations and on specific organisms. We present ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Computational prediction of protein subcellular localization is a challenging problem. Several approaches have been presented during the past few years; some attempt to cover a wide variety of localizations, while others focus on a small number of localizations and on specific organisms. We present a comprehensive system, integrating protein sequence-derived data and text-based information. It is tested on three large data sets, previously used by leading prediction methods. The results demonstrate that our system performs significantly better than previously reported results, for a wide range of eukaryotic subcellular localizations. 1.
Eukaryotic protein subcellular localization based on local pairwise profile alignment
- SVM,” in 2006 IEEE International Workshop on Machine Learning for Signal Processing (MLSP’06), 2006
, 2006
"... Abstract — The subcellular locations of proteins are important functional annotations. An effective and reliable subcellular localization method is necessary for proteomics research. This paper introduces a new method—PairProSVM—to automatically predict the subcellular locations of proteins. The pro ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
Abstract — The subcellular locations of proteins are important functional annotations. An effective and reliable subcellular localization method is necessary for proteomics research. This paper introduces a new method—PairProSVM—to automatically predict the subcellular locations of proteins. The profiles of all protein sequences in the training set are constructed by PSI-BLAST and the pairwise profile-alignment scores are used to form feature vectors for training a support vector machine (SVM) classifier. It was found that PairProSVM outperforms the methods that are based on sequence alignment and amino-acid compositions even if most of the homologous sequences have been removed. PairProSVM was evaluated on Huang and Li’s and Gardy et al.’s protein datasets. The overall accuracies on these datasets reach 75.3 % and 91.9%, respectively, which are higher than or comparable to those obtained by sequence alignment and composition-based methods. Index Terms — Protein subcellular localization; sequence alignment; profile alignment; kernel methods; support vector machines. I.
SherLoc: high-accuracy prediction of protein subcellular localization by integrating text and protein sequence data
, 2007
"... ..."
An automated combination of kernels for predicting protein subcellular localization.
, 2007
"... Protein subcellular localization is a crucial ingredient to many important inferences about cellular processes, including prediction of protein function and protein interactions. While many predictive computational tools have been proposed, they tend to have complicated architectures and require man ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Protein subcellular localization is a crucial ingredient to many important inferences about cellular processes, including prediction of protein function and protein interactions. While many predictive computational tools have been proposed, they tend to have complicated architectures and require many design decisions from the developer. Here we utilize the multiclass support vector machine (m-SVM) method to directly solve protein subcellular localization without resorting to the common approach of splitting the problem into several binary classification problems. We further propose a general class of protein sequence kernels which considers all motifs, including motifs with gaps. Instead of heuristically selecting one or a few kernels from this family, we utilize a recent extension of SVMs that optimizes over multiple kernels simultaneously. This way, we automatically search over families of possible amino acid motifs. We compare our automated approach to three other predictors on four different datasets, and show that we perform better than the current state of the art. Further, our method provides some insights as to which sequence motifs are most useful for determining subcellular localization, which are in agreement with biological reasoning. Data files, kernel matrices and open source software are available at
BioMed Central
, 2006
"... A novel approach to phylogenetic tree construction using stochastic optimization and clustering ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
A novel approach to phylogenetic tree construction using stochastic optimization and clustering
An automated combination of sequence motif kernels for predicting protein subcellular localization
, 2006
"... Abstract. Protein subcellular localization is a crucial ingredient to many important inferences about cellular processes, including prediction of protein function and protein interactions. While many predictive computational tools have been proposed, they tend to have complicated architectures and r ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. Protein subcellular localization is a crucial ingredient to many important inferences about cellular processes, including prediction of protein function and protein interactions. While many predictive computational tools have been proposed, they tend to have complicated architectures and require many design decisions from the developer. We propose an elegant and fully automated approach to building a prediction system for protein subcellular localization. We propose a new class of protein sequence kernels which considers all motifs including motifs with gaps. This class of kernels allows the inclusion of pairwise amino acid distances into their computation. We further propose a multiclass support vector machine method which directly solves protein subcellular localization without resorting to the common approach of splitting the problem into several binary classification problems. To automatically search over families of possible amino acid motifs, we generalize our method to optimize over multiple kernels at the same time. We compare our automated approach to four other predictors on three different datasets. 1
Spectral and Semidefinite Relaxations of the CLUHSIC Algorithm
"... CLUHSIC is a recent clustering framework that unifies the geometric, spectral and statistical views of clustering. In this paper, we show that the recently proposed discriminative view of clustering, which includes the DIFFRAC and DisKmeans algorithms, can also be unified under the CLUH-SIC framewor ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
CLUHSIC is a recent clustering framework that unifies the geometric, spectral and statistical views of clustering. In this paper, we show that the recently proposed discriminative view of clustering, which includes the DIFFRAC and DisKmeans algorithms, can also be unified under the CLUH-SIC framework. Moreover, CLUHSIC involves integer programming and one has to resort to heuristics such as iterative local optimization. In this paper, we propose two relaxations that are much more disciplined. The first one uses spectral techniques while the second one is based on semidefinite programming (SDP). Experimental results on a number of structured clustering tasks show that the proposed method significantly outperforms existing optimization methods for CLUHSIC. Moreover, it can also be used in semi-supervised classification. Experiments on real-world protein subcellular localization data sets clearly demonstrate the ability of CLUHSIC in incorporating structural and evolutionary information. 1
Supervised Ensembles of Prediction Methods for Subcellular Localization
"... In the past decade, many automated prediction methods for the subcellular localization of proteins have been proposed, utilizing a wide range of principles and learning approaches. Based on an experimental evaluation of different methods and on their theoretical properties, we propose to combine a w ..."
Abstract
- Add to MetaCart
In the past decade, many automated prediction methods for the subcellular localization of proteins have been proposed, utilizing a wide range of principles and learning approaches. Based on an experimental evaluation of different methods and on their theoretical properties, we propose to combine a well balanced set of existing approaches to new, ensemble-based prediction methods. The experimental evaluation shows our ensembles to improve substantially over the underlying base methods. 1.

