Results 1  10
of
20
Learning structured prediction models: a large margin approach
, 2004
"... We consider large margin estimation in a broad range of prediction models where inference involves solving combinatorial optimization problems, for example, weighted graphcuts or matchings. Our goal is to learn parameters such that inference using the model reproduces correct answers on the training ..."
Abstract

Cited by 177 (9 self)
 Add to MetaCart
(Show Context)
We consider large margin estimation in a broad range of prediction models where inference involves solving combinatorial optimization problems, for example, weighted graphcuts or matchings. Our goal is to learn parameters such that inference using the model reproduces correct answers on the training data. Our method relies on the expressive power of convex optimization problems to compactly capture inference or solution optimality in structured prediction models. Directly embedding this structure within the learning formulation produces concise convex problems for efficient estimation of very complex and diverse models. We describe experimental results on a matching task, disulfide connectivity prediction, showing significant improvements over stateoftheart methods. 1.
Scratch: a protein structure and structural feature prediction server
 Nucleic Acids Res
, 2005
"... server ..."
(Show Context)
Structured prediction via the extragradient method
 In Advances in
, 2006
"... We present a simple and scalable algorithm for largemargin estimation of structured models, including an important class of Markov networks and combinatorial models. We formulate the estimation problem as a convexconcave saddlepoint problem and apply the extragradient method, yielding an algorith ..."
Abstract

Cited by 29 (2 self)
 Add to MetaCart
(Show Context)
We present a simple and scalable algorithm for largemargin estimation of structured models, including an important class of Markov networks and combinatorial models. We formulate the estimation problem as a convexconcave saddlepoint problem and apply the extragradient method, yielding an algorithm with linear convergence using simple gradient and projection calculations. The projection step can be solved using combinatorial algorithms for mincost quadratic flow. This makes the approach an efficient alternative to formulations based on reductions to a quadratic program (QP). We present experiments on two very different structured prediction tasks: 3D image segmentation and word alignment, illustrating the favorable scaling properties of our algorithm. 1
Improving disulfide connectivity prediction with sequential distance between oxidized cysteines
, 2005
"... ..."
DISULFIND: a disulfide bonding state and cysteine connectivity prediction server
 Nucleic Acids Res
, 2006
"... DISULFIND is a server for predicting the disulfide bonding state of cysteines and their disulfide connectivity starting from sequence alone. Optionally, disulfide connectivity can be predicted from sequence and a bonding state assignment given as input. The output is a simple visualization of the as ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
DISULFIND is a server for predicting the disulfide bonding state of cysteines and their disulfide connectivity starting from sequence alone. Optionally, disulfide connectivity can be predicted from sequence and a bonding state assignment given as input. The output is a simple visualization of the assigned bonding state (with confidence degrees) and the most likely connectivity patterns. The server is available at
Machine Learning Methods for Protein Structure Prediction
"... Abstractâ€”Machine learning methods are widely used in bioinformatics and computational and systems biology. Here, we review the development of machine learning methods for protein structure prediction, one of the most fundamental problems in structural biology and bioinformatics. Protein structure pr ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstractâ€”Machine learning methods are widely used in bioinformatics and computational and systems biology. Here, we review the development of machine learning methods for protein structure prediction, one of the most fundamental problems in structural biology and bioinformatics. Protein structure prediction is such a complex problem that it is often decomposed and attacked at four different levels: 1D prediction of structural features along the primary sequence of amino acids; 2D prediction of spatial relationships between amino acids; 3D prediction of the tertiary structure of a protein; and 4D prediction of the quaternary structure of a multiprotein complex. A diverse set of both supervised and unsupervised machine learning methods has been applied over the years to tackle these problems and has significantly contributed to advancing the stateoftheart of protein structure prediction. In this paper, we review the development and application of hidden Markov models, neural networks, support vector machines, Bayesian methods, and clustering methods in 1D, 2D, 3D, and 4D protein structure predictions. Index Termsâ€”Bioinformatics, machine learning, protein folding, protein structure prediction. I.
PREDICTION OF OXIDATION STATES OF CYSTEINES AND DISULFIDE BRIDGES IN PROTEINS by
"... Knowledge on cysteine oxidation state and disulfide bond connectivity is of great importance to protein chemistry and 3D structures. This research is aimed at finding the most relevant features in prediction of cysteines oxidation states and the disulfide bonds connectivity of proteins. Models pred ..."
Abstract
 Add to MetaCart
Knowledge on cysteine oxidation state and disulfide bond connectivity is of great importance to protein chemistry and 3D structures. This research is aimed at finding the most relevant features in prediction of cysteines oxidation states and the disulfide bonds connectivity of proteins. Models predicting the oxidation states of cysteines are developed with machine learning techniques such as Support Vector Machines (SVMs) and Associative Neural Networks (ASNNs). A record high prediction accuracy of oxidation state, 95%, is achieved by incorporating the oxidation states of Nterminus cysteines, flanking sequences of cysteines and global information on the protein chain (number of cysteines, length of the chain and amino acids composition of the chain etc.) into the SVM encoding. This is 5 % higher than the current methods. This indicates to us that the oxidation states of amino terminal cysteines infer the oxidation states of other cysteines in the same protein chain. Satisfactory prediction results are also obtained with the newer and more inclusive SPX dataset, especially for chains with higher number of cysteines. Compared to literature methods, our approach is a onestep prediction system, which is easier to implement and use. A side by side comparison of SVM and ASNN is
CONNECTIVITY
"... Most questions in proteomics require complex answers. Yet graph theory, supervised learning, and statistical model have decomposed complex questions into simple questions with simple answers. The expertise in the field of protein study often address tasks that demand answers as complex as the questi ..."
Abstract
 Add to MetaCart
(Show Context)
Most questions in proteomics require complex answers. Yet graph theory, supervised learning, and statistical model have decomposed complex questions into simple questions with simple answers. The expertise in the field of protein study often address tasks that demand answers as complex as the questions. Such complex answers may consist of multiple factors that must be weighed against each other to arrive at a globally satisfactory and consistent solution to the question. In the prediction of calcium binding in proteins, we construct a global oxygen contact graph of a protein, then apply a graph algorithm to find oxygen clusters with the fixed size of four, finally employ a geometry algorithm to judge if the oxygen clusters are calciumbinding sites or not. Additionally, we can predict the locations of those sites. Furthermore, we construct a global oxygen contact graph including oxygenbonded carbon atoms of a protein, then apply a graph algorithm to find local biggest oxygen clusters, finally design another geometric filter to exclude the noncalcium binding oxygen clusters. In addition, we apply observed chemical properties as a chemical filter to recognize some noncalcium binding oxygen clusters. In order to explore the characteristics of calciumbinding sites in proteins, we conduct a statistic
Abstract
"... We present a simple and scalable algorithm for largemargin estimation of structured models, including an important class of Markov networks and combinatorial models. The estimation problem can be formulated as a quadratic program (QP) that exploits the problem structure to achieve polynomial number ..."
Abstract
 Add to MetaCart
(Show Context)
We present a simple and scalable algorithm for largemargin estimation of structured models, including an important class of Markov networks and combinatorial models. The estimation problem can be formulated as a quadratic program (QP) that exploits the problem structure to achieve polynomial number of variables and constraints. However, offtheshelf QP solvers scale poorly with problem and training sample size. We recast the formulation as a convexconcave saddle point problem that allows us to use simple projection methods. We show the projection step can be solved using combinatorial algorithms for mincost convex flow. We provide linear convergence guarantees for our method and present experiments on two very different structured prediction tasks: 3D image segmentation and word alignment, illustrating the favorable scaling properties of our algorithm. 1