Results 1  10
of
145
Projected Subgradient Methods for Learning Sparse Gaussians
"... Gaussian Markov random fields (GMRFs) are useful in a broad range of applications. In this paper we tackle the problem of learning a sparse GMRF in a highdimensional space. Our approach uses the ℓ1norm as a regularization on the inverse covariance matrix. We utilize a novel projected gradient meth ..."
Abstract

Cited by 62 (0 self)
 Add to MetaCart
(Show Context)
Gaussian Markov random fields (GMRFs) are useful in a broad range of applications. In this paper we tackle the problem of learning a sparse GMRF in a highdimensional space. Our approach uses the ℓ1norm as a regularization on the inverse covariance matrix. We utilize a novel projected gradient method, which is faster than previous methods in practice and equal to the best performing of these in asymptotic complexity. We also extend the ℓ1regularized objective to the problem of sparsifying entire blocks within the inverse covariance matrix. Our methods generalize fairly easily to this case, while other methods do not. We demonstrate that our extensions give better generalization performance on two real domains—biological network analysis and a 2Dshape modeling image task. 1
Structure learning in random fields for heart motion abnormality detection
 In CVPR
, 2008
"... Coronary Heart Disease can be diagnosed by assessing the regional motion of the heart walls in ultrasound images of the left ventricle. Even for experts, ultrasound images are difficult to interpret leading to high intraobserver variability. Previous work indicates that in order to approach this pr ..."
Abstract

Cited by 56 (8 self)
 Add to MetaCart
(Show Context)
Coronary Heart Disease can be diagnosed by assessing the regional motion of the heart walls in ultrasound images of the left ventricle. Even for experts, ultrasound images are difficult to interpret leading to high intraobserver variability. Previous work indicates that in order to approach this problem, the interactions between the different heart regions and their overall influence on the clinical condition of the heart need to be considered. To do this, we propose a method for jointly learning the structure and parameters of conditional random fields, formulating these tasks as a convex optimization problem. We consider blockL1 regularization for each set of features associated with an edge, and formalize an efficient projection method to find the globally optimal penalized maximum likelihood solution. We perform extensive numerical experiments comparing the presented method with related methods that approach the structure learning problem differently. We verify the robustness of our method on echocardiograms collected in routine clinical practice at one hospital. 1.
Discriminative Structure and Parameter Learning for Markov Logic Networks
"... Markov logic networks (MLNs) are an expressive representation for statistical relational learning that generalizes both firstorder logic and graphical models. Existing methods for learning the logical structure of an MLN are not discriminative; however, many relational learning problems involve spe ..."
Abstract

Cited by 56 (5 self)
 Add to MetaCart
(Show Context)
Markov logic networks (MLNs) are an expressive representation for statistical relational learning that generalizes both firstorder logic and graphical models. Existing methods for learning the logical structure of an MLN are not discriminative; however, many relational learning problems involve specific target predicates that must be inferred from given background information. We found that existing MLN methods perform very poorly on several such ILP benchmark problems, and we present improved discriminative methods for learning MLN clauses and weights that outperform existing MLN and traditional ILP methods. 1.
Optimizing costly functions with simple constraints: A limitedmemory projected quasinewton algorithm
 Proc. of Conf. on Artificial Intelligence and Statistics
, 2009
"... An optimization algorithm for minimizing a smooth function over a convex set is described. Each iteration of the method computes a descent direction by minimizing, over the original constraints, a diagonal plus lowrank quadratic approximation to the function. The quadratic approximation is construct ..."
Abstract

Cited by 52 (9 self)
 Add to MetaCart
(Show Context)
An optimization algorithm for minimizing a smooth function over a convex set is described. Each iteration of the method computes a descent direction by minimizing, over the original constraints, a diagonal plus lowrank quadratic approximation to the function. The quadratic approximation is constructed using a limitedmemory quasiNewton update. The method is suitable for largescale problems where evaluation of the function is substantially more expensive than projection onto the constraint set. Numerical experiments on onenorm regularized test problems indicate that the proposed method is competitive with stateoftheart methods such as boundconstrained LBFGS and orthantwise descent. We further show that the method generalizes to a wide class of problems, and substantially improves on stateoftheart methods for problems such as learning the structure of Gaussian graphical models and Markov random fields. 1
Bundle methods for machine learning
 JMLR
"... We present a globally convergent method for regularized risk minimization problems. Our method applies to Support Vector estimation, regression, Gaussian Processes, and any other regularized risk minimization setting which leads to a convex optimization problem. SVMPerf can be shown to be a special ..."
Abstract

Cited by 49 (10 self)
 Add to MetaCart
We present a globally convergent method for regularized risk minimization problems. Our method applies to Support Vector estimation, regression, Gaussian Processes, and any other regularized risk minimization setting which leads to a convex optimization problem. SVMPerf can be shown to be a special case of our approach. In addition to the unified framework we present tight convergence bounds, which show that our algorithm converges in O(1/) steps to precision for general convex problems and in O(log(1/)) steps for continuously differentiable problems. We demonstrate in experiments the performance of our approach. 1
Learning graphical model structure using L1regularization paths
 IN PROCEEDINGS OF THE 21ST CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI
, 2007
"... Sparsitypromoting L1regularization has recently been succesfully used to learn the structure of undirected graphical models. In this paper, we apply this technique to learn the structure of directed graphical models. Specifically, we make three contributions. First, we show how the decomposability ..."
Abstract

Cited by 42 (2 self)
 Add to MetaCart
Sparsitypromoting L1regularization has recently been succesfully used to learn the structure of undirected graphical models. In this paper, we apply this technique to learn the structure of directed graphical models. Specifically, we make three contributions. First, we show how the decomposability of the MDL score, plus the ability to quickly compute entire regularization paths, allows us to efficiently pick the optimal regularization parameter on a pernode basis. Second, we show how to use L1 variable selection to select the Markov blanket, before a DAG search stage. Finally, we show how L1 variable selection can be used inside of an order search algorithm. The effectiveness of these L1based approaches are compared to current state of the art methods on 10 datasets.
A featurebased approach to modeling proteinDNA interactions
 In Proc. RECOMB’07
, 2007
"... Abstract. Transcription factor (TF) binding to its DNA target site is a fundamental regulatory interaction. The most common model used to represent TF binding specificities is a position specific scoring matrix (PSSM), which assumes independence between binding positions. In many cases this simplify ..."
Abstract

Cited by 38 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Transcription factor (TF) binding to its DNA target site is a fundamental regulatory interaction. The most common model used to represent TF binding specificities is a position specific scoring matrix (PSSM), which assumes independence between binding positions. In many cases this simplifying assumption does not hold. Here, we present feature motif models (FMMs), a novel probabilistic method for modeling TFDNA interactions, based on Markov networks. Our approach uses sequence features to represent TF binding specificities, where each feature may span multiple positions. We develop the mathematical formulation of our models, and devise an algorithm for learning their structural features from binding site data. We evaluate our approach on synthetic data, and then apply it to binding site and ChIPchip data from yeast. We reveal sequence features that are present in the binding specificities of yeast TFs, and show that FMMs explain the binding data significantly better than PSSMs. Key words: transcription factor binding sites, DNA sequence motifs, probabilistic graphical models, Markov networks, motif finder. 1
Recovering occlusion boundaries from an image
 In ICCV
, 2007
"... Occlusion reasoning is a fundamental problem in computer vision. In this paper, we propose an algorithm to recover the occlusion boundaries and depth ordering of freestanding structures in the scene. Rather than viewing the problem as one of pure image processing, our approach employs cues from an ..."
Abstract

Cited by 30 (3 self)
 Add to MetaCart
(Show Context)
Occlusion reasoning is a fundamental problem in computer vision. In this paper, we propose an algorithm to recover the occlusion boundaries and depth ordering of freestanding structures in the scene. Rather than viewing the problem as one of pure image processing, our approach employs cues from an estimated surface layout and applies Gestalt grouping principles using a conditional random field (CRF) model. We propose a hierarchical segmentation process, based on agglomerative merging, that reestimates boundary strength as the segmentation progresses. Our experiments on the Geometric Context dataset validate our choices for features, our iterative refinement of classifiers, and our CRF model. In experiments on the Berkeley Segmentation Dataset, PASCAL VOC 2008, and LabelMe, we also show that the trained algorithm generalizes to other datasets and can be used as an object boundary predictor with figure/ground labels. 1.
Convex structure learning in loglinear models: Beyond pairwise potentials
 In Proceedings of International Workshop on Artificial Intelligence and Statistics
, 2010
"... Previous work has examined structure learning in loglinear models with `1regularization, largely focusing on the case of pairwise potentials. In this work we consider the case of models with potentials of arbitrary order, but that satisfy a hierarchical constraint. We enforce the hierarchical const ..."
Abstract

Cited by 28 (2 self)
 Add to MetaCart
(Show Context)
Previous work has examined structure learning in loglinear models with `1regularization, largely focusing on the case of pairwise potentials. In this work we consider the case of models with potentials of arbitrary order, but that satisfy a hierarchical constraint. We enforce the hierarchical constraint using group `1regularization with overlapping groups. An active set method that enforces hierarchical inclusion allows us to tractably consider the exponential number of higherorder potentials. We use a spectral projected gradient method as a subroutine for solving the overlapping group `1regularization problem, and make use of a sparse version of Dykstra's algorithm to compute the projection. Our experiments indicate that this model gives equal or better test set likelihood compared to previous models. 1
A LargeDeviation Analysis for the Maximum Likelihood Learning of Tree Structures
, 2009
"... The problem of maximumlikelihood learning of the Markov tree structure of an unknown distribution from samples is considered when the distribution is Markov on a tree. Largedeviation analysis of the error in estimation of the set of edges of the tree is considered. Necessary and sufficient conditi ..."
Abstract

Cited by 27 (17 self)
 Add to MetaCart
(Show Context)
The problem of maximumlikelihood learning of the Markov tree structure of an unknown distribution from samples is considered when the distribution is Markov on a tree. Largedeviation analysis of the error in estimation of the set of edges of the tree is considered. Necessary and sufficient conditions are provided to ensure that this error probability decays exponentially. These conditions are based on the mutual information between each pair of variables being distinct from that of other pairs. The rate of error decay, which is the error exponent, is derived using the largedeviation principle. For a discrete distribution, the error exponent is approximated using Euclidean information theory, and is given by a ratio, interpreted as the signaltonoise ratio (SNR) for learning. Extensions to the Gaussian case are also considered.