Results 1  10
of
12
Learning Latent Tree Graphical Models
 J. of Machine Learning Research
, 2011
"... We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing me ..."
Abstract

Cited by 19 (6 self)
 Add to MetaCart
We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing methods, the observed nodes (or variables) are not constrained to be leaf nodes. Our algorithms can be applied to both discrete and Gaussian random variables and our learned models are such that all the observed and latent variables have the same domain (state space). Our first algorithm, recursive grouping, builds the latent tree recursively by identifying sibling groups using socalled information distances. One of the main contributions of this work is our second algorithm, which we refer to as CLGrouping. CLGrouping starts with a preprocessing procedure in which a tree over the observed variables is constructed. This global step groups the observed nodes that are likely to be close to each other in the true latent tree, thereby guiding subsequent recursive grouping (or equivalent procedures such as neighborjoining) on much smaller subsets of variables. This results in more accurate and efficient learning of latent trees. We also present regularized versions of our algorithms that learn latent tree approximations of arbitrary distributions. We compare
A LargeDeviation Analysis for the Maximum Likelihood Learning of Tree Structures
, 2009
"... The problem of maximumlikelihood learning of the Markov tree structure of an unknown distribution from samples is considered when the distribution is Markov on a tree. Largedeviation analysis of the error in estimation of the set of edges of the tree is considered. Necessary and sufficient conditi ..."
Abstract

Cited by 13 (11 self)
 Add to MetaCart
The problem of maximumlikelihood learning of the Markov tree structure of an unknown distribution from samples is considered when the distribution is Markov on a tree. Largedeviation analysis of the error in estimation of the set of edges of the tree is considered. Necessary and sufficient conditions are provided to ensure that this error probability decays exponentially. These conditions are based on the mutual information between each pair of variables being distinct from that of other pairs. The rate of error decay, which is the error exponent, is derived using the largedeviation principle. For a discrete distribution, the error exponent is approximated using Euclidean information theory, and is given by a ratio, interpreted as the signaltonoise ratio (SNR) for learning. Extensions to the Gaussian case are also considered.
Learning graphical models for hypothesis testing
 in Proc. 14th IEEE Statist. Signal Process. Workshop
, 2007
"... Abstract—Sparse graphical models have proven to be a flexible class of multivariate probability models for approximating highdimensional distributions. In this paper, we propose techniques to exploit this modeling ability for binary classification by discriminatively learning such models from label ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
Abstract—Sparse graphical models have proven to be a flexible class of multivariate probability models for approximating highdimensional distributions. In this paper, we propose techniques to exploit this modeling ability for binary classification by discriminatively learning such models from labeled training data, i.e., using both positive and negative samples to optimize for the structures of the two models. We motivate why it is difficult to adapt existing generative methods, and propose an alternative method consisting of two parts. First, we develop a novel method to learn treestructured graphical models which optimizes an approximation of the loglikelihood ratio. We also formulate a joint objective to learn a nested sequence of optimal forestsstructured models. Second, we construct a classifier by using ideas from boosting to learn a set of discriminative trees. The final classifier can interpreted as a likelihood ratio test between two models with a larger set of pairwise features. We use crossvalidation to determine the optimal number of edges in the final model. The algorithm presented in this paper also provides a method to identify a subset of the edges that are most salient for discrimination. Experiments show that the proposed procedure outperforms generative methods such as Tree Augmented Naïve Bayes and ChowLiu as well as their boosted counterparts.
Learning HighDimensional Markov Forest Distributions: Analysis of Error Rates
, 1005
"... The problem of learning foreststructured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the ChowLiu tree through adaptive thresholding is proposed. It is shown that this algorithm is both structurally consistent and risk consistent and the error proba ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
The problem of learning foreststructured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the ChowLiu tree through adaptive thresholding is proposed. It is shown that this algorithm is both structurally consistent and risk consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the highdimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n,d,k) are given for the algorithm to satisfy structural and risk consistencies. In addition, the extremal structures for learning are identified; we prove that the independent (resp. tree) model is the hardest (resp. easiest) to learn using the proposed algorithm in terms of error rates for structure learning.
Forest density estimation
 Journal of Machine Learning Research
, 2011
"... We study graph estimation and density estimation in high dimensions, using a family of density estimators based on forest structured undirected graphical models. For density estimation, we do not assume the true distribution corresponds to a forest; rather, we form kernel density estimates of the bi ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
We study graph estimation and density estimation in high dimensions, using a family of density estimators based on forest structured undirected graphical models. For density estimation, we do not assume the true distribution corresponds to a forest; rather, we form kernel density estimates of the bivariate and univariate marginals, and apply Kruskal’s algorithm to estimate the optimal forest on held out data. We prove an oracle inequality on the excess risk of the resulting estimator relative to the risk of the best forest. For graph estimation, we consider the problem of estimating forests with restricted tree sizes. We prove that finding a maximum weight spanning forest with restricted tree size is NPhard, and develop an approximation algorithm for this problem. Viewing the tree size as a complexity parameter, we then select a forest using data splitting, and prove bounds on excess risk and structure selection consistency of the procedure. Experiments with simulated data and microarray data indicate that the methods are a practical alternative to Gaussian graphical models.
Highdimensional Gaussian graphical model selection: Walk summability and local separation criterion
 JMLR
"... We consider the problem of highdimensional Gaussian graphical model selection. We identify a set of graphs for which an efficient estimation algorithm exists, and this algorithm is based on thresholding of empirical conditional covariances. Under a set of transparent conditions, we establish struct ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
We consider the problem of highdimensional Gaussian graphical model selection. We identify a set of graphs for which an efficient estimation algorithm exists, and this algorithm is based on thresholding of empirical conditional covariances. Under a set of transparent conditions, we establish structural consistency (or sparsistency) for the proposed algorithm, when the number of samples n=Ω(J −2 min log p), where p is the number of variables and Jmin is the minimum (absolute) edge potential of the graphical model. The sufficient conditions for sparsistency are based on the notion of walksummability of the model and the presence of sparse local vertex separators in the underlying graph. We also derive novel nonasymptotic necessary conditions on the number of samples required for sparsistency.
Error Exponents for Composite Hypothesis Testing of Markov Forest Distributions
 IN PROC. OF INTL. SYMP. ON INFO. TH
, 2010
"... The problem of composite binary hypothesis testing of Markov forest (or tree) distributions is considered. The worstcase typeII error exponent is derived under the NeymanPearson formulation. Under simple null hypothesis, the error exponent is derived in closedform and is characterized in terms o ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
The problem of composite binary hypothesis testing of Markov forest (or tree) distributions is considered. The worstcase typeII error exponent is derived under the NeymanPearson formulation. Under simple null hypothesis, the error exponent is derived in closedform and is characterized in terms of the socalled bottleneck edge of the forest distribution. The least favorable distribution for detection is shown to be Markov on the secondbest maxweight spanning tree with mutual information edge weights. A necessary and sufficient condition to have positive error exponent is derived.
Laboratory for Information and Decision Systems,
"... We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing me ..."
Abstract
 Add to MetaCart
We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing methods, the observed nodes (or variables) are not constrained to be leaf nodes. Our algorithms can be applied to both discrete and Gaussian random variables and our learned models are such that all the observed and latent variables have the same domain (state space). Our first algorithm, recursive grouping, builds the latent tree recursively by identifying sibling groups using socalled information distances. One of the main contributions of this work is our second algorithm, which we refer to as CLGrouping. CLGrouping starts with a preprocessing procedure in which a tree over the observed variables is constructed. This global step groups the observed nodes that are likely to be close to each other in the true latent tree, thereby guiding subsequent recursive grouping (or equivalent procedures such as neighborjoining) on much smaller subsets of variables. This results in more accurate and efficient learning of latent trees. We
Consistent and Efficient Reconstruction of Latent Tree Models
"... Abstract—We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Our first algo ..."
Abstract
 Add to MetaCart
Abstract—We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Our first algorithm, recursivegrouping,buildsthelatenttreerecursivelybyidentifying sibling groups. Our second and main algorithm, CLGrouping, starts with a preprocessing procedure in which a tree over the observed variables is constructed. This global step guides subsequent recursive grouping (or other latenttree learning procedures) on much smaller subsets of variables. This results in more accurate and efficient learning of latent trees. We compare the proposed algorithms to other methods by performing extensive numerical experiments on various latent tree graphical models such as hidden Markov models and star graphs. I.
HIGHDIMENSIONAL STRUCTURE ESTIMATION IN ISING MODELS: LOCAL SEPARATION CRITERION 1
"... We consider the problem of highdimensional Ising (graphical) model selection. We propose a simple algorithm for structure estimation based on the thresholding of the empirical conditional variation distances. We introduce a novel criterion for tractable graph families, where this method is efficien ..."
Abstract
 Add to MetaCart
We consider the problem of highdimensional Ising (graphical) model selection. We propose a simple algorithm for structure estimation based on the thresholding of the empirical conditional variation distances. We introduce a novel criterion for tractable graph families, where this method is efficient, based on the presence of sparse local separators between node pairs in the underlying graph. For such graphs, the proposed algorithm has a sample complexity of n = �(J −2 min log p),wherepis the number of variables, and Jmin is the minimum (absolute) edge potential in the model. We also establish nonasymptotic necessary and sufficient conditions for structure estimation. 1. Introduction. The