Results 1 -
5 of
5
A Large-Deviation Analysis for the Maximum Likelihood Learning of Tree Structures
, 2009
"... The problem of maximum-likelihood learning of the Markov tree structure of an unknown distribution from samples is considered when the distribution is Markov on a tree. Large-deviation analysis of the error in estimation of the set of edges of the tree is considered. Necessary and sufficient conditi ..."
Abstract
-
Cited by 8 (7 self)
- Add to MetaCart
The problem of maximum-likelihood learning of the Markov tree structure of an unknown distribution from samples is considered when the distribution is Markov on a tree. Large-deviation analysis of the error in estimation of the set of edges of the tree is considered. Necessary and sufficient conditions are provided to ensure that this error probability decays exponentially. These conditions are based on the mutual information between each pair of variables being distinct from that of other pairs. The rate of error decay, which is the error exponent, is derived using the large-deviation principle. For a discrete distribution, the error exponent is approximated using Euclidean information theory, and is given by a ratio, interpreted as the signal-to-noise ratio (SNR) for learning. Extensions to the Gaussian case are also considered.
Learning graphical models for hypothesis testing
- in Proc. 14th IEEE Statist. Signal Process. Workshop
, 2007
"... Abstract—Sparse graphical models have proven to be a flexible class of multivariate probability models for approximating high-dimensional distributions. In this paper, we propose techniques to exploit this modeling ability for binary classification by discriminatively learning such models from label ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Abstract—Sparse graphical models have proven to be a flexible class of multivariate probability models for approximating high-dimensional distributions. In this paper, we propose techniques to exploit this modeling ability for binary classification by discriminatively learning such models from labeled training data, i.e., using both positive and negative samples to optimize for the structures of the two models. We motivate why it is difficult to adapt existing generative methods, and propose an alternative method consisting of two parts. First, we develop a novel method to learn tree-structured graphical models which optimizes an approximation of the log-likelihood ratio. We also formulate a joint objective to learn a nested sequence of optimal forests-structured models. Second, we construct a classifier by using ideas from boosting to learn a set of discriminative trees. The final classifier can interpreted as a likelihood ratio test between two models with a larger set of pairwise features. We use cross-validation to determine the optimal number of edges in the final model. The algorithm presented in this paper also provides a method to identify a subset of the edges that are most salient for discrimination. Experiments show that the proposed procedure outperforms generative methods such as Tree Augmented Naïve Bayes and Chow-Liu as well as their boosted counterparts.
Error Exponents for Composite Hypothesis Testing of Markov Forest Distributions
- IN PROC. OF INTL. SYMP. ON INFO. TH
, 2010
"... The problem of composite binary hypothesis testing of Markov forest (or tree) distributions is considered. The worstcase type-II error exponent is derived under the Neyman-Pearson formulation. Under simple null hypothesis, the error exponent is derived in closed-form and is characterized in terms o ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
The problem of composite binary hypothesis testing of Markov forest (or tree) distributions is considered. The worstcase type-II error exponent is derived under the Neyman-Pearson formulation. Under simple null hypothesis, the error exponent is derived in closed-form and is characterized in terms of the so-called bottleneck edge of the forest distribution. The least favorable distribution for detection is shown to be Markov on the second-best max-weight spanning tree with mutual information edge weights. A necessary and sufficient condition to have positive error exponent is derived.
Laboratory for Information and Decision Systems,
"... We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing me ..."
Abstract
- Add to MetaCart
We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing methods, the observed nodes (or variables) are not constrained to be leaf nodes. Our algorithms can be applied to both discrete and Gaussian random variables and our learned models are such that all the observed and latent variables have the same domain (state space). Our first algorithm, recursive grouping, builds the latent tree recursively by identifying sibling groups using so-called information distances. One of the main contributions of this work is our second algorithm, which we refer to as CLGrouping. CLGrouping starts with a pre-processing procedure in which a tree over the observed variables is constructed. This global step groups the observed nodes that are likely to be close to each other in the true latent tree, thereby guiding subsequent recursive grouping (or equivalent procedures such as neighbor-joining) on much smaller subsets of variables. This results in more accurate and efficient learning of latent trees. We
Consistent and Efficient Reconstruction of Latent Tree Models
"... Abstract—We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Our first algo ..."
Abstract
- Add to MetaCart
Abstract—We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Our first algorithm, recursivegrouping,buildsthelatenttreerecursivelybyidentifying sibling groups. Our second and main algorithm, CLGrouping, starts with a pre-processing procedure in which a tree over the observed variables is constructed. This global step guides subsequent recursive grouping (or other latent-tree learning procedures) on much smaller subsets of variables. This results in more accurate and efficient learning of latent trees. We compare the proposed algorithms to other methods by performing extensive numerical experiments on various latent tree graphical models such as hidden Markov models and star graphs. I.

