Results 1  10
of
13
Learning Latent Tree Graphical Models
 J. of Machine Learning Research
, 2011
"... We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing me ..."
Abstract

Cited by 46 (10 self)
 Add to MetaCart
We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing methods, the observed nodes (or variables) are not constrained to be leaf nodes. Our algorithms can be applied to both discrete and Gaussian random variables and our learned models are such that all the observed and latent variables have the same domain (state space). Our first algorithm, recursive grouping, builds the latent tree recursively by identifying sibling groups using socalled information distances. One of the main contributions of this work is our second algorithm, which we refer to as CLGrouping. CLGrouping starts with a preprocessing procedure in which a tree over the observed variables is constructed. This global step groups the observed nodes that are likely to be close to each other in the true latent tree, thereby guiding subsequent recursive grouping (or equivalent procedures such as neighborjoining) on much smaller subsets of variables. This results in more accurate and efficient learning of latent trees. We also present regularized versions of our algorithms that learn latent tree approximations of arbitrary distributions. We compare
A LargeDeviation Analysis for the Maximum Likelihood Learning of Tree Structures
, 2009
"... The problem of maximumlikelihood learning of the Markov tree structure of an unknown distribution from samples is considered when the distribution is Markov on a tree. Largedeviation analysis of the error in estimation of the set of edges of the tree is considered. Necessary and sufficient conditi ..."
Abstract

Cited by 27 (17 self)
 Add to MetaCart
(Show Context)
The problem of maximumlikelihood learning of the Markov tree structure of an unknown distribution from samples is considered when the distribution is Markov on a tree. Largedeviation analysis of the error in estimation of the set of edges of the tree is considered. Necessary and sufficient conditions are provided to ensure that this error probability decays exponentially. These conditions are based on the mutual information between each pair of variables being distinct from that of other pairs. The rate of error decay, which is the error exponent, is derived using the largedeviation principle. For a discrete distribution, the error exponent is approximated using Euclidean information theory, and is given by a ratio, interpreted as the signaltonoise ratio (SNR) for learning. Extensions to the Gaussian case are also considered.
HIGHDIMENSIONAL STRUCTURE ESTIMATION IN ISING MODELS: LOCAL SEPARATION CRITERION
, 2012
"... We consider the problem of highdimensional Ising (graphical) model selection. We propose a simple algorithm for structure estimation based on the thresholding of the empirical conditional variation distances. We introduce a novel criterion for tractable graph families, where this method is efficien ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
We consider the problem of highdimensional Ising (graphical) model selection. We propose a simple algorithm for structure estimation based on the thresholding of the empirical conditional variation distances. We introduce a novel criterion for tractable graph families, where this method is efficient, based on the presence of sparse local separators between node pairs in the underlying graph. For such graphs, the proposed algorithm has a sample complexity of n = �(J −2 min log p), where p is the number of variables, and Jmin is the minimum (absolute) edge potential in the model. We also establish nonasymptotic necessary and sufficient conditions for structure estimation.
A survey on latent tree models and applications
, 2013
"... In data analysis, latent variables play a central role because they help provide powerful insights into a wide variety of phenomena, ranging from biological to human sciences. The latent tree model, a particular type of probabilistic graphical models, deserves attention. Its simple structure a tree ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
In data analysis, latent variables play a central role because they help provide powerful insights into a wide variety of phenomena, ranging from biological to human sciences. The latent tree model, a particular type of probabilistic graphical models, deserves attention. Its simple structure a tree allows simple and efficient inference, while its latent variables capture complex relationships. In the past decade, the latent tree model has been subject to significant theoretical and methodological developments. In this review, we propose a comprehensive study of this model. First we summarize key ideas underlying the model. Second we explain how it can be efficiently learned from data. Third we illustrate its use within three types of applications: latent structure discovery, multidimensional clustering, and probabilistic inference. Finally, we conclude and give promising directions for future researches in this field.
Necessary and sufficient conditions for highdimensional salient subset recovery
 in Int. Symp. Inf. Th
, 2010
"... Abstract—We consider recovering the salient feature subset for distinguishing between two probability models from i.i.d. samples. Identifying the salient set improves discrimination performance and reduces complexity. The focus in this work is on the highdimensional regime where the number of varia ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract—We consider recovering the salient feature subset for distinguishing between two probability models from i.i.d. samples. Identifying the salient set improves discrimination performance and reduces complexity. The focus in this work is on the highdimensional regime where the number of variables d, the number of salient variables k and the number of samples n all grow. The definition of saliency is motivated by error exponents in a binary hypothesis test and is stated in terms of relative entropies. It is shown that if n grows faster than maxfck log((dk)=k); exp(c′k)g for constants c; c′, then the error probability in selecting the salient set can be made arbitrarily small. Thus, n can be much smaller than d. The exponential rate of decay and converse theorems are also provided. An efficient and consistent algorithm is proposed when the distributions are graphical models which are Markov on trees.
Laboratory for Information and Decision Systems,
"... We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing me ..."
Abstract
 Add to MetaCart
We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing methods, the observed nodes (or variables) are not constrained to be leaf nodes. Our algorithms can be applied to both discrete and Gaussian random variables and our learned models are such that all the observed and latent variables have the same domain (state space). Our first algorithm, recursive grouping, builds the latent tree recursively by identifying sibling groups using socalled information distances. One of the main contributions of this work is our second algorithm, which we refer to as CLGrouping. CLGrouping starts with a preprocessing procedure in which a tree over the observed variables is constructed. This global step groups the observed nodes that are likely to be close to each other in the true latent tree, thereby guiding subsequent recursive grouping (or equivalent procedures such as neighborjoining) on much smaller subsets of variables. This results in more accurate and efficient learning of latent trees. We
Learning Latent Tree Graphical Models
, 2014
"... Terms of Use Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. Detailed Terms The MIT Faculty has made this article openly available. Please share how this access benefits you. ..."
Abstract
 Add to MetaCart
Terms of Use Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. Detailed Terms The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters.
Legislative Prediction via Random Walks over a Heterogeneous Graph
"... In this article, we propose a random walkbased model to predict legislators ’ votes on a set of bills. In particular, we first convert roll call data, i.e. the recorded votes and the corresponding deliberative bodies, to a heterogeneous graph, where both the legislators and bills are treated as ver ..."
Abstract
 Add to MetaCart
(Show Context)
In this article, we propose a random walkbased model to predict legislators ’ votes on a set of bills. In particular, we first convert roll call data, i.e. the recorded votes and the corresponding deliberative bodies, to a heterogeneous graph, where both the legislators and bills are treated as vertices. Three types of weighted edges are then computed accordingly, representing legislators ’ social and political relations, bills ’ semantic similarity, and legislatorbill vote relations. Through performing twostage random walks over this heterogeneous graph, we can estimate legislative votes on past and future bills. We apply this proposed method on real legislative roll call data of the United States Congress and compare to stateoftheart approaches. The experimental results demonstrate the superior performance and unique prediction power of the proposed model. 1