Results 1  10
of
72
Learning Latent Tree Graphical Models
 J. of Machine Learning Research
, 2011
"... We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing me ..."
Abstract

Cited by 19 (6 self)
 Add to MetaCart
We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing methods, the observed nodes (or variables) are not constrained to be leaf nodes. Our algorithms can be applied to both discrete and Gaussian random variables and our learned models are such that all the observed and latent variables have the same domain (state space). Our first algorithm, recursive grouping, builds the latent tree recursively by identifying sibling groups using socalled information distances. One of the main contributions of this work is our second algorithm, which we refer to as CLGrouping. CLGrouping starts with a preprocessing procedure in which a tree over the observed variables is constructed. This global step groups the observed nodes that are likely to be close to each other in the true latent tree, thereby guiding subsequent recursive grouping (or equivalent procedures such as neighborjoining) on much smaller subsets of variables. This results in more accurate and efficient learning of latent trees. We also present regularized versions of our algorithms that learn latent tree approximations of arbitrary distributions. We compare
Likelihood based hierarchical clustering
 IEEE Trans. on Signal Processing
, 2004
"... This paper develops a new method for hierarchical clustering. Unlike other existing clustering schemes, our method is based on a generative, treestructured model that represents relationships between the objects to be clustered, rather than directly modeling properties of objects themselves. In cer ..."
Abstract

Cited by 14 (5 self)
 Add to MetaCart
This paper develops a new method for hierarchical clustering. Unlike other existing clustering schemes, our method is based on a generative, treestructured model that represents relationships between the objects to be clustered, rather than directly modeling properties of objects themselves. In certain problems, this generative model naturally captures the physical mechanisms responsible for relationships among objects, for example, in certain evolutionary tree problems in genetics and communication network topology identification. The paper examines the networking problem in some detail, to illustrate the new clustering method. More broadly, the generative model may not reflect actual physical mechanisms, but it nonetheless provides a means for dealing with errors in the similarity matrix, simultaneously promoting two desirable features in clustering: intraclass similarity and interclass dissimilarity.
Network tomography: A review and recent developments
 In Fan and Koul, editors, Frontiers in Statistics
, 2006
"... The modeling and analysis of computer communications networks give rise to a variety of interesting statistical problems. This paper focuses on network tomography, a term used to characterize two classes of largescale inverse problems. The first deals with passive tomography where aggregate data ar ..."
Abstract

Cited by 13 (5 self)
 Add to MetaCart
The modeling and analysis of computer communications networks give rise to a variety of interesting statistical problems. This paper focuses on network tomography, a term used to characterize two classes of largescale inverse problems. The first deals with passive tomography where aggregate data are collected at the individual router/node level and the goal is to recover pathlevel information. The main problem of interest here is the estimation of the origindestination traffic matrix. The second, referred to as active tomography, deals with reconstructing linklevel information from endtoend pathlevel measurements obtained by actively probing the network. The primary application in this case is estimation of qualityofservice parameters such as loss rates and delay distributions. The paper provides a review of the statistical issues and developments in network tomography with an emphasis on active tomography. An application to Internet telephony is used to illustrate the results.
Network topology discovery using finite mixture models
 IEEE Int’l Conf. on Acoustics, Speech, and Signal Processing (ICASSP) 2003
, 2004
"... In this article we propose a network topology estimation strategy using unicast endtoend packet pair delay measurements that is based on mixture models for the delay covariances. An unsupervised learning algorithms is applied to estimate the number of mixture components and delay covariances. The ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
In this article we propose a network topology estimation strategy using unicast endtoend packet pair delay measurements that is based on mixture models for the delay covariances. An unsupervised learning algorithms is applied to estimate the number of mixture components and delay covariances. The leaf pairs are clustered by a MAP criterion and passed to a hierarchical topology construction algorithm to rebuild the tree. Results from an ns simulation show that our algorithm can identify a network tree with 8 leaf nodes. 1.
A Markov Random Field Approach to MulticastBased Network
 Inference Problems,” 2006 IEEE International Symposium on Information Theory
, 2006
"... Abstract — In this paper, we provide a new unified approach to analyze and solve multicastbased network inference problems. We show that the outcome variables induced by the transmission of a multicast packet form a Markov random field on the multicast tree. We present an algorithm that recovers th ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
Abstract — In this paper, we provide a new unified approach to analyze and solve multicastbased network inference problems. We show that the outcome variables induced by the transmission of a multicast packet form a Markov random field on the multicast tree. We present an algorithm that recovers the multicast tree topology based on the values of an additive tree metric on pairs of the terminal nodes. We prove the correctness of the algorithm. We also give several examples of an additive tree metric for which the values on pairs of the terminal nodes can be estimated from traffic measurements taken at the receivers. In addition, we propose an algorithm to recover the link performance parameters from the joint distribution of the outcome variables at the terminal nodes. I.
Network delay inference from additive metrics, Preprint. Available at Arxiv: math.PR/0604367
, 2006
"... We use computational phylogenetic techniques to solve a central problem in inferential network monitoring. More precisely, we design a novel algorithm for multicastbased delay inference, that is, the problem of reconstructing delay characteristics of a network from endtoend delay measurements on ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
We use computational phylogenetic techniques to solve a central problem in inferential network monitoring. More precisely, we design a novel algorithm for multicastbased delay inference, that is, the problem of reconstructing delay characteristics of a network from endtoend delay measurements on network paths. Our inference algorithm is based on additive metric techniques used in phylogenetics. It runs in polynomial time and requires a sample of size only poly(log n). We also show how to recover the topology of the routing tree. 1
DiffProbe: Detecting ISP service discrimination
 in Infocom
, 2010
"... Abstract—We propose an active probing method, called Differential Probing or DiffProbe, to detect whether an access ISP is deploying forwarding mechanisms such as priority scheduling, variations of WFQ, or WRED to discriminate against some of its customer flows. DiffProbe aims to detect if the ISP i ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Abstract—We propose an active probing method, called Differential Probing or DiffProbe, to detect whether an access ISP is deploying forwarding mechanisms such as priority scheduling, variations of WFQ, or WRED to discriminate against some of its customer flows. DiffProbe aims to detect if the ISP is doing one or both of delay discrimination and loss discrimination. The basic idea in DiffProbe is to compare the delays and packet losses experienced by two flows: an Application flow A and a Probing flow P. The paper describes the statistical methods that DiffProbe uses, a novel method for distinguishing between Strict Priority and WFQvariant packet scheduling, simulation and emulation experiments, and a few realworld tests at major access ISPs. I.
Embracing statistical challenges in the information technology age
 Technometrics
"... www.stat.berkeley.edu/users/binyu) This article examines the role of statistics in the age of information technology (IT). It begins by examining the current state of IT and of the cyberinfrastructure initiative aimed at integrating the technologies into science, engineering, and education to conver ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
www.stat.berkeley.edu/users/binyu) This article examines the role of statistics in the age of information technology (IT). It begins by examining the current state of IT and of the cyberinfrastructure initiative aimed at integrating the technologies into science, engineering, and education to convert massive amounts of data into useful information. Selected applications from science and text processing are introduced to provide concrete examples of massive data sets and the statistical challenges that they pose. The thriving field of machine learning is reviewed as an example of current achievements driven by computations and IT. Ongoing challenges that we face in the IT revolution are also highlighted. The paper concludes that for the healthy future of our field, computer technologies have to be integrated into statistics, and statistical thinking in turn must be integrated into computer technologies. 1.
Statistical Inverse Problems in Active Network Tomography
"... Abstract: Active network tomography includes several interesting statistical inverse problems that arise in the context of computer and communication networks. The primary goal in these problems is to recover linklevel information about qualityofservice parameters from aggregate endtoend data m ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Abstract: Active network tomography includes several interesting statistical inverse problems that arise in the context of computer and communication networks. The primary goal in these problems is to recover linklevel information about qualityofservice parameters from aggregate endtoend data measured on paths across the network. The estimation and monitoring of these parameters are of considerable interest to network engineers and Internet service providers. This paper provides a review of the inverse problems and recent research on inference for loss rates and delay distributions. Some new results on parametric inference for delay distributions are developed. The results are illustrated using a network application related to Internet telephony. 1. The Inverse Problems Consider a tree T = {V, E} with a set of nodes V and a set of links or edges E. Figure 1 shows two examples: a simple twolayer symmetric binary tree on the left and a more general fourlayer tree on the right. Each member of E is a directed link numbered after the node at its terminus. V includes a root node 0, a set of receiver or destination nodes R and a set of internal nodes I. All transmissions on the tree are initiated at the root node. The internal nodes have a single incoming link and at least two outgoing links (children). The receiver nodes have a single incoming link but no