Results 1  10
of
13
Learning in graphical models
 STATISTICAL SCIENCE
, 2004
"... Statistical applications in fields such as bioinformatics, information retrieval, speech processing, image processing and communications often involve largescale models in which thousands or millions of random variables are linked in complex ways. Graphical models provide a general methodology for ..."
Abstract

Cited by 655 (10 self)
 Add to MetaCart
Statistical applications in fields such as bioinformatics, information retrieval, speech processing, image processing and communications often involve largescale models in which thousands or millions of random variables are linked in complex ways. Graphical models provide a general methodology for approaching these problems, and indeed many of the models developed by researchers in these applied fields are instances of the general graphical model formalism. We review some of the basic ideas underlying graphical models, including the algorithmic ideas that allow graphical models to be deployed in largescale data analysis problems. We also present examples of graphical models in bioinformatics, errorcontrol coding and language processing.
Robust bayesian mixture modelling
 Neurocomputing
, 2005
"... Abstract. Bayesian approaches to density estimation and clustering using mixture distributions allow the automatic determination of the number of components in the mixture. Previous treatments have focussed on mixtures having Gaussian components, but these are well known to be sensitive to outliers. ..."
Abstract

Cited by 24 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Bayesian approaches to density estimation and clustering using mixture distributions allow the automatic determination of the number of components in the mixture. Previous treatments have focussed on mixtures having Gaussian components, but these are well known to be sensitive to outliers. This can lead to excessive sensitivity to small numbers of data points and consequent overestimates of the number of components. In this paper we develop a Bayesian approach to mixture modelling based on StudentØ distributions, which are heavier tailed than Gaussians and hence more robust. By expressing the StudentØ distribution as a marginalisation over additional latent variables we are able to derive a tractable variational inference algorithm for this model, which includes Gaussian mixtures as a special case. Results on a variety of real data sets demonstrate the improved robustness of our approach. 1
A STATESPACE MIXED MEMBERSHIP BLOCKMODEL FOR DYNAMIC NETWORK TOMOGRAPHY
 SUBMITTED TO THE ANNALS OF APPLIED STATISTICS
"... In a dynamic social or biological environment, the interactions between the actors can undergo large and systematic changes. In this paper, we propose a modelbased approach to analyze what we will refer to as the dynamic tomography of such timeevolving networks. Our approach offers an intuitive bu ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
In a dynamic social or biological environment, the interactions between the actors can undergo large and systematic changes. In this paper, we propose a modelbased approach to analyze what we will refer to as the dynamic tomography of such timeevolving networks. Our approach offers an intuitive but powerful tool to infer the semantic underpinnings of each actor, such as its social roles or biological functions, underlying the observed network topologies. Our model builds on earlier work on a mixed membership stochastic blockmodel for static networks, and the statespace model for tracking object trajectory. It overcomes a major limitation of many current network inference techniques, which assume that each actor plays a unique and invariant role that accounts for all its interactions with other actors; instead, our method models the role of each actor as a timeevolving mixed membership vector that allows actors to behave differently over time and carry out different roles/functions when interacting with different peers, which is closer to reality. We present an efficient algorithm for approximate inference and learning using our model; and we applied our model to analyze a social network between monks (i.e., the Sampson’s network), a dynamic email communication network between the Enron employees, and a rewiring gene interaction network of fruit fly collected during its full life cycle. In all cases, our model reveals interesting patterns of the dynamic roles of the actors.
Mixed membership analysis of highthroughput interaction studies: Relational data
, 2007
"... In this paper, we consider the statistical analysis of a protein interaction network. We propose a Bayesian model that uses a hierarchy of probabilistic assumptions about the way proteins interact with one another in order to: (i) identify the number of nonobservable functional modules; (ii) estima ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
In this paper, we consider the statistical analysis of a protein interaction network. We propose a Bayesian model that uses a hierarchy of probabilistic assumptions about the way proteins interact with one another in order to: (i) identify the number of nonobservable functional modules; (ii) estimate the degree of membership of proteins to modules; and (iii) estimate typical interaction patterns among the functional modules themselves. Our model describes large amount of (relational) data using a relatively small set of parameters that we can reliably estimate with an efficient inference algorithm. We apply our methodology to data on proteintoprotein interactions in saccharomyces cerevisiae to reveal proteins’ diverse functional roles. The case study provides the basis for an overview of which scientific questions can be addressed using our methods, and for a discussion of technical issues.
Admixtures of latent blocks with application to protein interaction networks. Manuscript under review
, 2006
"... In this paper, we consider the statistical analysis of a protein interaction network. We propose a Bayesian model that uses a hierarchy of probabilistic assumptions about the way proteins interact with one another in order to: (i) identify the number of nonobservable functional modules; (ii) estima ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
In this paper, we consider the statistical analysis of a protein interaction network. We propose a Bayesian model that uses a hierarchy of probabilistic assumptions about the way proteins interact with one another in order to: (i) identify the number of nonobservable functional modules; (ii) estimate the degree of membership of proteins to modules; and (iii) estimate typical interaction patterns among the functional modules themselves. Our model describes large amount of (relational) data using a relatively small set of parameters that we can reliably estimate with an efficient inference algorithm. We apply our methodology to data on proteintoprotein interactions in saccharomyces cerevisiae to reveal proteins’ diverse functional roles. The case study provides the basis for an overview of which scientific questions can be addressed using our methods, and for a discussion of technical issues.
Bayesian Inconsistency under Misspecification
, 2006
"... This is a synopsis of the work underlying the author’s contributed plenary presentation at the Valencia 8 meeting on Bayesian Statistics, held in Benidorm, June 2006. We show that Bayesian inference can be inconsistent under misspecification. Specifically, we exhibit a distribution P ∗ , a model M ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
This is a synopsis of the work underlying the author’s contributed plenary presentation at the Valencia 8 meeting on Bayesian Statistics, held in Benidorm, June 2006. We show that Bayesian inference can be inconsistent under misspecification. Specifically, we exhibit a distribution P ∗ , a model M with P ∗ � ∈ M, and a prior Π on M such that the prior puts significant mass on ˜ P, the best approximation to P ∗ within the set M. Yet, if data are i.i.d. ∼ P ∗ , then for all large samples, the Bayesian posterior puts its mass on a subset of M that only contains bad approximations to P ∗. This result holds both if approximation quality is defined in terms of KullbackLeibler divergence and if it is defined in terms of classification risk. We present several variations of this result, including one in which, with P ∗probability 1, for all large enough samples, predictions of the next outcome based on the Bayesian predictive distribution become worse than predictions based on purely random guessing.
Summary
"... For many applications of machine learning the goal is to predict the value of a vector c given the value of a vector x of input features. In a classification problem c represents a discrete class label, whereas in a regression problem it corresponds to one or more continuous variables. From a probab ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
For many applications of machine learning the goal is to predict the value of a vector c given the value of a vector x of input features. In a classification problem c represents a discrete class label, whereas in a regression problem it corresponds to one or more continuous variables. From a probabilistic perspective, the goal is to find the conditional distribution p(cx). The most common approach to this problem is to represent the conditional distribution using a parametric model, and then to determine the parameters using a training set consisting of pairs {xn, cn} of input vectors along with their corresponding target output vectors. The resulting conditional distribution can be used to make predictions of c for new values of x. This is known as a discriminative approach, since the conditional distribution discriminates directly between the different values of c. An alternative approach is to find the joint distribution p(x, c), expressed for instance as a parametric model, and then subsequently uses this joint distribution to evaluate the conditional p(cx) in order to make predictions of c
Variational Bayes for generic topic models
"... Abstract. The article contributes a derivation of variational Bayes for a large class of topic models by generalising from the wellknown model of latent Dirichlet allocation. For an abstraction of these models as systems of interconnected mixtures, variational update equations are obtained, leading ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract. The article contributes a derivation of variational Bayes for a large class of topic models by generalising from the wellknown model of latent Dirichlet allocation. For an abstraction of these models as systems of interconnected mixtures, variational update equations are obtained, leading to inference algorithms for models that so far have used Gibbs sampling exclusively. 1
Abstract Robust Bayesian Mixture Modelling
"... Bayesian approaches to density estimation and clustering using mixture distributions allow the automatic determination of the number of components in the mixture. Previous treatments have focussed on mixtures having Gaussian components, but these are well known to be sensitive to outliers, which can ..."
Abstract
 Add to MetaCart
Bayesian approaches to density estimation and clustering using mixture distributions allow the automatic determination of the number of components in the mixture. Previous treatments have focussed on mixtures having Gaussian components, but these are well known to be sensitive to outliers, which can lead to excessive sensitivity to small numbers of data points and consequent overestimates of the number of components. In this paper we develop a Bayesian approach to mixture modelling based on Studentt distributions, which are heavier tailed than Gaussians and hence more robust. By expressing the Studentt distribution as a marginalisation over additional latent variables we are able to derive a tractable variational inference algorithm for this model, which includes Gaussian mixtures as a special case. Results on a variety of real data sets demonstrate the improved robustness of our approach. This paper is published in Neurocomputing, volume 64, pages 235–252. Due to different formatting, there is no exact page correspondance between this version and the published version, but the content should be identical. The published version is available atwww.sciencedirect.com. c ○ 2004 Elsevier B.V. 1