Results 1  10
of
10
Nonparametric Multigroup Membership Model for Dynamic Networks
"... Relational data—like graphs, networks, and matrices—is often dynamic, where the relational structure evolves over time. A fundamental problem in the analysis of timevarying network data is to extract a summary of the common structure and the dynamics of the underlying relations between the entitie ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Relational data—like graphs, networks, and matrices—is often dynamic, where the relational structure evolves over time. A fundamental problem in the analysis of timevarying network data is to extract a summary of the common structure and the dynamics of the underlying relations between the entities. Here we build on the intuition that changes in the network structure are driven by the dynamics at the level of groups of nodes. We propose a nonparametric multigroup membership model for dynamic networks. Our model contains three main components: We model the birth and death of individual groups with respect to the dynamics of the network structure via a distance dependent Indian Buffet Process. We capture the evolution of individual node group memberships via a Factorial Hidden Markov model. And, we explain the dynamics of the network structure by explicitly modeling the connectivity structure of groups. We demonstrate our model’s capability of identifying the dynamics of latent groups in a number of different types of network data. Experimental results show that our model provides improved predictive performance over existing dynamic network models on future network forecasting and missing link prediction. 1
The Supervised IBP: Neighbourhood Preserving Infinite Latent Feature Models
"... We propose a probabilistic model to infer supervised latent variables in the Hamming space from observed data. Our model allows simultaneous inference of the number of binary latent variables, and their values. The latent variables preserve neighbourhood structure of the data in a sense that object ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
We propose a probabilistic model to infer supervised latent variables in the Hamming space from observed data. Our model allows simultaneous inference of the number of binary latent variables, and their values. The latent variables preserve neighbourhood structure of the data in a sense that objects in the same semantic concept have similar latent values, and objects in different concepts have dissimilar latent values. We formulate the supervised infinite latent variable problem based on an intuitive principle of pulling objects together if they are of the same type, and pushing them apart if they are not. We then combine this principle with a flexible Indian Buffet Process prior on the latent variables. We show that the inferred supervised latent variables can be directly used to perform a nearest neighbour search for the purpose of retrieval. We introduce a new application of dynamically extending hash codes, and show how to effectively couple the structure of the hash codes with continuously growing structure of the neighbourhood preserving infinite latent feature space. 1
A survey of nonexchangeable priors for Bayesian nonparametric models
, 2014
"... Dependent nonparametric processes extend distributions over measures, such as the Dirichlet process and the beta process, to give distributions over collections of measures, typically indexed by values in some covariate space. Such models are appropriate priors when exchangeability assumptions do ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Dependent nonparametric processes extend distributions over measures, such as the Dirichlet process and the beta process, to give distributions over collections of measures, typically indexed by values in some covariate space. Such models are appropriate priors when exchangeability assumptions do not hold, and instead we want our model to vary fluidly with some set of covariates. Since the concept of dependent nonparametric processes was formalized by MacEachern [1], there have been a number of models proposed and used in the statistics and machine learning literatures. Many of these models exhibit underlying similarities, an understanding of which, we hope, will help in selecting an appropriate prior, developing new models, and leveraging inference techniques.
Stochastic Blockmodel with Cluster Overlap, Relevance Selection, and SimilarityBased Smoothing
"... Abstract—Stochastic blockmodels provide a rich, probabilistic framework for modeling relational data by expressing the objects being modeled in terms of a latent vector representation. This representation can be a latent indicator vector denoting the cluster membership (hard clustering), a vector ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Stochastic blockmodels provide a rich, probabilistic framework for modeling relational data by expressing the objects being modeled in terms of a latent vector representation. This representation can be a latent indicator vector denoting the cluster membership (hard clustering), a vector of cluster membership probabilities (soft clustering), or more generally a realvalued vector (latent space representation). Recently, a new class of overlapping stochastic blockmodels has been proposed where the idea is to allow the objects to have hard memberships in multiple clusters (in form of a latent binary vector). This aspect captures the properties of many realworld networks in domains such as biology and social networks where objects can simultaneously have memberships in multiple clusters owing to the multiple roles they may have. In this paper, we improve upon this model in three key ways:
Metadata Dependent Mondrian Processes
"... Stochastic partition processes in a product space play an important role in modeling relational data. Recent studies on the Mondrian process have introduced more flexibility into the block structure in relational models. A sideeffect of such high flexibility is that, in data sparsity scenarios, t ..."
Abstract
 Add to MetaCart
(Show Context)
Stochastic partition processes in a product space play an important role in modeling relational data. Recent studies on the Mondrian process have introduced more flexibility into the block structure in relational models. A sideeffect of such high flexibility is that, in data sparsity scenarios, the model is prone to overfit. In reality, relational entities are always associated with meta information, such as user profiles in a social network. In this paper, we propose a metadata dependent Mondrian process (MDMP) to incorporate meta information into the stochastic partition process in the product space and the entity allocation process on the resulting block structure. MDMP can not only encourage homogeneous relational interactions within blocks but also discourage metalabel diversity within blocks. Regularized by meta information, MDMP becomes more robust in data sparsity scenarios and easier to converge in posterior inference. We apply MDMP to link prediction and rating prediction and demonstrate that MDMP is more effective than the baseline models in prediction accuracy with a more parsimonious model structure. 1.
Supervised By
"... Computational models of preferences have been applied in variousdomainsincludingeconomics, consumerresearchand marketing. They are also commonly used for designing recommender agents for suggesting new content to the users based on their inferred preferences. A major challenge for such systems is to ..."
Abstract
 Add to MetaCart
(Show Context)
Computational models of preferences have been applied in variousdomainsincludingeconomics, consumerresearchand marketing. They are also commonly used for designing recommender agents for suggesting new content to the users based on their inferred preferences. A major challenge for such systems is to cater to the changing needs of the users over time. Although, user preferences are known to be dynamic in nature, there are few methods for predicting these dynamics in a reliable way. In this thesis, the problem of defining predictive models of dynamic user preferences is addressed. A solution to this problem is provided by formulating a framework that incorporates history and time dependent changes in user preferences for items. Two types of changes in user preferences are identified. Firstly, user’s interestsaremodeledaseitherfavoringfamiliarityorlooking for exploring new content. Secondly, user’s preferences for familiar items are defined to change as a function of exposure for incorporating the psychological effects of boredom from repetition. Such a framework for estimating dynamic preferences of users provides unprecedented insights to user changing needs. These insights are proposed to be incorporatedinsolvingtwoimportantproblemsforcontentservices; user retention and temporallyaware recommendations.
Editor: Somebody
"... We define the beta diffusion tree, a random tree structure with a set of leaves that defines a collection of overlapping subsets of objects, known as a feature allocation. A generative process for the tree structure is defined in terms of particles (representing the objects) diffusing in some conti ..."
Abstract
 Add to MetaCart
We define the beta diffusion tree, a random tree structure with a set of leaves that defines a collection of overlapping subsets of objects, known as a feature allocation. A generative process for the tree structure is defined in terms of particles (representing the objects) diffusing in some continuous space, analogously to the Dirichlet diffusion tree (Neal, 2003b), which defines a tree structure over partitions (i.e., nonoverlapping subsets) of the objects. Unlike in the Dirichlet diffusion tree, multiple copies of a particle may exist and diffuse along multiple branches in the beta diffusion tree, and an object may therefore belong to multiple subsets of particles. We demonstrate how to build a hierarchicallyclustered factor analysis model with the beta diffusion tree and how to perform inference over the random tree structures with a Markov chain Monte Carlo algorithm. We conclude with several numerical experiments on missing data problems with data sets of gene expression microarrays, international development statistics, and intranational socioeconomic measurements.
Big Learning with Bayesian Methods
, 2007
"... Explosive growth in data and availability of cheap computing resources have sparked increasing interest in Big learning, an emerging subfield that studies scalable machine learning algorithms, systems, and applications with Big Data. Bayesian methods represent one important class of statistic metho ..."
Abstract
 Add to MetaCart
(Show Context)
Explosive growth in data and availability of cheap computing resources have sparked increasing interest in Big learning, an emerging subfield that studies scalable machine learning algorithms, systems, and applications with Big Data. Bayesian methods represent one important class of statistic methods for machine learning, with substantial recent developments on adaptive, flexible and scalable Bayesian learning. This article provides a survey of the recent advances in Big learning with Bayesian methods, termed Big Bayesian Learning, including nonparametric Bayesian methods for adaptively inferring model complexity, regularized Bayesian inference for improving the flexibility via posterior regularization, and scalable algorithms and systems based on stochastic subsampling and distributed computing for dealing with largescale applications.
Beta Diffusion Trees
"... We define the beta diffusion tree, a random tree structure with a set of leaves that defines a collection of overlapping subsets of objects, known as a feature allocation. The generative process for the tree is defined in terms of particles (representing the objects) diffusing in some continuous sp ..."
Abstract
 Add to MetaCart
(Show Context)
We define the beta diffusion tree, a random tree structure with a set of leaves that defines a collection of overlapping subsets of objects, known as a feature allocation. The generative process for the tree is defined in terms of particles (representing the objects) diffusing in some continuous space, analogously to the Dirichlet and Pitman–Yor diffusion trees (Neal, 2003b; Knowles & Ghahramani, 2011), both of which define tree structures over clusters of the particles. With the beta diffusion tree, however, multiple copies of a particle may exist and diffuse to multiple locations in the continuous space, resulting in (a random number of) possibly overlapping clusters of the objects. We demonstrate how to build a hierarchicallyclustered factor analysis model with the beta diffusion tree and how to perform inference over the random tree structures with a Markov chain Monte Carlo algorithm. We conclude with several numerical experiments on missing data problems with data sets of gene expression arrays, international development statistics, and intranational socioeconomic measurements. 1.
The Supervised IBP: Z → X Linear Gaussian Model
"... WHAT: a probabilistic model to infer binary latent variables that preserve neighbourhood structure of the data • WHY: to perform a nearest neighbour search for the purpose of retrieval • WHEN: in dynamic and streaming nature of the Internet data • HOW: the Indian Buffet Process prior coupled with a ..."
Abstract
 Add to MetaCart
WHAT: a probabilistic model to infer binary latent variables that preserve neighbourhood structure of the data • WHY: to perform a nearest neighbour search for the purpose of retrieval • WHEN: in dynamic and streaming nature of the Internet data • HOW: the Indian Buffet Process prior coupled with a preference relation • WHERE: dynamic extension of hash codes