Results 1 
9 of
9
The hidden life of latent variables: Bayesian learning with mixed graph models
, 2008
"... Directed acyclic graphs (DAGs) have been widely used as a representation of conditional independence in machine learning and statistics. Moreover, hidden or latent variables are often an important component of graphical models. However, DAG models suffer from an important limitation: the family of D ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
Directed acyclic graphs (DAGs) have been widely used as a representation of conditional independence in machine learning and statistics. Moreover, hidden or latent variables are often an important component of graphical models. However, DAG models suffer from an important limitation: the family of DAGs is not closed under marginalization of hidden variables. This means that in general we cannot use a DAG to represent the independencies over a subset of variables in a larger DAG. Directed mixed graphs (DMGs) are a representation that includes DAGs as a special case, and overcomes this limitation. This paper introduces algorithms for performing Bayesian inference in Gaussian and probit DMG models. An important requirement for inference is the characterization of the distribution over parameters of the models. We introduce a new distribution for covariance matrices of Gaussian DMGs. We discuss and illustrate how several Bayesian machine learning tasks can benefit from the principle presented here: the power to model dependencies that are generated from hidden variables, but without necessarily modelling such variables explicitly.
Bayesian semiparametric structural equation models with latent variables
 Psychometrika
, 2010
"... Structural equation models (SEMs) with latent variables are widely useful for sparse covariance structure modeling and for inferring relationships among latent variables. Bayesian SEMs are appealing in allowing for the incorporation of prior information and in providing exact posterior distributions ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Structural equation models (SEMs) with latent variables are widely useful for sparse covariance structure modeling and for inferring relationships among latent variables. Bayesian SEMs are appealing in allowing for the incorporation of prior information and in providing exact posterior distributions of unknowns, including the latent variables. In this article, we propose a broad class of semiparametric Bayesian SEMs, which allow mixed categorical and continuous manifest variables while also allowing the latent variables to have unknown distributions. In order to include typical identifiability restrictions on the latent variable distributions, we rely on centered Dirichlet process (CDP) and CDP mixture (CDPM) models. The CDP will induce a latent class model with an unknown number of classes, while the CDPM will induce a latent trait model with unknown densities for the latent traits. A simple and efficient Markov chain Monte Carlo algorithm is developed for posterior computation, and the methods are illustrated using simulated examples, and several applications.
Gaussian process structural equation models with latent variables
 Proceedings of the 26th Conference on Uncertainty on Artificial Intelligence, UAI
, 2010
"... In a variety of disciplines such as social sciences, psychology, medicine and economics, the recorded data are considered to be noisy measurements of latent variables connected by some causal structure. This corresponds to a family of graphical models known as the structural equation model with late ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In a variety of disciplines such as social sciences, psychology, medicine and economics, the recorded data are considered to be noisy measurements of latent variables connected by some causal structure. This corresponds to a family of graphical models known as the structural equation model with latent variables. While linear nonGaussian variants have been wellstudied, inference in nonparametric structural equation models is still underdeveloped. We introduce a sparse Gaussian process parameterization that defines a nonlinear structure connecting latent variables, unlike common formulations of Gaussian process latent variable models. The sparse parameterization is given a full Bayesian treatment without compromising Markov chain Monte Carlo efficiency. We compare the stability of the sampling procedure and the predictive ability of the model against the current practice. 1
Identifying Graph Clusters using Variational Inference and links to Covariance Parameterisation
"... Finding clusters of wellconnected nodes in a graph is useful in many domains, including Social Network, Web and molecular interaction analyses. From a computational viewpoint, finding these clusters or graph communities is a difficult problem. We consider the framework of Clique Matrices to decompo ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Finding clusters of wellconnected nodes in a graph is useful in many domains, including Social Network, Web and molecular interaction analyses. From a computational viewpoint, finding these clusters or graph communities is a difficult problem. We consider the framework of Clique Matrices to decompose a graph into a set of possibly overlapping clusters, defined as wellconnected subsets of vertices. The decomposition is based on a statistical description which encourages clusters to be well connected and few in number. The formal intractability of inferring the clusters is addressed using a variational approximation which has links to meanfield theories in statistical mechanics. Clique matrices also play a natural role in parameterising positive definite matrices under zero constraints on elements of the matrix. We show that clique matrices can parameterise all positive definite matrices restricted according to a decomposable graph and form a structured Factor Analysis approximation in the nondecomposable case.
Thinning Measurement Models and Questionnaire Design
"... Inferring key unobservable features of individuals is an important task in the applied sciences. In particular, an important source of data in fields such as marketing, social sciences and medicine is questionnaires: answers in such questionnaires are noisy measures of target unobserved features. Wh ..."
Abstract
 Add to MetaCart
Inferring key unobservable features of individuals is an important task in the applied sciences. In particular, an important source of data in fields such as marketing, social sciences and medicine is questionnaires: answers in such questionnaires are noisy measures of target unobserved features. While comprehensive surveys help to better estimate the latent variables of interest, aiming at a high number of questions comes at a price: refusal to participate in surveys can go up, as well as the rate of missing data; quality of answers can decline; costs associated with applying such questionnaires can also increase. In this paper, we cast the problem of refining existing models for questionnaire data as follows: solve a constrained optimization problem of preserving the maximum amount of information found in a latent variable model using only a subset of existing questions. The goal is to find an optimal subset of a given size. For that, we first define an information theoretical measure for quantifying the quality of a reduced questionnaire. Three different approximate inference methods are introduced to solve this problem. Comparisons against a simple but powerful heuristic are presented. 1
1 Philosophy Research Statement
, 2006
"... My work lies on the intersection of computer science and statistics. The questions I want to answer are of the following nature: how can machines learn from experience? This raises questions about statistical modeling, since the nature of a phenomenon is only observable through a limited set of meas ..."
Abstract
 Add to MetaCart
My work lies on the intersection of computer science and statistics. The questions I want to answer are of the following nature: how can machines learn from experience? This raises questions about statistical modeling, since the nature of a phenomenon is only observable through a limited set of measurements: the data. Rather than explicitly programming a computer to perform a particular task, machine learning uses data and statistical models to achieve intelligent behavior. The outcome can be observed in tasks as diverse as: predicting user preferences (movie ratings are fashionable these days 1); filtering spam; adapting models of computer vision and speech recognition to new environments; improving retrieval of important documents; improving machine translation; and many others. We can also turn the question around and ask instead how machines can be used in new methods of data analysis, and improve scientific progress. Standard statistical practice focuses on studies with a small number of variables and data points, but the increase in the amount of data that has been collected is evident. The need for analysing high dimensional measurements, and combining different sources of data, is pressing. Now the issue turns to finding proper computational approaches for building models from data, and providing novel techniques for exploration and analysis within more thorough studies. In particular, my research addresses fundamental questions on learning with graphical models. More
Contributions to Bayesian Structural Equation Modeling
"... Abstract. Structural equation models (SEMs) are multivariate latent variable models used to model causality structures in data. A Bayesian estimation and validation of SEMs is proposed and identifiability of parameters is studied. The latter study shows that latent variables should be standardized i ..."
Abstract
 Add to MetaCart
Abstract. Structural equation models (SEMs) are multivariate latent variable models used to model causality structures in data. A Bayesian estimation and validation of SEMs is proposed and identifiability of parameters is studied. The latter study shows that latent variables should be standardized in the analysis to ensure identifiability. This heuristics is in fact introduced to deal with complex identifiability constraints. To illustrate the point, identifiability constraints are calculated in a marketing application, in which posterior draws of the constraints are derived from the posterior conditional distributions of parameters.
Manuscript Region of Origin: INDONESIA
"... Abstract: As one of the basic human needs, water services should be sustainable. Researches related to the sustainability of water services have been conducted in several developing countries. However, there are no identical researches in Indonesia. This paper discusses analysis of factors that cont ..."
Abstract
 Add to MetaCart
Abstract: As one of the basic human needs, water services should be sustainable. Researches related to the sustainability of water services have been conducted in several developing countries. However, there are no identical researches in Indonesia. This paper discusses analysis of factors that contribute to sustainability of rural water supply systems in East Java, Indonesia. Data is collected by observing rural water supply facilities, interviewing water committees and water users, and taking documentation. The data is used to build a model, which was developed from theoretical or conceptual model. The development of model uses structural equation modeling (SEM). This model can show the factors that contribute to sustainability of rural water supply systems. The sustainability is influenced significantly by nine variables; they are selection of technology, water sources, investment cost, capability of operator, availability of spare parts, operation cost, technical operation, community participation, and institutional management. Manuscript Click here to download Manuscript: masduqi__sustainability__manuscript_revised 2.doc 1