Results 11  20
of
42
Identifying confounders using additive noise models
, 2009
"... We propose a method for inferring the existence of a latent common cause (“confounder”) of two observed random variables. The method assumes that the two effects of the confounder are (possibly nonlinear) functions of the confounder plus independent, additive noise. We discuss under which conditions ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We propose a method for inferring the existence of a latent common cause (“confounder”) of two observed random variables. The method assumes that the two effects of the confounder are (possibly nonlinear) functions of the confounder plus independent, additive noise. We discuss under which conditions the model is identifiable (up to an arbitrary reparameterization of the confounder) from the joint distribution of the effects. We state and prove a theoretical result that provides evidence for the conjecture that the model is generically identifiable under suitable technical conditions. In addition, we propose a practical method to estimate the confounder from a finite i.i.d. sample of the effects and illustrate that the method works well on both simulated and realworld data.
New dseparation identification results for learning continuous latent variable models
 Proceedings of the 22nd Interational Conference in Machine Learning
, 2005
"... Learning the structure of graphical models is an important task, but one of considerable difficulty when latent variables are involved. Because conditional independences using hidden variables cannot be directly observed, one has to rely on alternative methods to identify the dseparations that defi ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Learning the structure of graphical models is an important task, but one of considerable difficulty when latent variables are involved. Because conditional independences using hidden variables cannot be directly observed, one has to rely on alternative methods to identify the dseparations that define the graphical structure. This paper describes new distributionfree techniques for identifying dseparations in continuous latent variable models when nonlinear dependencies are allowed among hidden variables. 1.
A nonparametric variable clustering model
"... Factor analysis models effectively summarise the covariance structure of high dimensional data, but the solutions are typically hard to interpret. This motivates attempting to find a disjoint partition, i.e. a simple clustering, of observed variables into highly correlated subsets. We introduce a Ba ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Factor analysis models effectively summarise the covariance structure of high dimensional data, but the solutions are typically hard to interpret. This motivates attempting to find a disjoint partition, i.e. a simple clustering, of observed variables into highly correlated subsets. We introduce a Bayesian nonparametric approach to this problem, and demonstrate advantages over heuristic methods proposed to date. Our Dirichlet process variable clustering (DPVC) model can discover blockdiagonal covariance structures in data. We evaluate our method on both synthetic and gene expression analysis problems.
Node discovery in a networked organization
, 803
"... Abstract—In this paper, I present a method to solve a node discovery problem in a networked organization. Covert nodes refer to the nodes which are not observable directly. They affect social interactions, but do not appear in the surveillance logs which record the participants of the social interac ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract—In this paper, I present a method to solve a node discovery problem in a networked organization. Covert nodes refer to the nodes which are not observable directly. They affect social interactions, but do not appear in the surveillance logs which record the participants of the social interactions. Discovering the covert nodes is defined as identifying the suspicious logs where the covert nodes would appear if the covert nodes became overt. A mathematical model is developed for the maximal likelihood estimation of the network behind the social interactions and for the identification of the suspicious logs. Precision, recall, and F measure characteristics are demonstrated with the dataset generated from a real organization and the computationally synthesized datasets. The performance is close to the theoretical limit for any covert nodes in the networks of any topologies and sizes if the ratio of the number of observation to the number of possible communication patterns is large.
Learning Linear Bayesian Networks with Latent Variables
"... This work considers the problem of learning linear Bayesian networks when some of the variables are unobserved. Identifiability and efficient recovery from loworder observable moments are established under a novel graphical constraint. The constraint concerns the expansion properties of the underly ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
This work considers the problem of learning linear Bayesian networks when some of the variables are unobserved. Identifiability and efficient recovery from loworder observable moments are established under a novel graphical constraint. The constraint concerns the expansion properties of the underlying directed acyclic graph (DAG) between observed and unobserved variables in the network, and it is satisfied by many natural families of DAGs that include multilevel DAGs, DAGs with effective depth one, as well as certain families of polytrees. 1.
Temporal Graphical Models for CrossSpecies Gene Regulatory Network Discovery
"... Many genes and biological processes function in similar ways across different species. Crossspecies gene expression analysis, as a powerful tool to characterize the dynamical properties of the cell, has found a number of applications, such as identifying a conserved core set of cell cycle genes. Ho ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Many genes and biological processes function in similar ways across different species. Crossspecies gene expression analysis, as a powerful tool to characterize the dynamical properties of the cell, has found a number of applications, such as identifying a conserved core set of cell cycle genes. However, to the best of our knowledge, there is limited effort on developing appropriate techniques to capture the causal relations between genes from timeseries microarray data across species. In this paper, we present hidden Markov random field regression with L1 penalty to jointly uncover the regulatory networks for multiple species. The algorithm provides a framework for sharing information across species via hidden component graphs and can conveniently incorporate domain knowledge over evolution relationship between species. We demonstrate the effectiveness of our method on two synthetic datasets and one innate immune response microarray dataset. 1.
Gaussian process structural equation models with latent variables
 Proceedings of the 26th Conference on Uncertainty on Artificial Intelligence, UAI
, 2010
"... In a variety of disciplines such as social sciences, psychology, medicine and economics, the recorded data are considered to be noisy measurements of latent variables connected by some causal structure. This corresponds to a family of graphical models known as the structural equation model with late ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In a variety of disciplines such as social sciences, psychology, medicine and economics, the recorded data are considered to be noisy measurements of latent variables connected by some causal structure. This corresponds to a family of graphical models known as the structural equation model with latent variables. While linear nonGaussian variants have been wellstudied, inference in nonparametric structural equation models is still underdeveloped. We introduce a sparse Gaussian process parameterization that defines a nonlinear structure connecting latent variables, unlike common formulations of Gaussian process latent variable models. The sparse parameterization is given a full Bayesian treatment without compromising Markov chain Monte Carlo efficiency. We compare the stability of the sampling procedure and the predictive ability of the model against the current practice. 1
Learning Maximum Lag for Grouped Graphical Granger Models
"... Abstract—Temporal causal modeling has been a highly active research area in the last few decades. Temporal or time series data arises in a wide array of application domains ranging from medicine to finance. Deciphering the causal relationships between the various time series can be critical in under ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract—Temporal causal modeling has been a highly active research area in the last few decades. Temporal or time series data arises in a wide array of application domains ranging from medicine to finance. Deciphering the causal relationships between the various time series can be critical in understanding and consequently, enhancing the efficacy of the underlying processes in these domains. Grouped graphical modeling methods such as Granger methods provide an efficient alternative for finding out such dependencies. A key parameter which affects the performance of these methods is the maximum lag. The maximum lag specifies the extent to which one has to look into the past to predict the future. A smaller than required value of the lag will result in missing important dependencies while an excessively large value of the lag will increase the computational complexity alongwith the addition of noisy dependencies. In this paper, we propose a novel approach for estimating this key parameter efficiently. One of the primary advantages of this approach is that it can, in a principled manner, incorporate prior knowledge of dependencies that are known to exist between certain pairs of time series out of the entire set and use this information to estimate the lag for the entire set. This ability to extrapolate the lag from a known subset to the entire set, in order to get better estimates of the overall lag efficiently, makes such an approach attractive in practice. Keywordslag; granger; modeling I.
Towards association rules with hidden variables
 10th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2006
"... Abstract. The mining of association rules can provide relevant and novel information to the data analyst. However, current techniques do not take into account that the observed associations may arise from variables that are unrecorded in the database. For instance, the pattern of answers in a large ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract. The mining of association rules can provide relevant and novel information to the data analyst. However, current techniques do not take into account that the observed associations may arise from variables that are unrecorded in the database. For instance, the pattern of answers in a large marketing survey might be better explained by a few latent traits of the population than by direct association among measured items. Techniques for mining association rules with hidden variables are still largely unexplored. This paper provides a sound methodology for finding association rules of the type H ⇒ A1,..., Ak, where H is a hidden variable inferred to exist by making suitable assumptions and A1,..., Ak are discrete binary or ordinal variables in the database. 1