Results 1  10
of
12
A New Discriminative Kernel from Probabilistic Models
, 2002
"... Recently, Jaakkola and Haussler proposed a method for constructing kernel functions from probabilistic models. Their so called \Fisher kernel" has been combined with discriminative classi ers such as SVM and applied successfully in e.g. DNA and protein analysis. Whereas the Fisher kernel (FK) is ca ..."
Abstract

Cited by 61 (5 self)
 Add to MetaCart
Recently, Jaakkola and Haussler proposed a method for constructing kernel functions from probabilistic models. Their so called \Fisher kernel" has been combined with discriminative classi ers such as SVM and applied successfully in e.g. DNA and protein analysis. Whereas the Fisher kernel (FK) is calculated from the marginal loglikelihood, we propose the TOP kernel derived from Tangent vectors Of Posterior logodds. Furthermore, we develop a theoretical framework on feature extractors from probabilistic models and use it for analyzing the TOP kernel. In experiments our new discriminative TOP kernel compares favorably to the Fisher kernel.
Stratified exponential families: Graphical models and model selection
 ANNALS OF STATISTICS
, 2001
"... ..."
Population Markov Chain Monte Carlo
 Machine Learning
, 2003
"... Stochastic search algorithms inspired by physical and biological systems are applied to the problem of learning directed graphical probability models in the presence of missing observations and hidden variables. For this class of problems, deterministic search algorithms tend to halt at local optima ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
Stochastic search algorithms inspired by physical and biological systems are applied to the problem of learning directed graphical probability models in the presence of missing observations and hidden variables. For this class of problems, deterministic search algorithms tend to halt at local optima, requiring random restarts to obtain solutions of acceptable quality. We compare three stochastic search algorithms: a MetropolisHastings Sampler (MHS), an Evolutionary Algorithm (EA), and a new hybrid algorithm called Population Markov Chain Monte Carlo, or popMCMC. PopMCMC uses statistical information from a population of MHSs to inform the proposal distributions for individual samplers in the population. Experimental results show that popMCMC and EAs learn more efficiently than the MHS with no information exchange. Populations of MCMC samplers exhibit more diversity than populations evolving according to EAs not satisfying physicsinspired local reversibility conditions. KEY WORDS: Markov Chain Monte Carlo, MetropolisHastings Algorithm, Graphical Probabilistic Models, Bayesian Networks, Bayesian Learning, Evolutionary Algorithms Machine Learning MCMC Issue 1 5/16/01 1.
Instrumentality Tests Revisited
 In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence
, 2001
"... An instrument is a random variable that is uncorrelated with certain (unobserved) error terms and, thus, allows the identification of structural parameters in linear models. In nonlinear models, instrumental variables are useful for deriving bounds on causal effects. Few years ago, Pearl introduced ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
An instrument is a random variable that is uncorrelated with certain (unobserved) error terms and, thus, allows the identification of structural parameters in linear models. In nonlinear models, instrumental variables are useful for deriving bounds on causal effects. Few years ago, Pearl introduced a necessary test for instruments which permits researchers to identify variables that could not serve as instruments. In this paper, we extend Pearl's result in several directions. In particular, we answer in the armative an open conjecture about the nontestability of instruments in models with unrestricted variables, and we devise new tests for models with discrete and continuous variables.
Inequality constraints in causal models with hidden variables
 In Proceedings of the Seventeenth Annual Conference on Uncertainty in Artificial Intelligence (UAI06
, 2006
"... We present a class of inequality constraints on the set of distributions induced by local interventions on variables governed by a causal Bayesian network, in which some of the variables remain unmeasured. We derive bounds on causal effects that are not directly measured in randomized experiments. W ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
We present a class of inequality constraints on the set of distributions induced by local interventions on variables governed by a causal Bayesian network, in which some of the variables remain unmeasured. We derive bounds on causal effects that are not directly measured in randomized experiments. We derive instrumental inequality type of constraints on nonexperimental distributions. The results have applications in testing causal models with observational or experimental data. 1
A SemiAlgebraic Description of Discrete Naive Bayes Models with Two Hidden Classes
 In Proc. Ninth International Symposium on Artificial Intelligence and Mathematics, Fort
, 2006
"... Discrete Bayesian network models with hidden variables de ne an important class of statistical models. These models are usually de ned parametrically, but can also be described semialgebraically as the solutions in the probability simplex of a nite set of polynomial equations and inequations. In th ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Discrete Bayesian network models with hidden variables de ne an important class of statistical models. These models are usually de ned parametrically, but can also be described semialgebraically as the solutions in the probability simplex of a nite set of polynomial equations and inequations. In this paper we present a semialgebraic description of discrete Naive Bayes models with two hidden classes and a nite number of observable variables. The identi ability of the parameters is also studied. Our derivations are based on an alternative parametrization of the Naive Bayes models with an arbitrary number of hidden classes. 1
Combining subjective probabilities and data in training markov logic networks
 of Lecture Notes in Computer Science
, 2012
"... Abstract. Markov logic is a rich language that allows one to specify a knowledge base as a set of weighted firstorder logic formulas, and to define a probability distribution over truth assignments to ground atoms using this knowledge base. Usually, the weight of a formula cannot be related to the ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Abstract. Markov logic is a rich language that allows one to specify a knowledge base as a set of weighted firstorder logic formulas, and to define a probability distribution over truth assignments to ground atoms using this knowledge base. Usually, the weight of a formula cannot be related to the probability of the formula without taking into account the weights of the other formulas. In general, this is not an issue, since the weights are learned from training data. However, in many domains (e.g. healthcare, dependable systems, etc.), only little or no training data may be available, but one has access to a domain expert whose knowledge is available in the form of subjective probabilities. Within the framework of Bayesian statistics, we present a formalism for using a domain expert’s knowledge for weight learning. Our approach defines priors that are different from and more general than previously used Gaussian priors over weights. We show how one can learn weights in an MLN by combining subjective probabilities and training data, without requiring that the domain expert provides consistent knowledge. Additionally, we also provide a formalism for capturing conditional subjective probabilities, which are often easier to obtain and more reliable than nonconditional probabilities. We demonstrate the effectiveness of our approach by extensive experiments in a domain that models failure dependencies in a cyberphysical system. Moreover, we demonstrate the advantages of using our proposed prior over that of using nonzero mean Gaussian priors in a commonly cited social network MLN testbed. 1
Polynomial constraints in causal Bayesian networks
 In Proceedings of the Seventeenth Annual Conference on Uncertainty in Artificial Intelligence (UAI07
"... We use the implicitization procedure to generate polynomial equality constraints on the set of distributions induced by local interventions on variables governed by a causal Bayesian network with hidden variables. We show how we may reduce the complexity of the implicitization problem and make the p ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
We use the implicitization procedure to generate polynomial equality constraints on the set of distributions induced by local interventions on variables governed by a causal Bayesian network with hidden variables. We show how we may reduce the complexity of the implicitization problem and make the problem tractable in certain causal Bayesian networks. We also show some preliminary results on the algebraic structure of polynomial constraints. The results have applications in distinguishing between causal models and in testing causal models with combined observational and experimental data. 1
Tractable Structure Search in the Presence of Latent Variables
"... The problem of learning the structure of a DAGmodel in the presence of latent variables presents many formidable challenges. In particular there are an infinite number of latent variable models to consider, and these models possess features which make them hard to work with. We describe a clas ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
The problem of learning the structure of a DAGmodel in the presence of latent variables presents many formidable challenges. In particular there are an infinite number of latent variable models to consider, and these models possess features which make them hard to work with. We describe a class of graphical models which can represent the conditional independence structure induced by a latent variable model over the observed margin. We give a parametrization of the set of Gaussian distributions with conditional independence structure given by a MAG model. The models are illustrated via a simple example. Different estimation techniques are discussed in the context of Zellner's Seemingly Unrelated Regression (SUR) models. Keywords: Multivariate Graphical Models; Causal Modelling; Latent Variables; Ancestral Graphs; MAG Models. 1 INTRODUCTION There has been significant progress in the development of algorithms for learning the directed acyclic graph (DAG) part of a Bayes...