Results 1 
9 of
9
Measure Transformer Semantics for Bayesian Machine Learning
"... Abstract. The Bayesian approach to machine learning amounts to inferring posterior distributions of random variables from a probabilistic model of how the variables are related (that is, a prior distribution) and a set of observations of variables. There is a trend in machine learning towards expres ..."
Abstract

Cited by 15 (3 self)
 Add to MetaCart
(Show Context)
Abstract. The Bayesian approach to machine learning amounts to inferring posterior distributions of random variables from a probabilistic model of how the variables are related (that is, a prior distribution) and a set of observations of variables. There is a trend in machine learning towards expressing Bayesian models as probabilistic programs. As a foundation for this kind of programming, we propose a core functional calculus with primitives for sampling prior distributions and observing variables. We define combinators for measure transformers, based on theorems in measure theory, and use these to give a rigorous semantics to our core calculus. The original features of our semantics include its support for discrete, continuous, and hybrid measures, and, in particular, for observations of zeroprobability events. We compile our core language to a small imperative language that has a straightforward semantics via factor graphs, data structures that enable many efficient inference algorithms. We use an existing inference engine for efficient approximate inference of posterior marginal distributions, treating thousands of observations per second for large instances of realistic models. 1
Delimited Control in OCaml, Abstractly and Concretely System Description
"... Abstract. We describe the first implementation of multiprompt delimited control operators in OCaml that is direct in that it captures only the needed part of the control stack. The implementation is a library that requires no changes to the OCaml compiler or runtime, so it is perfectly compatible ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
Abstract. We describe the first implementation of multiprompt delimited control operators in OCaml that is direct in that it captures only the needed part of the control stack. The implementation is a library that requires no changes to the OCaml compiler or runtime, so it is perfectly compatible with existing OCaml source code and bytecode. The library has been in fruitful practical use for four years. We present the library as an implementation of an abstract machine derived by elaborating the definitional machine. The abstract view lets us distill a minimalistic API, scAPI, sufficient for implementing multiprompt delimited control. We argue that a language system that supports exception and stackoverflow handling supports scAPI. Our library illustrates how to use scAPI to implement multiprompt delimited control in a typed language. The approach is general and can be used to add multiprompt delimited control to other existing language systems. 1
A compilation target for probabilistic programming languages
 In ICML
, 2014
"... Forward inference techniques such as sequential Monte Carlo and particle Markov chain Monte Carlo for probabilistic programming can be implemented in any programming language by creative use of standardized operating system functionality including processes, forking, mutexes, and shared memory. E ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Forward inference techniques such as sequential Monte Carlo and particle Markov chain Monte Carlo for probabilistic programming can be implemented in any programming language by creative use of standardized operating system functionality including processes, forking, mutexes, and shared memory. Exploiting this we have defined, developed, and tested a probabilistic programming language intermediate representation language we call probabilistic C, which itself can be compiled to machine code by standard compilers and linked to operating system libraries yielding an efficient, scalable, portable probabilistic programming compilation target. This opens up a new hardware and systems research path for optimizing probabilistic programming systems. 1.
From Bayesian notation to pure Racket, via measuretheoretic probability in λZFC
 In: Implementation and Application of Functional Languages
, 2010
"... Abstract. Bayesian practitioners build models of the world without regarding how difficult it will be to answer questions about them. When answering questions, they put off approximating as long as possible, and usually must write programs to compute converging approximations. Writing the programs i ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Bayesian practitioners build models of the world without regarding how difficult it will be to answer questions about them. When answering questions, they put off approximating as long as possible, and usually must write programs to compute converging approximations. Writing the programs is distracting, tedious and errorprone, and we wish to relieve them of it by providing languages and compilers. Their style constrains our work: the tools we provide cannot approximate early. Our approach to meeting this constraint is to 1) determine their notation’s meaning in a suitable theoretical framework; 2) generalize our interpretation in an uncomputable, exact semantics; 3) approximate the exact semantics and prove convergence; and 4) implement the approximating semantics in Racket (formerly PLT Scheme). In this way, we define languages with at least as much exactness as Bayesian practitioners have in mind, and also put off approximating as long as possible. In this paper, we demonstrate the approach using our preliminary work on discrete (countably infinite) Bayesian models.
A ModelLearner Pattern for Bayesian Reasoning Andrew D. Gordon (Microsoft Research and University of Edinburgh) Mihhail Aizatulin (Open University)
"... A Bayesian model is based on a pair of probability distributions, known as the prior and sampling distributions. A wide range of fundamental machine learning tasks, including regression, classification, clustering, and many others, can all be seen as Bayesian models. We propose a new probabilistic ..."
Abstract
 Add to MetaCart
(Show Context)
A Bayesian model is based on a pair of probability distributions, known as the prior and sampling distributions. A wide range of fundamental machine learning tasks, including regression, classification, clustering, and many others, can all be seen as Bayesian models. We propose a new probabilistic programming abstraction, a typed Bayesian model, based on a pair of probabilistic expressions for the prior and sampling distributions. A sampler for a model is an algorithm to compute synthetic data from its sampling distribution, while a learner for a model is an algorithm for probabilistic inference on the model. Models, samplers, and learners form a generic programming pattern for modelbased inference. They support the uniform expression of common tasks including model testing, and generic compositions such as mixture models, evidencebased model averaging, and mixtures of experts. A formal semantics supports reasoning about model equivalence and implementation correctness. By developing a series of examples and three learner implementations based on exact inference, factor graphs, and Markov chain Monte Carlo, we demonstrate the broad applicability of this new programming pattern.
Seoul National University
"... Probabilistic programs use familiar notation of programming languages to specify probabilistic models. Suppose we are interested in estimating the distribution of the return expression r of a probabilistic program P. We are interested in slicing the probabilistic program P and obtaining a simpler ..."
Abstract
 Add to MetaCart
(Show Context)
Probabilistic programs use familiar notation of programming languages to specify probabilistic models. Suppose we are interested in estimating the distribution of the return expression r of a probabilistic program P. We are interested in slicing the probabilistic program P and obtaining a simpler program SLI(P) which retains only those parts of P that are relevant to estimating r, and elides those parts of P that are not relevant to estimating r. We desire that the SLI transformation be both correct and efficient. By correct, we mean that P and SLI(P) have identical estimates on r. By efficient, we mean that estimation over SLI(P) be as fast as possible. We show that the usual notion of program slicing, which traverses control and data dependencies backward from the return expression r, is unsatisfactory for probabilistic programs, since it produces incorrect slices on some programs and suboptimal ones on others. Our key insight is that in addition to the usual notions of control dependence and data dependence that are used to slice nonprobabilistic programs, a new kind of dependence called observe dependence arises naturally due to observe statements in probabilistic programs. We propose a new definition of SLI(P) which is both correct and efficient for probabilistic programs, by including observe dependence in addition to control and data dependences for computing slices. We prove correctness mathematically, and we demonstrate efficiency empirically. We show that by applying the SLI transformation as a prepass, we can improve the efficiency of probabilistic inference, not only in our own inference tool R2, but also in other systems for performing inference such as Church and Infer.NET.
Semantics Sensitive Sampling for Probabilistic Programs
"... We present a new semantics sensitive sampling algorithm for probabilistic programs, which are “usual ” programs endowed with statements to sample from distributions, and condition executions based on observations. Since probabilistic programs are executable, sampling can be performed by repeatedly ..."
Abstract
 Add to MetaCart
We present a new semantics sensitive sampling algorithm for probabilistic programs, which are “usual ” programs endowed with statements to sample from distributions, and condition executions based on observations. Since probabilistic programs are executable, sampling can be performed by repeatedly executing them. However, in the case of programs with a large number of random variables and observations, naive execution does not produce high quality samples, and it takes an intractable number of samples in order to perform reasonable inference. Our MCMC algorithm called S3 tackles these problems using ideas from program analysis. First, S3 propagates observations back through the program in order to obtain a semantically equivalent program with conditional sample statements – this has the effect of preventing rejections due to executions that fail to satisfy observations. Next, S3 decomposes the probabilistic program into a set of straightline programs, one for every valid program path, and performing MetropolisHastings sampling over each straightline program independently. Sampling over straightline programs has the advantage that random choices from previous executions can be reused merely using the program counter (or line number) associated with each random choice. Finally, it combines the results from sampling each straightline program (using appropriate weighting) to produce a result for the whole program. We formalize the semantics of probabilistic programs and rigorously prove the correctness of S3. We also empirically demonstrate the effectiveness of S3, and compare it with an importance sampling based tool over various benchmarks. 1
Bayesian Machine Learning
, 2011
"... The Bayesian approach to machine learning amounts to inferring posterior distributions of random variables from a probabilistic model of how the variables are related (that is, a prior distribution) and a set of observations of variables. There is a trend in machine learning towards expressing Bay ..."
Abstract
 Add to MetaCart
(Show Context)
The Bayesian approach to machine learning amounts to inferring posterior distributions of random variables from a probabilistic model of how the variables are related (that is, a prior distribution) and a set of observations of variables. There is a trend in machine learning towards expressing Bayesian models as probabilistic programs. As a foundation for this kind of programming, we propose a core functional calculus with primitives for sampling prior distributions and observing variables. We define combinators for measure transformers, based on theorems in measure theory, and use these to give a rigorous semantics to our core calculus. The original features of our semantics include its support for discrete, continuous, and hybrid measures, and, in particular, for observations of zeroprobability events. We compile our core language to a small imperative language that in addition to the measure transformer semantics also has a straightforward semantics via factor graphs, data structures that enable many efficient inference algorithms. We use an existing inference engine for efficient approximate inference of posterior marginal distributions, treating thousands of observations per second for large instances of realistic models.
MEASURE TRANSFORMER SEMANTICS FOR BAYESIAN MACHINE LEARNING ∗
, 2012
"... Vol. 9(3:11)2013, pp. 1–39 www.lmcsonline.org ..."
(Show Context)