Results 1 -
9 of
9
Measure Transformer Semantics for Bayesian Machine Learning
"... Abstract. The Bayesian approach to machine learning amounts to inferring posterior distributions of random variables from a probabilistic model of how the variables are related (that is, a prior distribution) and a set of observations of variables. There is a trend in machine learning towards expres ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
(Show Context)
Abstract. The Bayesian approach to machine learning amounts to inferring posterior distributions of random variables from a probabilistic model of how the variables are related (that is, a prior distribution) and a set of observations of variables. There is a trend in machine learning towards expressing Bayesian models as probabilistic programs. As a foundation for this kind of programming, we propose a core functional calculus with primitives for sampling prior distributions and observing variables. We define combinators for measure transformers, based on theorems in measure theory, and use these to give a rigorous semantics to our core calculus. The original features of our semantics include its support for discrete, continuous, and hybrid measures, and, in particular, for observations of zero-probability events. We compile our core language to a small imperative language that has a straightforward semantics via factor graphs, data structures that enable many efficient inference algorithms. We use an existing inference engine for efficient approximate inference of posterior marginal distributions, treating thousands of observations per second for large instances of realistic models. 1
Delimited Control in OCaml, Abstractly and Concretely System Description
"... Abstract. We describe the first implementation of multi-prompt delimited control operators in OCaml that is direct in that it captures only the needed part of the control stack. The implementation is a library that requires no changes to the OCaml compiler or run-time, so it is perfectly compatible ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
(Show Context)
Abstract. We describe the first implementation of multi-prompt delimited control operators in OCaml that is direct in that it captures only the needed part of the control stack. The implementation is a library that requires no changes to the OCaml compiler or run-time, so it is perfectly compatible with existing OCaml source code and byte-code. The library has been in fruitful practical use for four years. We present the library as an implementation of an abstract machine derived by elaborating the definitional machine. The abstract view lets us distill a minimalistic API, scAPI, sufficient for implementing multiprompt delimited control. We argue that a language system that supports exception and stack-overflow handling supports scAPI. Our library illustrates how to use scAPI to implement multi-prompt delimited control in a typed language. The approach is general and can be used to add multi-prompt delimited control to other existing language systems. 1
A compilation target for probabilistic programming languages
- In ICML
, 2014
"... Forward inference techniques such as sequential Monte Carlo and particle Markov chain Monte Carlo for probabilistic programming can be im-plemented in any programming language by cre-ative use of standardized operating system func-tionality including processes, forking, mutexes, and shared memory. E ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Forward inference techniques such as sequential Monte Carlo and particle Markov chain Monte Carlo for probabilistic programming can be im-plemented in any programming language by cre-ative use of standardized operating system func-tionality including processes, forking, mutexes, and shared memory. Exploiting this we have de-fined, developed, and tested a probabilistic pro-gramming language intermediate representation language we call probabilistic C, which itself can be compiled to machine code by standard com-pilers and linked to operating system libraries yielding an efficient, scalable, portable proba-bilistic programming compilation target. This opens up a new hardware and systems research path for optimizing probabilistic programming systems. 1.
From Bayesian notation to pure Racket, via measuretheoretic probability in λZFC
- In: Implementation and Application of Functional Languages
, 2010
"... Abstract. Bayesian practitioners build models of the world without regarding how difficult it will be to answer questions about them. When answering questions, they put off approximating as long as possible, and usually must write programs to compute converging approximations. Writing the programs i ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract. Bayesian practitioners build models of the world without regarding how difficult it will be to answer questions about them. When answering questions, they put off approximating as long as possible, and usually must write programs to compute converging approximations. Writing the programs is distracting, tedious and error-prone, and we wish to relieve them of it by providing languages and compilers. Their style constrains our work: the tools we provide cannot approximate early. Our approach to meeting this constraint is to 1) determine their notation’s meaning in a suitable theoretical framework; 2) generalize our interpretation in an uncomputable, exact semantics; 3) approximate the exact semantics and prove convergence; and 4) implement the approximating semantics in Racket (formerly PLT Scheme). In this way, we define languages with at least as much exactness as Bayesian practitioners have in mind, and also put off approximating as long as possible. In this paper, we demonstrate the approach using our preliminary work on discrete (countably infinite) Bayesian models.
A Model-Learner Pattern for Bayesian Reasoning Andrew D. Gordon (Microsoft Research and University of Edinburgh) Mihhail Aizatulin (Open University)
"... A Bayesian model is based on a pair of probability distributions, known as the prior and sampling distributions. A wide range of fundamental machine learning tasks, including regression, classi-fication, clustering, and many others, can all be seen as Bayesian models. We propose a new probabilistic ..."
Abstract
- Add to MetaCart
(Show Context)
A Bayesian model is based on a pair of probability distributions, known as the prior and sampling distributions. A wide range of fundamental machine learning tasks, including regression, classi-fication, clustering, and many others, can all be seen as Bayesian models. We propose a new probabilistic programming abstraction, a typed Bayesian model, based on a pair of probabilistic expressions for the prior and sampling distributions. A sampler for a model is an algorithm to compute synthetic data from its sampling distribu-tion, while a learner for a model is an algorithm for probabilis-tic inference on the model. Models, samplers, and learners form a generic programming pattern for model-based inference. They sup-port the uniform expression of common tasks including model test-ing, and generic compositions such as mixture models, evidence-based model averaging, and mixtures of experts. A formal seman-tics supports reasoning about model equivalence and implemen-tation correctness. By developing a series of examples and three learner implementations based on exact inference, factor graphs, and Markov chain Monte Carlo, we demonstrate the broad applica-bility of this new programming pattern.
Seoul National University
"... Probabilistic programs use familiar notation of programming lan-guages to specify probabilistic models. Suppose we are interested in estimating the distribution of the return expression r of a prob-abilistic program P. We are interested in slicing the probabilistic program P and obtaining a simpler ..."
Abstract
- Add to MetaCart
(Show Context)
Probabilistic programs use familiar notation of programming lan-guages to specify probabilistic models. Suppose we are interested in estimating the distribution of the return expression r of a prob-abilistic program P. We are interested in slicing the probabilistic program P and obtaining a simpler program SLI(P) which retains only those parts of P that are relevant to estimating r, and elides those parts of P that are not relevant to estimating r. We desire that the SLI transformation be both correct and efficient. By correct, we mean that P and SLI(P) have identical estimates on r. By efficient, we mean that estimation over SLI(P) be as fast as possible. We show that the usual notion of program slicing, which tra-verses control and data dependencies backward from the return ex-pression r, is unsatisfactory for probabilistic programs, since it pro-duces incorrect slices on some programs and sub-optimal ones on others. Our key insight is that in addition to the usual notions of control dependence and data dependence that are used to slice non-probabilistic programs, a new kind of dependence called observe dependence arises naturally due to observe statements in proba-bilistic programs. We propose a new definition of SLI(P) which is both correct and efficient for probabilistic programs, by including observe de-pendence in addition to control and data dependences for comput-ing slices. We prove correctness mathematically, and we demon-strate efficiency empirically. We show that by applying the SLI transformation as a pre-pass, we can improve the efficiency of prob-abilistic inference, not only in our own inference tool R2, but also in other systems for performing inference such as Church and In-fer.NET.
Semantics Sensitive Sampling for Probabilistic Programs
"... We present a new semantics sensitive sampling algorithm for probabilistic pro-grams, which are “usual ” programs endowed with statements to sample from distributions, and condition executions based on observations. Since probabilis-tic programs are executable, sampling can be performed by repeatedly ..."
Abstract
- Add to MetaCart
We present a new semantics sensitive sampling algorithm for probabilistic pro-grams, which are “usual ” programs endowed with statements to sample from distributions, and condition executions based on observations. Since probabilis-tic programs are executable, sampling can be performed by repeatedly executing them. However, in the case of programs with a large number of random vari-ables and observations, naive execution does not produce high quality samples, and it takes an intractable number of samples in order to perform reasonable inference. Our MCMC algorithm called S3 tackles these problems using ideas from program analysis. First, S3 propagates observations back through the pro-gram in order to obtain a semantically equivalent program with conditional sam-ple statements – this has the effect of preventing rejections due to executions that fail to satisfy observations. Next, S3 decomposes the probabilistic program into a set of straight-line programs, one for every valid program path, and perform-ing Metropolis-Hastings sampling over each straight-line program independently. Sampling over straight-line programs has the advantage that random choices from previous executions can be re-used merely using the program counter (or line number) associated with each random choice. Finally, it combines the results from sampling each straight-line program (using appropriate weighting) to pro-duce a result for the whole program. We formalize the semantics of probabilistic programs and rigorously prove the correctness of S3. We also empirically demon-strate the effectiveness of S3, and compare it with an importance sampling based tool over various benchmarks. 1
Bayesian Machine Learning
, 2011
"... The Bayesian approach to machine learning amounts to inferring posterior distributions of random variables from a probabilistic model of how the variables are related (that is, a prior distribution) and a set of observations of variables. There is a trend in machine learning towards expressing Bay ..."
Abstract
- Add to MetaCart
(Show Context)
The Bayesian approach to machine learning amounts to inferring posterior distributions of random variables from a probabilistic model of how the variables are related (that is, a prior distribution) and a set of observations of variables. There is a trend in machine learning towards expressing Bayesian models as probabilistic programs. As a foundation for this kind of programming, we propose a core functional calculus with primitives for sampling prior distributions and observing variables. We define combinators for measure transformers, based on theorems in measure theory, and use these to give a rigorous semantics to our core calculus. The original features of our semantics include its support for discrete, continuous, and hybrid measures, and, in particular, for observations of zero-probability events. We compile our core language to a small imperative language that in addition to the measure transformer semantics also has a straightforward semantics via factor graphs, data structures that enable many efficient inference algorithms. We use an existing inference engine for efficient approximate inference of posterior marginal distributions, treating thousands of observations per second for large instances of realistic models.
MEASURE TRANSFORMER SEMANTICS FOR BAYESIAN MACHINE LEARNING ∗
, 2012
"... Vol. 9(3:11)2013, pp. 1–39 www.lmcs-online.org ..."
(Show Context)