## ON THE COMPUTABILITY OF CONDITIONAL PROBABILITY

Citations: | 3 - 3 self |

### BibTeX

@MISC{Ackerman_onthe,

author = {Nathanael L. Ackerman and Cameron E. Freer and Daniel and M. Roy},

title = {ON THE COMPUTABILITY OF CONDITIONAL PROBABILITY},

year = {}

}

### OpenURL

### Abstract

Abstract. We study the problem of computing conditional probabilities, a fundamental operation in statistics and machine learning. In the elementary discrete setting, conditional probability is defined axiomatically and the search for more constructive definitions is the subject of a rich literature in probability theory and statistics. In the discrete or dominated setting, under suitable computability hypotheses, conditional probabilities are computable. However, we show that in general one cannot compute conditional probabilities. We do this by constructing a pair of computable random variables in the unit interval whose conditional distribution encodes the halting problem at almost every point. We show that this result is tight, in the sense that given an oracle for the halting problem, one can compute this conditional distribution. On the other hand, we show that conditioning in abstract settings is computable in the presence of certain additional structure, such as independent absolutely continuous noise. 1.

### Citations

923 |
Language identification in the limit
- Gold
- 1967
(Show Context)
Citation Context ...ting considered here could bear on the practice of statistical AI and machine learning. Osherson, Stob, and Weinstein [OSW88] study learning theory in the setting of identifiability in the limit (see =-=[Gol67]-=- and [Put65] for more details on this setting) and prove that a certain type of “computable Bayesian” learner fails to identify the index of a (computably enumerable) set that is “computably identifia... |

850 | Jr.; Theory of Recursive Functions and Effective Computability - Rogers - 1967 |

601 | Markov logic network
- Richardson, Domingos
- 2006
(Show Context)
Citation Context ...om the model’s joint distribution. (See, e.g., IBAL [Pfe01], λ◦[PPT08], Church [GMR + 08], and HANSEI [KS09]. For related and earlier efforts, see, e.g., PHA [Poo91], Infer.NET [MWGK10], Markov Logic =-=[RD06]-=-. Probabilistic programming languages have been the focus of a long tradition of research within programming languages, model checking and formal methods.) In many of these languages, one can easily r... |

574 |
Communication in the presence of noise
- Shannon
- 1949
(Show Context)
Citation Context ...mputable random variable corrupted by such noise still admits a computable conditional distribution. This result is analogous to a classical theorem of information theory. Hartley [Har28] and Shannon =-=[Sha49]-=- show that the capacity of a continuous real-valued channel without noise is infinite, yet the addition of, e.g., Gaussian noise with ɛ > 0 variance causes the channel capacity to drop to a finite amo... |

419 |
A formal theory of inductive inference
- Solomonoff
- 1964
(Show Context)
Citation Context ....3. Related work. The computability of conditional distributions has been explored for priors that are universal for partial computable functions, as defined using Kolmogorov complexity by Solomonoff =-=[Sol64]-=- and Zvonkin and Levin [ZL70]. Hutter [Hut07] has recently extended classic results about such universal priors in Solomonoff’s theory of prediction. The computability of conditional distributions als... |

357 | Foundations of Modern Probability - Kallenberg - 2002 |

339 | A theory of program size formally identical to information theory - Chaitin - 1975 |

222 |
Data types at lattices
- Scott
- 1976
(Show Context)
Citation Context ...table probability comes from domain theory and an analysis of how computations on continuous structures are formed from partial information on an appropriate partial order. Early work is due to Scott =-=[Sco75]-=-, Plotkin [Plo76], and many others, and more recent work on representing probability measures is due to Edalat [Eda96] and others. Recently, these two threads have converged on essentially equivalent ... |

219 | A powerdomain construction
- Plotkin
- 1976
(Show Context)
Citation Context ... comes from domain theory and an analysis of how computations on continuous structures are formed from partial information on an appropriate partial order. Early work is due to Scott [Sco75], Plotkin =-=[Plo76]-=-, and many others, and more recent work on representing probability measures is due to Edalat [Eda96] and others. Recently, these two threads have converged on essentially equivalent definitions of co... |

202 |
Average case complete problems
- Levin
- 1986
(Show Context)
Citation Context ...lf be noncomputable). In the finite discrete setting, the computational complexity of conditional distributions has also been explored, through extensions of Levin’s theory of average-case complexity =-=[Lev86]-=-. Conditional probabilities for distributions on finite sets of discrete strings are computable, but may not be efficiently so. Suppose f is a one-way function. Then it is difficult to sample from the... |

157 |
Grundbegriffe der Wahrscheinlichkeitsrechnung
- Kolmogorov
- 1933
(Show Context)
Citation Context ...mselves. In fact, we show that this is not possible. 1.1. Background. If X is a continuous random variable, then P{X = x} = 0 and the elementary rule from the discrete case does not apply. Kolmogorov =-=[Kol33]-=- gives an axiomatic characterization of conditional probabilities in the abstract setting, but this definition gives no recipe for their calculation. The probability and statistics literature contains... |

137 |
Transmission of information
- Hartley
- 1927
(Show Context)
Citation Context ...Corollary 4.19, a computable random variable corrupted by such noise still admits a computable conditional distribution. This result is analogous to a classical theorem of information theory. Hartley =-=[Har28]-=- and Shannon [Sha49] show that the capacity of a continuous real-valued channel without noise is infinite, yet the addition of, e.g., Gaussian noise with ɛ > 0 variance causes the channel capacity to ... |

108 | On the theory of average case complexity
- Ben-David, Chor, et al.
- 1992
(Show Context)
Citation Context ...ple from the conditional distribution of the uniform distribution of strings of some length with respect to a given output of f. This intuition is made precise by Ben-David, Chor, Goldreich, and Luby =-=[BCGL92]-=- in their theory of polynomial-time samplable distributions, which has since been extended by Yamakami [Yam99] and others. In Section 4 we explore several general circumstances in which conditioning i... |

72 |
The complexity of nonuniform random number generation,” Algorithms and complexity: new directions and recent results
- Knuth, Yao
- 1976
(Show Context)
Citation Context ... These random variables are manifestly computable in an intuitive sense (and can even be shown to be optimal in their use of input bits, via classic analysis of rational-weight coins by Knuth and Yao =-=[KY76]-=-). Hence it is natural to admit as computable random variables those measurable functions that are computable only on a P-measure one subset of {0, 1} ∞ , as we have done. The notion of a computable m... |

62 | IBAL: A probabilistic rational programming language
- Pfeffer
- 2001
(Show Context)
Citation Context ...practice. Not only do researchers in probabilistic AI and machine learning conceive of models by defining generative processes, but recently probabilistic functional programming languages (e.g., IBAL =-=[Pfe01]-=-, λ◦[PPT08], Church [GMR+ 08], and HANSEI [KS09]) have been proposed and used for universal statistical modeling in these areas and elsewhere. (Within domain theory, idealized functional languages tha... |

62 | Church: A language for generative models - Goodman, Mansinghka, et al. - 2008 |

37 | Uniform test of algorithmic randomness over a general space
- Gács
(Show Context)
Citation Context ...nable us to formulate our results. The foundations of the theory include notions of computability for probability measures developed by Edalat [Eda96], Weihrauch [Wei99], Schroeder [Sch07b], and Gács =-=[Gác05]-=-. Computable probability theory itself builds off notions and results in computable analysis. For a general introduction to this approach to real computation, see Weihrauch [Wei00], Braverman [Bra05] ... |

32 | An extension result for continuous valuations
- Alvarez-Manilla, Edalat, et al.
- 1997
(Show Context)
Citation Context ...asures is due to Edalat [Eda96] and others. Recently, these two threads have converged on essentially equivalent definitions of computable probability measures in a wide range of settings (see, e.g., =-=[AES00]-=- and [SS06]). We will formulate our results in terms of computable probability measures on computable metric spaces, though similar formulations using computable topological spaces are also possible. ... |

32 | Computing over the reals: foundations for scientific computing
- Braverman, Cook
(Show Context)
Citation Context ...ose distributions from which a computer can generate exact samples to arbitrary accuracy. (For a general introduction to this approach to real computation, see Braverman [Bra05] or Braverman and Cook =-=[BC06]-=-.) In Section 3 we provide background on the computability of probability distributions and precise definitions. Computable distributions form a robust class that delineates those distributions from w... |

30 |
A computable ordinary differential equation which possesses no computable solution
- Richards
(Show Context)
Citation Context ...ble distributions whose conditional distributions cannot be computed by any possible algorithm. This constitutes a new noncomputability result in analysis, akin to the theorem of Pour-El and Richards =-=[PER79]-=- that there is a computable ordinary differential equation with no computable solution, or more recent results of Braverman and Yampolsky [BY07] on the noncomputability of certain Julia sets. Our resu... |

17 |
The complexity of finite objects and the basing of the concepts of information and randomness on the theory of algorithms
- Zvonkin, Levin
- 1970
(Show Context)
Citation Context ...ility of conditional distributions has been explored for priors that are universal for partial computable functions, as defined using Kolmogorov complexity by Solomonoff [Sol64] and Zvonkin and Levin =-=[ZL70]-=-. Hutter [Hut07] has recently extended classic results about such universal priors in Solomonoff’s theory of prediction. The computability of conditional distributions also plays a role in Takahashi’s... |

15 |
A recursive function defined on a compact interval and having a continuous derivative that is not recursive
- Myhill
- 1971
(Show Context)
Citation Context ...cover the original condition as σ → 0, by our main noncomputability result (Theorem 6.6) one cannot, in general, compute how small σ must be in order to bound the error introduced by noise. By Myhill =-=[Myh71]-=-, there is a computable function [0, 1] → R whose derivative is continuous, but not computable. However, Pour-El and Richards [PER89, Ch. 1, Thm. 2] show that a twice continuously differentiable compu... |

15 | Representing Bayesian networks within probabilistic horn abduction
- Poole
- 1991
(Show Context)
Citation Context ...e process that produces an exact sample from the model’s joint distribution. (See, e.g., IBAL [Pfe01], λ◦[PPT08], Church [GMR + 08], and HANSEI [KS09]. For related and earlier efforts, see, e.g., PHA =-=[Poo91]-=-, Infer.NET [MWGK10], Markov Logic [RD06]. Probabilistic programming languages have been the focus of a long tradition of research within programming languages, model checking and formal methods.) In ... |

14 |
Effective symbolic dynamics, random points, statistical behavior, complexity and entropy
- Galatolo, Hoyrup, et al.
(Show Context)
Citation Context ...etric spaces, as developed in computable analysis, provide a convenient framework for formulating results in computable probability theory. For consistency, we largely use definitions from [HR09] and =-=[GHR10]-=-. Additional details about computable metric spaces can also be found in [Wei00, Ch. 8.1] and [Gác05, §B.3], and their relationship to computable topological spaces is explored in [GSW07]. Computable ... |

14 | Theory of Statistics, Springer Series in Statistics - SCHERVISH - 1995 |

13 | Recursive approximability of real numbers - Zheng |

12 |
Mechanical learners pay a price for Bayesianism
- Osherson, Stob, et al.
- 1988
(Show Context)
Citation Context ...PUTABILITY OF CONDITIONAL PROBABILITY 4 complexity results to the more general setting considered here could bear on the practice of statistical AI and machine learning. Osherson, Stob, and Weinstein =-=[OSW88]-=- study learning theory in the setting of identifiability in the limit (see [Gol67] and [Put65] for more details on this setting) and prove that a certain type of “computable Bayesian” learner fails to... |

11 | C.: Computability of probability measures and martin-löf randomness over metric spaces
- Hoyrupa, Rojas
- 2009
(Show Context)
Citation Context ...omputable metric spaces, as developed in computable analysis, provide a convenient framework for formulating results in computable probability theory. For consistency, we largely use definitions from =-=[HR09]-=- and [GHR10]. Additional details about computable metric spaces can also be found in [Wei00, Ch. 8.1] and [Gác05, §B.3], and their relationship to computable topological spaces is explored in [GSW07].... |

10 | Computability and randomness. Oxford logic guides - Nies - 2009 |

10 | A probabilistic language based on sampling functions
- Park, Pfenning, et al.
- 2005
(Show Context)
Citation Context ...ot only do researchers in probabilistic AI and machine learning conceive of models by defining generative processes, but recently probabilistic functional programming languages (e.g., IBAL [Pfe01], λ◦=-=[PPT08]-=-, Church [GMR+ 08], and HANSEI [KS09]) have been proposed and used for universal statistical modeling in these areas and elsewhere. (Within domain theory, idealized functional languages that can manip... |

10 | Admissible representations for probability measures
- Schröder
- 2007
(Show Context)
Citation Context ...erform something like automatic numerical analysis.ON THE COMPUTABILITY OF CONDITIONAL PROBABILITY 7 Computable probability measures in this context have been analyzed by Weihrauch [Wei99], Schröder =-=[Sch07]-=-, and others. Another approach to computable probability comes from domain theory and an analysis of how computations on continuous structures are formed from partial information on an appropriate par... |

10 |
The Scott topology induces the weak topology
- Edalat
- 1996
(Show Context)
Citation Context ...background on computable probability theory, which will enable us to formulate our results. The foundations of the theory include notions of computability for probability measures developed by Edalat =-=[Eda96]-=-, Weihrauch [Wei99], Schroeder [Sch07b], and Gács [Gác05]. Computable probability theory itself builds off notions and results in computable analysis. For a general introduction to this approach to re... |

10 |
Trial and error predicates and the solution to a problem of
- Putnam
- 1965
(Show Context)
Citation Context ...red here could bear on the practice of statistical AI and machine learning. Osherson, Stob, and Weinstein [OSW88] study learning theory in the setting of identifiability in the limit (see [Gol67] and =-=[Put65]-=- for more details on this setting) and prove that a certain type of “computable Bayesian” learner fails to identify the index of a (computably enumerable) set that is “computably identifiable” in the ... |

8 |
Computability by probabilistic machines, in Automata Studies, Annals of Mathematics Studies 34
- Leeuw, Moore, et al.
- 1956
(Show Context)
Citation Context ...characterized computable measures on a wide variety of topological spaces. However, the computability theory of probability distributions extends back to work of de Leeuw, Moore, Shannon, and Shapiro =-=[dMSS56]-=-, and the computability of real functions was initiated by Grzegorczyk [Grz57], Mazur [Maz63], and others. The notion of a computable real number extends back to Turing’s foundational paper [Tur36]. M... |

8 | Computable exchangeable sequences have computable de Finetti measures
- Freer, Roy
- 2009
(Show Context)
Citation Context ...for generating an exchangeable sequence into a rule for computing the posterior distribution of the directing random measure. The result is a corollary of a computable version of de Finetti’s theorem =-=[FR09]-=-, and covers a wide range of common scenarios in nonparametric Bayesian statistics (often where no density exists). 2. Conditional distributions The notion of a conditional distribution is meant to ca... |

8 | Noncomputable conditional distributions
- Ackerman, Freer, et al.
- 2011
(Show Context)
Citation Context ...ts). Acknowledgments A preliminary version of this article appeared as “Noncomputable conditional distributions” in Proceedings of the 26th Annual IEEE Symposium on Logic in Computer Science, 107–116 =-=[AFR11]-=-. CEF was partially supported by NSF grants DMS-0901020 and DMS-0800198. His work on this publication was made possible through the support of a grant from the John Templeton Foundation. The opinions ... |

7 | A.: “Representing probability measures using probabilistic processes
- Schröder, Simpson
(Show Context)
Citation Context ...o [dMSS56], and the computability of real functions was initiated by Grzegorczyk [Grz57], Mazur [Maz63], and others. The notion of a computable real number extends back to Turing’s foundational paper =-=[Tur36]-=-. More recently, Pour-El and Richards [PER89], Weihrauch [Wei89], and others have brought in methods from constructive analysis. There has been much recent work following their approach to computable ... |

7 | Polynomial time samplable distributions
- Yamakami
- 1999
(Show Context)
Citation Context ...iven output of f. This intuition is made precise by Ben-David, Chor, Goldreich, and Luby [BCGL92] in their theory of polynomial-time samplable distributions, which has since been extended by Yamakami =-=[Yam99]-=- and others. In Section 4 we explore several general circumstances in which conditioning is computable. Freer and Roy [FR10] show how to compute conditional distributions in a situation with a rather ... |

7 | C.: Applications of effective probability theory to Martin-Löf randomness - Hoyrup, Rojas |

7 |
Domain representability of metric spaces. Annals of Pure and Applied Logic 83
- Blanck
- 1997
(Show Context)
Citation Context ...ble if and only if it is both lower and upper semicomputable. 2.2. Computable Metric Spaces. Computable metric spaces, as developed in computable analysis [Hem02], [Wei93] and effective domain theory =-=[JB97]-=-, [EH98], provide a convenient framework for formulating results in computable probability theory. For consistency, we largely use definitions from [HR09a] and [GHR10]. Additional details about comput... |

6 |
Notions of probabilistic computability on represented spaces
- Bosserhoff
(Show Context)
Citation Context ...s, uniformly in i1, . . . , ik. (These representations for computable probability measures largely coincide with those for measures on computable topological spaces in Schröder [Sch07] and Bosserhoff =-=[Bos08]-=-, and agree with Weihrauch [Wei99] for measures on the unit interval.) Thus the above analysis of what we learn by simulating a computable random variable X in S shows that PX is a computable point in... |

6 | Induction and recursion on the partial real line with applications to
- Escardó, Streicher
- 1999
(Show Context)
Citation Context ...osed and used for universal statistical modeling in these areas and elsewhere. (Within domain theory, idealized functional languages that can manipulate exact real numbers, such as Escardó’s RealPCF+ =-=[ES99]-=- and Plotkin’s PCF++ [Plo77], have also been extended by probabilistic choice operators, e.g., by Escardó [Esc09] and Saheb-Djahromi [SD78].) Such languages can naturally represent the higher-orderON... |

6 |
On universal prediction and bayesian confirmation. Theoretical Computer Science (in press
- Hutter
- 2007
(Show Context)
Citation Context ...ional distributions has been explored for priors that are universal for partial computable functions, as defined using Kolmogorov complexity by Solomonoff [Sol64] and Zvonkin and Levin [ZL70]. Hutter =-=[Hut07]-=- has recently extended classic results about such universal priors in Solomonoff’s theory of prediction. The computability of conditional distributions also plays a role in Takahashi’s work on the alg... |

6 |
Conditional distributions as derivatives
- Pfanzagl
- 1979
(Show Context)
Citation Context ...es, like the dominated case above. There have been several attempts at more general techniques, but even the most constructive definitions (e.g., those due to Tjur [Tju74], [Tju75], [Tju80], Pfanzagl =-=[Pfa79]-=-, and Rao [Rao88], [Rao05]) are often not sensitive to issues of computability. In particular, almost every characterization requires the computation of a limit for which we have no computable bound o... |

6 | Effective metric spaces and representations of the reals
- Hemmerling
(Show Context)
Citation Context ...e. uniformly in n). The function f is computable if and only if it is both lower and upper semicomputable. 2.2. Computable Metric Spaces. Computable metric spaces, as developed in computable analysis =-=[Hem02]-=-, [Wei93] and effective domain theory [JB97], [EH98], provide a convenient framework for formulating results in computable probability theory. For consistency, we largely use definitions from [HR09a] ... |

6 |
Conditional Probability Distributions. Lecture Notes, no
- Tjur
- 1974
(Show Context)
Citation Context ...ontains many ad-hoc techniques for calculating conditional probabilities in special circumstances, and this state of affairs motivated much work on constructive definitions (such as those due to Tjur =-=[Tju74]-=-, [Tju75], [Tju80], Pfanzagl [Pfa79], and Rao [Rao88], [Rao05]), but this work has often not been sensitive to issues of computability. We recall the basics of the measure-theoretic approach to condit... |

5 |
A computational model for metric spaces, Theoret
- Edalat, Heckmann
- 1998
(Show Context)
Citation Context ... in [Wei00, Ch. 8.1] and [Gác05, §B.3], and their relationship to computable topological spaces is explored in [GSW07]. Computable measures on metric spaces have also been studied using domain theory =-=[EH98]-=-. Definition 3.1 (Computable metric space [GHR10, Def. 2.3.1]). A computable metric space is a triple (S, δ, D) for which δ is a metric on the set S satisfying (1) (S, δ) is a complete separable metri... |

4 | Constructing non-computable Julia sets
- Braverman, Yampolsky
- 2007
(Show Context)
Citation Context ... analysis, akin to the theorem of Pour-El and Richards [PER79] that there is a computable ordinary differential equation with no computable solution, or more recent results of Braverman and Yampolsky =-=[BY07]-=- on the noncomputability of certain Julia sets. Our results are formulated in the Turing-machine-based bit-model for real computation, using a natural notion of computable probability measure which co... |

4 |
Semi-decidability of may, must and probabilistic testing in a higher-type setting,” Electron
- Escardó
(Show Context)
Citation Context ...functional languages that can manipulate exact real numbers, such as Escardó’s RealPCF+ [ES99] and Plotkin’s PCF++ [Plo77], have also been extended by probabilistic choice operators, e.g., by Escardó =-=[Esc09]-=- and Saheb-Djahromi [SD78].) Such languages can naturally represent the higher-orderON THE COMPUTABILITY OF CONDITIONAL PROBABILITY 3 objects used in machine learning (e.g., a distribution on distrib... |

4 |
Paradoxes in conditional probability
- Rao
- 1988
(Show Context)
Citation Context ...nated case above. There have been several attempts at more general techniques, but even the most constructive definitions (e.g., those due to Tjur [Tju74], [Tju75], [Tju80], Pfanzagl [Pfa79], and Rao =-=[Rao88]-=-, [Rao05]) are often not sensitive to issues of computability. In particular, almost every characterization requires the computation of a limit for which we have no computable bound on the rate of con... |