Results 1 -
8 of
8
Probabilistic Logic Programming under Maximum Entropy
- In Proc. ECSQARU-99, LNCS 1638
, 1999
"... . In this paper, we focus on the combination of probabilistic logic programming with the principle of maximum entropy. We start by defining probabilistic queries to probabilistic logic programs and their answer substitutions under maximum entropy. We then present an efficient linear programming char ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
. In this paper, we focus on the combination of probabilistic logic programming with the principle of maximum entropy. We start by defining probabilistic queries to probabilistic logic programs and their answer substitutions under maximum entropy. We then present an efficient linear programming characterization for the problem of deciding whether a probabilistic logic program is satisfiable. Finally, and as a central contribution of this paper, we introduce an efficient technique for approximative probabilistic logic programming under maximum entropy. This technique reduces the original entropy maximization task to solving a modified and relatively small optimization problem. 1 Introduction Probabilistic propositional logics and their various dialects are thoroughly studied in the literature (see especially [19] and [5]; see also [15] and [16]). Their extensions to probabilistic first-order logics can be classified into first-order logics in which probabilities are defined over the do...
Foundations for Bayesian networks
, 2001
"... Bayesian networks are normally given one of two types of foundations: they are either treated purely formally as an abstract way of representing probability functions, or they are interpreted, with some causal interpretation given to the graph in a network and some standard interpretation of probabi ..."
Abstract
-
Cited by 9 (6 self)
- Add to MetaCart
Bayesian networks are normally given one of two types of foundations: they are either treated purely formally as an abstract way of representing probability functions, or they are interpreted, with some causal interpretation given to the graph in a network and some standard interpretation of probability given to the probabilities specified in the network. In this chapter I argue that current foundations are problematic, and put forward new foundations which involve aspects of both the interpreted and the formal approaches. One standard approach is to interpret a Bayesian network objectively: the graph in a Bayesian network represents causality in the world and the specified probabilities are objective, empirical probabilities. Such an interpretation founders when the Bayesian network independence assumption (often called the causal Markov condition) fails to hold. In §2 I catalogue the occasions when the independence assumption fails, and show that such failures are pervasive. Next, in §3, I show that even where the independence assumption does hold objectively, an agent’s causal knowledge is unlikely to satisfy the assumption with respect to her subjective probabilities, and that slight differences between an agent’s subjective Bayesian network and an objective Bayesian network can lead to large differences between probability distributions determined by these networks. To overcome these difficulties I put forward logical Bayesian foundations in §5. I show that if the graph and probability specification in a Bayesian network are thought of as an agent’s background knowledge, then the agent is most rational if she adopts the probability distribution determined by the
Credal Networks under Maximum Entropy
- In Proceedings UAI-2000
, 2000
"... We apply the principle of maximum entropy to select a unique joint probability distribution from the set of all joint probability distributions specified by a credal network. In detail, we start by showing that the unique joint distribution of a Bayesian tree coincides with the maximum entropy ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
We apply the principle of maximum entropy to select a unique joint probability distribution from the set of all joint probability distributions specified by a credal network. In detail, we start by showing that the unique joint distribution of a Bayesian tree coincides with the maximum entropy model of its conditional distributions. This result, however, does not hold anymore for general Bayesian networks. We thus present a new kind of maximum entropy models, which are computed sequentially. We then show that for all general Bayesian networks, the sequential maximum entropy model coincides with the unique joint distribution. Moreover, we apply the new principle of sequential maximum entropy to interval Bayesian networks and more generally to credal networks. We especially show that this application is equivalent to a number of small local entropy maximizations. 1 INTRODUCTION In classical Bayesian networks [31], a single joint probability distribution is specified b...
Default Reasoning Using Maximum Entropy and Variable Strength Defaults
, 1999
"... The thesis presents a computational model for reasoning with partial information which uses default rules or information about what normally happens. The idea is to provide a means of filling the gaps in an incomplete world view with the most plausible assumptions while allowing for the retraction o ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
The thesis presents a computational model for reasoning with partial information which uses default rules or information about what normally happens. The idea is to provide a means of filling the gaps in an incomplete world view with the most plausible assumptions while allowing for the retraction of conclusions should they subsequently turn out to be incorrect. The model can be used both to reason from a given knowledge base of default rules, and to aid in the construction of such knowledge bases by allowing their designer to compare the consequences of his design with his own default assumptions. The conclusions supported by the proposed model are justified by the use of a probabilistic semantics for default rules in conjunction with the application of a rational means of inference from incomplete knowledge---the principle of maximum entropy (ME). The thesis develops both the theory and algorithms for the ME approach and argues that it should be considered as a general theory of default reasoning. The argument supporting the thesis has two main threads. Firstly, the ME approach is tested on the benchmark examples required of nonmonotonic behaviour, and it is found to handle them appropriately. Moreover, these patterns of commonsense reasoning emerge as consequences of the chosen semantics rather than being design features. It is argued that this makes the ME approach more objective, and its conclusions more justifiable, than other default systems. Secondly, the ME approach is compared with two existing systems: the lexicographic approach (LEX) and system Z + . It is shown that the former can be equated with ME under suitable conditions making it strictly less expressive, while the latter is too crude to perform the subtle resolution of default conflict which the ME...
Measure selection: Notions of rationality and representation independence
- Proceedings of the 14th conference on Uncertainty in Artificial Intelligence
, 1998
"... We take another look at the general problem of selecting a preferred probability measure among those that comply with some given constraints. The dominant role that entropy maximization has obtained in this context is questioned by arguing that the minimum information principle on which it is based ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We take another look at the general problem of selecting a preferred probability measure among those that comply with some given constraints. The dominant role that entropy maximization has obtained in this context is questioned by arguing that the minimum information principle on which it is based could be supplanted by an at least as plausible “likelihood of evidence ” principle. We then review a method for turning given selection functions into representation independent variants, and discuss the tradeoffs involved in this transformation. setI(J) 1
Rationality as conformity
- Synthese (Knowledge, Rationality and Action), 144(2):249 – 285
, 2005
"... Abstract 5 ..."
Deceptive Updating and Minimal Information Methods
"... The technique of minimizing information (infomin) has been commonly employed as a general method for both choosing and updating a subjective probability function. We argue that, in a wide class of cases, the use of infomin methods fails to cohere with our standard conception of rational degrees of b ..."
Abstract
- Add to MetaCart
The technique of minimizing information (infomin) has been commonly employed as a general method for both choosing and updating a subjective probability function. We argue that, in a wide class of cases, the use of infomin methods fails to cohere with our standard conception of rational degrees of belief. We introduce the notion of a deceptive updating method, and argue that non-deceptiveness is a necessary condition for rational coherence. Infomin has been criticized on the grounds that there are no higher order probabilities that ‘support ’ it, but the appeal to higher order probabilities is a substantial assumption that some might reject. The elementary arguments from deceptiveness do not rely on this assumption. While deceptiveness implies lack of higher order support, the converse does not, in general, hold, which indicates that deceptiveness is a more objectionable property. We offer a new proof of the claim that infomin updating of any strictly-positive prior with respect to conditional-probability constraints is deceptive. In the case of expected-value constraints, infomin updating of the uniform prior is deceptive for some random variables, but not for others. We establish both a necessary condition and a sufficient condition (which extends the scope of the phenomenon beyond cases previously considered) for deceptiveness in this setting. Along the way, we clarify the relation which obtains between the strong notion of higher order support, in which the higher order probability is defined over the full space of first order probabilities, and the apparently weaker notion, in which it is defined over some smaller parameter space. We show that under certain natural assumptions, the two are equivalent. Finally, we offer an interpretation of Jaynes, according to which his own appeal to infomin methods avoids the incoherencies discussed in this paper. 1.

