Results 1  10
of
74
Learning Stochastic Logic Programs
, 2000
"... Stochastic Logic Programs (SLPs) have been shown to be a generalisation of Hidden Markov Models (HMMs), stochastic contextfree grammars, and directed Bayes' nets. A stochastic logic program consists of a set of labelled clauses p:C where p is in the interval [0,1] and C is a firstorder range ..."
Abstract

Cited by 1057 (71 self)
 Add to MetaCart
Stochastic Logic Programs (SLPs) have been shown to be a generalisation of Hidden Markov Models (HMMs), stochastic contextfree grammars, and directed Bayes' nets. A stochastic logic program consists of a set of labelled clauses p:C where p is in the interval [0,1] and C is a firstorder rangerestricted definite clause. This paper summarises the syntax, distributional semantics and proof techniques for SLPs and then discusses how a standard Inductive Logic Programming (ILP) system, Progol, has been modied to support learning of SLPs. The resulting system 1) nds an SLP with uniform probability labels on each definition and nearmaximal Bayes posterior probability and then 2) alters the probability labels to further increase the posterior probability. Stage 1) is implemented within CProgol4.5, which differs from previous versions of Progol by allowing userdefined evaluation functions written in Prolog. It is shown that maximising the Bayesian posterior function involves nding SLPs with short derivations of the examples. Search pruning with the Bayesian evaluation function is carried out in the same way as in previous versions of CProgol. The system is demonstrated with worked examples involving the learning of probability distributions over sequences as well as the learning of simple forms of uncertain knowledge.
Prior Probabilities
 IEEE Transactions on Systems Science and Cybernetics
, 1968
"... e case of location and scale parameters, rate constants, and in Bernoulli trials with unknown probability of success. In realistic problems, both the transformation group analysis and the principle of maximum entropy are needed to determine the prior. The distributions thus found are uniquely determ ..."
Abstract

Cited by 166 (3 self)
 Add to MetaCart
e case of location and scale parameters, rate constants, and in Bernoulli trials with unknown probability of success. In realistic problems, both the transformation group analysis and the principle of maximum entropy are needed to determine the prior. The distributions thus found are uniquely determined by the prior information, independently of the choice of parameters. In a certain class of problems, therefore, the prior distributions may now be claimed to be fully as "objective" as the sampling distributions. I. Background of the problem Since the time of Laplace, applications of probability theory have been hampered by difficulties in the treatment of prior information. In realistic problems of decision or inference, we often have prior information which is highly relevant to the question being asked; to fail to take it into account is to commit the most obvious inconsistency of reasoning and may lead to absurd or dangerously misleading results. As an extreme examp
Betting on Theories
, 1993
"... Predictions about the future and unrestricted universal generalizations are never logically implied by our observational evidence, which is limited to particular facts in the present and past. Nevertheless, propositions of these and other kinds are often said to be confirmed by observational evidenc ..."
Abstract

Cited by 70 (4 self)
 Add to MetaCart
Predictions about the future and unrestricted universal generalizations are never logically implied by our observational evidence, which is limited to particular facts in the present and past. Nevertheless, propositions of these and other kinds are often said to be confirmed by observational evidence. A natural place to begin the study of confirmation theory is to consider what it means to say that some evidence E confirms a hypothesis H. Incremental and absolute confirmation Let us say that E raises the probability of H if the probability of H given E is higher than the probability of H not given E. According to many confirmation theorists, “E confirms H ” means that E raises the probability of H. This conception of confirmation will be called incremental confirmation. Let us say that H is probable given E if the probability of H given E is above some threshold. (This threshold remains to be specified but is assumed to be at least one half.) According to some confirmation theorists, “E confirms H ” means that H is probable given E. This conception of confirmation will be called absolute confirmation. Confirmation theorists have sometimes failed to distinguish these two concepts. For example, Carl Hempel in his classic “Studies in the Logic of Confirmation ” endorsed the following principles: (1) A generalization of the form “All F are G ” is confirmed by the evidence that there is an individual that is both F and G. (2) A generalization of that form is also confirmed by the evidence that there is an individual that is neither F nor G. (3) The hypotheses confirmed by a piece of evidence are consistent with one another. (4) If E confirms H then E confirms every logical consequence of H. Principles (1) and (2) are not true of absolute confirmation. Observation of a single thing that is F and G cannot in general make it probable that all F are G; likewise for an individual that is neither
Random Worlds and Maximum Entropy
 In Proc. 7th IEEE Symp. on Logic in Computer Science
, 1994
"... Given a knowledge base KB containing firstorder and statistical facts, we consider a principled method, called the randomworlds method, for computing a degree of belief that some formula ' holds given KB . If we are reasoning about a world or system consisting of N individuals, then we can conside ..."
Abstract

Cited by 49 (12 self)
 Add to MetaCart
Given a knowledge base KB containing firstorder and statistical facts, we consider a principled method, called the randomworlds method, for computing a degree of belief that some formula ' holds given KB . If we are reasoning about a world or system consisting of N individuals, then we can consider all possible worlds, or firstorder models, with domain f1; : : : ; Ng that satisfy KB , and compute the fraction of them in which ' is true. We define the degree of belief to be the asymptotic value of this fraction as N grows large. We show that when the vocabulary underlying ' and KB uses constants and unary predicates only, we can naturally associate an entropy with each world. As N grows larger, there are many more worlds with higher entropy. Therefore, we can use a maximumentropy computation to compute the degree of belief. This result is in a similar spirit to previous work in physics and artificial intelligence, but is far more general. Of equal interest to the result itself are...
Statistical Foundations for Default Reasoning
, 1993
"... We describe a new approach to default reasoning, based on a principle of indifference among possible worlds. We interpret default rules as extreme statistical statements, thus obtaining a knowledge base KB comprised of statistical and firstorder statements. We then assign equal probability to all w ..."
Abstract

Cited by 45 (8 self)
 Add to MetaCart
We describe a new approach to default reasoning, based on a principle of indifference among possible worlds. We interpret default rules as extreme statistical statements, thus obtaining a knowledge base KB comprised of statistical and firstorder statements. We then assign equal probability to all worlds consistent with KB in order to assign a degree of belief to a statement '. The degree of belief can be used to decide whether to defeasibly conclude '. Various natural patterns of reasoning, such as a preference for more specific defaults, indifference to irrelevant information, and the ability to combine independent pieces of evidence, turn out to follow naturally from this technique. Furthermore, our approach is not restricted to default reasoning; it supports a spectrum of reasoning, from quantitative to qualitative. It is also related to other systems for default reasoning. In particular, we show that the work of [ Goldszmidt et al., 1990 ] , which applies maximum entropy ideas t...
A Natural Law of Succession
, 1995
"... Consider the following problem. You are given an alphabet of k distinct symbols and are told that the i th symbol occurred exactly ni times in the past. On the basis of this information alone, you must now estimate the conditional probability that the next symbol will be i. In this report, we presen ..."
Abstract

Cited by 35 (3 self)
 Add to MetaCart
Consider the following problem. You are given an alphabet of k distinct symbols and are told that the i th symbol occurred exactly ni times in the past. On the basis of this information alone, you must now estimate the conditional probability that the next symbol will be i. In this report, we present a new solution to this fundamental problem in statistics and demonstrate that our solution outperforms standard approaches, both in theory and in practice.
System Identification, Approximation and Complexity
 International Journal of General Systems
, 1977
"... This paper is concerned with establishing broadlybased systemtheoretic foundations and practical techniques for the problem of system identification that are rigorous, intuitively clear and conceptually powerful. A general formulation is first given in which two order relations are postulated on a ..."
Abstract

Cited by 34 (23 self)
 Add to MetaCart
This paper is concerned with establishing broadlybased systemtheoretic foundations and practical techniques for the problem of system identification that are rigorous, intuitively clear and conceptually powerful. A general formulation is first given in which two order relations are postulated on a class of models: a constant one of complexity; and a variable one of approximation induced by an observed behaviour. An admissible model is such that any less complex model is a worse approximation. The general problem of identification is that of finding the admissible subspace of models induced by a given behaviour. It is proved under very general assumptions that, if deterministic models are required then nearly all behaviours require models of nearly maximum complexity. A general theory of approximation between models and behaviour is then developed based on subjective probability concepts and semantic information theory The role of structural constraints such as causality, locality, finite memory, etc., are then discussed as rules of the game. These concepts and results are applied to the specific problem or stochastic automaton, or grammar, inference. Computational results are given to demonstrate that the theory is complete and fully operational. Finally the formulation of identification proposed in this paper is analysed in terms of Klir’s epistemological hierarchy and both are discussed in terms of the rich philosophical literature on the acquisition of knowledge. 1
From inheritance relation to nonaxiomatic logic
 International Journal of Approximate Reasoning
, 1994
"... NonAxiomatic Reasoning System is an adaptive system that works with insu cient knowledge and resources. At the beginning of the paper, three binary term logics are de ned. The rst is based only on an inheritance relation. The second and the third suggest a novel way to process extension and intensi ..."
Abstract

Cited by 33 (25 self)
 Add to MetaCart
NonAxiomatic Reasoning System is an adaptive system that works with insu cient knowledge and resources. At the beginning of the paper, three binary term logics are de ned. The rst is based only on an inheritance relation. The second and the third suggest a novel way to process extension and intension, and they also have interesting relations with Aristotle's syllogistic logic. Based on the three simple systems, a NonAxiomatic Logic is de ned. It has a termoriented language and an experiencegrounded semantics. It can uniformly represents and processes randomness, fuzziness, and ignorance. It can also uniformly carries out deduction, abduction, induction, and revision.