Results 1  10
of
12
Two views of belief: Belief as generalized probability and belief as evidence
, 1992
"... : Belief functions are mathematical objects defined to satisfy three axioms that look somewhat similar to the Kolmogorov axioms defining probability functions. We argue that there are (at least) two useful and quite different ways of understanding belief functions. The first is as a generalized prob ..."
Abstract

Cited by 72 (12 self)
 Add to MetaCart
: Belief functions are mathematical objects defined to satisfy three axioms that look somewhat similar to the Kolmogorov axioms defining probability functions. We argue that there are (at least) two useful and quite different ways of understanding belief functions. The first is as a generalized probability function (which technically corresponds to the inner measure induced by a probability function). The second is as a way of representing evidence. Evidence, in turn, can be understood as a mapping from probability functions to probability functions. It makes sense to think of updating a belief if we think of it as a generalized probability. On the other hand, it makes sense to combine two beliefs (using, say, Dempster's rule of combination) only if we think of the belief functions as representing evidence. Many previous papers have pointed out problems with the belief function approach; the claim of this paper is that these problems can be explained as a consequence of confounding the...
Rational explanation of the selection task
 Psychological Review
, 1996
"... M. Oaksford and N. Chater (O&C; 1994) presented the first quantitative model of P. C. Wason's ( 1966, 1968) selection task in.which performance is rational. J. St B T Evans and D. E. Over (1996) reply that O&C's account is normatively incorrect and cannot model K. N. Kirby's (1994b) or P. Pollard an ..."
Abstract

Cited by 46 (4 self)
 Add to MetaCart
M. Oaksford and N. Chater (O&C; 1994) presented the first quantitative model of P. C. Wason's ( 1966, 1968) selection task in.which performance is rational. J. St B T Evans and D. E. Over (1996) reply that O&C's account is normatively incorrect and cannot model K. N. Kirby's (1994b) or P. Pollard and J. St B T Evans's (1983) data. It is argued that an equivalent measure satisfies their normative concerns and that a modification of O&C's model accounts for their empirical concerns. D. Laming (1996) argues that O&C made unjustifiable psychological assumptions and that a "correct" Bayesian analysis agrees with logic. It is argued that O&C's model makes normative and psychological sense and that Laming's analysis is not Bayesian. A. Almor and S. A. Sloman (1996) argue that O&C cannot explain their data. It is argued that Almor and Sloman's data do not bear on O&C's model because they alter the nature of the task. It is concluded that O&C's model remains the most compelling and comprehensive account of the selection task. Research on Wason's (1966, 1968) selection task questions human rationality because performance is not "logically correct?' Recently, Oaksford and Chater (O&C; 1994) provided a rational analysis (Anderson, 1990, 1991) of the selection task that appeared to vindicate human rationality. O&C argued that the selection task is an inductive, rather than a deductive, reasoning task: Participants must assess the truth or falsity of a general rule from specific instances. In particular, participants face a problem of optimal data selection (Lindley, 1956): They must decide which of four cards (p, notp, q, or notq) is likely to provide the most useful data to test a conditional rule,/fp then q. The "logical " solution is to select the p and the notq cards. O&C argued that this solution presupposes falsificationism (Popper, 1959), which argues that only data that can disconfirm, not confirm, hypotheses are of interest. In contrast, O&C's rational analysis uses a Bayesian approach to inductive
A logic for reasoning about evidence
 In Proc. 19th Conference on Uncertainty in Artificial Intelligence (UAI’03
, 2003
"... We introduce a logic for reasoning about evidence that essentially views evidence as a function from prior beliefs (before making an observation) to posterior beliefs (after making the observation). We provide a sound and complete axiomatization for the logic, and consider the complexity of the deci ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
We introduce a logic for reasoning about evidence that essentially views evidence as a function from prior beliefs (before making an observation) to posterior beliefs (after making the observation). We provide a sound and complete axiomatization for the logic, and consider the complexity of the decision problem. Although the reasoning in the logic is mainly propositional, we allow variables representing numbers and quantification over them. This expressive power seems necessary to capture important properties of evidence. 1.
Evidential Diversity and Premise Probability in Young Children's Inductive Judgment
, 2000
"... A familiar adage in the philosophy of science is that general hypotheses are better supported by varied evidence than by uniform evidence. Several studies suggest that young children do not respect this principle, and thus su#er from a defect in their inductive methodology. We argue that the dive ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
A familiar adage in the philosophy of science is that general hypotheses are better supported by varied evidence than by uniform evidence. Several studies suggest that young children do not respect this principle, and thus su#er from a defect in their inductive methodology. We argue that the diversity principle does not have the normative status that psychologists attribute to it, and should be replaced by a simple rule of probability. We then report experiments designed to detect conformity to the latter rule in children's inductive judgment.
The Maximum Entropy Approach and Probabilistic IR Models
 ACM TRANSACTIONS ON INFORMATION SYSTEMS
, 1998
"... The Principle of Maximum Entropy is discussed and two classic probabilistic models of information retrieval, the Binary Independence Model of Robertson and Sparck Jones and the Combination Match Model of Croft and Harper are derived using the maximum entropy approach. The assumptions on which the cl ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
The Principle of Maximum Entropy is discussed and two classic probabilistic models of information retrieval, the Binary Independence Model of Robertson and Sparck Jones and the Combination Match Model of Croft and Harper are derived using the maximum entropy approach. The assumptions on which the classical models are based are not made. In their place, the probability distribution of maximum entropy consistent with a set of constraints is determined. It is argued that this subjectivist approach is more philosophically coherent than the frequentist conceptualization of probability that is often assumed as the basis of probabilistic modeling and that this philosophical stance has important practical consequences with respect to the realization of information retrieval research.
Empirical Studies of Query/Document Characteristics as Evidence in Favor of Relevance
 Proceedings of ACM SIGIR
, 1998
"... Query/document characteristics known to be useful for information retrieval are analyzed for a specific collection/queryset pair. These features are analyzed in terms of the weight of evidence in favor of relevance provided by values assumed by the feature variables. Weight of evidence, a measure o ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Query/document characteristics known to be useful for information retrieval are analyzed for a specific collection/queryset pair. These features are analyzed in terms of the weight of evidence in favor of relevance provided by values assumed by the feature variables. Weight of evidence, a measure of how much more likely a hypothesis is believed to hold after evidence is considered than before it is observed, is formally defined; and a technique for the analysis of weight of evidence as a function of features of interest is presented. The method is exemplified by showing how it has been applied to analyze evidence in the form of: the coordination level, and the inverse document frequencies and term frequencies for all of the query terms. The result of data analysis is a model of weight of evidence that can be used as the foundation of a retrieval ranking formula. Results of preliminary evaluation of the derived formula are presented and discussed. 1 Introduction This paper presents ...
Evidential Diversity and Premise Probability in Young Children's Inductive Judgment
, 1999
"... A familiar adage in the philosophy of science is that general hypotheses are better supported by varied evidence than by uniform evidence. Several studies suggest that young children do not respect this principle, and thus su#er from a defect in their inductive methodology. We argue that the diversi ..."
Abstract
 Add to MetaCart
A familiar adage in the philosophy of science is that general hypotheses are better supported by varied evidence than by uniform evidence. Several studies suggest that young children do not respect this principle, and thus su#er from a defect in their inductive methodology. We argue that the diversity principle does not have the normative status that psychologists attribute to it, and should be replaced by a simple rule of probability. We then report an experiment designed to detect conformity to the latter rule in children's inductive judgment. Evidential Diversity 1 Introduction A central issue in cognitive development is whether children's scientific reasoning is methodologically sound (simply short on facts), or else neglectful of fundamental principles of inductive reasoning (Carey, 1985; Markman, 1989; Keil, 1989; Kuhn, 1996; Gopnik and Meltzo#, 1996; Koslowski, 1996). To address the issue, normative standards of inductive reasoning must be formulated, and children's thinking e...
NULL HYPOTHESES BAYESIAN HYPOTHESIS TESTING PAVLOVIAN CONDITIONING STATISTICAL LEARNING ATTENTION MOTOR PLANNING EFFECT SIZE PROBABILISTIC INFERENCE
"... Null hypotheses are simple, precise and theoretically important. Conventional statistical analysis cannot support them; Bayesian analysis can. The challenge in a Bayesian analysis is to formulate a suitably vague alternative, because the vaguer the alternative is (the more it spreads out the unit ma ..."
Abstract
 Add to MetaCart
Null hypotheses are simple, precise and theoretically important. Conventional statistical analysis cannot support them; Bayesian analysis can. The challenge in a Bayesian analysis is to formulate a suitably vague alternative, because the vaguer the alternative is (the more it spreads out the unit mass of prior probability), the more the null is favored. A general solution is a sensitivity analysis: Compute the odds for or against the null as a function of the limit(s) on the vagueness of the alternative. If the odds on the null approach 1 from above as the hypothesized maximum size of the possible effect approaches 0, then the data favor the null over any vaguer alternative to it. The simple computations and the intuitive graphic representation of the analysis are illustrated by the analysis of diverse examples from the current literature. They pose three common experimental questions: 1) Are two means the same? 2) Is performance at chance? 3) Are factors additive?
THEORETICAL NOTES The Importance of Proving the Null
"... Null hypotheses are simple, precise, and theoretically important. Conventional statistical analysis cannot support them; Bayesian analysis can. The challenge in a Bayesian analysis is to formulate a suitably vague alternative, because the vaguer the alternative is (the more it spreads out the unit m ..."
Abstract
 Add to MetaCart
Null hypotheses are simple, precise, and theoretically important. Conventional statistical analysis cannot support them; Bayesian analysis can. The challenge in a Bayesian analysis is to formulate a suitably vague alternative, because the vaguer the alternative is (the more it spreads out the unit mass of prior probability), the more the null is favored. A general solution is a sensitivity analysis: Compute the odds for or against the null as a function of the limit(s) on the vagueness of the alternative. If the odds on the null approach 1 from above as the hypothesized maximum size of the possible effect approaches 0, then the data favor the null over any vaguer alternative to it. The simple computations and the intuitive graphic representation of the analysis are illustrated by the analysis of diverse examples from the current literature. They pose 3 common experimental questions: (a) Are 2 means the same? (b) Is performance at chance? (c) Are factors additive?
Bayes ’ Rule of Information Bayes ’ Rule of Information
"... This chapter discusses a duality between the addition of random variables and the addition of information via Bayes ’ theorem: When adding independent random variables, variances (when they exist) add. With Bayes ’ theorem, defining “score ” and “observed information ” via derivatives of the log den ..."
Abstract
 Add to MetaCart
This chapter discusses a duality between the addition of random variables and the addition of information via Bayes ’ theorem: When adding independent random variables, variances (when they exist) add. With Bayes ’ theorem, defining “score ” and “observed information ” via derivatives of the log densities, the posterior score is the prior score plus the score from the data, and observed information similarly adds. These facts make it easier to understand and use Bayes ’ theorem. They also provide tools for easily deriving approximate posteriors in particular families, especially normal. Other tools can then be used to evaluate the adequacy of naive use of these approximations. Even when, for example, a normal posterior is not sufficiently accurate for direct use, it can still be used as part of an improved solution obtained via adaptive GaussHermite quadrature or importance sampling in Monte Carlo integration and Markov Chain Monte Carlo, for example. One important realm for application of these techniques is with various kinds of (extended) Kalman / Bayesian filtering following a 2step Bayesian sequential updating ch3Bayes Rule of Info2.doc 1 / 28 08/02/05Bayes ’ Rule of Information cycle of (1) updating the posterior from the previous observation to model a possible change of state before the current observation, and (2) using Bayes ’ theorem to combine the current prior and observation to produce an updated posterior. These tools provide easy derivations of the posterior and of approximations, especially normal approximations. Another application involves mixed effects models outside the normal linear framework. This chapter includes derivations of Bayesian exponentially weighted moving averages (EWMAs) for exponential family / exponential dispersion models including gammaPoison, betabinomial and Dirichletmultinomial. Pathologies that occur with violations of standard assumptions are illustrated with an exponentialuniform model. 1.