Results 11 - 20
of
269
Plausibility Measures and Default Reasoning
- Journal of the ACM
, 1996
"... this paper: default reasoning. In recent years, a number of different semantics for defaults have been proposed, such as preferential structures, ffl-semantics, possibilistic structures, and -rankings, that have been shown to be characterized by the same set of axioms, known as the KLM properties. W ..."
Abstract
-
Cited by 68 (10 self)
- Add to MetaCart
this paper: default reasoning. In recent years, a number of different semantics for defaults have been proposed, such as preferential structures, ffl-semantics, possibilistic structures, and -rankings, that have been shown to be characterized by the same set of axioms, known as the KLM properties. While this was viewed as a surprise, we show here that it is almost inevitable. In the framework of plausibility measures, we can give a necessary condition for the KLM axioms to be sound, and an additional condition necessary and sufficient to ensure that the KLM axioms are complete. This additional condition is so weak that it is almost always met whenever the axioms are sound. In particular, it is easily seen to hold for all the proposals made in the literature. Categories and Subject Descriptors: F.4.1 [Mathematical Logic and Formal Languages]:
Building Probabilistic Models for Natural Language
, 1996
"... Building models of language is a central task in natural language processing. Traditionally, language has been modeled with manually-constructed grammars that describe which strings are grammatical and which are not; however, with the recent availability of massive amounts of on-line text, statistic ..."
Abstract
-
Cited by 60 (1 self)
- Add to MetaCart
Building models of language is a central task in natural language processing. Traditionally, language has been modeled with manually-constructed grammars that describe which strings are grammatical and which are not; however, with the recent availability of massive amounts of on-line text, statistically-trained models are an attractive alternative. These models are generally probabilistic, yielding a score reflecting sentence frequency instead of a binary grammaticality judgement. Probabilistic models of language are a fundamental tool in speech recognition for resolving acoustically ambiguous utterances. For example, we prefer the transcription forbear to four bear as the former string is far more frequent in English text. Probabilistic models also have application in optical character recognition, handwriting recognition, spelling correction, part-of-speech tagging, and machine translation. In this thesis, we investigate three problems involving the probabilistic modeling of languag...
Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction
, 1998
"... We introduce an entropic prior for multinomial parameter estimation problems and solve for its maximum... ..."
Abstract
-
Cited by 59 (0 self)
- Add to MetaCart
We introduce an entropic prior for multinomial parameter estimation problems and solve for its maximum...
Bayesian Approaches to Gaussian Mixture Modelling
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1998
"... A Bayesian-based methodology is presented which automatically penalises over-complex models being fitted to unknown data. We show that, with a Gaussian mixture model, the approach is able to select an `optimal' number of components in the model and so partition data sets. The performance of the Baye ..."
Abstract
-
Cited by 59 (1 self)
- Add to MetaCart
A Bayesian-based methodology is presented which automatically penalises over-complex models being fitted to unknown data. We show that, with a Gaussian mixture model, the approach is able to select an `optimal' number of components in the model and so partition data sets. The performance of the Bayesian method is compared to other methods of optimal model selection and found to give good results. The methods are tested on synthetic and real data sets. Introduction Scientific disciplines generate data. In the attempt to understand the patterns present in such data sets methods which perform some form of unsupervised partitioning or modelling are particularly useful. Such an approach is only of use, however, if it offers a less complex representation of the data than the data set itself. This introduces an apparent conflict, however, as any model improves its fit to the data monotonically with increases in its complexity (the number of model parameters) -- a model as complex as the data...
Learning Bayesian Belief Networks Based on the Minimum Description Length Principle: Basic Properties
, 1996
"... This paper was partially presented at the 9th conference on Uncertainty in Artificial Intelligence, July 1993. ..."
Abstract
-
Cited by 44 (0 self)
- Add to MetaCart
This paper was partially presented at the 9th conference on Uncertainty in Artificial Intelligence, July 1993.
Bayesian Treatment of the Independent Student-t Linear Model
- JOURNAL OF APPLIED ECONOMETRICS
, 1993
"... This article takes up methods for Bayesian inference in a linear model in which the disturbances are independent and have identical Student-t distributions. It exploits the equivalence of the Student-t distribution and an appropriate scale mixture of normals, and uses a Gibbs sampler to perform the ..."
Abstract
-
Cited by 44 (2 self)
- Add to MetaCart
This article takes up methods for Bayesian inference in a linear model in which the disturbances are independent and have identical Student-t distributions. It exploits the equivalence of the Student-t distribution and an appropriate scale mixture of normals, and uses a Gibbs sampler to perform the computations. The new method is applied to some well-known macroeconomic time series. It is found that posterior odds ratios favor the independent Student-t linear model over the normal linear model, and that the posterior odds ratio in favor of difference stationarity over trend stationarity is often substantially less in the favored Student-t models.
From Laplace To Supernova Sn 1987a: Bayesian Inference In Astrophysics
, 1990
"... . The Bayesian approach to probability theory is presented as an alternative to the currently used long-run relative frequency approach, which does not offer clear, compelling criteria for the design of statistical methods. Bayesian probability theory offers unique and demonstrably optimal solutions ..."
Abstract
-
Cited by 42 (2 self)
- Add to MetaCart
. The Bayesian approach to probability theory is presented as an alternative to the currently used long-run relative frequency approach, which does not offer clear, compelling criteria for the design of statistical methods. Bayesian probability theory offers unique and demonstrably optimal solutions to well-posed statistical problems, and is historically the original approach to statistics. The reasons for earlier rejection of Bayesian methods are discussed, and it is noted that the work of Cox, Jaynes, and others answers earlier objections, giving Bayesian inference a firm logical and mathematical foundation as the correct mathematical language for quantifying uncertainty. The Bayesian approaches to parameter estimation and model comparison are outlined and illustrated by application to a simple problem based on the gaussian distribution. As further illustrations of the Bayesian paradigm, Bayesian solutions to two interesting astrophysical problems are outlined: the measurement of wea...
Model Selection and Accounting for Model Uncertainty in Linear Regression Models
, 1993
"... We consider the problems of variable selection and accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. The complete B ..."
Abstract
-
Cited by 40 (6 self)
- Add to MetaCart
We consider the problems of variable selection and accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. The complete Bayesian solution to this problem involves averaging over all possible models when making inferences about quantities of interest. This approach is often not practical. In this paper we offer two alternative approaches. First we describe a Bayesian model selection algorithm called "Occam's "Window" which involves averaging over a reduced set of models. Second, we describe a Markov chain Monte Carlo approach which directly approximates the exact solution. Both these model averaging procedures provide better predictive performance than any single model which might reasonably have been selected. In the extreme case where there are many candidate predictors but there is no relationship between any of them and the response, standard variable selection procedures often choose some subset of variables that yields a high R² and a highly significant overall F value. We refer to this unfortunate phenomenon as "Freedman's Paradox" (Freedman, 1983). In this situation, Occam's vVindow usually indicates the null model as the only one to be considered, or else a small number of models including the null model, thus largely resolving the paradox.
Objective Bayesian Analysis of Spatially Correlated Data
- JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2000
"... Spatially varying phenomena are often modeled using Gaussian random fields, specified by their mean function and covariance function. The spatial correlation structure of these models is commonly specified to be of a certain form (e.g., spherical, power exponential, rational quadratic, or Matérn) wi ..."
Abstract
-
Cited by 38 (6 self)
- Add to MetaCart
Spatially varying phenomena are often modeled using Gaussian random fields, specified by their mean function and covariance function. The spatial correlation structure of these models is commonly specified to be of a certain form (e.g., spherical, power exponential, rational quadratic, or Matérn) with a small number of unknown parameters. We consider objective Bayesian analysis of such spatial models, when the mean function of the Gaussian random field is specified as in a linear model. It is thus necessary to determine an objective (or default) prior distribution for the unknown mean and covariance parameters of the random field. We first

