Results 1  10
of
24
Predictive regressions
 Journal of Financial Economics
, 1999
"... When a rate of return is regressed on a lagged stochastic regressor, such as a dividend yield, the regression disturbance is correlated with the regressor's innovation. The OLS estimator's "nitesample properties, derived here, can depart substantially from the standard regression set ..."
Abstract

Cited by 318 (13 self)
 Add to MetaCart
When a rate of return is regressed on a lagged stochastic regressor, such as a dividend yield, the regression disturbance is correlated with the regressor's innovation. The OLS estimator's "nitesample properties, derived here, can depart substantially from the standard regression setting. Bayesian posterior distributions for the regression parameters are obtained under speci"cations that di!er with respect to (i) prior beliefs about the autocorrelation of the regressor and (ii) whether the initial observation of the regressor is speci"ed as "xed or stochastic. The posteriors di!er across such speci"cations, and asset allocations in the presence of estimation risk exhibit sensitivity to those
Markov Chain Monte Carlo Model Determination for Hierarchical and Graphical Loglinear Models
 Biometrika
, 1996
"... this paper, we will only consider undirected graphical models. For details of Bayesian model selection for directed graphical models see Madigan et al (1995). An (undirected) graphical model is determined by a set of conditional independence constraints of the form `fl 1 is independent of fl 2 condi ..."
Abstract

Cited by 55 (8 self)
 Add to MetaCart
this paper, we will only consider undirected graphical models. For details of Bayesian model selection for directed graphical models see Madigan et al (1995). An (undirected) graphical model is determined by a set of conditional independence constraints of the form `fl 1 is independent of fl 2 conditional on all other fl i 2 C'. Graphical models are so called because they can each be represented as a graph with vertex set C and an edge between each pair fl 1 and fl 2 unless fl 1 and fl 2 are conditionally independent as described above. Darroch, Lauritzen and Speed (1980) show that each graphical loglinear model is hierarchical, with generators given by the cliques (complete subgraphs) of the graph. The total number of possible graphical models is clearly given by 2 (
A Natural Law of Succession
, 1995
"... Consider the following problem. You are given an alphabet of k distinct symbols and are told that the i th symbol occurred exactly ni times in the past. On the basis of this information alone, you must now estimate the conditional probability that the next symbol will be i. In this report, we presen ..."
Abstract

Cited by 36 (3 self)
 Add to MetaCart
Consider the following problem. You are given an alphabet of k distinct symbols and are told that the i th symbol occurred exactly ni times in the past. On the basis of this information alone, you must now estimate the conditional probability that the next symbol will be i. In this report, we present a new solution to this fundamental problem in statistics and demonstrate that our solution outperforms standard approaches, both in theory and in practice.
Entropy Inference and the JamesStein Estimator, with Application to Nonlinear Gene Association Networks
"... We present a procedure for effective estimation of entropy and mutual information from smallsample data, and apply it to the problem of inferring highdimensional gene association networks. Specifically, we develop a JamesSteintype shrinkage estimator, resulting in a procedure that is highly effic ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
(Show Context)
We present a procedure for effective estimation of entropy and mutual information from smallsample data, and apply it to the problem of inferring highdimensional gene association networks. Specifically, we develop a JamesSteintype shrinkage estimator, resulting in a procedure that is highly efficient statistically as well as computationally. Despite its simplicity, we show that it outperforms eight other entropy estimation procedures across a diverse range of sampling scenarios and datagenerating models, even in cases of severe undersampling. We illustrate the approach by analyzing E. coli gene expression data and computing an entropybased geneassociation network from gene expression data. A computer program is available that implements the proposed shrinkage estimator. Keywords: entropy, shrinkage estimation, JamesStein estimator, “small n, large p ” setting, mutual information, gene association network
An introduction to Bayesian reference analysis: Inference on the ratio of multinomial parameters. The Statistician 47
, 1998
"... ..."
Conservative inference rule for uncertain reasoning under incompleteness
 Journal of Artificial Intelligence Research
"... In this paper we formulate the problem of inference under incomplete information in very general terms. This includes modelling the process responsible for the incompleteness, which we call the incompleteness process. We allow the process ’ behaviour to be partly unknown. Then we use Walley’s theory ..."
Abstract

Cited by 10 (7 self)
 Add to MetaCart
(Show Context)
In this paper we formulate the problem of inference under incomplete information in very general terms. This includes modelling the process responsible for the incompleteness, which we call the incompleteness process. We allow the process ’ behaviour to be partly unknown. Then we use Walley’s theory of coherent lower previsions, a generalisation of the Bayesian theory to imprecision, to derive the rule to update beliefs under incompleteness that logically follows from our assumptions, and that we call conservative inference rule. This rule has some remarkable properties: it is an abstract rule to update beliefs that can be applied in any situation or domain; it gives us the opportunity to be neither too optimistic nor too pessimistic about the incompleteness process, which is a necessary condition to draw reliable while strong enough conclusions; and it is a coherent rule, in the sense that it cannot lead to inconsistencies. We give examples to show how the new rule can be applied in expert systems, in parametric statistical inference, and in pattern classification, and discuss more generally the view of incompleteness processes defended here as well as some of its consequences. 1.
The Reverend Thomas Bayes FRS: a biography to celebrate the tercentenary of his birth
 Statistical Science
"... Abstract. Thomas Bayes, from whom Bayes theorem takes its name, was probably born in 1701, so the year 2001 marked the 300th anniversary of his birth. This biography was written to celebrate this anniversary. The current sketch of his life includes his family background and education, as well as his ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Thomas Bayes, from whom Bayes theorem takes its name, was probably born in 1701, so the year 2001 marked the 300th anniversary of his birth. This biography was written to celebrate this anniversary. The current sketch of his life includes his family background and education, as well as his scientific and theological work. In contrast to some, but not all, biographies of Bayes, the current biography is an attempt to cover areas beyond Bayes’ scientific work. When commenting on the writing of scientific biography, Pearson [(1978). The History of Statistics in the 17th and 18th Centuries.... Charles Griffin and Company, London] stated, “it is impossible to understand a man’s work unless you understand something of his character and unless you understand something of his environment. And his environment means the state of affairs social and political of his own age. ” The intention here is to follow this general approach to biography. There is very little primary source material on Bayes and his work. For example, only three of his letters and a notebook containing some sketches of his own work, almost all unpublished, as well as notes on the work of others are known to have survived. Neither the letters nor the notebook is dated, and only one of the letters can be dated accurately from internal evidence. This biography contains new information about Bayes. In particular, among the papers of the 2nd Earl Stanhope, letters and papers of Bayes have been uncovered that previously were not known to exist. The letters indirectly confirm the centrality of Stanhope in Bayes ’ election to the Royal Society. They also provide evidence that Bayes was part of a network of mathematicians initially centered on Stanhope. In addition, the letters shed light on Bayes ’ work in infinite series. 1.
C.P.: Inference from multinomial data based on a MLEdominance criterion
 In: Proc. on European Conf. on Symbolic and Quantitative Approaches to Reasoning and Uncertainty (Ecsqaru
, 2009
"... Abstract. We consider the problem of inference from multinomial data with chances θ, subject to the apriori information that the true parameter vector θ belongs to a known convex polytope Θ. The proposed estimator has the parametrized structure of the conditionalmean estimator with a prior Dirichl ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract. We consider the problem of inference from multinomial data with chances θ, subject to the apriori information that the true parameter vector θ belongs to a known convex polytope Θ. The proposed estimator has the parametrized structure of the conditionalmean estimator with a prior Dirichlet distribution, whose parameters (s,t) are suitably designed via a dominance criterion so as to guarantee, for any θ ∈ Θ, an improvement of the Mean Squared Error over the Maximum Likelihood Estimator (MLE). The solution of this MLEdominance problem allows us togiveadifferent interpretation of: (1) theseveral Bayesian estimators proposed in the literature for the problem of inference from multinomial data; (2) the Imprecise Dirichlet Model (IDM) developed by Walley [13]. 1
A conjugate prior for discrete hierarchical loglinear models
, 2009
"... In Bayesian analysis of multiway contingency tables, the selection of a prior distribution for either the loglinear parameters or the cell probabilities parameters is a major challenge. In this paper, we define a flexible family of conjugate priors for the wide class of discrete hierarchical logl ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
In Bayesian analysis of multiway contingency tables, the selection of a prior distribution for either the loglinear parameters or the cell probabilities parameters is a major challenge. In this paper, we define a flexible family of conjugate priors for the wide class of discrete hierarchical loglinear models, which includes the class of graphical models. These priors are defined as the Diaconis–Ylvisaker conjugate priors on the loglinear parameters subject to “baseline constraints” under multinomial sampling. We also derive the induced prior on the cell probabilities and show that the induced prior is a generalization of the hyper Dirichlet prior. We show that this prior has several desirable properties and illustrate its usefulness by identifying the most probable decomposable, graphical and hierarchical loglinear models for a sixway contingency table.
Nonuniform Markov Models
, 1996
"... A statistical language model assigns probability to strings of arbitrary length. Unfortunately, it is not possible to gather reliable statistics on strings of arbitrary length from a finite corpus. Therefore, a statistical language model must decide that each symbol in a string depends on at most a ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
A statistical language model assigns probability to strings of arbitrary length. Unfortunately, it is not possible to gather reliable statistics on strings of arbitrary length from a finite corpus. Therefore, a statistical language model must decide that each symbol in a string depends on at most a small, finite number of other symbols in the string. In this report we propose a new way to model conditional independence in Markov models. The central feature of our nonuniform Markov model is that it makes predictions of varying lengths using contexts of varying lengths. Experiments on the Wall Street Journal reveal that the nonuniform model performs slightly better than the classic interpolated Markov model. This result is somewhat remarkable because both models contain identical numbers of parameters whose values are estimated in a similar manner. The only difference between the two models is how they combine the statistics of longer and shorter strings.