Results 1  10
of
14
Predictive regressions
 Journal of Financial Economics
, 1999
"... When a rate of return is regressed on a lagged stochastic regressor, such as a dividend yield, the regression disturbance is correlated with the regressor's innovation. The OLS estimator's "nitesample properties, derived here, can depart substantially from the standard regression setting. Bayesian ..."
Abstract

Cited by 257 (16 self)
 Add to MetaCart
When a rate of return is regressed on a lagged stochastic regressor, such as a dividend yield, the regression disturbance is correlated with the regressor's innovation. The OLS estimator's "nitesample properties, derived here, can depart substantially from the standard regression setting. Bayesian posterior distributions for the regression parameters are obtained under speci"cations that di!er with respect to (i) prior beliefs about the autocorrelation of the regressor and (ii) whether the initial observation of the regressor is speci"ed as "xed or stochastic. The posteriors di!er across such speci"cations, and asset allocations in the presence of estimation risk exhibit sensitivity to those
Markov Chain Monte Carlo Model Determination for Hierarchical and Graphical Loglinear Models
 Biometrika
, 1996
"... this paper, we will only consider undirected graphical models. For details of Bayesian model selection for directed graphical models see Madigan et al (1995). An (undirected) graphical model is determined by a set of conditional independence constraints of the form `fl 1 is independent of fl 2 condi ..."
Abstract

Cited by 55 (8 self)
 Add to MetaCart
this paper, we will only consider undirected graphical models. For details of Bayesian model selection for directed graphical models see Madigan et al (1995). An (undirected) graphical model is determined by a set of conditional independence constraints of the form `fl 1 is independent of fl 2 conditional on all other fl i 2 C'. Graphical models are so called because they can each be represented as a graph with vertex set C and an edge between each pair fl 1 and fl 2 unless fl 1 and fl 2 are conditionally independent as described above. Darroch, Lauritzen and Speed (1980) show that each graphical loglinear model is hierarchical, with generators given by the cliques (complete subgraphs) of the graph. The total number of possible graphical models is clearly given by 2 (
A Natural Law of Succession
, 1995
"... Consider the following problem. You are given an alphabet of k distinct symbols and are told that the i th symbol occurred exactly ni times in the past. On the basis of this information alone, you must now estimate the conditional probability that the next symbol will be i. In this report, we presen ..."
Abstract

Cited by 35 (3 self)
 Add to MetaCart
Consider the following problem. You are given an alphabet of k distinct symbols and are told that the i th symbol occurred exactly ni times in the past. On the basis of this information alone, you must now estimate the conditional probability that the next symbol will be i. In this report, we present a new solution to this fundamental problem in statistics and demonstrate that our solution outperforms standard approaches, both in theory and in practice.
Entropy Inference and the JamesStein Estimator, with Application to Nonlinear Gene Association Networks
"... We present a procedure for effective estimation of entropy and mutual information from smallsample data, and apply it to the problem of inferring highdimensional gene association networks. Specifically, we develop a JamesSteintype shrinkage estimator, resulting in a procedure that is highly effic ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
We present a procedure for effective estimation of entropy and mutual information from smallsample data, and apply it to the problem of inferring highdimensional gene association networks. Specifically, we develop a JamesSteintype shrinkage estimator, resulting in a procedure that is highly efficient statistically as well as computationally. Despite its simplicity, we show that it outperforms eight other entropy estimation procedures across a diverse range of sampling scenarios and datagenerating models, even in cases of severe undersampling. We illustrate the approach by analyzing E. coli gene expression data and computing an entropybased geneassociation network from gene expression data. A computer program is available that implements the proposed shrinkage estimator. Keywords: entropy, shrinkage estimation, JamesStein estimator, “small n, large p ” setting, mutual information, gene association network
Conservative inference rule for uncertain reasoning under incompleteness
 Journal of Artificial Intelligence Research
"... In this paper we formulate the problem of inference under incomplete information in very general terms. This includes modelling the process responsible for the incompleteness, which we call the incompleteness process. We allow the process ’ behaviour to be partly unknown. Then we use Walley’s theory ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
In this paper we formulate the problem of inference under incomplete information in very general terms. This includes modelling the process responsible for the incompleteness, which we call the incompleteness process. We allow the process ’ behaviour to be partly unknown. Then we use Walley’s theory of coherent lower previsions, a generalisation of the Bayesian theory to imprecision, to derive the rule to update beliefs under incompleteness that logically follows from our assumptions, and that we call conservative inference rule. This rule has some remarkable properties: it is an abstract rule to update beliefs that can be applied in any situation or domain; it gives us the opportunity to be neither too optimistic nor too pessimistic about the incompleteness process, which is a necessary condition to draw reliable while strong enough conclusions; and it is a coherent rule, in the sense that it cannot lead to inconsistencies. We give examples to show how the new rule can be applied in expert systems, in parametric statistical inference, and in pattern classification, and discuss more generally the view of incompleteness processes defended here as well as some of its consequences. 1.
The Reverend Thomas Bayes FRS: a biography to celebrate the tercentenary of his birth
 Statistical Science
"... Abstract. Thomas Bayes, from whom Bayes theorem takes its name, was probably born in 1701, so the year 2001 marked the 300th anniversary of his birth. This biography was written to celebrate this anniversary. The current sketch of his life includes his family background and education, as well as his ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Abstract. Thomas Bayes, from whom Bayes theorem takes its name, was probably born in 1701, so the year 2001 marked the 300th anniversary of his birth. This biography was written to celebrate this anniversary. The current sketch of his life includes his family background and education, as well as his scientific and theological work. In contrast to some, but not all, biographies of Bayes, the current biography is an attempt to cover areas beyond Bayes’ scientific work. When commenting on the writing of scientific biography, Pearson [(1978). The History of Statistics in the 17th and 18th Centuries.... Charles Griffin and Company, London] stated, “it is impossible to understand a man’s work unless you understand something of his character and unless you understand something of his environment. And his environment means the state of affairs social and political of his own age. ” The intention here is to follow this general approach to biography. There is very little primary source material on Bayes and his work. For example, only three of his letters and a notebook containing some sketches of his own work, almost all unpublished, as well as notes on the work of others are known to have survived. Neither the letters nor the notebook is dated, and only one of the letters can be dated accurately from internal evidence. This biography contains new information about Bayes. In particular, among the papers of the 2nd Earl Stanhope, letters and papers of Bayes have been uncovered that previously were not known to exist. The letters indirectly confirm the centrality of Stanhope in Bayes ’ election to the Royal Society. They also provide evidence that Bayes was part of a network of mathematicians initially centered on Stanhope. In addition, the letters shed light on Bayes ’ work in infinite series. 1.
Nonuniform Markov Models
, 1996
"... A statistical language model assigns probability to strings of arbitrary length. Unfortunately, it is not possible to gather reliable statistics on strings of arbitrary length from a finite corpus. Therefore, a statistical language model must decide that each symbol in a string depends on at most a ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
A statistical language model assigns probability to strings of arbitrary length. Unfortunately, it is not possible to gather reliable statistics on strings of arbitrary length from a finite corpus. Therefore, a statistical language model must decide that each symbol in a string depends on at most a small, finite number of other symbols in the string. In this report we propose a new way to model conditional independence in Markov models. The central feature of our nonuniform Markov model is that it makes predictions of varying lengths using contexts of varying lengths. Experiments on the Wall Street Journal reveal that the nonuniform model performs slightly better than the classic interpolated Markov model.
A conjugate prior for discrete hierarchical loglinear models
, 2009
"... In Bayesian analysis of multiway contingency tables, the selection of a prior distribution for either the loglinear parameters or the cell probabilities parameters is a major challenge. In this paper, we define a flexible family of conjugate priors for the wide class of discrete hierarchical logl ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
In Bayesian analysis of multiway contingency tables, the selection of a prior distribution for either the loglinear parameters or the cell probabilities parameters is a major challenge. In this paper, we define a flexible family of conjugate priors for the wide class of discrete hierarchical loglinear models, which includes the class of graphical models. These priors are defined as the Diaconis–Ylvisaker conjugate priors on the loglinear parameters subject to “baseline constraints” under multinomial sampling. We also derive the induced prior on the cell probabilities and show that the induced prior is a generalization of the hyper Dirichlet prior. We show that this prior has several desirable properties and illustrate its usefulness by identifying the most probable decomposable, graphical and hierarchical loglinear models for a sixway contingency table.
Nonuniform Markov Models
, 1996
"... A statistical language model assigns probability to strings of arbitrary length. Unfortunately, it is not possible to gather reliable statistics on strings of arbitrary length from a finite corpus. Therefore, a statistical language model must decide that each symbol in a string depends on at most a ..."
Abstract
 Add to MetaCart
A statistical language model assigns probability to strings of arbitrary length. Unfortunately, it is not possible to gather reliable statistics on strings of arbitrary length from a finite corpus. Therefore, a statistical language model must decide that each symbol in a string depends on at most a small, finite number of other symbols in the string. In this report we propose a new way to model conditional independence in Markov models. The central feature of our nonuniform Markov model is that it makes predictions of varying lengths using contexts of varying lengths. Experiments on the Wall Street Journal reveal that the nonuniform model performs slightly better than the classic interpolated Markov model. Keywords: nonuniform Markov model, interpolated Markov model, conditional independence, statistical language model, discrete time series. 1 Thanks to Andrew Appel, Joe Kupin and Harry Printz for their critique. Our implementation of the nonuniform model used the library of practical...
Model Checking for Incomplete High Dimensional Categorical Data
, 1999
"... OF THE DISSERTATION Model Checking for Incomplete High Dimensional Categorical Data by MingYi Hu Doctor of Philosophy in Statistics University of California, Los Angeles, 1999 Professor Thomas R. Belin, Cochair Professor Robert I. Jennrich, Cochair Categorical data are often arranged in ..."
Abstract
 Add to MetaCart
OF THE DISSERTATION Model Checking for Incomplete High Dimensional Categorical Data by MingYi Hu Doctor of Philosophy in Statistics University of California, Los Angeles, 1999 Professor Thomas R. Belin, Cochair Professor Robert I. Jennrich, Cochair Categorical data are often arranged in a contingency table and summarized by a loglinear model. A standard approach for comparing two competing models is to calculate twice the discrepancy between maximized loglikelihoods, which follows a 2 distribution asymptotically. But when data are sparse, the 2 approximation may be questionable. xii As an alternative to a largesample approximation to the reference distribution, we implement the framework introduced by Rubin (1984) for finding the posterior predictive check (PPC) distribution. The PPC distribution represents the conditional probability of a future value of a test statistic based on the information given by observed data along with model specifications, which can se...