Results 1  10
of
17
Bootstrapping phylogenetic trees: theory and methods
 Statist. Sci
, 2003
"... Abstract. This is a survey of the use of the bootstrap in the area of systematic and evolutionary biology. I present the current usage by biologists of the bootstrap as a tool both for making inferences and for evaluating robustness, and propose a framework for thinking about these problems in terms ..."
Abstract

Cited by 26 (3 self)
 Add to MetaCart
Abstract. This is a survey of the use of the bootstrap in the area of systematic and evolutionary biology. I present the current usage by biologists of the bootstrap as a tool both for making inferences and for evaluating robustness, and propose a framework for thinking about these problems in terms of mathematical statistics. Key words and phrases: Bootstrap, phylogenetic trees, confidence regions, nonpositive curvature. 1. AN INTRODUCTION TO SYSTEMATICS The objects of study in systematics are binary rooted semilabeled trees that link species or families by their coancestral relationships. For example, Figure 1 shows a tree with seven strains of HIV.
GENERAL MAXIMUM LIKELIHOOD EMPIRICAL BAYES ESTIMATION OF NORMAL MEANS
, 908
"... We propose a general maximum likelihood empirical Bayes (GMLEB) method for the estimation of a mean vector based on observations with i.i.d. normal errors. We prove that under mild moment conditions on the unknown means, the average mean squared error (MSE) of the GMLEB is within an infinitesimal f ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
(Show Context)
We propose a general maximum likelihood empirical Bayes (GMLEB) method for the estimation of a mean vector based on observations with i.i.d. normal errors. We prove that under mild moment conditions on the unknown means, the average mean squared error (MSE) of the GMLEB is within an infinitesimal fraction of the minimum average MSE among all separable estimators which use a single deterministic estimating function on individual observations, provided that the risk is of greater order than (log n) 5 /n. We also prove that the GMLEB is uniformly approximately minimax in regular and weak ℓp balls when the order of the lengthnormalized norm of the unknown means is between (log n) κ1 /n
General empirical Bayes wavelet methods and exactly adaptive minimax estimation

, 2005
"... In many statistical problems, stochastic signals can be represented as a sequence of noisy wavelet coefficients. In this paper, we develop general empirical Bayes methods for the estimation of true signal. Our estimators approximate certain oracle separable rules and achieve adaptation to ideal risk ..."
Abstract

Cited by 25 (3 self)
 Add to MetaCart
In many statistical problems, stochastic signals can be represented as a sequence of noisy wavelet coefficients. In this paper, we develop general empirical Bayes methods for the estimation of true signal. Our estimators approximate certain oracle separable rules and achieve adaptation to ideal risks and exact minimax risks in broad collections of classes of signals. In particular, our estimators are uniformly adaptive to the minimum risk of separable estimators and the exact minimax risks simultaneously in Besov balls of all smoothness and shape indices, and they are uniformly superefficient in convergence rates in all compact sets in Besov spaces with a finite secondary shape parameter. Furthermore, in classes nested between Besov balls of the same smoothness index, our estimators dominate threshold and James–Stein estimators within an infinitesimal fraction of the minimax risks. More general block empirical Bayes estimators are developed. Both white noise with drift and nonparametric regression are considered.
Statistics for phylogenetic trees
 THEORETICAL POPULATION BIOLOGY 63 (2003) 17–32
, 2003
"... This paper poses the problem of estimating and validating phylogenetic trees in statistical terms. The problem is hard enough to warrant several tacks: we reason by analogy to rounding real numbers, and dealing with ranking data. These are both cases where, as in phylogeny the parameters of interest ..."
Abstract

Cited by 19 (4 self)
 Add to MetaCart
This paper poses the problem of estimating and validating phylogenetic trees in statistical terms. The problem is hard enough to warrant several tacks: we reason by analogy to rounding real numbers, and dealing with ranking data. These are both cases where, as in phylogeny the parameters of interest are not real numbers. Then we pose the problem in geometrical terms, using distances and measures on a natural space of trees. We do not solve the problems of inference on tree space, but suggest some coherent ways of tackling them.
Asymptotic efficiency of simple decisions for the compound decision problem
"... Abstract: We consider the compound decision problem of estimating a vector of n parameters, known up to a permutation, corresponding to n independent observations, and discuss the difference between two symmetric classes of estimators. The first and larger class is restricted to the set of all permu ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
(Show Context)
Abstract: We consider the compound decision problem of estimating a vector of n parameters, known up to a permutation, corresponding to n independent observations, and discuss the difference between two symmetric classes of estimators. The first and larger class is restricted to the set of all permutation invariant estimators. The second class is restricted further to simple symmetric procedures. That is, estimators such that each parameter is estimated by a function of the corresponding observation alone. We show that under mild conditions, the minimal total squared error risks over these two classes are asymptotically equivalent up to essentially O(1) difference.
Empirical Bayes and compound estimation of normal means
 Statistica Sinica
, 1997
"... Abstract: This article concerns the canonical empirical Bayes problem of estimating normal means under squarederror loss. General empirical estimators are derived which are asymptotically minimax and optimal. Uniform convergence and the speed of convergence are considered. The general empirical Bay ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
Abstract: This article concerns the canonical empirical Bayes problem of estimating normal means under squarederror loss. General empirical estimators are derived which are asymptotically minimax and optimal. Uniform convergence and the speed of convergence are considered. The general empirical Bayes estimators are compared with the shrinkage estimators of Stein (1956) and James and Stein (1961). Estimation of the mixture density and its derivatives are also discussed.
A new approach to fitting linear models in high dimensional spaces
, 2000
"... This thesis presents a new approach to fitting linear models, called “pace regression”, which also overcomes the dimensionality determination problem. Its optimality in minimizing the expected prediction loss is theoretically established, when the number of free parameters is infinitely large. In th ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
This thesis presents a new approach to fitting linear models, called “pace regression”, which also overcomes the dimensionality determination problem. Its optimality in minimizing the expected prediction loss is theoretically established, when the number of free parameters is infinitely large. In this sense, pace regression outperforms existing procedures for fitting linear models. Dimensionality determination, a special case of fitting linear models, turns out to be a natural byproduct. A range of simulation studies are conducted; the results support the theoretical analysis. Through the thesis, a deeper understanding is gained of the problem of fitting linear models. Many key issues are discussed. Existing procedures, namely OLS, AIC, BIC, RIC, CIC, CV(d), BS(m), RIDGE, NNGAROTTE and LASSO, are reviewed and compared, both theoretically and empirically, with the new methods. Estimating a mixing distribution is an indispensable part of pace regression. A measurebased minimum distance approach, including probability measures and nonnegative measures, is proposed, and strongly consistent estimators are produced. Of all minimum distance methods for estimating a mixing distribution, only the
Empirical Bayes rules for selecting the best binomial population
, 1986
"... l.: i...S,....,.., ..."
Assessment of Spatial Variation of Risks in Small Populations
"... Often environmental hazards are assessed by examining the spatial variation of diseasespecific mortality or morbidity rates. These rates, when estimated for small local populations, can have a high degree of random variation or uncertainty associated with them. If those rate estimates are used to p ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Often environmental hazards are assessed by examining the spatial variation of diseasespecific mortality or morbidity rates. These rates, when estimated for small local populations, can have a high degree of random variation or uncertainty associated with them. If those rate estimates are used to prioritize environmental cleanup actions or to allocate resources, then those decisions may be influenced by this high degree ofuncertainty. Unfortunately, the effect of this uncertainty is not to add "random noise " into the decsionmaking process, but to systematicaily bias action toward the smaIlest populations where uncertainty is greatest and where extreme high and low rate deviations are most likely to be manifest by chance. We present a statistical procedure for adjusting rate estimates for differences in variability due to differentials in local area population sizes. Such adjustments produce rate estimates for areas that have better properties than the unadjusted rates for use in making statistically based decisions about the entire set of areas. Examples are provided for county variation in bladder, stomach, and lung cancer mortality rates for U.S white males for the period 1970 to 1979.