Results 1  10
of
72
SiZer for exploration of structures in curves
 Journal of the American Statistical Association
, 1997
"... In the use of smoothing methods in data analysis, an important question is often: which observed features are "really there?", as opposed to being spurious sampling artifacts. An approach is described, based on scale space ideas that were originally developed in computer vision literature. Assess ..."
Abstract

Cited by 82 (16 self)
 Add to MetaCart
In the use of smoothing methods in data analysis, an important question is often: which observed features are "really there?", as opposed to being spurious sampling artifacts. An approach is described, based on scale space ideas that were originally developed in computer vision literature. Assessment of Significant ZERo crossings of derivatives, results in the SiZer map, a graphical device for display of significance of features, with respect to both location and scale. Here "scale" means "level of resolution", i.e.
Deconvoluting kernel density estimators
 Statistics
, 1990
"... This paper considers estimation ofa continuous bounded probability density when observations from the density are contaminated by additive measurement errors having a known distribution. Properties of the estimator obtained by deconvolving a kernel estimator of the observed data are investigated. Wh ..."
Abstract

Cited by 63 (7 self)
 Add to MetaCart
This paper considers estimation ofa continuous bounded probability density when observations from the density are contaminated by additive measurement errors having a known distribution. Properties of the estimator obtained by deconvolving a kernel estimator of the observed data are investigated. When the kernel used is sufficiently smooth the deconvolved estimator is shown to be pointwise consistent and bounds on its integrated mean squared error are derived. Very weak assumptions are made on the measurementerror density thereby permitting a comparison of the effects of different types of measurement error on the deconvolved estimator.
The Mode Tree: A Tool for Visualization of Nonparametric Density Features
 Journal of Computational and Graphical Statistics
, 1993
"... Recognition and extraction of features in a nonparametric density estimate is highly dependent on correct calibration. The datadriven choice of bandwidth h in kernel density estimation is a difficult one, compounded by the fact that the globally optimal h is not generally optimal for all values of ..."
Abstract

Cited by 35 (4 self)
 Add to MetaCart
Recognition and extraction of features in a nonparametric density estimate is highly dependent on correct calibration. The datadriven choice of bandwidth h in kernel density estimation is a difficult one, compounded by the fact that the globally optimal h is not generally optimal for all values of x. In recognition of this fact, a new type of graphical tool, the mode tree, is proposed. The basic mode tree plot relates the locations of modes in density estimates with the bandwidths of those estimates. Additional information can be included on the plot indicating such factors as the size of modes, how modes split, and the locations of antimodes and bumps. The use of a mode tree in adaptive multimodality investigations is proposed, and an example is given to show the value in using a Normal kernel, as opposed to the biweight or other kernels, in such investigations. Examples of such investigations are provided for Ahrens' chondrite data and van Winkle's Hidalgo stamp data. Finally, the b...
Testing monotonicity of regression
 Journal of Computational and Graphical Statistics
, 1998
"... This article provides a test of monotonicity of a regression function. The test is based on the size of a “critical ” bandwidth, the amount of smoothing necessary to force a nonparametric regression estimate to be monotone. It is analogous to Silverman’s test of multimodality in density estimation. ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
This article provides a test of monotonicity of a regression function. The test is based on the size of a “critical ” bandwidth, the amount of smoothing necessary to force a nonparametric regression estimate to be monotone. It is analogous to Silverman’s test of multimodality in density estimation. Bootstrapping is used to provide a null distribution for the test statistic. The methodology is particularly simple in regression models in which the variance is a specified function of the mean, but we also discuss in detail the homoscedastic case with unknown variance. Simulation evidence indicates the usefulness of the method. Two examples are given.
The Problem of Regions
, 1998
"... In the problem of regions we wish to know which one of a discrete set of possibilities applies to a continuous parameter vector. This problem arises in the following way: we compute a descriptive statistic from a set of data and notice an interesting feature. We wish to assign a confidence level to ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
In the problem of regions we wish to know which one of a discrete set of possibilities applies to a continuous parameter vector. This problem arises in the following way: we compute a descriptive statistic from a set of data and notice an interesting feature. We wish to assign a confidence level to that feature. For example, we compute a density estimate and notice that the estimate is bimodal. What confidence do we assign to bimodality ? A natural way to measure this confidence is via the bootstrap: we compute our descriptive statistic on a large number of bootstrap samples and record the proportion of times that the feature appears. This proportion seems like a plausible measure of confidence for the feature. We study the construction of such confidence values and examine to what extent they approximate frequentist pvalues. We derive more accurate confidence values using both frequentist and objective Bayesian approaches. The methods are illustrated with a number of examples includ...
Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2001
"... ..."
On the number of modes of a Gaussian mixture

, 2003
"... We consider a problem intimately related to the creation of maxima under Gaussian blurring: the number of modes of a Gaussian mixture in D dimensions. To our knowledge, a general answer to this question is not known. We conjecture that if the components of the mixture have the same covariance matr ..."
Abstract

Cited by 18 (5 self)
 Add to MetaCart
We consider a problem intimately related to the creation of maxima under Gaussian blurring: the number of modes of a Gaussian mixture in D dimensions. To our knowledge, a general answer to this question is not known. We conjecture that if the components of the mixture have the same covariance matrix (or the same covariance matrix up to a scaling factor), then the number of modes cannot exceed the number of components. We demonstrate
Density estimation
 Statistical Science
, 2004
"... Abstract. This paper provides a practical description of density estimation based on kernel methods. An important aim is to encourage practicing statisticians to apply these methods to data. As such, reference is made to implementations of these methods in R, SPLUS and SAS. Key words and phrases: K ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
Abstract. This paper provides a practical description of density estimation based on kernel methods. An important aim is to encourage practicing statisticians to apply these methods to data. As such, reference is made to implementations of these methods in R, SPLUS and SAS. Key words and phrases: Kernel density estimation, bandwidth selection, local likelihood density estimates, data sharpening. 1.
Estimating the Number of Clusters
, 2000
"... Hartigan (1975) defines the number q of clusters in a dvariate statistical population as the number of connected components of the set {f>c}, where f denotes the underlying density function on R^d and c is a given constant. Some usual cluster algorithms treat q as an input which must be given in ad ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
Hartigan (1975) defines the number q of clusters in a dvariate statistical population as the number of connected components of the set {f>c}, where f denotes the underlying density function on R^d and c is a given constant. Some usual cluster algorithms treat q as an input which must be given in advance. The authors propose a method for estimating this parameter which is based on the computation of the number of connected components of an estimate of {f>c}. This set estimator is constructed as a union of balls with centres at an appropriate subsample which is selected via a nonparametric density estimator of f. The asymptotic behaviour of the proposed method is analyzed. A simulation study and an example with real data are also included.