Results 1  10
of
14
On Locally Adaptive Density Estimation
, 1996
"... : In this paper, theoretical and practical aspects of the samplepoint adaptive positive kernel density estimator are examined. A closedform expression for the mean integrated squared error is obtained through the device of preprocessing the data by binning. With this expression, the exact behavio ..."
Abstract

Cited by 40 (5 self)
 Add to MetaCart
: In this paper, theoretical and practical aspects of the samplepoint adaptive positive kernel density estimator are examined. A closedform expression for the mean integrated squared error is obtained through the device of preprocessing the data by binning. With this expression, the exact behavior of the optimally adaptive smoothing parameter function is studied for the first time. The approach differs from most earlier techniques in that bias of the adaptive estimator remains O(h 2 ) and is not "improved" to the rate O(h 4 ). A practical algorithm is constructed using a modification of leastsquares crossvalidation. Simulated and real examples are presented, including comparisons with a fixed bandwidth estimator and a fully automatic version of Abramson's adaptive estimator. The results are very promising. KEY WORDS: Kernel Function, Variable Bandwidth, Binning, CrossValidation. 1 Stephan R. Sain is Research Associate, Department of Statistical Science, Southern Methodist U...
Fast Algorithms for Mutual Information Based Independent Component Analysis
, 2002
"... This paper provides fast algorithms to perform independent component analysis based on the mutual information criterion. The main ingredient is the binning technique and the use of cardinal splines, which allows the fast computation of the density estimator over a regular grid. Using a discretized ..."
Abstract

Cited by 17 (5 self)
 Add to MetaCart
This paper provides fast algorithms to perform independent component analysis based on the mutual information criterion. The main ingredient is the binning technique and the use of cardinal splines, which allows the fast computation of the density estimator over a regular grid. Using a discretized form of the entropy, the criterion can be evaluated quickly together with its gradient, which can be expressed in terms of the score functions. Both offline and online separation algorithms have been developed. Our density, entropy and score estimators also have their own interest.
On the asymptotics of penalized splines
, 2007
"... The asymptotic behaviour of penalized spline estimators is studied in the univariate case. We use Bsplines and a penalty is placed on mthorder differences of the coefficients. The number of knots is assumed to converge to infinity as the sample size increases. We show that penalized splines behav ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
The asymptotic behaviour of penalized spline estimators is studied in the univariate case. We use Bsplines and a penalty is placed on mthorder differences of the coefficients. The number of knots is assumed to converge to infinity as the sample size increases. We show that penalized splines behave similarly to NadarayaWatson kernel estimators with ‘equivalent ’ kernels depending upon m. The equivalent kernels we obtain for penalized splines are the same as those found by Silverman for smoothing splines. The asymptotic distribution of the penalized spline estimator is Gaussian and we give simple expressions for the asymptotic mean and variance. Provided that it is fast enough, the rate at which the number of knots converges to infinity does not affect the asymptotic distribution. The optimal rate of convergence of the penalty parameter is given. Penalized splines are not designadaptive.
Accuracy of Binned Kernel Functional Approximations
, 1995
"... this paper is to study the accuracy of binning approximations used in bandwidth selection algorithms. Since virtually all common rules depend on the computation of a particular type of kernel functional estimator, the problem reduces to the study of the accuracy of binned kernel functional approxima ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
this paper is to study the accuracy of binning approximations used in bandwidth selection algorithms. Since virtually all common rules depend on the computation of a particular type of kernel functional estimator, the problem reduces to the study of the accuracy of binned kernel functional approximations. For simplicity and brevity our study is confined to the density estimation context. However, the conclusions apply to other settings where bandwidth selection is used, such as kernel regression. Binning techniques for fast kernel estimation were first proposed by Silverman (1982), Scott (1985) and Hardle & Scott (1992). Wand (1994) describes the extension of binning ideas to multivariate functional estimation. Studies in the approximation accuracy of binned kernel estimators include Jones and Lotwick (1983) and Hall and Wand (1994). The class of kernel functional estimators studied here were introduced by Hall and Marron (1987) and Jones and Sheather (1991). For access to the large literature on automatic bandwidth selection methods and their relative merits see, for example, Cao, Cuevas & Gonz'alezManteiga (1994) and Jones, Sheather & Marron (1995). Section 2 contains the theoretical results required for our investigation. In Section 3 we apply the results to a set of specific problems to develop an understanding of the effect of binning on kernel functional estimation and, therefore, the effect on bandwidth selection algorithms. Conclusions of this study are given in Section 4. 1\Delta2 Notation
A continuous Gaussian approximation to a nonparametric regression in two dimensions.
"... Estimating the mean in a nonparametric regression on a twodimensional regular grid of design points is asymptotically equivalent to estimating the drift of a continuous Gaussian process on the unit square. In particular, we provide a construction of a Brownian sheet process with a drift that is alm ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Estimating the mean in a nonparametric regression on a twodimensional regular grid of design points is asymptotically equivalent to estimating the drift of a continuous Gaussian process on the unit square. In particular, we provide a construction of a Brownian sheet process with a drift that is almost the mean function in the nonparametric regression. This can be used to apply estimation or testing procedures from the continuous process to the regression experiment as in Le Cam’s theory of equivalent experiments. Our result is motivated by first looking at the amount of information lost in binning the data in a density estimation problem. Keywords: Nonparametric regression, Asymptotic equivalence of experiments,
Cluster Analysis of Massive Datasets in Astronomy
, 2006
"... Clusters of galaxies are a useful proxy to trace the mass distribution of the universe. By measuring the mass of clusters of galaxies at different scales, one can follow the evolution of the mass distribution (Martínez and Saar, 2002). It can be shown that finding galaxies clustering is equivalent t ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Clusters of galaxies are a useful proxy to trace the mass distribution of the universe. By measuring the mass of clusters of galaxies at different scales, one can follow the evolution of the mass distribution (Martínez and Saar, 2002). It can be shown that finding galaxies clustering is equivalent to finding density contour clusters (Hartigan, 1975): connected components of the level set Sc ≡ {f> c} where f is a probability density function. Cuevas et al. (2000, 2001) proposed a nonparametric method for density contour clusters. They attempt to find density contour clusters by the minimal spanning tree. While their algorithm is conceptually simple, it requires intensive computations for large datasets. We propose a more efficient clustering method based on their algorithm with the Fast Fourier Transform (FFT). The method is applied to a study of galaxy clustering on large astronomical sky survey data.
Logspline Density Estimation for Binned Data
"... In this paper we consider logspline density estimation for binned data. Rates of convergence are established when the logdensity function is assumed to be in a Besov space. An algorithm involving a procedure similar to maximum likelihood, stepwise knot addition, and stepwise knot deletion is propos ..."
Abstract
 Add to MetaCart
In this paper we consider logspline density estimation for binned data. Rates of convergence are established when the logdensity function is assumed to be in a Besov space. An algorithm involving a procedure similar to maximum likelihood, stepwise knot addition, and stepwise knot deletion is proposed for the estimation of the density function based upon binned data. Numerical examples are used to show the finitesample performance of inference based on the logspline density estimation. Keywords: Besov space, binning, knot selection, MILE, optimal rate of convergence. 1. Introduction This paper proposes a method of density estimation for binned data. Let X 1 ; : : : ; X n be a random sample from a distribution with density f . In some experiments, the data are reported in the form of a histogram. The observed random variables are Y q = #fX i : X i 2 I q g where I q are bins. We want to estimate the unknown density function f based on Y q 's. Flexible exponential families have bee...
On Gauss Quadratures and Partial Cross Validation
"... In the paper we consider new estimators of expected values Eω(X) of functions of a random variable X. The new estimators are based on Gauss quadrature, a numerical method frequently used to approximate integrals over finite intervals. We apply the new estimators in Partial Cross Validation, a ..."
Abstract
 Add to MetaCart
In the paper we consider new estimators of expected values Eω(X) of functions of a random variable X. The new estimators are based on Gauss quadrature, a numerical method frequently used to approximate integrals over finite intervals. We apply the new estimators in Partial Cross Validation, a numerical method for finding optimal smoothing parameters in nonparametric curve estimation. We show that Partial Cross Validation can considerably reduce the computational cost of the Generalized Cross Validation method typically used to determine the optimal smoothing parameter.
On Boundary Effects of Smooth Curve Estimators
, 1994
"... Many nonparametric smooth curve estimators have a problem with boundary effects. Roughly speaking, the discontinuity of the curves under investigation at their endpoints causes difficulties for this kind of estimators. These estimators are visually disturbing at boundary regions and can become misl ..."
Abstract
 Add to MetaCart
Many nonparametric smooth curve estimators have a problem with boundary effects. Roughly speaking, the discontinuity of the curves under investigation at their endpoints causes difficulties for this kind of estimators. These estimators are visually disturbing at boundary regions and can become misleading in modeling the data because they are seriously biased there. In applications, boundary regions can be a substantial portion of the entire support. This has been recognized as an important problem and there are many adjustments suggested in the literature. We investigate properties of Shuster's boundary fold method and Rice's modification. Noticing the automatic boundary adaptive property of the local linear smoother recently highlighted by Fan, we further find out it is 100 % efficient; Le. best out of all possible estimators, for estimation at endpoints in a typical minimax sense. This result is important since it shows in one step that the local linear approach is as good as or better than all of the many other approaches proposed in the literature. The problem of