Results 1 - 10
of
10
De-Noising By Soft-Thresholding
, 1992
"... Donoho and Johnstone (1992a) proposed a method for reconstructing an unknown function f on [0; 1] from noisy data di = f(ti)+ zi, iid i =0;:::;n 1, ti = i=n, zi N(0; 1). The reconstruction fn ^ is de ned in the wavelet domain by translating all the empirical wavelet coe cients of d towards 0 by an a ..."
Abstract
-
Cited by 545 (11 self)
- Add to MetaCart
Donoho and Johnstone (1992a) proposed a method for reconstructing an unknown function f on [0; 1] from noisy data di = f(ti)+ zi, iid i =0;:::;n 1, ti = i=n, zi N(0; 1). The reconstruction fn ^ is de ned in the wavelet domain by translating all the empirical wavelet coe cients of d towards 0 by an amount p 2 log(n) = p n. We prove two results about that estimator. [Smooth]: With high probability ^ fn is at least as smooth as f, in any of a wide variety of smoothness measures. [Adapt]: The estimator comes nearly as close in mean square to f as any measurable estimator can come, uniformly over balls in each of two broad scales of smoothness classes. These two properties are unprecedented in several ways. Our proof of these results develops new facts about abstract statistical inference and its connection with an optimal recovery model.
On Qualitative Smoothness of Kernel Density Estimates
, 1994
"... In this paper we give asymptotic expansions for the expected number of local extremes and inflection points of kernel density estimates. We argue that these numbers can be used as an indicator for the "qualitative" smoothness of the density estimate. ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
In this paper we give asymptotic expansions for the expected number of local extremes and inflection points of kernel density estimates. We argue that these numbers can be used as an indicator for the "qualitative" smoothness of the density estimate.
Testing For Monotonicity Of A Regression Mean Without Selecting A Bandwidth
, 1998
"... . A new approach to testing for monotonicity of a regression mean, not requiring computation of a curve estimator or a bandwidth, is suggested. It is based on the notion of `running gradients' over short intervals, although from some viewpoints it may be regarded as an analogue for monotonicity test ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
. A new approach to testing for monotonicity of a regression mean, not requiring computation of a curve estimator or a bandwidth, is suggested. It is based on the notion of `running gradients' over short intervals, although from some viewpoints it may be regarded as an analogue for monotonicity testing of the dip/excess mass approach for testing modality hypotheses about densities. Like the latter methods, the new technique does not suffer difficulties caused by almostflat parts of the target function. In fact, it is calibrated so as to work well for flat response curves, and as a result it has relatively good power properties in boundary cases where the curve exhibits shoulders. In this respect, as well as in its construction, the `running gradients' approach differs from alternative techniques based on the notion of a critical bandwidth. KEYWORDS. Bootstrap, calibration, curve estimation, Monte Carlo, response curve, running gradient. SHORT TITLE. Testing for monotonicity. 1 The man...
Piecewise Convex Function Estimation: Pilot Estimators
- Proc. of ICIAM95 World Scientific Pub
, 1995
"... Given noisy data, function estimation is considered when the unknown function is known a priori to consist of a small number of regions where the function is either convex or concave. When the number of regions is unknown, the model selection problem is to determine the number of convexity change po ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Given noisy data, function estimation is considered when the unknown function is known a priori to consist of a small number of regions where the function is either convex or concave. When the number of regions is unknown, the model selection problem is to determine the number of convexity change points. For kernel estimates in Gaussian noise, the number of false change points is evaluated as a function of the smoothing parameter. To insure that the number of false convexity change points tends to zero, the smoothing level must be larger than is generically optimal for minimizing the mean integrated square error (MISE). A two-stage estimator is proposed and shown to achieve the optimal rate of convergence of the MISE. In the first-stage, the number and location of the change points is estimated using strong smoothing. In the secondstage, a constrained smoothing spline fit is performed with the smoothing level chosen to minimize the MISE. The imposed constraint is that a single change p...
Testing of Monotonicity in Regression Models
- Mimeograph Series, Operations Research, Statistics
, 1990
"... In data anaysis concerning the investigation of the relationship between a dependent variable Y and an independent variable X, one may wish to determine whether this relationship is monotone or not. This determination may be of interest in itself, or it may form part of a (nonparametric) regression ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In data anaysis concerning the investigation of the relationship between a dependent variable Y and an independent variable X, one may wish to determine whether this relationship is monotone or not. This determination may be of interest in itself, or it may form part of a (nonparametric) regression analysis which relies on monotonicity of the true regression function. In this paper we generalize the test of positive correlation by proposing a test statistic for monotonicity based on fitting a parametric model, say a higher order polynomial, to the data with and without the monotonicity constraint. The statistic has an asymptotic chi-bar-squared distribution under the null hypothesis that the true regression function is on the boundary of the space of monotone functions. Based on the theoretical results, an algorithm is developed for testing the significance of the statistic, and it is shown to perform well in several null and non-null settings. Extensions to fitting regression splines ...
On Kernel Density Estimation without Bumps in the Tails
"... An empirical approach to bandwidth choice is proposed for nonparametric estimation of the tails of a probability density. It is an "equal-information" method, in that it uses approximately equal amounts of sample information to estimate the density at all points. In one respect it is related to near ..."
Abstract
- Add to MetaCart
An empirical approach to bandwidth choice is proposed for nonparametric estimation of the tails of a probability density. It is an "equal-information" method, in that it uses approximately equal amounts of sample information to estimate the density at all points. In one respect it is related to nearest neighbour methods, although it produces substantially smoother estimates, without the spikes associated with nearest neighbour analysis. In the tails it enjoys low relative error, as well as low absolute error, and as a result the tail estimates do not exhibit the familiar "bumpy" appearance which impedes both modal analysis and qualitative interpretation of curve estimates. Indeed, one of the applications of our technique is to bump-hunting methods, where it allows standard approaches to be improved.
Significance Tests for Unsupervised Pattern Discovery in Large Continuous Multivariate Data Sets Richard J. Bolton
"... In this paper we consider the question of uncertainty of discovered patterns in data mining. In particular, we develop statistical tests for flagged patterns found in continuous data, where such patterns are perhaps more familiar to statisticians as local modes in the data. We indicate the significa ..."
Abstract
- Add to MetaCart
In this paper we consider the question of uncertainty of discovered patterns in data mining. In particular, we develop statistical tests for flagged patterns found in continuous data, where such patterns are perhaps more familiar to statisticians as local modes in the data. We indicate the significance of these patterns in terms of the probability that they have occurred by chance. We examine the performance of these tests on patterns discovered in several large data sets, including a data set describing the locations of earthquakes in California and another describing flow cytometry measurements on phytoplankton. Keywords: Data mining, pattern discovery, mode analysis, local structure, uncertainty, significance tests Corresponding author. Telephone +44 (0)20 7589 5111 ext 58600; Fax +44 (0)20 7594 8517. 1
BIS Working Papers No 184
"... A nonparametric analysis of the shape dynamics of the US personal income ..."

