Results 1 - 10
of
22
Ideal spatial adaptation by wavelet shrinkage
- Biometrika
, 1994
"... With ideal spatial adaptation, an oracle furnishes information about how best to adapt a spatially variable estimator, whether piecewise constant, piecewise polynomial, variable knot spline, or variable bandwidth kernel, to the unknown function. Estimation with the aid of an oracle o ers dramatic ad ..."
Abstract
-
Cited by 578 (2 self)
- Add to MetaCart
With ideal spatial adaptation, an oracle furnishes information about how best to adapt a spatially variable estimator, whether piecewise constant, piecewise polynomial, variable knot spline, or variable bandwidth kernel, to the unknown function. Estimation with the aid of an oracle o ers dramatic advantages over traditional linear estimation by nonadaptive kernels � however, it is a priori unclear whether such performance can be obtained by a procedure relying on the data alone. We describe a new principle for spatially-adaptive estimation: selective wavelet reconstruction. Weshowthatvariableknot spline ts and piecewise-polynomial ts, when equipped with an oracle to select the knots, are not dramatically more powerful than selective wavelet reconstruction with an oracle. We develop a practical spatially adaptive method, RiskShrink, which works by shrinkage of empirical wavelet coe cients. RiskShrink mimics the performance of an oracle for selective wavelet reconstruction as well as it is possible to do so. A new inequality inmultivariate normal decision theory which wecallthe oracle inequality shows that attained performance di ers from ideal performance by at most a factor 2logn, where n is the sample size. Moreover no estimator can give a better guarantee than this. Within the class of spatially adaptive procedures, RiskShrink is essentially optimal. Relying only on the data, it comes within a factor log 2 n of the performance of piecewise polynomial and variable-knot spline methods equipped with an oracle. In contrast, it is unknown how or if piecewise polynomial methods could be made to function this well when denied access to an oracle and forced to rely on data alone.
Wavelet shrinkage: asymptopia
- Journal of the Royal Statistical Society, Ser. B
, 1995
"... Considerable e ort has been directed recently to develop asymptotically minimax methods in problems of recovering in nite-dimensional objects (curves, densities, spectral densities, images) from noisy data. A rich and complex body of work has evolved, with nearly- or exactly- minimax estimators bein ..."
Abstract
-
Cited by 196 (32 self)
- Add to MetaCart
Considerable e ort has been directed recently to develop asymptotically minimax methods in problems of recovering in nite-dimensional objects (curves, densities, spectral densities, images) from noisy data. A rich and complex body of work has evolved, with nearly- or exactly- minimax estimators being obtained for a variety of interesting problems. Unfortunately, the results have often not been translated into practice, for a variety of reasons { sometimes, similarity to known methods, sometimes, computational intractability, and sometimes, lack of spatial adaptivity. We discuss a method for curve estimation based on n noisy data; one translates the empirical wavelet coe cients towards the origin by an amount p p 2 log(n) = n. The method is di erent from methods in common use today, is computationally practical, and is spatially adaptive; thus it avoids a number of previous objections to minimax estimators. At the same time, the method is nearly minimax for a wide variety of loss functions { e.g. pointwise error, global error measured in L p norms, pointwise and global error in estimation of derivatives { and for a wide range of smoothness classes, including standard Holder classes, Sobolev classes, and Bounded Variation. This is amuch broader near-optimality than anything previously proposed in the minimax literature. Finally, the theory underlying the method is interesting, as it exploits a correspondence between statistical questions and questions of optimal recovery and information-based complexity.
An Active Learning Framework for Content-Based Information Retrieval
- IEEE TRANSACTIONS ON MULTIMEDIA
, 2002
"... In this paper, we propose a general active learning framework for content-based information retrieval (CBIR). We use this framework to guide hidden annotations in order to improve the retrieval performance. For each object in the database, we maintain a list of probabilities, each indicating the pro ..."
Abstract
-
Cited by 40 (1 self)
- Add to MetaCart
In this paper, we propose a general active learning framework for content-based information retrieval (CBIR). We use this framework to guide hidden annotations in order to improve the retrieval performance. For each object in the database, we maintain a list of probabilities, each indicating the probability of this object having one of the attributes. During training, the learning algorithm samples objects in the database and presents them to the annotator to assign attributes to. For each sampled object, each probability is set to be one or zero depending on whether or not the corresponding attribute is assigned by the annotator. For objects that have not been annotated, the learning algorithm estimates their probabilities with biased kernel regression. Knowledge gain is then defined to determine, among the objects that have not been annotated, which one the system is the most uncertain of. The system then presents it as the next sample to the annotator to which it is assigned attributes. During retrieval, the list of probabilities works as a feature vector for us to calculate the semantic distance between two objects, or between the user query and an object in the database. The overall distance between two objects is determined by a weighted sum of the semantic distance and the low-level feature distance. The algorithm is tested on both synthetic databases and real databases of three-dimensional (3-D) models. In both cases, the retrieval performance of the system improves rapidly with the number of annotated samples. Furthermore, we show that active learning outperforms learning based on random sampling.
On Locally Adaptive Density Estimation
, 1996
"... : In this paper, theoretical and practical aspects of the sample-point adaptive positive kernel density estimator are examined. A closed-form expression for the mean integrated squared error is obtained through the device of preprocessing the data by binning. With this expression, the exact behavio ..."
Abstract
-
Cited by 30 (4 self)
- Add to MetaCart
: In this paper, theoretical and practical aspects of the sample-point adaptive positive kernel density estimator are examined. A closed-form expression for the mean integrated squared error is obtained through the device of preprocessing the data by binning. With this expression, the exact behavior of the optimally adaptive smoothing parameter function is studied for the first time. The approach differs from most earlier techniques in that bias of the adaptive estimator remains O(h 2 ) and is not "improved" to the rate O(h 4 ). A practical algorithm is constructed using a modification of least-squares cross-validation. Simulated and real examples are presented, including comparisons with a fixed bandwidth estimator and a fully automatic version of Abramson's adaptive estimator. The results are very promising. KEY WORDS: Kernel Function, Variable Bandwidth, Binning, Cross-Validation. 1 Stephan R. Sain is Research Associate, Department of Statistical Science, Southern Methodist U...
Nonparametric Density Estimation With Adaptive Varying Window Size
- In Conference on Image and Signal Processing for Remote Sensing VI, European Symposium on Remote Sensing
, 2000
"... We propose a new method of kernel density estimation with a varying adaptive window width. This method is different from traditional ones in two aspects. First, we use symmetric as well as nonsymmetric left and right kernels with discontinuities and show that the fusion of these estimates results in ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
We propose a new method of kernel density estimation with a varying adaptive window width. This method is different from traditional ones in two aspects. First, we use symmetric as well as nonsymmetric left and right kernels with discontinuities and show that the fusion of these estimates results in accuracy improvement. Second, we develop estimates with adaptive varying window widths based on the so-called intersection of confidence intervals (ICI) rule. Several examples of the proposed method are given for different types of densities and the quality of the adaptive density estimate is assessed by means of numerical simulations.
Variable kernel estimates: On the impossibility of tuning the parameters
- in: High-Dimensional Probability II, (edited by
, 2000
"... ABSTRACT For the standard kernel density estimate, it is known that one can tune the bandwidth such that the expected L1 error is within a constant factor of the optimal L1 error (obtained when one is allowed to choose the bandwidth with knowledge ofthe density). In this paper, we pose the same prob ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
ABSTRACT For the standard kernel density estimate, it is known that one can tune the bandwidth such that the expected L1 error is within a constant factor of the optimal L1 error (obtained when one is allowed to choose the bandwidth with knowledge ofthe density). In this paper, we pose the same problem for variable bandwidth kernel estimates where the bandwidths are allowed to depend upon the location. We show in particular that for positive kernels on the real line, for any data-based bandwidth, there exists a density for which the ratio of expected L1 error over optimal L1 error tends to infinity. Thus, the problem oftuning the variable bandwidth in an optimal manner is “too hard”. Moreover, from the class ofcounterexamples exhibited in the paper, it appears that placing conditions on the densities (monotonicity, convexity, smoothness) does not help. 1
Predictor-Corrector Ensemble Filters for the Assimilation of Sparse Data into High-Dimensional Nonlinear Systems
, 2006
"... ..."
Continuous Contour Monte Carlo for Marginal Density Estimation with an Application to Spatial Statistical Model
, 2006
"... The problem of marginal density estimation for a multivariate density function f(x) can be generally stated as a problem of density function estimation for a random vector λ(x) of dimension lower than that of x. In this paper, we propose a technique, the so-called continuous Contour Monte Carlo (CCM ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
The problem of marginal density estimation for a multivariate density function f(x) can be generally stated as a problem of density function estimation for a random vector λ(x) of dimension lower than that of x. In this paper, we propose a technique, the so-called continuous Contour Monte Carlo (CCMC) algorithm, for solving this problem. CCMC can be viewed as a continuous version of the contour Monte Carlo (CMC) algorithm recently proposed in the literature. CCMC abandons the use of sample space partitioning and incorporates the techniques of kernel density estimation into its simulations. CCMC is more general than other marginal density estimation algorithms. First, it works for any density functions, even for those having a rugged or unbalanced energy landscape. Second, it works for any transformation λ(x) regardless of the availability of the analytical form of the inverse transformation. In this paper, CCMC is applied to estimate the unknown normalizing constant function for a spatial autologistic model, and the estimate is then used in a Bayesian analysis for the spatial autologistic model in place of the true normalizing constant function. Numerical results on the US cancer mortality data indicate that the Bayesian method can produce much more accurate estimates than the MPLE and MCMLE methods for the parameters of the spatial autologistic model.
Filtered Kernel Density Estimation
, 1995
"... this paper for the construction of the filtering functions for two reasons: we have some experience in finite mixture estimation, and it is a convenient arena in which to do some comparisons with the standard kernel estimator. There are a number of methods for choosing the number of components to be ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
this paper for the construction of the filtering functions for two reasons: we have some experience in finite mixture estimation, and it is a convenient arena in which to do some comparisons with the standard kernel estimator. There are a number of methods for choosing the number of components to be used, either subjectively or using automated methods such as in Priebe 1994, Solka et al 1995 or Rogers et al 1995. In this paper, we will either assume knowledge of the true mixture (for comparison with the standard kernel estimator) or use subjective methods.
Kernel density classification and boosting: an L2 analysis
- Statistics and Computing
, 2005
"... Abstract. Kernel density estimation is a commonly used approach to classification. However, most of the theoretical results for kernel methods apply to estimation per se and not necessarily to classification. In this paper we show that when estimating the difference between two densities, the optima ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Abstract. Kernel density estimation is a commonly used approach to classification. However, most of the theoretical results for kernel methods apply to estimation per se and not necessarily to classification. In this paper we show that when estimating the difference between two densities, the optimal smoothing parameters are increasing functions of the sample size of the complementary group, and we provide a small simluation study which examines the relative performance of kernel density methods when the final goal is classification. A relative newcomer to the classification portfolio is “boosting”, and this paper proposes an algorithm for boosting kernel density classifiers. We note that boosting is closely linked to a previously proposed method of bias reduction in kernel density estimation and indicate how it will enjoy similar properties for classification. We show that boosting kernel classifiers reduces the bias whilst only slightly increasing the variance, with an overall reduction in error. Numerical examples and simulations are used to illustrate the findings, and we also suggest further areas of research.

