Results 1  10
of
243
Mean shift: A robust approach toward feature space analysis
 In PAMI
, 2002
"... A general nonparametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure, the mean shift. We prove for discrete data the convergence ..."
Abstract

Cited by 1619 (34 self)
 Add to MetaCart
(Show Context)
A general nonparametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure, the mean shift. We prove for discrete data the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and thus its utility in detecting the modes of the density. The equivalence of the mean shift procedure to the Nadaraya–Watson estimator from kernel regression and the robust Mestimators of location is also established. Algorithms for two lowlevel vision tasks, discontinuity preserving smoothing and image segmentation are described as applications. In these algorithms the only user set parameter is the resolution of the analysis, and either gray level or color images are accepted as input. Extensive experimental results illustrate their excellent performance.
Manipulation of the running variable in the regression discontinuity design: A density test
 Journal of Econometrics 142
, 2008
"... Standard sufficient conditions for identification in the regression discontinuity design are continuity of the conditional expectation of counterfactual outcomes in the running variable. These continuity assumptions may not be plausible if agents are able to manipulate the running variable. This pap ..."
Abstract

Cited by 138 (4 self)
 Add to MetaCart
Standard sufficient conditions for identification in the regression discontinuity design are continuity of the conditional expectation of counterfactual outcomes in the running variable. These continuity assumptions may not be plausible if agents are able to manipulate the running variable. This paper develops a test of manipulation related to continuity of the running variable density function. The methodology is applied to popular elections to the House of Representatives, where sorting is neither expected nor found, and to rollcall voting in the House, where sorting is both expected and found. I thank two anonymous referees for comments, the editors for multiple suggestions that substantially improved the paper, Jack Porter, John DiNardo, and Serena Ng for discussion, Jonah Gelbach for computing improvements, and MingYen Cheng One reason for the increasing popularity in economics of regression discontinuity applications is the perception that the identifying assumptions are quite weak. However, while some applications of the design can be highly persuasive, many are subject to the criticism that public knowledge of the treatment assignment rule may invalidate the continuity assumptions at the heart of identification.
Datadriven bandwidth selection in local polynomial fitting: variable bandwidth and spatial
 B
, 1995
"... ..."
Did Securitization Lead to Lax Screening? Evidence from Subprime Loans. SSRN working paper
, 2008
"... and seminar participants at Duke (Fuqua School of Business) and London Business School for useful discussions. The opinions expressed in the paper are those of the authors and do not reflect the views of Sorin Capital Management. All remaining errors are our responsibility. ..."
Abstract

Cited by 130 (5 self)
 Add to MetaCart
and seminar participants at Duke (Fuqua School of Business) and London Business School for useful discussions. The opinions expressed in the paper are those of the authors and do not reflect the views of Sorin Capital Management. All remaining errors are our responsibility.
The Variable Bandwidth Mean Shift and DataDriven Scale Selection
 in Proc. 8th Intl. Conf. on Computer Vision
, 2001
"... We present two solutions for the scale selection problem in computer vision. The first one is completely nonparametric and is based on the the adaptive estimation of the normalized density gradient. Employing the sample point estimator, we define the Variable Bandwidth Mean Shift, prove its converge ..."
Abstract

Cited by 111 (9 self)
 Add to MetaCart
(Show Context)
We present two solutions for the scale selection problem in computer vision. The first one is completely nonparametric and is based on the the adaptive estimation of the normalized density gradient. Employing the sample point estimator, we define the Variable Bandwidth Mean Shift, prove its convergence, and show its superiority over the fixed bandwidth procedure. The second technique has a semiparametric nature and imposes a local structure on the data to extract reliable scale information. The local scale of the underlying density is taken as the bandwidth which maximizes the magnitude of the normalized mean shift vector. Both estimators provide practical tools for autonomous image and quasi realtime video analysis and several examples are shown to illustrate their effectiveness. 1 Motivation for Variable Bandwidth The efficacy of Mean Shift analysis has been demonstrated in computer vision problems such as tracking and segmentation in [5, 6]. However, one of the limitations of the mean shift procedure as defined in these papers is that it involves the specification of a scale parameter. While results obtained appear satisfactory, when the local characteristics of the feature space differs significantly across data, it is difficult to find an optimal global bandwidth for the mean shift procedure. In this paper we address the issue of locally adapting the bandwidth. We also study an alternative approach for datadriven scale selection which imposes a local structure on the data. The proposed solutions are tested in the framework of quasi realtime video analysis. We review first the intrinsic limitations of the fixed bandwidth density estimation methods. Then, two of the most popular variable bandwidth estimators, the balloon and the sample point, are introduced and...
An algorithm for datadriven bandwidth selection
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2003
"... Abstract—The analysis of a feature space that exhibits multiscale patterns often requires kernel estimation techniques with locally adaptive bandwidths, such as the variablebandwidth mean shift. Proper selection of the kernel bandwidth is, however, a critical step for superior space analysis and pa ..."
Abstract

Cited by 86 (7 self)
 Add to MetaCart
Abstract—The analysis of a feature space that exhibits multiscale patterns often requires kernel estimation techniques with locally adaptive bandwidths, such as the variablebandwidth mean shift. Proper selection of the kernel bandwidth is, however, a critical step for superior space analysis and partitioning. This paper presents a mean shiftbased approach for local bandwidth selection in the multimodal, multivariate case. Our method is based on a fundamental property of normal distributions regarding the bias of the normalized density gradient. We demonstrate that, within the large sample approximation, the local covariance is estimated by the matrix that maximizes the magnitude of the normalized mean shift vector. Using this property, we develop a reliable algorithm which takes into account the stability of local bandwidth estimates across scales. The validity of our theoretical results is proven in various space partitioning experiments involving the variablebandwidth mean shift. Index Terms—Variablebandwidth mean shift, bandwidth selection, multiscale analysis, JensenShannon divergence, feature space. 1
Bayesian Analysis of Mixture Models with an Unknown Number of Components  an alternative to reversible jump methods
, 1998
"... Richardson and Green (1997) present a method of performing a Bayesian analysis of data from a finite mixture distribution with an unknown number of components. Their method is a Markov Chain Monte Carlo (MCMC) approach, which makes use of the "reversible jump" methodology described by Gree ..."
Abstract

Cited by 72 (0 self)
 Add to MetaCart
Richardson and Green (1997) present a method of performing a Bayesian analysis of data from a finite mixture distribution with an unknown number of components. Their method is a Markov Chain Monte Carlo (MCMC) approach, which makes use of the "reversible jump" methodology described by Green (1995). We describe an alternative MCMC method which views the parameters of the model as a (marked) point process, extending methods suggested by Ripley (1977) to create a Markov birthdeath process with an appropriate stationary distribution. Our method is easy to implement, even in the case of data in more than one dimension, and we illustrate it on both univariate and bivariate data. Keywords: Bayesian analysis, Birthdeath process, Markov process, MCMC, Mixture model, Model Choice, Reversible Jump, Spatial point process 1 Introduction Finite mixture models are typically used to model data where each observation is assumed to have arisen from one of k groups, each group being suitably modelle...
Local polynomial kernel regression for generalized linear models and quasilikelihood functions
 Journal of the American Statistical Association,90
, 1995
"... were introduced as a means of extending the techniques of ordinary parametric regression to several commonlyused regression models arising from nonnormal likelihoods. Typically these models have a variance that depends on the mean function. However, in many cases the likelihood is unknown, but the ..."
Abstract

Cited by 65 (7 self)
 Add to MetaCart
(Show Context)
were introduced as a means of extending the techniques of ordinary parametric regression to several commonlyused regression models arising from nonnormal likelihoods. Typically these models have a variance that depends on the mean function. However, in many cases the likelihood is unknown, but the relationship between mean and variance can be specified. This has led to the consideration of quasilikelihood methods, where the conditionalloglikelihood is replaced by a quasilikelihood function. In this article we investigate the extension of the nonparametric regression technique of local polynomial fitting with a kernel weight to these more general contexts. In the ordinary regression case local polynomial fitting has been seen to possess several appealing features in terms of intuitive and mathematical simplicity. One noteworthy feature is the better performance near the boundaries compared to the traditional kernel regression estimators. These properties are shown to carryover to the generalized linear model and quasilikelihood model. The end result is a class of kernel type estimators for smoothing in quasilikelihood models. These estimators can be viewed as a straightforward generalization of the usual parametric estimators. In addition, their simple asymptotic distributions allow for simple interpretation
Bandwidth Selection in Kernel Density Estimation: A Review
 CORE and Institut de Statistique
"... Allthough nonparametric kernel density estimation is nowadays a standard technique in explorative dataanalysis, there is still a big dispute on how to assess the quality of the estimate and which choice of bandwidth is optimal. The main argument is on whether one should use the Integrated Squared ..."
Abstract

Cited by 53 (1 self)
 Add to MetaCart
Allthough nonparametric kernel density estimation is nowadays a standard technique in explorative dataanalysis, there is still a big dispute on how to assess the quality of the estimate and which choice of bandwidth is optimal. The main argument is on whether one should use the Integrated Squared Error or the Mean Integrated Squared Error to define the optimal bandwidth. In the last years a lot of research was done to develop bandwidth selection methods which try to estimate the optimal bandwidth obtained by either of this error criterion. This paper summarizes the most important arguments for each criterion and gives an overview over the existing bandwidth selection methods. We also summarize the small sample behaviour of these methods as assessed in several MonteCarlo studies. These MonteCarlo studies are all restricted to very small sample sizes due to the fact that the numerical effort of estimating the optimal bandwidth by any of these bandwidth selection methods is proporti...