The Colonial Origins of Comparative Development: An Empirical Analysis
 AMERICAN ECONOMIC REVIEW
, 2002
Cited by 1585 (38 self)
We exploit differences in early colonial experience to estimate the effect of institutions on economic performance. Our argument is that Europeans adopted very different colonization policies in different colonies, with different associated institutions. The choice of colonization strategy was, at least in part, determined by the feasibility of whether Europeans could settle in the colony. In places where Europeans faced high mortality rates, they could not settle and they were more likely to set up worse (extractive) institutions. These early institutions persisted to the present. We document these hypotheses in the data. Exploiting differences in mortality rates faced by soldiers, bishops and sailors in the colonies during the 18th and 19th centuries as an instrument for current institutions, we estimate large effects of institutions on income per capita. Our estimates imply that a change from the worst (Zaire) to the best (US or New Zealand) institutions in our sample would be associated with a five fold increase in income per capita.
A Fast Algorithm for the Minimum Covariance Determinant Estimator
 Technometrics
, 1998
Cited by 334 (14 self)
The minimum covariance determinant (MCD) method of Rousseeuw (1984) is a highly robust estimator of multivariate location and scatter. Its objective is to find h observations (out of n) whose covariance matrix has the lowest determinant. Until now applications of the MCD were hampered by the computation time of existing algorithms, which were limited to a few hundred objects in a few dimensions. We discuss two important applications of larger size: one about a production process at Philips with n = 677 objects and p = 9 variables, and a data set from astronomy with n =137,256 objects and p = 27 variables. To deal with such problems we have developed a new algorithm for the MCD, called FASTMCD. The basic ideas are an inequality involving order statistics and determinants, and techniques which we call `selective iteration' and `nested extensions'. For small data sets FASTMCD typically finds the exact MCD, whereas for larger data sets it gives more accurate results than existing algori...
Robust multiresolution estimation of parametric motion models
 Jal of Vis. Comm. and Image Representation
, 1995
Cited by 327 (55 self)
This paper describes a method to estimate parametric motion models. Motivations for the use of such models are on one hand their efficiency, which has been demonstrated in numerous contexts such as estimation, segmentation, tracking and interpretation of motion, and on the other hand, their low computational cost compared to optical flow estimation. However, it is important to have the best accuracy for the estimated parameters, and to take into account the problem of multiple motion. We have therefore developed two robust estimators in a multiresolution framework. Numerical results support this approach, as validated by the use of these algorithms on complex sequences. 1
Robust parameter estimation in computer vision
 SIAM Reviews
, 1999
Cited by 162 (10 self)
Abstract. Estimation techniques in computer vision applications must estimate accurate model parameters despite smallscale noise in the data, occasional largescale measurement errors (outliers), and measurements from multiple populations in the same data set. Increasingly, robust estimation techniques, some borrowed from the statistics literature and others described in the computer vision literature, have been used in solving these parameter estimation problems. Ideally, these techniques should effectively ignore the outliers and measurements from other populations, treating them as outliers, when estimating the parameters of a single population. Two frequently used techniques are leastmedian of
Robust clustering methods: a unified view
 IEEE Transactions on Fuzzy Systems
, 1997
Cited by 111 (8 self)
Abstract—Clustering methods need to be robust if they are to be useful in practice. In this paper, we analyze several popular robust clustering methods and show that they have much in common. We also establish a connection between fuzzy set theory and robust statistics and point out the similarities between robust clustering methods and statistical methods such as the weighted leastsquares (LS) technique, the M estimator, the minimum volume ellipsoid (MVE) algorithm, cooperative robust estimation (CRE), minimization of probability of randomness (MINPRAN), and the epsilon contamination model. By gleaning the common principles upon which the methods proposed in the literature are based, we arrive at a unified view of robust clustering methods. We define several general concepts that are useful in robust clustering, state the robust clustering problem in terms of the defined concepts, and propose generic algorithms and guidelines for clustering noisy data. We also discuss why the generalized Hough transform is a suboptimal solution to the robust clustering problem. Index Terms — Clustering validity, fuzzy clustering, robust methods.
Computing LTS Regression for Large Data Sets
 Institute of Mathematical Statistics Bulletin
, 1999
Cited by 90 (2 self)
Least trimmed squares (LTS) regression is based on the subset of h cases (out of n) whose least squares t possesses the smallest sum of squared residuals. The coverage h may be set between n=2 andn. The LTS method was proposed by Rousseeuw (1984, p. 876) as a highly robust regression estimator, with breakdown value (n; h)=n. It turned out that the computation time of existing LTS algorithms grew too fast with the size of the data set, precluding their use for data mining. Therefore we develop a new algorithm called FASTLTS. The basic ideas are an inequality involving order statistics and sums of squared residuals, and techniques which we call `selective iteration' and `nested extensions'. We also use an intercept adjustment technique to improve the precision. For small data sets FASTLTS typically nds the exact LTS, whereas for larger data sets it gives more accurate results than existing algorithms for LTS and is faster by orders of magnitude. Moreover, FASTLTS runs faster than all programs for least median of squares (LMS). The new algorithm makes the LTS method available as a tool for robust regression in large data sets, e.g. in a data mining context.
The dualbootstrap iterative closest point algorithm with application to retinal image registration
 IEEE Trans. Med. Img
, 2003
Cited by 85 (19 self)
Abstract—Motivated by the problem of retinal image registration, this paper introduces and analyzes a new registration algorithm called DualBootstrap Iterative Closest Point (DualBootstrap ICP). The approach is to start from one or more initial, loworder estimates that are only accurate in small image regions, called bootstrap regions. In each bootstrap region, the algorithm iteratively: 1) refines the transformation estimate using constraints only from within the bootstrap region; 2) expands the bootstrap region; and 3) tests to see if a higher order transformation model can be used, stopping when the region expands to cover the overlap between images. Steps 1): and 3), the bootstrap steps, are governed by the covariance matrix of the estimated transformation. Estimation refinement [Step 2)] uses a novel robust version of the ICP algorithm. In registering retinal image pairs, DualBootstrap ICP is initialized by automatically matching individual vascular landmarks, and it aligns images based on detected blood vessel centerlines. The resulting quadratic transformations are accurate to less than a pixel. On tests involving approximately 6000 image pairs, it successfully registered 99.5 % of the pairs containing at least one common landmark, and 100 % of the pairs containing at least one common landmark and at least 35 % image overlap. Index Terms—Iterative closest point, medical imaging, registration, retinal imaging, robust estimation.
ROBPCA: a New Approach to Robust Principal Component Analysis
, 2003
Cited by 77 (14 self)
In this paper we introduce a new method for robust principal component analysis. Classical PCA is based on the empirical covariance matrix of the data and hence it is highly sensitive to outlying observations. In the past, two robust approaches have been developed. The first is based on the eigenvectors of a robust scatter matrix such as the MCD or an Sestimator, and is limited to relatively lowdimensional data. The second approach is based on projection pursuit and can handle highdimensional data. Here, we propose the ROBPCA approach which combines projection pursuit ideas with robust scatter matrix estimation. It yields more accurate estimates at noncontaminated data sets and more robust estimates at contaminated data. ROBPCA can be computed fast, and is able to detect exact fit situations. As a byproduct, ROBPCA produces a diagnostic plot which displays and classifies the outliers. The algorithm is applied to several data sets from chemometrics and engineering.
Scanning Physical Interaction Behavior of 3D Objects
 In Computer Graphics (ACM SIGGRAPH 01 Conference Proceedings
, 2001
Cited by 68 (19 self)
(a) Real toy tiger. By design, it is soft to touch and exhibits significant deformation behavior. (b) Deformable model of tiger scanned by our system, with haptic interaction. We describe a system for constructing computer models of several aspects of physical interaction behavior, by scanning the response of real objects. The behaviors we can successfully scan and model include deformation response, contact textures for interaction with forcefeedback, and contact sounds. The system we describe uses a highly automated robotic facility that can scan behavior models of whole objects. We provide a comprehensive view of the modeling process, including selection of model structure, measurement, estimation, and rendering at interactive rates. The results are demonstrated with two examples: a soft stuffed toy which has significant deformation behavior, and a hard clay pot which has significant contact textures and sounds. The results described here make it possible to quickly construct physical interaction models of objects for applications in games, animation, and ecommerce.