Results 1  10
of
197
Bagging Predictors
 Machine Learning
, 1996
"... Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making ..."
Abstract

Cited by 2479 (1 self)
 Add to MetaCart
Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability of the prediction method. If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy. 1. Introduction A learning set of L consists of data f(y n ; x n ), n = 1; : : : ; Ng where the y's are either class labels or a numerical response. We have a procedure for using this learning set to form a predictor '(x; L)  if the input is x we ...
Partial Constraint Satisfaction
, 1992
"... . A constraint satisfaction problem involves finding values for variables subject to constraints on which combinations of values are allowed. In some cases it may be impossible or impractical to solve these problems completely. We may seek to partially solve the problem, in particular by satisfying ..."
Abstract

Cited by 427 (23 self)
 Add to MetaCart
. A constraint satisfaction problem involves finding values for variables subject to constraints on which combinations of values are allowed. In some cases it may be impossible or impractical to solve these problems completely. We may seek to partially solve the problem, in particular by satisfying a maximal number of constraints. Standard backtracking and local consistency techniques for solving constraint satisfaction problems can be adapted to cope with, and take advantage of, the differences between partial and complete constraint satisfaction. Extensive experimentation on maximal satisfaction problems illuminates the relative and absolute effectiveness of these methods. A general model of partial constraint satisfaction is proposed. 1 Introduction Constraint satisfaction involves finding values for problem variables subject to constraints on acceptable combinations of values. Constraint satisfaction has wide application in artificial intelligence, in areas ranging from temporal r...
Exploratory projection pursuit
 Journal of the American Statistical Association
, 1987
"... Exploratory projection pursuit is concerned with finding relatively highly revealing lower dimensional projections of high dimensional data. The intent is to discover views of the multivariate data set that exhibit nonlinear effectsclustering, concentrations near nonlinear manifolds that are not c ..."
Abstract

Cited by 242 (0 self)
 Add to MetaCart
Exploratory projection pursuit is concerned with finding relatively highly revealing lower dimensional projections of high dimensional data. The intent is to discover views of the multivariate data set that exhibit nonlinear effectsclustering, concentrations near nonlinear manifolds that are not captured by the linear correlation structure. This paper presents a new algorithm for this purpose that has both statistical and computational advantages over previous methods. A connection to density estimation is established. Examples are presented and issues related to practical application are discussed.
Institutions Rule: The Primacy of Institutions over Geography and Integration in Economic Development
 Free University of Berlin
, 2004
"... We estimate the respective contributions of institutions, geography, and trade in determining income levels around the world, using recently developed instrumental variables for institutions and trade. Our results indicate that the quality of institutions “trumps ” everything else. Once institutions ..."
Abstract

Cited by 227 (17 self)
 Add to MetaCart
We estimate the respective contributions of institutions, geography, and trade in determining income levels around the world, using recently developed instrumental variables for institutions and trade. Our results indicate that the quality of institutions “trumps ” everything else. Once institutions are controlled for, conventional measures of geography have at best weak direct effects on incomes, although they have a strong indirect effect by influencing the quality of institutions. Similarly, once institutions are controlled for, trade is almost always insignificant, and often enters the income equation with the “wrong ” (i.e., negative) sign. We relate our results to recent literature, and where differences exist, trace their origins to choices on samples, specification, and instrumentation. The views expressed in this paper are the authors ’ own and not of the institutions with which they are affiliated. We thank three referees, Chad Jones, James Robinson, Will Masters, and participants at the HarvardMIT development seminar, joint IMFWorld Bank Seminar, and the Harvard econometrics workshop for their comments, Daron Acemoglu for helpful conversations, and Simon Johnson for providing us with his data. Dani Rodrik gratefully acknowledges support from the Carnegie Corporation of New York. Commerce and manufactures can seldom flourish long in any state which does not enjoy a regular administration of justice, in which the people do not feel themselves secure in the possession of their property, in which the faith of contracts is not supported by law, and in which the authority of the state is not supposed to be regularly employed in enforcing the payment of debts from all those who are able to pay. Commerce and manufactures, in short, can seldom flourish in any state in which there is not a certain degree of confidence in the justice of government. Adam Smith, Wealth of Nations I.
Parameter Estimation Techniques: A Tutorial with Application to Conic Fitting
 Image and Vision Computing
, 1997
"... : Almost all problems in computer vision are related in one form or another to the problem of estimating parameters from noisy data. In this tutorial, we present what is probably the most commonly used techniques for parameter estimation. These include linear leastsquares (pseudoinverse and eigen ..."
Abstract

Cited by 196 (6 self)
 Add to MetaCart
: Almost all problems in computer vision are related in one form or another to the problem of estimating parameters from noisy data. In this tutorial, we present what is probably the most commonly used techniques for parameter estimation. These include linear leastsquares (pseudoinverse and eigen analysis); orthogonal leastsquares; gradientweighted leastsquares; biascorrected renormalization; Kalman øltering; and robust techniques (clustering, regression diagnostics, Mestimators, least median of squares). Particular attention has been devoted to discussions about the choice of appropriate minimization criteria and the robustness of the dioeerent techniques. Their application to conic øtting is described. Keywords: Parameter estimation, Leastsquares, Bias correction, Kalman øltering, Robust regression (R#sum# : tsvp) Unite de recherche INRIA SophiaAntipolis 2004 route des Lucioles, BP 93, 06902 SOPHIAANTIPOLIS Cedex (France) Telephone : (33) 93 65 77 77  Telecopie : (33) 9...
Network Externalities in Microcomputer Software: An Econometric Analysis of the Spreadsheet Market
 Management Science
, 1996
"... Because of network externalities, the success of a software product may depend in part on the size of its installed base and its conformance to industry standards. This research builds a hedonic model to determine the effects of network externalities, standards, intrinsic features and a time trend o ..."
Abstract

Cited by 138 (4 self)
 Add to MetaCart
Because of network externalities, the success of a software product may depend in part on the size of its installed base and its conformance to industry standards. This research builds a hedonic model to determine the effects of network externalities, standards, intrinsic features and a time trend on microcomputer spreadsheet software prices. When data for a sample of products during the 19871992 time period were analyzed using this model, four main results emerged: 1) Network externalities, as measured by the size of a product's installed base, significantly increased the price of spreadsheet products: a one percent increase in a product's installed base was associated with a 0.75% increase in its price. 2) Products which adhered to the dominant standard, the Lotus menu tree interface, commanded prices which were higher by an average of 46%. 3) Although nominal prices increased slightly during this time period, qualityadjusted prices declined by an average of 16% per year. 4) The hed...
Mapreduce for machine learning on multicore
 In Proceedings of NIPS
, 2007
"... We are at the beginning of the multicore era. Computers will have increasingly many cores (processors), but there is still no good programming framework for these architectures, and thus no simple and unified way for machine learning to take advantage of the potential speed up. In this paper, we dev ..."
Abstract

Cited by 138 (7 self)
 Add to MetaCart
We are at the beginning of the multicore era. Computers will have increasingly many cores (processors), but there is still no good programming framework for these architectures, and thus no simple and unified way for machine learning to take advantage of the potential speed up. In this paper, we develop a broadly applicable parallel programming method, one that is easily applied to many different learning algorithms. Our work is in distinct contrast to the tradition in machine learning of designing (often ingenious) ways to speed up a single algorithm at a time. Specifically, we show that algorithms that fit the Statistical Query model [15] can be written in a certain “summation form, ” which allows them to be easily parallelized on multicore computers. We adapt Google’s mapreduce [7] paradigm to demonstrate this parallel speed up technique on a variety of learning algorithms including locally weighted linear regression (LWLR), kmeans, logistic regression
Nonparametric regression using Bayesian variable selection
 Journal of Econometrics
, 1996
"... This paper estimates an additive model semiparametrically, while automatically selecting the significant independent variables and the app~opriatc power transformation of the dependent variable. The nonlinear variables arc modeled as regression splincs, with significant knots selected fiom a large ..."
Abstract

Cited by 136 (10 self)
 Add to MetaCart
This paper estimates an additive model semiparametrically, while automatically selecting the significant independent variables and the app~opriatc power transformation of the dependent variable. The nonlinear variables arc modeled as regression splincs, with significant knots selected fiom a large number of candidate knots. The estimation is made robust by modeling the errors as a mixture of normals. A Bayesian approach is used to select the significant knots, the power transformation, and to identify oatliers using the Gibbs sampler to curry out the computation. Empirical evidence is given that the sampler works well on both simulated and real examples and that in the univariate case it compares faw)rably with a kernelweighted local linear smoother, The variable selection algorithm in the paper is substantially fasler than previous Bayesian variable sclcclion algorithms. K('I ' word~': Additive nlodel, Pov¢¢r Iransformalio:l: Robust cslinlalion
Incremental Online Learning in High Dimensions
 Neural Computation
, 2005
"... Locally weighted projection regression (LWPR) is a new algorithm for incremental nonlinear function approximation in high dimensional spaces with redundant and irrelevant input dimensions. At its core, it employs nonparametric regression with locally linear models. In order to stay computationally e ..."
Abstract

Cited by 104 (15 self)
 Add to MetaCart
Locally weighted projection regression (LWPR) is a new algorithm for incremental nonlinear function approximation in high dimensional spaces with redundant and irrelevant input dimensions. At its core, it employs nonparametric regression with locally linear models. In order to stay computationally e#cient and numerically robust, each local model performs the regression analysis with a small number of univariate regressions in selected directions in input space in the spirit of partial least squares regression. We discuss when and how local learning techniques can successfully work in high dimensional spaces and review the various techniques for local dimensionality reduction before finally deriving the LWPR algorithm. The properties of LWPR are that it i) learns rapidly with second order learning methods based on incremental training, ii) uses statistically sound stochastic leaveoneout cross validation for learning without the need to memorize training data, iii) adjusts its weighting kernels based only on local information in order to minimize the danger of negative interference of incremental learning, iv) has a computational complexity that is linear in the number of inputs, and v) can deal with a large number of  possibly redundant  inputs, as shown in various empirical evaluations with up to 90 dimensional data sets. For a probabilistic interpretation, predictive variance and confidence intervals are derived. To our knowledge, LWPR is the first truly incremental spatially localized learning method that can successfully and e#ciently operate in very high dimensional spaces.
Who trusts others?
, 2002
"... Both individual experiences and community characteristics influence how much people trust each other. Using individual level data drawn from US localities we find that the strongest factors associated with low trust are: (i) a recent history of traumatic experiences; (ii) belonging to a group that h ..."
Abstract

Cited by 102 (4 self)
 Add to MetaCart
Both individual experiences and community characteristics influence how much people trust each other. Using individual level data drawn from US localities we find that the strongest factors associated with low trust are: (i) a recent history of traumatic experiences; (ii) belonging to a group that historically felt discriminated against, such as minorities (blacks in particular) and, to a lesser extent, women; (iii) being economically unsuccessful in terms of income and education; (iv) living in a racially mixed community and/or in one with a high degree of income disparity. Religious beliefs and ethnic origins do not significantly affect trust. The role of racial cleavages leading to low trust is confirmed when we explicitly account for individual preferences on inter racial relationships: within the same community, individuals who express stronger feelings against racial integration trust relatively less the more racially heterogeneous the community is.