Results 1  10
of
687
Locally weighted learning
 ARTIFICIAL INTELLIGENCE REVIEW
, 1997
"... This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, ass ..."
Abstract

Cited by 448 (52 self)
 Add to MetaCart
This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning t parameters, interference between old and new data, implementing locally weighted learning e ciently, and applications of locally weighted learning. A companion paper surveys how locally weighted learning can be used in robot learning and control.
LOF: Identifying DensityBased Local Outliers
 PROCEEDINGS OF THE 2000 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA
, 2000
"... For many KDD applications, such as detecting criminal activities in Ecommerce, finding the rare instances or the outliers, can be more interesting than finding the common patterns. Existing work in outlier detection regards being an outlier as a binary property. In this paper, we contend that for m ..."
Abstract

Cited by 295 (8 self)
 Add to MetaCart
For many KDD applications, such as detecting criminal activities in Ecommerce, finding the rare instances or the outliers, can be more interesting than finding the common patterns. Existing work in outlier detection regards being an outlier as a binary property. In this paper, we contend that for many scenarios, it is more meaningful to assign to each object a degree of being an outlier. This degree is called the local outlier factor (LOF) of an object. It is local in that the degree depends on how isolated the object is with respect to the surrounding neighborhood. We give a detailed formal analysis showing that LOF enjoys many desirable properties. Using realworld datasets, we demonstrate that LOF can be used to find outliers which appear to be meaningful, but can otherwise not be identified with existing approaches. Finally, a careful performance evaluation of our algorithm confirms we show that our approach of finding local outliers can be practical.
Survey of clustering data mining techniques
, 2002
"... Accrue Software, Inc. Clustering is a division of data into groups of similar objects. Representing the data by fewer clusters necessarily loses certain fine details, but achieves simplification. It models data by its clusters. Data modeling puts clustering in a historical perspective rooted in math ..."
Abstract

Cited by 247 (0 self)
 Add to MetaCart
Accrue Software, Inc. Clustering is a division of data into groups of similar objects. Representing the data by fewer clusters necessarily loses certain fine details, but achieves simplification. It models data by its clusters. Data modeling puts clustering in a historical perspective rooted in mathematics, statistics, and numerical analysis. From a machine learning perspective clusters correspond to hidden patterns, the search for clusters is unsupervised learning, and the resulting system represents a data concept. From a practical perspective clustering plays an outstanding role in data mining applications such as scientific data exploration, information retrieval and text mining, spatial database applications, Web analysis, CRM, marketing, medical diagnostics, computational biology, and many others. Clustering is the subject of active research in several fields such as statistics, pattern recognition, and machine learning. This survey focuses on clustering in data mining. Data mining adds to clustering the complications of very large datasets with very many attributes of different types. This imposes unique
Summaries of Affymetrix GeneChip probe level data
 Nucleic Acids Res
, 2003
"... High density oligonucleotide array technology is widely used in many areas of biomedical research for quantitative and highly parallel measurements of gene expression. Affymetrix GeneChip arrays are the most popular. In this technology each gene is typically represented by a set of 11±20 pairs of pr ..."
Abstract

Cited by 218 (15 self)
 Add to MetaCart
High density oligonucleotide array technology is widely used in many areas of biomedical research for quantitative and highly parallel measurements of gene expression. Affymetrix GeneChip arrays are the most popular. In this technology each gene is typically represented by a set of 11±20 pairs of probes. In order to obtain expression measures it is necessary to summarize the probe level data. Using two extensive spikein studies and a dilution study, we developed a set of tools for assessing the effectiveness of expression measures. We found that the performance of the current version of the default expression measure provided by Affymetrix Microarray Suite can be signi®cantly improved by the use of probe level summaries derived from empirically motivated statistical models. In particular, improvements in the ability to detect differentially expressed genes are demonstrated.
Tracking multiple independent targets: Evidence for a parallel tracking mechanism
 Spatial Vision
, 1988
"... AbstractThere is considerable evidence that visual attention is concentrated at a single locus in the visual field, and that this locus can be moved independent of eye movements. Two studies are reported which suggest that, while certain aspects of attention require that locations\be scanned serial ..."
Abstract

Cited by 211 (23 self)
 Add to MetaCart
AbstractThere is considerable evidence that visual attention is concentrated at a single locus in the visual field, and that this locus can be moved independent of eye movements. Two studies are reported which suggest that, while certain aspects of attention require that locations\be scanned serially, at least one operation may be carried out in parallel across several independent loci in the visual field. That is the operation of indexing features and tracking their identity. The studies show that: (a) subjects are able to track a subset of up to 5 objects in a field of 10 'identical randomlymoving objects in order to distinguish a change in a target from a change in a distractor; and (b) when the speed and distance parameters of the display are designed so that, on the basis of some very conservative assumptions about the speed of attention movement and encoding times, the predicted performance of a serial scanning and updating algorithm would not exceed about 40 % accuracy, subjects still manage to do the task with 87 % accuracy. These findings are discussed in relation to an earlier, and independently motivated model of featurebindingcalled the FINST modelwhich posits a primitive identity maintenance mechanism that indexes and tracks a limited number ofvisual objects in parallel. These indexes are hypothesized to serve the function of binding visual features prior to subsequent pattern recognition.
SUSAN  A New Approach to Low Level Image Processing
 International Journal of Computer Vision
, 1995
"... This paper describes a new approach to low level image processing; in particular, edge and corner detection and structure preserving noise reduction. ..."
Abstract

Cited by 205 (3 self)
 Add to MetaCart
This paper describes a new approach to low level image processing; in particular, edge and corner detection and structure preserving noise reduction.
Using confidence intervals in withinsubject designs
 Psychonomic Bulletin & Review
, 1994
"... Wolford, and two anonymous reviewers for very useful comments on earlier drafts of the manuscript. Correspondence may be addressed to ..."
Abstract

Cited by 178 (21 self)
 Add to MetaCart
Wolford, and two anonymous reviewers for very useful comments on earlier drafts of the manuscript. Correspondence may be addressed to
Boosting with the L_2Loss: Regression and Classification
, 2001
"... This paper investigates a variant of boosting, L 2 Boost, which is constructed from a functional gradient descent algorithm with the L 2 loss function. Based on an explicit stagewise re tting expression of L 2 Boost, the case of (symmetric) linear weak learners is studied in detail in both regressi ..."
Abstract

Cited by 121 (16 self)
 Add to MetaCart
This paper investigates a variant of boosting, L 2 Boost, which is constructed from a functional gradient descent algorithm with the L 2 loss function. Based on an explicit stagewise re tting expression of L 2 Boost, the case of (symmetric) linear weak learners is studied in detail in both regression and twoclass classification. In particular, with the boosting iteration m working as the smoothing or regularization parameter, a new exponential biasvariance trade off is found with the variance (complexity) term bounded as m tends to infinity. When the weak learner is a smoothing spline, an optimal rate of convergence result holds for both regression and twoclass classification. And this boosted smoothing spline adapts to higher order, unknown smoothness. Moreover, a simple expansion of the 01 loss function is derived to reveal the importance of the decision boundary, bias reduction, and impossibility of an additive biasvariance decomposition in classification. Finally, simulation and real data set results are obtained to demonstrate the attractiveness of L 2 Boost, particularly with a novel componentwise cubic smoothing spline as an effective and practical weak learner.
The earth is round (p < .05
 American Psychologist
, 1994
"... After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred.05 criterion—still persists. This article reviews the problems with this practice, including its nearuniversal misinterpretation ofp as the probability that Ho is ..."
Abstract

Cited by 113 (0 self)
 Add to MetaCart
After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred.05 criterion—still persists. This article reviews the problems with this practice, including its nearuniversal misinterpretation ofp as the probability that Ho is false, the misinterpretation that its complement is the probability of successful replication, and the mistaken assumption that if one rejects Ho one thereby affirms the theory that led to the test. Exploratory data analysis and the use of graphic methods, a steady improvement in and a movement toward standardization in measurement, an emphasis on estimating effect sizes using confidence intervals, and the informed use of available statistical methods is suggested. For generalization, psychologists must finally rely, as has been done in all the older sciences,