Results 1  10
of
1,710
Locally weighted learning
 ARTIFICIAL INTELLIGENCE REVIEW
, 1997
"... This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, ass ..."
Abstract

Cited by 594 (53 self)
 Add to MetaCart
This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning t parameters, interference between old and new data, implementing locally weighted learning e ciently, and applications of locally weighted learning. A companion paper surveys how locally weighted learning can be used in robot learning and control.
LOF: Identifying DensityBased Local Outliers
 PROCEEDINGS OF THE 2000 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA
, 2000
"... For many KDD applications, such as detecting criminal activities in Ecommerce, finding the rare instances or the outliers, can be more interesting than finding the common patterns. Existing work in outlier detection regards being an outlier as a binary property. In this paper, we contend that for m ..."
Abstract

Cited by 499 (14 self)
 Add to MetaCart
For many KDD applications, such as detecting criminal activities in Ecommerce, finding the rare instances or the outliers, can be more interesting than finding the common patterns. Existing work in outlier detection regards being an outlier as a binary property. In this paper, we contend that for many scenarios, it is more meaningful to assign to each object a degree of being an outlier. This degree is called the local outlier factor (LOF) of an object. It is local in that the degree depends on how isolated the object is with respect to the surrounding neighborhood. We give a detailed formal analysis showing that LOF enjoys many desirable properties. Using realworld datasets, we demonstrate that LOF can be used to find outliers which appear to be meaningful, but can otherwise not be identified with existing approaches. Finally, a careful performance evaluation of our algorithm confirms we show that our approach of finding local outliers can be practical.
Summaries of Affymetrix GeneChip probe level data
 Nucleic Acids Res
, 2003
"... High density oligonucleotide array technology is widely used in many areas of biomedical research for quantitative and highly parallel measurements of gene expression. Affymetrix GeneChip arrays are the most popular. In this technology each gene is typically represented by a set of 11±20 pairs of pr ..."
Abstract

Cited by 464 (21 self)
 Add to MetaCart
(Show Context)
High density oligonucleotide array technology is widely used in many areas of biomedical research for quantitative and highly parallel measurements of gene expression. Affymetrix GeneChip arrays are the most popular. In this technology each gene is typically represented by a set of 11±20 pairs of probes. In order to obtain expression measures it is necessary to summarize the probe level data. Using two extensive spikein studies and a dilution study, we developed a set of tools for assessing the effectiveness of expression measures. We found that the performance of the current version of the default expression measure provided by Affymetrix Microarray Suite can be signi®cantly improved by the use of probe level summaries derived from empirically motivated statistical models. In particular, improvements in the ability to detect differentially expressed genes are demonstrated.
Using confidence intervals in withinsubject designs
 PSYCHONOMIC BULLETIN & REVIEW
, 1994
"... ..."
Survey of clustering data mining techniques
, 2002
"... Accrue Software, Inc. Clustering is a division of data into groups of similar objects. Representing the data by fewer clusters necessarily loses certain fine details, but achieves simplification. It models data by its clusters. Data modeling puts clustering in a historical perspective rooted in math ..."
Abstract

Cited by 400 (0 self)
 Add to MetaCart
(Show Context)
Accrue Software, Inc. Clustering is a division of data into groups of similar objects. Representing the data by fewer clusters necessarily loses certain fine details, but achieves simplification. It models data by its clusters. Data modeling puts clustering in a historical perspective rooted in mathematics, statistics, and numerical analysis. From a machine learning perspective clusters correspond to hidden patterns, the search for clusters is unsupervised learning, and the resulting system represents a data concept. From a practical perspective clustering plays an outstanding role in data mining applications such as scientific data exploration, information retrieval and text mining, spatial database applications, Web analysis, CRM, marketing, medical diagnostics, computational biology, and many others. Clustering is the subject of active research in several fields such as statistics, pattern recognition, and machine learning. This survey focuses on clustering in data mining. Data mining adds to clustering the complications of very large datasets with very many attributes of different types. This imposes unique
Tracking multiple independent targets: Evidence for a parallel tracking mechanism
 Spatial Vision
, 1988
"... AbstractThere is considerable evidence that visual attention is concentrated at a single locus in the visual field, and that this locus can be moved independent of eye movements. Two studies are reported which suggest that, while certain aspects of attention require that locations\be scanned serial ..."
Abstract

Cited by 361 (23 self)
 Add to MetaCart
AbstractThere is considerable evidence that visual attention is concentrated at a single locus in the visual field, and that this locus can be moved independent of eye movements. Two studies are reported which suggest that, while certain aspects of attention require that locations\be scanned serially, at least one operation may be carried out in parallel across several independent loci in the visual field. That is the operation of indexing features and tracking their identity. The studies show that: (a) subjects are able to track a subset of up to 5 objects in a field of 10 'identical randomlymoving objects in order to distinguish a change in a target from a change in a distractor; and (b) when the speed and distance parameters of the display are designed so that, on the basis of some very conservative assumptions about the speed of attention movement and encoding times, the predicted performance of a serial scanning and updating algorithm would not exceed about 40 % accuracy, subjects still manage to do the task with 87 % accuracy. These findings are discussed in relation to an earlier, and independently motivated model of featurebindingcalled the FINST modelwhich posits a primitive identity maintenance mechanism that indexes and tracks a limited number ofvisual objects in parallel. These indexes are hypothesized to serve the function of binding visual features prior to subsequent pattern recognition.
The earth is round (p < .05
 American Psychologist
, 1994
"... After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred.05 criterion—still persists. This article reviews the problems with this practice, including its nearuniversal misinterpretation ofp as the probability that Ho is ..."
Abstract

Cited by 346 (0 self)
 Add to MetaCart
After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred.05 criterion—still persists. This article reviews the problems with this practice, including its nearuniversal misinterpretation ofp as the probability that Ho is false, the misinterpretation that its complement is the probability of successful replication, and the mistaken assumption that if one rejects Ho one thereby affirms the theory that led to the test. Exploratory data analysis and the use of graphic methods, a steady improvement in and a movement toward standardization in measurement, an emphasis on estimating effect sizes using confidence intervals, and the informed use of available statistical methods is suggested. For generalization, psychologists must finally rely, as has been done in all the older sciences,
Data Clustering: 50 Years Beyond KMeans
, 2008
"... Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into taxonomic ranks: domain, kingdom, phylum, class, etc.). Cluster analysis is the formal study of algorithms and m ..."
Abstract

Cited by 274 (6 self)
 Add to MetaCart
Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into taxonomic ranks: domain, kingdom, phylum, class, etc.). Cluster analysis is the formal study of algorithms and methods for grouping, or clustering, objects according to measured or perceived intrinsic characteristics or similarity. Cluster analysis does not use category labels that tag objects with prior identifiers, i.e., class labels. The absence of category information distinguishes data clustering (unsupervised learning) from classification or discriminant analysis (supervised learning). The aim of clustering is exploratory in nature to find structure in data. Clustering has a long and rich history in a variety of scientific fields. One of the most popular and simple clustering algorithms, Kmeans, was first published in 1955. In spite of the fact that Kmeans was proposed over 50 years ago and thousands of clustering algorithms have been published since then, Kmeans is still widely used. This speaks to the difficulty of designing a general purpose clustering algorithm and the illposed problem of clustering. We provide a brief overview of clustering, summarize well known clustering methods, discuss the major challenges and key issues in designing clustering algorithms, and point out some of the emerging and useful research directions, including semisupervised clustering, ensemble clustering, simultaneous feature selection, and data clustering and large scale data clustering.
SUSAN  A New Approach to Low Level Image Processing
 International Journal of Computer Vision
, 1995
"... This paper describes a new approach to low level image processing; in particular, edge and corner detection and structure preserving noise reduction. ..."
Abstract

Cited by 266 (3 self)
 Add to MetaCart
(Show Context)
This paper describes a new approach to low level image processing; in particular, edge and corner detection and structure preserving noise reduction.