Results 1 -
7 of
7
When Is "Nearest Neighbor" Meaningful?
- In Int. Conf. on Database Theory
, 1999
"... . We explore the effect of dimensionality on the "nearest neighbor " problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the fa ..."
Abstract
-
Cited by 222 (1 self)
- Add to MetaCart
. We explore the effect of dimensionality on the "nearest neighbor " problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the farthest data point. To provide a practical perspective, we present empirical results on both real and synthetic data sets that demonstrate that this effect can occur for as few as 10-15 dimensions. These results should not be interpreted to mean that high-dimensional indexing is never meaningful; we illustrate this point by identifying some high-dimensional workloads for which this effect does not occur. However, our results do emphasize that the methodology used almost universally in the database literature to evaluate high-dimensional indexing techniques is flawed, and should be modified. In particular, most such techniques proposed in the literature are not evaluated versus simple...
Applications of Machine Learning and Rule Induction
- Communications of the ACM
, 1995
"... An important area of application for machine learning is in automating the acquisition of knowledge bases required for expert systems. In this paper, we review the major paradigms for machine learning, including neural networks, instance-based methods, genetic learning, rule induction, and analytic ..."
Abstract
-
Cited by 89 (8 self)
- Add to MetaCart
An important area of application for machine learning is in automating the acquisition of knowledge bases required for expert systems. In this paper, we review the major paradigms for machine learning, including neural networks, instance-based methods, genetic learning, rule induction, and analytic approaches. We consider rule induction in greater detail and review some of its recent applications, in each case stating the problem, how rule induction was used, and the status of the resulting expert system. In closing, we identify the main stages in fielding an applied learning system and draw some lessons from successful applications. Introduction Machine learning is the study of computational methods for improving performance by mechanizing the acquisition of knowledge from experience. Expert performance requires much domainspecific knowledge, and knowledge engineering has produced hundreds of AI expert systems that are now used regularly in industry. Machine learning aims to provide ...
Machine Learning for Adaptive User Interfaces
- Proceedings of the 21st German Annual Conference on Artificial Intelligence
, 1997
"... . In this paper we examine the growing interest in personalized user interfaces and explore the potential of machine learning in meeting that need. We briefly review progress in developing fielded applications of machine learning, then consider some characteristics of adaptive user interfaces that d ..."
Abstract
-
Cited by 26 (7 self)
- Add to MetaCart
. In this paper we examine the growing interest in personalized user interfaces and explore the potential of machine learning in meeting that need. We briefly review progress in developing fielded applications of machine learning, then consider some characteristics of adaptive user interfaces that distinguish them from more traditional applications. After this, we consider some examples of adaptive interfaces that use inductive methods to personalize their behavior, and we report some ongoing research that extends these ideas in the automobile environment. 1 The Need for Personalized User Interfaces Early computer software aimed to solve business and scientific problems in a predetermined way that allowed only very constrained user input, through arguments given to the program at run time. This contrasts sharply with modernday software, which is much more interactive and supports frequent user input throughout its operation. This shift toward interactive software is reflected in the g...
When Is "Nearest Neighbor" Meaningful?
- In Int. Conf. on Database Theory
, 1999
"... this paper, we study the nearest neighbor problem and make the following contributions: ffl We show that under certain conditions (in terms of data and query distributions, or workload), as dimensionality increases, the distance to the nearest neighbor approaches the distance to the farthest neighb ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
this paper, we study the nearest neighbor problem and make the following contributions: ffl We show that under certain conditions (in terms of data and query distributions, or workload), as dimensionality increases, the distance to the nearest neighbor approaches the distance to the farthest neighbor. In other words, virtually every data point is as good as any other, and slight perturbations to the query point would result in another data point being chosen as the nearest neighbor. Our result characterizes the problem itself, rather than specific algorithms that address the problem. This observation places some fundamental limits upon current approaches to multimedia similarity search based upon highdimensional feature vector representations. In addition, our observations apply equally to the k-nearest neigbor variant of the problem
Building a Piece-Wise Ensemble of Decision Tree Classifiers
, 2000
"... Inductive learning is a form of supervised learning in which a system tries to build a model of a concept (e.g. what makes a container a cup) from descriptions of things that are/are not examples of that concept. Numerous methods have been dened to induce concepts such as decision trees learning, ..."
Abstract
- Add to MetaCart
Inductive learning is a form of supervised learning in which a system tries to build a model of a concept (e.g. what makes a container a cup) from descriptions of things that are/are not examples of that concept. Numerous methods have been dened to induce concepts such as decision trees learning, articial neural networks, etc. One especially powerful method is to use an ensemble of classiers (i.e., a collection of classiers each trained on a subset of the original training set). Bagging and Boosting are the two of the most popular methods for building ensembles. Boosting is often chosen as an ensemble building technique because it can reduce both the bias and variance of the error. But building a Boosting ensemble is a time consuming process as the process cannot be executed in parallel and it tends to overt the training examples as the number of classiers in the ensemble increases. In this work, I propose a new ensemble building algorithm that works similar to Boosti...
Analysis and Prediction of the Typhoon from an Informatics Perspective
, 1984
"... Analysis and prediction of the typhoon has been intensively studied by a number of meteorologists because of the huge impact of the typhoon to the society. We study the same issue from a di#erent viewpoint --- from an informatics perspective. Our goal is to discover relevant knowledge for typhoon an ..."
Abstract
- Add to MetaCart
Analysis and prediction of the typhoon has been intensively studied by a number of meteorologists because of the huge impact of the typhoon to the society. We study the same issue from a di#erent viewpoint --- from an informatics perspective. Our goal is to discover relevant knowledge for typhoon analysis and prediction by means of various computational tools that have been developed in the informatics community. Our research takes advantage of the large collection of typhoon data, especially the satellite images of the typhoon, with the application of multimedia data mining methods in the hope of discovering hidden regularities and anomalies in the data collection using data mining algorithms such as principal component analysis, K-means clustering, and self-organizing map. In this paper, we summarize our approaches, achievements and open problems, with the brief introduction of our hand-crafted system, IMET (Image Mining Environment for Typhoon analysis and prediction) .

