Results 1 - 10
of
137
Locally weighted learning
- Artificial Intelligence Review
, 1997
"... This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, ass ..."
Abstract
-
Cited by 370 (43 self)
- Add to MetaCart
This paper surveys locally weighted learning, a form of lazy learning and memorybased learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning t parameters, interference between old and new data, implementing locally weighted learning e ciently, and applications of locally weighted learning. A companion paper surveys how locally weighted learning can be used in robot learning and control.
The use of the area under the ROC curve in the evaluation of machine learning algorithms
- Pattern Recognition
, 1997
"... Abstract--In this paper we investigate the use of the area under the receiver operating characteristic (ROC) curve (AUC) as a performance measure for machine learning algorithms. As a case study we evaluate six machine learning algorithms (C4.5, Multiscale Classifier, Perceptron, Multi-layer Percept ..."
Abstract
-
Cited by 325 (0 self)
- Add to MetaCart
Abstract--In this paper we investigate the use of the area under the receiver operating characteristic (ROC) curve (AUC) as a performance measure for machine learning algorithms. As a case study we evaluate six machine learning algorithms (C4.5, Multiscale Classifier, Perceptron, Multi-layer Perceptron, k-Nearest Neighbours, and a Quadratic Discriminant Function) on six "real world " medical diagnostics data sets. We compare and discuss the use of AUC to the more conventional overall accuracy and find that AUC exhibits a number of desirable properties when compared to overall accuracy: increased sensitivity in Analysis of Variance (ANOVA) tests; a standard error that decreased as both AUC and the number of test samples increased; decision threshold independent; and it is invariant to a priori class probabilities. The paper concludes with the recommendation that AUC be used in preference to overall accuracy for "single number " evaluation of machine
Texture classification by wavelet packet signatures
- IEEE Transaction PAMI
, 1993
"... This paper introduces a new approach tocharacterize textures at multiple scales. The performance of wavelet packet spaces are measured in terms of sensitivity and selectivity for the classi cation of twenty- ve natural textures. Both energy and entropy metrics were computed for each wavelet packet a ..."
Abstract
-
Cited by 128 (3 self)
- Add to MetaCart
This paper introduces a new approach tocharacterize textures at multiple scales. The performance of wavelet packet spaces are measured in terms of sensitivity and selectivity for the classi cation of twenty- ve natural textures. Both energy and entropy metrics were computed for each wavelet packet and incorporated into distinct scale space representations, where each wavelet packet (channel) re ected a speci c scale and orientation sensitivity. Wavelet packet representations for twenty- ve natural textures were classi ed without error by a simple two-layer network classi er. An analyzing function of large regularity (D 20) was shown to be slightly more e cient inrepresentation and discrimination than a similar function with fewer vanishing moments (D6). In addition, energy representations computed from the standard wavelet decomposition alone (17 features) provided classi cation without error for the twenty- ve textures included in our study. The reliability exhibited by texture signatures based on wavelet packets analysis suggest that the multiresolution properties of such transforms are bene cial for accomplishing segmentation, classication and subtle discrimination of texture. Index Terms{Feature extraction, texture analysis, texture classi cation, wavelet transform, wavelet packet, neural networks.
Speaker recognition: A tutorial
"... A tutorial on the design and development of automatic speaker-recognition systems is presented. Automatic speaker recognition is the use of a machine to recognize a person from a spoken phrase. These systems can operate in two modes: to identify a particular person or to verify a person’s claimed id ..."
Abstract
-
Cited by 121 (1 self)
- Add to MetaCart
A tutorial on the design and development of automatic speaker-recognition systems is presented. Automatic speaker recognition is the use of a machine to recognize a person from a spoken phrase. These systems can operate in two modes: to identify a particular person or to verify a person’s claimed identity. Speech processing and the basic components of automatic speakerrecognition systems are shown and design tradeoffs are discussed. Then, a new automatic speaker-recognition system is given. This recognizer performs with 98.9 % correct identification. Last, the performances of various systems are compared.
Query By Image Example: The Candid Approach
, 1995
"... CANDID (Comparison Algorithm for Navigating Digital Image Databases) was developed to enable contentbased retrieval of digital imagery from large databases using a query-by-example methodology. A user provides an example image to the system, and images in the database that are similar to that exampl ..."
Abstract
-
Cited by 81 (1 self)
- Add to MetaCart
CANDID (Comparison Algorithm for Navigating Digital Image Databases) was developed to enable contentbased retrieval of digital imagery from large databases using a query-by-example methodology. A user provides an example image to the system, and images in the database that are similar to that example are retrieved. The development of CANDID was inspired by the N-gram approach to document fingerprinting, where a "global signature" is computed for every document in a database and these signatures are compared to one another to determine the similarity between any two documents. CANDID computes a global signature for every image in a database, where the signature is derived from various image features such as localized texture, shape, or color information. A distance between probability density functions of feature vectors is then used to compare signatures. In this paper, we present CANDID and highlight two results from our current research: subtracting a "background" signature from ever...
A System for Sound Analysis/Transformation/Synthesis Based on a Deterministic Plus Stochastic Decomposition
, 1989
"... This dissertation introduces a new analysis/synthesis method. It is designed to obtain musically useful intermediate representations for sound transformations. The method’s underlying model assumes that a sound is composed of a deterministic component plus a stochastic one. The deterministic compone ..."
Abstract
-
Cited by 75 (5 self)
- Add to MetaCart
This dissertation introduces a new analysis/synthesis method. It is designed to obtain musically useful intermediate representations for sound transformations. The method’s underlying model assumes that a sound is composed of a deterministic component plus a stochastic one. The deterministic component is represented by a series of sinusoids that are described by amplitude and frequency functions. The stochastic component is represented by a series of magnitude-spectrum envelopes that function as a time-varying filter excited by white noise. Together these representations make it possible for a synthesized sound to attain all the perceptual characteristics of the original sound. At the same time the representation is easily modified to create a wide variety of new sounds. This analysis/synthesis technique is based on the short-time Fourier transform (STFT). From the set of spectra returned by the STFT, the relevant peaks of each spectrum are detected and used as breakpoints in a set of frequency trajectories. The deterministic signal is obtained by synthesizing a sinusoid from each trajectory. Then, in order to obtain the stochastic component, a set of spectra of the deterministic component is computed, and these spectra are subtracted from the spectra of the original sound. The resulting spectral residuals are approximated by a series of envelopes, from which the stochastic signal is generated by performing an inverse-STFT. The result is a method that is appropriate for the manipulation of sounds. The intermediate representation is very flexible and musically useful in that it offers unlimited possibilities for transformation. iii iv v To Eva and Octavi vi
Heterogeneous Learning in the Doppelgänger User Modeling System
- Interaction
, 1995
"... Doppelg anger is a generalized user modeling system that gathers data about users, performs inferences upon the data, and makes the resulting information available to applications. Doppelg anger's learning is called heterogeneous for two reasons: first, multiple learning techniques are used to inter ..."
Abstract
-
Cited by 64 (0 self)
- Add to MetaCart
Doppelg anger is a generalized user modeling system that gathers data about users, performs inferences upon the data, and makes the resulting information available to applications. Doppelg anger's learning is called heterogeneous for two reasons: first, multiple learning techniques are used to interpret the data, and second, the learning techniques must often grapple with disparate data types. These computations take place at geographically distributed sites, and make use of portable user models carried by individuals. This paper concentrates on Doppelg anger's learning techniques and their implementation in an application-independent, sensor-independent environment. Key words: User model, machine learning, server-client architecture, multivariate statistical analysis, Markov models, Beta distribution, linear prediction. 1 Introduction When users interact with a computer, they provide a great deal of information about themselves. Even when they are not physically at a computer console,...
An Empirical Comparison of Four Initialization Methods for the K-Means Algorithm
, 1999
"... In this paper, we aim to compare empirically four initialization methods for the K-Means algorithm: random, Forgy, MacQueen and Kaufman. Although this algorithm is known for its robustness, it is widely reported in literature that its performance depends upon two key points: initial clustering an ..."
Abstract
-
Cited by 62 (0 self)
- Add to MetaCart
In this paper, we aim to compare empirically four initialization methods for the K-Means algorithm: random, Forgy, MacQueen and Kaufman. Although this algorithm is known for its robustness, it is widely reported in literature that its performance depends upon two key points: initial clustering and instance order. We conduct a series of experiments to draw up (in terms of mean, maximum, minimum and standard deviation) the probability distribution of the square-error values of the final clusters returned by the K-Means algorithm independently on any initial clustering and on any instance order when each of the four initialization methods is used. The results of our experiments illustrate that the random and the Kaufman initialization methods outperform the rest of the compared methods as they make the K-Means more effective and more independent on initial clustering and on instance order. In addition, we compare the convergence speed of the K-Means algorithm when using each o...
Authentication via Keystroke Dynamics
- In 4th ACM Conference on Computer and Communications Security
, 1997
"... In an effort to confront the challenges brought forward by the networking revolution of the past few years, we present improved techniques for authorized access to computer system resources and data. More than ever before, the Internet is changing computing as we know it. The possibilities of this g ..."
Abstract
-
Cited by 47 (2 self)
- Add to MetaCart
In an effort to confront the challenges brought forward by the networking revolution of the past few years, we present improved techniques for authorized access to computer system resources and data. More than ever before, the Internet is changing computing as we know it. The possibilities of this global network seem limitless; unfortunately, with this global access comes increased chances of malicious attack and intrusion. Alternatives to traditional access control measures are in high demand. In what follows we present one such alternative : computer access via keystroke dynamics. A database of 42 profiles was constructed based on keystrokes patterns gathered from various users performing structured and unstructured tasks. We study the performance of a system for recognition of these users, and present a toolkit for analyzing system performance under varying criteria. Keywords: Biometrics, keystroke dynamics, pattern recognition, computer security. 1 Introduction Todays' society de...
A Multilevel Approach to Intelligent Information Filtering: Model, System, and Evaluation
- ACM Transactions on Information Systems
, 1997
"... this article, a filtering model is proposed that decomposes the overall task into subsystem functionalities and highlights the need for multiple adaptation techniques to cope with uncertainties. A filtering system, SIFTER, has been implemented based on the model, using established techniques in info ..."
Abstract
-
Cited by 45 (5 self)
- Add to MetaCart
this article, a filtering model is proposed that decomposes the overall task into subsystem functionalities and highlights the need for multiple adaptation techniques to cope with uncertainties. A filtering system, SIFTER, has been implemented based on the model, using established techniques in information retrieval and artificial intelligence. These techniques include document representation by a vector-space model, document classification by unsupervised learning, and user modeling by reinforcement learning. The system can filter information based on content and a user's specific interests. The user's interests are automatically learned with only limited user intervention in the form of optional relevance feedback for documents. We also describe experimental studies conducted with SIFTER to filter computer and information science documents collected from the Internet and commercial database services. The experimental results demonstrate that the system performs very well in filtering documents in a realistic problem setting.

