Results 1  10
of
218
Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection
, 1997
"... We develop a face recognition algorithm which is insensitive to gross variation in lighting direction and facial expression. Taking a pattern classification approach, we consider each pixel in an image as a coordinate in a highdimensional space. We take advantage of the observation that the images ..."
Abstract

Cited by 1509 (17 self)
 Add to MetaCart
We develop a face recognition algorithm which is insensitive to gross variation in lighting direction and facial expression. Taking a pattern classification approach, we consider each pixel in an image as a coordinate in a highdimensional space. We take advantage of the observation that the images of a particular face, under varying illumination but fixed pose, lie in a 3D linear subspace of the high dimensional image space  if the face is a Lambertian surface without shadowing. However, since faces are not truly Lambertian surfaces and do indeed produce selfshadowing, images will deviate from this linear subspace. Rather than explicitly modeling this deviation, we linearly project the image into a subspace in a manner which discounts those regions of the face with large deviation. Our projection method is based on Fisher's Linear Discriminant and produces well separated classes in a lowdimensional subspace even under severe variation in lighting and facial expressions. The Eigenface
A training algorithm for optimal margin classifiers
 PROCEEDINGS OF THE 5TH ANNUAL ACM WORKSHOP ON COMPUTATIONAL LEARNING THEORY
, 1992
"... A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented. The technique is applicable to a wide variety of classifiaction functions, including Perceptrons, polynomials, and Radial Basis Functions. The effective number of parameters is adjust ..."
Abstract

Cited by 1304 (43 self)
 Add to MetaCart
A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented. The technique is applicable to a wide variety of classifiaction functions, including Perceptrons, polynomials, and Radial Basis Functions. The effective number of parameters is adjusted automatically to match the complexity of the problem. The solution is expressed as a linear combination of supporting patterns. These are the subset of training patterns that are closest to the decision boundary. Bounds on the generalization performance based on the leaveoneout method and the VCdimension are given. Experimental results on optical character recognition problems demonstrate the good generalization obtained when compared with other learning algorithms.
Combining labeled and unlabeled data with cotraining
, 1998
"... We consider the problem of using a large unlabeled sample to boost performance of a learning algorithm when only a small set of labeled examples is available. In particular, we consider a setting in which the description of each example can be partitioned into two distinct views, motivated by the ta ..."
Abstract

Cited by 1249 (28 self)
 Add to MetaCart
We consider the problem of using a large unlabeled sample to boost performance of a learning algorithm when only a small set of labeled examples is available. In particular, we consider a setting in which the description of each example can be partitioned into two distinct views, motivated by the task of learning to classify web pages. For example, the description of a web page can be partitioned into the words occurring on that page, and the words occurring in hyperlinks that point to that page. We assume that either view of the example would be su cient for learning if we had enough labeled data, but our goal is to use both views together to allow inexpensive unlabeled data to augment amuch smaller set of labeled examples. Speci cally, the presence of two distinct views of each example suggests strategies in which two learning algorithms are trained separately on each view, and then each algorithm's predictions on new unlabeled examples are used to enlarge the training set of the other. Our goal in this paper is to provide a PACstyle analysis for this setting, and, more broadly, a PACstyle framework for the general problem of learning from both labeled and unlabeled data. We also provide empirical results on real webpage data indicating that this use of unlabeled examples can lead to signi cant improvement of hypotheses in practice. As part of our analysis, we provide new re
Affective Computing
, 1995
"... Recent neurological studies indicate that the role of emotion in human cognition is essential; emotions are not a luxury. Instead, emotions play a critical role in rational decisionmaking, in perception, in human interaction, and in human intelligence. These facts, combined with abilities computers ..."
Abstract

Cited by 1218 (37 self)
 Add to MetaCart
Recent neurological studies indicate that the role of emotion in human cognition is essential; emotions are not a luxury. Instead, emotions play a critical role in rational decisionmaking, in perception, in human interaction, and in human intelligence. These facts, combined with abilities computers are acquiring in expressing and recognizing affect, open new areas for research. This paper defines key issues in "affective computing," computing that relates to, arises from, or deliberately influences emotions. New models are suggested for computer recognition of human emotion, and both theoretical and practical applications are described for learning, humancomputer interaction, perceptual information retrieval, creative arts and entertainment, human health, and machine intelligence. Significant potential advances in emotion and cognition theory hinge on the development of affective computing, especially in the form of wearable computers. This paper establishes challenges and future directions for this emerging field.
Hierarchical mixtures of experts and the EM algorithm
 Neural Computation
, 1994
"... We present a treestructured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a maximum likelihood ..."
Abstract

Cited by 724 (19 self)
 Add to MetaCart
We present a treestructured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a maximum likelihood problem; in particular, we present an ExpectationMaximization (EM) algorithm for adjusting the parameters of the architecture. We also develop an online learning algorithm in which the parameters are updated incrementally. Comparative simulation results are presented in the robot dynamics domain. 1
Automatic Subspace Clustering of High Dimensional Data
 Data Mining and Knowledge Discovery
, 2005
"... Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, enduser comprehensibility of the results, nonpresumption of any canonical data distribution, and insensitivity to the or ..."
Abstract

Cited by 564 (12 self)
 Add to MetaCart
Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, enduser comprehensibility of the results, nonpresumption of any canonical data distribution, and insensitivity to the order of input records. We present CLIQUE, a clustering algorithm that satisfies each of these requirements. CLIQUE identifies dense clusters in subspaces of maximum dimensionality. It generates cluster descriptions in the form of DNF expressions that are minimized for ease of comprehension. It produces identical results irrespective of the order in which input records are presented and does not presume any specific mathematical form for data distribution. Through experiments, we show that CLIQUE efficiently finds accurate clusters in large high dimensional datasets.
Knowledgebased Analysis of Microarray Gene Expression Data By Using Support Vector Machines
, 2000
"... We introduce a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines (SVMs). SVMs are considered a supervised computer learning method because they exploit prior knowledge of ..."
Abstract

Cited by 394 (6 self)
 Add to MetaCart
We introduce a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines (SVMs). SVMs are considered a supervised computer learning method because they exploit prior knowledge of gene function to identify unknown genes of similar function from expression data. SVMs avoid several problems associated with unsupervised clustering methods, such as hierarchical clustering and selforganizing maps. SVMs have many mathematical features that make them attractive for gene expression analysis, including their exibility in choosing a similarity function, sparseness of solution when dealing with large data sets, the ability t...
Toward optimal feature selection
 In 13th International Conference on Machine Learning
, 1995
"... In this paper, we examine a method for feature subset selection based on Information Theory. Initially, a framework for de ning the theoretically optimal, but computationally intractable, method for feature subset selection is presented. We show that our goal should be to eliminate a feature if it g ..."
Abstract

Cited by 366 (10 self)
 Add to MetaCart
In this paper, we examine a method for feature subset selection based on Information Theory. Initially, a framework for de ning the theoretically optimal, but computationally intractable, method for feature subset selection is presented. We show that our goal should be to eliminate a feature if it gives us little or no additional information beyond that subsumed by the remaining features. In particular, this will be the case for both irrelevant and redundant features. We then give an e cient algorithm for feature selection which computes an approximation to the optimal feature selection criterion. The conditions under which the approximate algorithm is successful are examined. Empirical results are given on a number of data sets, showing that the algorithm e ectively handles datasets with a very large number of features.
Image retrieval: Current techniques, promising directions and open issues
 Journal of Visual Communication and Image Representation
, 1999
"... This paper provides a comprehensive survey of the technical achievements in the research area of image retrieval, especially contentbased image retrieval, an area that has been so active and prosperous in the past few years. The survey includes 100+ papers covering the research aspects of image fea ..."
Abstract

Cited by 356 (11 self)
 Add to MetaCart
This paper provides a comprehensive survey of the technical achievements in the research area of image retrieval, especially contentbased image retrieval, an area that has been so active and prosperous in the past few years. The survey includes 100+ papers covering the research aspects of image feature representation and extraction, multidimensional indexing, and system design, three of the fundamental bases of contentbased image retrieval. Furthermore, based on the stateoftheart technology available now and the demand from realworld applications, open research issues are identified and future promising research directions are suggested. C ○ 1999 Academic Press 1.
ROCK: A Robust Clustering Algorithm for Categorical Attributes
 In Proc.ofthe15thInt.Conf.onDataEngineering
, 2000
"... Clustering, in data mining, is useful to discover distribution patterns in the underlying data. Clustering algorithms usually employ a distance metric based (e.g., euclidean) similarity measure in order to partition the database such that data points in the same partition are more similar than point ..."
Abstract

Cited by 338 (2 self)
 Add to MetaCart
Clustering, in data mining, is useful to discover distribution patterns in the underlying data. Clustering algorithms usually employ a distance metric based (e.g., euclidean) similarity measure in order to partition the database such that data points in the same partition are more similar than points in different partitions. In this paper, we study clustering algorithms for data with boolean and categorical attributes. We show that traditional clustering algorithms that use distances between points for clustering are not appropriate for boolean and categorical attributes. Instead, we propose a novel concept of links to measure the similarity/proximity between a pair of data points. We develop a robust hierarchical clustering algorithm ROCK that employs links and not distances when merging clusters.