Results 1  10
of
28
Estimating the Support of a HighDimensional Distribution
, 1999
"... Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We propose a metho ..."
Abstract

Cited by 501 (32 self)
 Add to MetaCart
Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We propose a method to approach this problem by trying to estimate a function f which is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a preliminary theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabelled d...
Charting a Manifold
 Advances in Neural Information Processing Systems 15
, 2003
"... this paper we use m i ( j ) N ( j ; i , s ), with the scale parameter s specifying the expected size of a neighborhood on the manifold in sample space. A reasonable choice is s = r/2, so that 2erf(2) > 99.5% of the density of m i ( j ) is contained in the area around y i where the manifold is e ..."
Abstract

Cited by 161 (7 self)
 Add to MetaCart
this paper we use m i ( j ) N ( j ; i , s ), with the scale parameter s specifying the expected size of a neighborhood on the manifold in sample space. A reasonable choice is s = r/2, so that 2erf(2) > 99.5% of the density of m i ( j ) is contained in the area around y i where the manifold is expected to be locally linear. With uniform p i and i , m i ( j ) and fixed, the MAP estimates of the GMM covariances are S i = m i ( j ) (y j i )(y j i ) # + ( j i )( j i ) # +S j m i ( j ) . (3) Note that each covariance S i is dependent on all other S j . The MAP estimators for all covariances can be arranged into a set of fully constrained linear equations and solved exactly for their mutually optimal values. This key step brings nonlocal information about the manifold's shape into the local description of each neighborhood, ensuring that adjoining neighborhoods have similar covariances and small angles between their respective subspaces. Even if a local subset of data points are dense in a direction perpendicular to the manifold, the prior encourages the local chart to orient parallel to the manifold as part of a globally optimal solution, protecting against a pathology noted in [8]. Equation (3) is easily adapted to give a reduced number of charts and/or charts centered on local centroids. 4 Connecting the charts We now build a connection for set of charts specified as an arbitrary nondegenerate GMM. A GMM gives a soft partitioning of the dataset into neighborhoods of mean k and covariance S k . The optimal variancepreserving lowdimensional coordinate system for each neighborhood derives from its weighted principal component analysis, which is exactly specified by the eigenvectors of its covariance matrix: Eigendecompose V k L k V # k S k with...
Support Vector Method for Novelty Detection
, 2000
"... Suppose you are given some dataset drawn from an underlying probability distributionPand you want to estimate a “simple ” subsetSof input space such that the probability that a test point drawn from P lies outside of Sequals some a priori specified between0and1. We propose a m ethod to approach this ..."
Abstract

Cited by 100 (4 self)
 Add to MetaCart
Suppose you are given some dataset drawn from an underlying probability distributionPand you want to estimate a “simple ” subsetSof input space such that the probability that a test point drawn from P lies outside of Sequals some a priori specified between0and1. We propose a m ethod to approach this problem by trying to estimate a function f which is positive on S and negative on the complement. The functional form offis given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. We provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabelled data.
Support Vector Machines: Hype or Hallelujah?
 SIGKDD Explorations
, 2003
"... Support Vector Machines (SVMs) and related kernel methods have become increasingly popular tools for data mining tasks such as classification, regression, and novelty detection. The goal of this tutorial is to provide an intuitive explanation of SVMs from a geometric perspective. The classification ..."
Abstract

Cited by 80 (0 self)
 Add to MetaCart
Support Vector Machines (SVMs) and related kernel methods have become increasingly popular tools for data mining tasks such as classification, regression, and novelty detection. The goal of this tutorial is to provide an intuitive explanation of SVMs from a geometric perspective. The classification problem is used to investigate the basic concepts behind SVMs and to examine their strengths and weaknesses from a data mining perspective. While this overview is not comprehensive, it does provide resources for those interested in further exploring SVMs.
Generalization Performance of Regularization Networks and Support . . .
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 2001
"... We derive new bounds for the generalization error of kernel machines, such as support vector machines and related regularization networks by obtaining new bounds on their covering numbers. The proofs make use of a viewpoint that is apparently novel in the field of statistical learning theory. The hy ..."
Abstract

Cited by 73 (20 self)
 Add to MetaCart
We derive new bounds for the generalization error of kernel machines, such as support vector machines and related regularization networks by obtaining new bounds on their covering numbers. The proofs make use of a viewpoint that is apparently novel in the field of statistical learning theory. The hypothesis class is described in terms of a linear operator mapping from a possibly infinitedimensional unit ball in feature space into a finitedimensional space. The covering numbers of the class are then determined via the entropy numbers of the operator. These numbers, which characterize the degree of compactness of the operator, can be bounded in terms of the eigenvalues of an integral operator induced by the kernel function used by the machine. As a consequence, we are able to theoretically explain the effect of the choice of kernel function on the generalization performance of support vector machines.
Learning appearance manifolds from video
 IN COMPUTER VISION AND PATTERN RECOGNITION (CVPR
, 2005
"... The appearance of dynamic scenes is often largely governed by a latent lowdimensional dynamic process. We show how to learn a mapping from video frames to this lowdimensional representation by exploiting the temporal coherence between frames and supervision from a user. This function maps the frame ..."
Abstract

Cited by 30 (2 self)
 Add to MetaCart
The appearance of dynamic scenes is often largely governed by a latent lowdimensional dynamic process. We show how to learn a mapping from video frames to this lowdimensional representation by exploiting the temporal coherence between frames and supervision from a user. This function maps the frames of the video to a lowdimensional sequence that evolves according to Markovian dynamics. This ensures that the recovered lowdimensional sequence represents a physically meaningful process. We relate our algorithm to manifold learning, semisupervised learning, and system identification, and demonstrate it on the tasks of tracking 3D rigid objects, deformable bodies, and articulated bodies. We also show how to use the inverse of this mapping to manipulate video.
SV Estimation of a Distribution's Support
, 1999
"... Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified 0 < 1. We propose an algorithm which approach ..."
Abstract

Cited by 28 (2 self)
 Add to MetaCart
Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified 0 < 1. We propose an algorithm which approaches this problem by trying to estimate a function f which is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The algorithm is a natural extension of the support vector algorithm to the case of unlabelled data.
A Taxonomy for Spatiotemporal Connectionist Networks Revisited: The Unsupervised Case
 Neural Computation
, 2003
"... Spatiotemporal connectionist networks (STCN's) comprise an important class of neural models that can deal with patterns distributed both in time and space. In this paper, we widen the application domain of the taxonomy for supervised STCN's recently proposed by Kremer (2001) to the unsupervised case ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
Spatiotemporal connectionist networks (STCN's) comprise an important class of neural models that can deal with patterns distributed both in time and space. In this paper, we widen the application domain of the taxonomy for supervised STCN's recently proposed by Kremer (2001) to the unsupervised case. This is possible through a reinterpretation of the state vector as a vector of latent (hidden) variables, as proposed by Meinicke (2000). The goal of this generalized taxonomy is then to provide a nonlinear generative framework for describing unsupervised spatiotemporal networks, making it easier to compare and contrast their representational and operational characteristics. Computational properties, representational issues and learning are also discussed and a number of references to the relevant source publications are provided. It is argued that the proposed approach is simple and more powerful than the previous attempts, from a descriptive and predictive viewpoint. We also discuss the relation of this taxonomy with automata theory and state space modeling, and suggest directions for further work.
Principal Curves With Bounded Turn
, 2002
"... Principal curves, like principal components, are a tool used in multivariate analysis for ends like feature extraction. Defined in their original form, principal curves need not exist for general distributions. The existence of principal curves with bounded length for any distribution that satisfies ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
Principal curves, like principal components, are a tool used in multivariate analysis for ends like feature extraction. Defined in their original form, principal curves need not exist for general distributions. The existence of principal curves with bounded length for any distribution that satisfies some minimal regularity conditions has been shown. We define principal curves with bounded turn, show that they exist, and present a learning algorithm for them. Principal components are a special case of such curves when the turn is zero.