Results 1  10
of
89
Reidentification by relative distance comparison
 In PAMI
, 2013
"... Abstract—Matching people across nonoverlapping camera views at different locations and different times, known as person reidentification, is both a hard and important problem for associating behavior of people observed in a large distributed space over a prolonged period of time. Person reidentifica ..."
Abstract

Cited by 52 (9 self)
 Add to MetaCart
(Show Context)
Abstract—Matching people across nonoverlapping camera views at different locations and different times, known as person reidentification, is both a hard and important problem for associating behavior of people observed in a large distributed space over a prolonged period of time. Person reidentification is fundamentally challenging because of the large visual appearance changes caused by variations in view angle, lighting, background clutter, and occlusion. To address these challenges, most previous approaches aim to model and extract distinctive and reliable visual features. However, seeking an optimal and robust similarity measure that quantifies a wide range of features against realistic viewing conditions from a distance is still an open and unsolved problem for person reidentification. In this paper, we formulate person reidentification as a relative distance comparison (RDC) learning problem in order to learn the optimal similarity measure between a pair of person images. This approach avoids treating all features indiscriminately and does not assume the existence of some universally distinctive and reliable features. To that end, a novel relative distance comparison model is introduced. The model is formulated to maximize the likelihood of a pair of true matches having a relatively smaller distance than that of a wrong match pair in a soft discriminant manner. Moreover, in order to maintain the tractability of the model in large scale learning, we further develop an ensemble RDC model. Extensive experiments on three publicly available benchmarking datasets are carried out to demonstrate the clear superiority of the proposed RDC models over related popular person reidentification techniques. The results also show that the new RDC models are more robust against visual appearance changes and less susceptible to model overfitting compared to other related existing models. Index Terms—Person reidentification, feature quantification, feature selection, relative distance comparison Ç 1
Discriminant locally linear embedding with highorder tensor data
 IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
, 2008
"... Abstract—Graphembedding along with its linearization and kernelization provides a general framework that unifies most traditional dimensionality reduction algorithms. From this framework, we propose a new manifold learning technique called discriminant locally linear embedding (DLLE), in which th ..."
Abstract

Cited by 44 (12 self)
 Add to MetaCart
(Show Context)
Abstract—Graphembedding along with its linearization and kernelization provides a general framework that unifies most traditional dimensionality reduction algorithms. From this framework, we propose a new manifold learning technique called discriminant locally linear embedding (DLLE), in which the local geometric properties within each class are preserved according to the locally linear embedding (LLE) criterion, and the separability between different classes is enforced by maximizing margins between point pairs on different classes. To deal with the outofsample problem in visual recognition with vector input, the linear version of DLLE, i.e., linearization of DLLE (DLLE/L), is directly proposed through the graphembedding framework. Moreover, we propose its multilinear version, i.e., tensorization of DLLE, for the outofsample problem with highorder tensor input. Based on DLLE, a procedure for gait recognition is described. We conduct comprehensive experiments on both gait and face recognition, and observe that: 1) DLLE along its linearization and tensorization outperforms the related versions of linear discriminant analysis, and DLLE/L demonstrates greater effectiveness than the linearization of LLE; 2) algorithms based on tensor representations are generally superior to linear algorithms when dealing with intrinsically highorder data; and 3) for human gait recognition, DLLE/L generally obtains higher accuracy than stateoftheart gait recognition algorithms on the standard University of South Florida gait database. Index Terms—Dimensionality reduction, face recognition, human gait recognition, manifold learning, tensor representation.
Sparsity preserving projections with applications to face recognition
 Pattern Recogn. 2010
"... Abstract: Dimensionality reduction methods (DRs) have commonly been used as a principled way to understand the highdimensional data such as face images. In this paper, we propose a new unsupervised DR method called Sparsity Preserving Projections (SPP). Unlike many existing techniques such as Local ..."
Abstract

Cited by 29 (3 self)
 Add to MetaCart
(Show Context)
Abstract: Dimensionality reduction methods (DRs) have commonly been used as a principled way to understand the highdimensional data such as face images. In this paper, we propose a new unsupervised DR method called Sparsity Preserving Projections (SPP). Unlike many existing techniques such as Local Preserving Projection (LPP) and Neighborhood Preserving Embedding (NPE), where local neighborhood information is preserved during the DR procedure, SPP aims to preserve the sparse reconstructive relationship of the data, which is achieved by minimizing a L1 regularizationrelated objective function. The obtained projections are invariant to rotations, rescalings and translations of the data, and more importantly, they contain natural discriminating information even if no class labels are provided. Moreover, SPP chooses its neighborhood automatically and hence can be more conveniently used in practice compared to LPP and NPE. The feasibility and effectiveness of the proposed method is verified on three popular face databases (Yale, AR and Extended Yale B) with promising results. Key words: Dimensionality reduction; sparse representation; compressive sensing; face recognition. 1
Subspace Learning from Image gradient orientations
, 2012
"... We introduce the notion of subspace learning from image gradient orientations for appearancebased object recognition. As image data is typically noisy and noise is substantially different from Gaussian, traditional subspace learning from pixel intensities fails very often to estimate reliably the ..."
Abstract

Cited by 16 (9 self)
 Add to MetaCart
We introduce the notion of subspace learning from image gradient orientations for appearancebased object recognition. As image data is typically noisy and noise is substantially different from Gaussian, traditional subspace learning from pixel intensities fails very often to estimate reliably the lowdimensional subspace of a given data population. We show that replacing pixel intensities with gradient orientations and the ℓ2 norm with a cosinebased distance measure offers, to some extend, a remedy to this problem. Within this framework, which we coin IGO (Image Gradient Orientations) subspace learning, we first formulate and study the properties of Principal Component Analysis of image gradient orientations (IGOPCA). We then show its connection to previously proposed robust PCA techniques both theoretically and experimentally. Finally, we derive a number of other popular subspace learning techniques, namely Linear Discriminant Analysis (LDA), Locally Linear Embedding (LLE) and Laplacian Eigenmaps (LE). Experimental results show that our algorithms outperform significantly popular methods such as Gabor features and Local Binary Patterns and achieve stateoftheart performance for difficult problems such as illumination and occlusionrobust face recognition. In addition to this, the proposed IGOmethods require the eigendecomposition of simple covariance matrices and are as computationally efficient as their corresponding ℓ2 norm intensitybased counterparts. Matlab code for the methods presented in this paper can be found at
Discriminant graph structures for facial expression recognition
 IEEE Transactions on Multimedia
, 2008
"... Abstract—In this paper, a series of advances in elastic graph matching for facial expression recognition are proposed. More specifically, a new technique for the selection of the most discriminant facial landmarks for every facial expression (discriminant expressionspecific graphs) is applied. Fur ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
(Show Context)
Abstract—In this paper, a series of advances in elastic graph matching for facial expression recognition are proposed. More specifically, a new technique for the selection of the most discriminant facial landmarks for every facial expression (discriminant expressionspecific graphs) is applied. Furthermore, a novel kernelbased technique for discriminant feature extraction from graphs is presented. This feature extraction technique remedies some of the limitations of the typical kernel Fisher discriminant analysis (KFDA) which provides a subspace of very limited dimensionality (i.e., one or two dimensions) in twoclass problems. The proposed methods have been applied to the Cohn–Kanade database in which very good performance has been achieved in a fully automatic manner. Index Terms—Elastic graph matching, expandable graphs, Fisher’s linear discriminant analysis, Kernel techniques. I.
Parzen Discriminant Analysis
"... In this paper, we propose a nonparametric Discriminant Analysis method (no assumption on the distributions of classes), called Parzen Discriminant Analysis (PDA). Through a deep investigation on the nonparametric density estimation, we find that minimizing/maximizing the distances between each dat ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
(Show Context)
In this paper, we propose a nonparametric Discriminant Analysis method (no assumption on the distributions of classes), called Parzen Discriminant Analysis (PDA). Through a deep investigation on the nonparametric density estimation, we find that minimizing/maximizing the distances between each data sample and its nearby similar/dissimilar samples is equivalent to minimizing an upper bound of the Bayesian error rate. Based on this theoretical analysis, we define our criterion as maximizing the average local dissimilarity scatter with respect to a fixed average local similarity scatter. All local scatters are calculated in fixed size local regions, resembling the idea of Parzen estimation. Experiments in UCI machine learning database show that our method impressively outperforms other related neighbor based nonparametric methods. 1.
Ocfs: Optimal orthogonal centroid feature selection for text categorization
 In Proc. of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, 2005
"... ABSTRACT 1 Text categorization is an important research area in many Information Retrieval (IR) applications. To save the storage space and computation time in text categorization, efficient and effective algorithms for reducing the data before analysis are highly desired. Traditional techniques for ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
(Show Context)
ABSTRACT 1 Text categorization is an important research area in many Information Retrieval (IR) applications. To save the storage space and computation time in text categorization, efficient and effective algorithms for reducing the data before analysis are highly desired. Traditional techniques for this purpose can generally be classified into feature extraction and feature selection. Because of efficiency, the latter is more suitable for text data such as web documents. However, many popular feature selection techniques such as Information Gain (IG) and 2 χtest (CHI) are all greedy in nature and thus may not be optimal according to some criterion. Moreover, the performance of these greedy methods may be deteriorated when the reserved data dimension is extremely low. In this paper, we propose an efficient optimal feature selection algorithm by optimizing the objective function of Orthogonal Centroid (OC) subspace learning algorithm in a discrete solution space, called Orthogonal Centroid Feature Selection (OCFS). Experiments on 20 Newsgroups (20NG), Reuters Corpus Volume 1 (RCV1) and Open Directory Project (ODP) data show that OCFS is consistently better than IG and CHI with smaller computation time especially when the reduced dimension is extremely small.
Margin maximizing discriminant analysis
 In Proceedings of the 15th European Conference on Machine Learning
, 2004
"... Abstract. We propose a new feature extraction method called Margin Maximizing Discriminant Analysis (MMDA) which seeks to extract features suitable for classification tasks. MMDA is based on the principle that an ideal feature should convey the maximum information about the class labels and it shoul ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
(Show Context)
Abstract. We propose a new feature extraction method called Margin Maximizing Discriminant Analysis (MMDA) which seeks to extract features suitable for classification tasks. MMDA is based on the principle that an ideal feature should convey the maximum information about the class labels and it should depend only on the geometry of the optimal decision boundary and not on those parts of the distribution of the input data that do not participate in shaping this boundary. Further, distinct feature components should convey unrelated information about the data. Two feature extraction methods are proposed for calculating the parameters of such a projection that are shown to yield equivalent results. The kernel mapping idea is used to derive nonlinear versions. Experiments with several realworld, publicly available data sets demonstrate that the new method yields competitive results. 1
IMMC: Incremental Maximum Margin Criterion
 ACM SIGKDD Conference
, 2004
"... Subspace learning approaches have attracted much attention in academia recently. However, the classical batch algorithms no longer satisfy the applications on streaming data or largescale data. To meet this desirability, Incremental Principal Component Analysis (IPCA) algorithm has been well establ ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Subspace learning approaches have attracted much attention in academia recently. However, the classical batch algorithms no longer satisfy the applications on streaming data or largescale data. To meet this desirability, Incremental Principal Component Analysis (IPCA) algorithm has been well established, but it is an unsupervised subspace learning approach and is not optimal for general classification tasks, such as face recognition and Web document categorization. In this paper, we propose an incremental supervised subspace learning algorithm, called Incremental Maximum Margin Criterion (IMMC), to infer an adaptive subspace by optimizing the Maximum Margin Criterion. We also present the proof for convergence of the proposed algorithm. Experimental results on both synthetic dataset and real world datasets show that IMMC converges to the similar subspace as that of batch approach.
From Manifold to Manifold: GeometryAware Dimensionality Reduction for SPD Matrices
"... of any given curve under the geodesic distance δg and the Stein metric δS up to scale of 2 √ 2. The proof of this theorem follows several steps. We start with the definition of curve length and intrinsic metric. Without any assumption on differentiability, let (M, d) be a metric space. A curve in M ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
of any given curve under the geodesic distance δg and the Stein metric δS up to scale of 2 √ 2. The proof of this theorem follows several steps. We start with the definition of curve length and intrinsic metric. Without any assumption on differentiability, let (M, d) be a metric space. A curve in M is a continuous function γ: [0, 1] → M and joins the starting point γ(0) = x to the end point γ(1) = y. Definition 1. The length of a curve γ is the supremum of l(γ; {ti}) over all possible partitions {ti}, where 0 = t0 < t1 < · · · < tn−1 < tn = 1 and l(γ; {ti}) = ∑ i d (γ(ti), γ(ti−1)). Definition 2. The intrinsic metric ̂ δ(x, y) on M is defined as the infimum of the lengths of all paths from x to y. Theorem 1 ( [2]). If the intrinsic metrics induced by two metrics d1 and d2 are identical up to a scale ξ, then the length of any given curve is the same under both metrics up to ξ. Theorem 2 ( [2]). If d1(x, y) and d2(x, y) are two metrics defined on a space M such that d2(x, y) lim = 1. (1) d1(x,y)→0 d1(x, y) uniformly (with respect to x and y), then their intrinsic metrics are identical. Therefore, here, we need to study the behavior of lim δ 2 S (X,Y)→0 δ 2 g(X, Y) δ2 S