Results 1 -
8 of
8
A scalable tree-based approach for joint object and pose recognition
- In Twenty-Fifth Conference on Artificial Intelligence (AAAI
, 2011
"... Recognizing possibly thousands of objects is a crucial capability for an autonomous agent to understand and interact with everyday environments. Practical object recognition comes in multiple forms: Is this a coffee mug? (category recognition). Is this Alice’s coffee mug? (instance recognition). Is ..."
Abstract
-
Cited by 29 (6 self)
- Add to MetaCart
(Show Context)
Recognizing possibly thousands of objects is a crucial capability for an autonomous agent to understand and interact with everyday environments. Practical object recognition comes in multiple forms: Is this a coffee mug? (category recognition). Is this Alice’s coffee mug? (instance recognition). Is the mug with the handle facing left or right? (pose recognition). We present a scalable framework, Object-Pose Tree, which efficiently organizes data into a semantically structured tree. The tree structure enables both scalable training and testing, allowing us to solve recognition over thousands of object poses in near real-time. Moreover, by simultaneously optimizing all three tasks, our approach outperforms standard nearest neighbor and 1-vs-all classifications, with large improvements on pose recognition. We evaluate the proposed technique on a dataset of 300 household objects collected using a Kinect-style 3D camera. Experiments demonstrate that our system
Robust principal component analysis based on maximum correntropy criterion
- IEEE Trans. Image Process
, 2011
"... Abstract—Principal component analysis (PCA) minimizes the mean square error (MSE) and is sensitive to outliers. In this paper, we present a new rotational-invariant PCA based on maximum correntropy criterion (MCC). A half-quadratic optimization algorithm is adopted to compute the correntropy objecti ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
(Show Context)
Abstract—Principal component analysis (PCA) minimizes the mean square error (MSE) and is sensitive to outliers. In this paper, we present a new rotational-invariant PCA based on maximum correntropy criterion (MCC). A half-quadratic optimization algorithm is adopted to compute the correntropy objective. At each iteration, the complex optimization problem is reduced to a quadratic problem that can be efficiently solved by a standard optimization method. The proposed method exhibits the following benefits: 1) it is robust to outliers through the mechanism of MCC which can be more theoretically solid than a heuristic rule based on MSE; 2) it requires no assumption about the zero-mean of data for processing and can estimate data mean during optimization; and 3) its optimal solution consists of principal eigenvectors of a robust covariance matrix corresponding to the largest eigen-values. In addition, kernel techniques are further introduced in the proposed method to deal with nonlinearly distributed data. Numerical results demonstrate that the proposed method can outperform robust rotational-invariant PCAs based on norm when outliers occur. Index Terms—Correntropy, half-quadratic optimization, prin-cipal component analysis (PCA), robust. I.
X.W.: Nonnegative sparse coding for discriminative semi-supervised learning
- In: CVPR
, 2011
"... An informative and discriminative graph plays an im-portant role in the graph-based semi-supervised learning methods. This paper introduces a nonnegative sparse al-gorithm and its approximated algorithm based on the l0-l1 equivalence theory to compute the nonnegative sparse weights of a graph. Hence ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
(Show Context)
An informative and discriminative graph plays an im-portant role in the graph-based semi-supervised learning methods. This paper introduces a nonnegative sparse al-gorithm and its approximated algorithm based on the l0-l1 equivalence theory to compute the nonnegative sparse weights of a graph. Hence, the sparse probability graph (SPG) is termed for representing the proposed method. The nonnegative sparse weights in the graph naturally serve as clustering indicators, benefiting for semi-supervised learn-ing. More important, our approximation algorithm speeds up the computation of the nonnegative sparse coding, which is still a bottle-neck for any previous attempts of sparse non-negative graph learning. And it is much more efficient than using l1-norm sparsity technique for learning large scale sparse graph. Finally, for discriminative semi-supervised learning, an adaptive label propagation algorithm is also proposed to iteratively predict the labels of data on the SPG. Promising experimental results show that the nonnegative sparse coding is efficient and effective for discriminative semi-supervised learning. 1.
ROBUST VIEW TRANSFORMATION MODEL FOR GAIT RECOGNITION
"... Recent gait recognition systems often suffer from the chal-lenges including viewing angle variation and large intra-class variations. In order to address these challenges, this paper presents a robust View Transformation Model for gait recog-nition. Based on the gait energy image, the proposed metho ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
Recent gait recognition systems often suffer from the chal-lenges including viewing angle variation and large intra-class variations. In order to address these challenges, this paper presents a robust View Transformation Model for gait recog-nition. Based on the gait energy image, the proposed method establishes a robust view transformation model via robust principal component analysis. Partial least square is used as feature selection method. Compared with the existing methods, the proposed method finds out a shared linear corre-lated low rank subspace, which brings the advantages that the view transformation model is robust to viewing angle varia-tion, clothing and carrying condition changes. Conducted on the CASIA gait dataset, experimental results show that the proposed method outperforms the other existing methods.
Detection of Occluded Face Image using Mean Based Weight Matrix and Support Vector Machine
- J. Computer Sci
, 2012
"... Abstract: Problem statement: Face occlusion is a very challenging problem in face recognition. The performance of face recognition system decreases drastically due to the presence of partial occlusion on the face. Extracting discriminative features to achieve accurate detection versus computational ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract: Problem statement: Face occlusion is a very challenging problem in face recognition. The performance of face recognition system decreases drastically due to the presence of partial occlusion on the face. Extracting discriminative features to achieve accurate detection versus computational overhead in extracting the features, which affects the classification speed, would be a sustained problem. The objective of this study is to segment the human face into non-occluded and occluded part of the occluded human face image. In General, for face detection special facial features are extracted. In the proposed study a simplified algorithm to extract the features is developed. Approach: An algorithm which enables the automatic detection of the presence of occlusions on the face would be a useful tool to increase the performances of the system. The face image was preprocessed to enhance the input face images in order to reduce the loss of classification performance due to changes in facial appearance. The experiment also balances both illumination and facial expression changes. Results: In this study, a Mean Based Weight Matrix (MBWM) algorithm has been proposed to enhance the performance by 4.25 % than the LBP method. Conclusion: The proposed model has been tested on occluded face images with a dataset obtained from the MIT face database.
Two-Stage Nonnegative Sparse Representation for Large-Scale Face Recognition
"... Abstract — This paper proposes a novel nonnegative sparse representation approach, called two-stage sparse representation (TSR), for robust face recognition on a large-scale database. Based on the divide and conquer strategy, TSR decomposes the procedure of robust face recognition into outlier detec ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
Abstract — This paper proposes a novel nonnegative sparse representation approach, called two-stage sparse representation (TSR), for robust face recognition on a large-scale database. Based on the divide and conquer strategy, TSR decomposes the procedure of robust face recognition into outlier detection stage and recognition stage. In the first stage, we propose a general multisubspace framework to learn a robust metric in which noise and outliers in image pixels are detected. Potential loss functions, including L1, L2,1, and correntropy are studied. In the second stage, based on the learned metric and collaborative representa-tion, we propose an efficient nonnegative sparse representation algorithm to find an approximation solution of sparse represen-tation. According to the L1 ball theory in sparse representation, the approximated solution is unique and can be optimized efficiently. Then a filtering strategy is developed to avoid the computation of the sparse representation on the whole large-scale dataset. Moreover, theoretical analysis also gives the necessary condition for nonnegative least squares technique to find a sparse solution. Extensive experiments on several public databases have demonstrated that the proposed TSR approach, in general, achieves better classification accuracy than the state-of-the-art sparse representation methods. More importantly, a significant reduction of computational costs is reached in comparison with sparse representation classifier; this enables the TSR to be more suitable for robust face recognition on a large-scale dataset. Index Terms — Correntropy, L1 regularization, large-scale, nonnegative sparse representation, robust face recognition. I.
Recovery of Corrupted Low-Rank Matrices via Half-Quadratic based Nonconvex Minimization
"... Recovering arbitrarily corrupted low-rank matrices arises in computer vision applications, including bioinfor-matic data analysis and visual tracking. The methods used involve minimizing a combination of nuclear norm and l1 norm. We show that by replacing the l1 norm on error items with nonconvex M- ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Recovering arbitrarily corrupted low-rank matrices arises in computer vision applications, including bioinfor-matic data analysis and visual tracking. The methods used involve minimizing a combination of nuclear norm and l1 norm. We show that by replacing the l1 norm on error items with nonconvex M-estimators, exact recovery of densely corrupted low-rank matrices is possible. The robustness of the proposed method is guaranteed by the M-estimator theory. The multiplicative form of half-quadratic optimiza-tion is used to simplify the nonconvex optimization problem so that it can be efficiently solved by iterative regulariza-tion scheme. Simulation results corroborate our claims and demonstrate the efficiency of our proposed method under tough conditions. 1.
Object Recognition and Semantic Scene Labeling for RGB-D Data
, 2013
"... The availability of RGB-D (Kinect-like) cameras has led to an explosive growth of research on robot perception. RGB-D cameras provide high resolution (640 × 480) synchronized videos of both color (RGB) and depth (D) at 30 frames per second. This dissertation demonstrates the thesis that combining of ..."
Abstract
- Add to MetaCart
The availability of RGB-D (Kinect-like) cameras has led to an explosive growth of research on robot perception. RGB-D cameras provide high resolution (640 × 480) synchronized videos of both color (RGB) and depth (D) at 30 frames per second. This dissertation demonstrates the thesis that combining of RGB and depth at high frame rates is helpful for various recognition tasks including object recognition, object detection, and semantic scene labeling. We present the RGB-D Object Dataset, a large dataset of 250,000 RGB-D images of 300 objects in 51 categories, and 22 RGB-D videos of objects in indoor home and office environments. We introduce algorithms for object recognition in RGB-D images that perform category, instance, and pose recognition in a scalable manner. We also present HMP3D, an unsupervised feature learning approach for 3D point cloud data, and demonstrate that HMP3D can be used to learn hierarchies of features from different attributes including color, gradient, shape, and surface normal orientation. Finally, we present a scene labeling approach for scenes constructed from RGB-D videos. The approach uses features learned from both individual RGB-D images and 3D point clouds constructed from entire video