Results 1 - 10
of
41
An introduction to variable and feature selection
- Journal of Machine Learning Research
, 2003
"... Variable and feature selection have become the focus of much research in areas of application for which datasets with tens or hundreds of thousands of variables are available. ..."
Abstract
-
Cited by 431 (8 self)
- Add to MetaCart
Variable and feature selection have become the focus of much research in areas of application for which datasets with tens or hundreds of thousands of variables are available.
Robust face recognition via sparse representation,” (preprint
- IEEE Trans. Pattern Analysis and Machine Intelligence
"... Abstract — We consider the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise. We cast the recognition problem as one of classifying among multiple linear regression models, and argue that new theory from sp ..."
Abstract
-
Cited by 145 (18 self)
- Add to MetaCart
Abstract — We consider the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise. We cast the recognition problem as one of classifying among multiple linear regression models, and argue that new theory from sparse signal representation offers the key to addressing this problem. Based on a sparse representation computed by ℓ 1-minimization, we propose a general classification algorithm for (image-based) object recognition. This new framework provides new insights into two crucial issues in face recognition: feature extraction and robustness to occlusion. For feature extraction, we show that if sparsity in the recognition problem is properly harnessed, the choice of features is no longer critical. What is critical, however, is whether the number of features is sufficiently large and whether the sparse representation is correctly computed. Unconventional features such as downsampled images and random projections perform just as well as conventional features such as Eigenfaces and Laplacianfaces, as long as the dimension of the feature space surpasses certain threshold, predicted by the theory of sparse representation. This framework can handle errors due to occlusion and corruption uniformly, by exploiting the fact that these errors are often sparse w.r.t. to the standard (pixel) basis. The theory of sparse representation helps predict how much occlusion the recognition algorithm can handle and how to choose the training images to maximize robustness to occlusion. We conduct extensive experiments on publicly available databases to verify the efficacy of the proposed algorithm, and corroborate the above claims.
Use of the Zero-Norm With Linear Models and Kernel Methods
, 2002
"... We explore the use of the so-called zero-norm of the parameters of linear models in learning. ..."
Abstract
-
Cited by 85 (4 self)
- Add to MetaCart
We explore the use of the so-called zero-norm of the parameters of linear models in learning.
Low-Dimensional Linear Programming with Violations
- In Proc. 43th Annu. IEEE Sympos. Found. Comput. Sci
, 2002
"... Two decades ago, Megiddo and Dyer showed that linear programming in 2 and 3 dimensions (and subsequently, any constant number of dimensions) can be solved in linear time. In this paper, we consider linear programming with at most k violations: finding a point inside all but at most k of n given half ..."
Abstract
-
Cited by 43 (3 self)
- Add to MetaCart
Two decades ago, Megiddo and Dyer showed that linear programming in 2 and 3 dimensions (and subsequently, any constant number of dimensions) can be solved in linear time. In this paper, we consider linear programming with at most k violations: finding a point inside all but at most k of n given halfspaces. We give a simple algorithm in 2-d that runs in O((n + k ) log n) expected time; this is faster than earlier algorithms by Everett, Robert, and van Kreveld (1993) and Matousek (1994) and is probably nearoptimal for all k n=2. A (theoretical) extension of our algorithm in 3-d runs in near O(n + k ) expected time. Interestingly, the idea is based on concave-chain decompositions (or covers) of the ( k)-level, previously used in proving combinatorial k-level bounds.
Hardness of learning halfspaces with noise
- In Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
, 2006
"... Learning an unknown halfspace (also called a perceptron) from labeled examples is one of the classic problems in machine learning. In the noise-free case, when a halfspace consistent with all the training examples exists, the problem can be solved in polynomial time using linear programming. However ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
Learning an unknown halfspace (also called a perceptron) from labeled examples is one of the classic problems in machine learning. In the noise-free case, when a halfspace consistent with all the training examples exists, the problem can be solved in polynomial time using linear programming. However, under the promise that a halfspace consistent with a fraction (1 − ε) of the examples exists (for some small constant ε> 0), it was not known how to efficiently find a halfspace that is correct on even 51 % of the examples. Nor was a hardness result that ruled out getting agreement on more than 99.9 % of the examples known. In this work, we close this gap in our understanding, and prove that even a tiny amount of worst-case noise makes the problem of learning halfspaces intractable in a strong sense. Specifically, for arbitrary ε, δ> 0, we prove that given a set of examples-label pairs from the hypercube a fraction (1 − ε) of which can be explained by a halfspace, it is NP-hard to find a halfspace that correctly labels a fraction (1/2 + δ) of the examples. The hardness result is tight since it is trivial to get agreement on 1/2 the examples. In learning theory parlance, we prove that weak proper agnostic learning of halfspaces is hard. This settles a question that was raised by Blum et al. in their work on learning halfspaces in the presence of random classification noise [10], and in some more recent works as well. Along the way, we also obtain a strong hardness result for another basic computational problem: solving a linear system over the rationals. 1
Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization
- Advances in Neural Information Processing Systems 22
, 2009
"... The supplementary material to the NIPS version of this paper [4] contains a critical error, which was discovered several days before the conference. Unfortunately, it was too late to withdraw the paper from the proceedings. Fortunately, since that time, a correct analysis of the proposed convex prog ..."
Abstract
-
Cited by 21 (3 self)
- Add to MetaCart
The supplementary material to the NIPS version of this paper [4] contains a critical error, which was discovered several days before the conference. Unfortunately, it was too late to withdraw the paper from the proceedings. Fortunately, since that time, a correct analysis of the proposed convex programming relaxation has been developed by Emmanuel Candes of Stanford University. That analysis is reported in a joint paper, Robust Principal Component Analysis? by Emmanuel Candes, Xiaodong Li, Yi Ma and John Wright,
Self Adapting Software for Numerical Linear Algebra and LAPACK for Clusters
- Parallel Computing
, 2003
"... This article describes the context, design, and recent development of the LAPACK for Clusters (LFC) project. It has been developed in the framework of Self-Adapting Numerical Software (SANS) since we believe such an approach can deliver the con- venience and ease of use of existing sequential enviro ..."
Abstract
-
Cited by 20 (15 self)
- Add to MetaCart
This article describes the context, design, and recent development of the LAPACK for Clusters (LFC) project. It has been developed in the framework of Self-Adapting Numerical Software (SANS) since we believe such an approach can deliver the con- venience and ease of use of existing sequential environments bundled with the power and versatility of highly-tuned parallel codes that execute on clusters. Accomplishing this task is far from trivial as we argue in the paper by presenting pertinent case studies and possible usage scenarios.
Parsimonious Least Norm Approximation
, 1997
"... A theoretically justifiable fast finite successive linear approximation algorithm is proposed for obtaining a parsimonious solution to a corrupted linear system Ax = b + p, where the corruption p is due to noise or error in measurement. The proposed linear-programming-based algorithm finds a solutio ..."
Abstract
-
Cited by 18 (7 self)
- Add to MetaCart
A theoretically justifiable fast finite successive linear approximation algorithm is proposed for obtaining a parsimonious solution to a corrupted linear system Ax = b + p, where the corruption p is due to noise or error in measurement. The proposed linear-programming-based algorithm finds a solution x by parametrically minimizing the number of nonzero elements in x and the error k Ax \Gamma b \Gamma p k 1 . Numerical tests on a signal-processing-based example indicate that the proposed method is comparable to a method that parametrically minimizes the 1-norm of the solution x and the error k Ax \Gamma b \Gamma p k 1 , and that both methods are superior, by orders of magnitude, to solutions obtained by least squares as well by combinatorially choosing an optimal solution with a specific number of nonzero elements. Keywords Minimal cardinality, least norm approximation 1 Introduction A wide range of important applications can be reduced to the problem of estimating a vector x by minim...
False Data Injection Attacks against State Estimation in Electric Power Grids
, 2009
"... A power grid is a complex system connecting electric power generators to consumers through power transmission and distribution networks across a large geographical area. System monitoring is necessary to ensure the reliable operation of power grids, and state estimation is used in system monitoring ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
A power grid is a complex system connecting electric power generators to consumers through power transmission and distribution networks across a large geographical area. System monitoring is necessary to ensure the reliable operation of power grids, and state estimation is used in system monitoring to best estimate the power grid state through analysis of meter measurements and power system models. Various techniques have been developed to detect and identify bad measurements, including the interacting bad measurements introduced by arbitrary, non-random causes. At first glance, it seems that these techniques can also defeat malicious measurements injected by attackers. In this paper, we present a new class of attacks, called false data injection attacks, against state estimation in electric power grids. We show that an attacker can exploit the configuration of a power system to launch such attacks to successfully introduce arbitrary errors into certain state variables while bypassing existing techniques for bad measurement detection. Moreover, we look at two realistic attack scenarios, in which the attacker is either constrained to some specific meters (due to the physical protection of the meters), or limited in the resources required to compromise meters. We show that the attacker can systematically and efficiently construct attack vectors in both scenarios, which can not only change the results of state estimation, but also modify the results in arbitrary ways. We demonstrate the success of these attacks through simulation using IEEE test systems. Our results indicate that security protection of the electric power grid must be revisited when there are potentially malicious attacks.

