Results 1  10
of
181
Guaranteed minimumrank solutions of linear matrix equations via nuclear norm minimization
, 2007
"... The affine rank minimization problem consists of finding a matrix of minimum rank that satisfies a given system of linear equality constraints. Such problems have appeared in the literature of a diverse set of fields including system identification and control, Euclidean embedding, and collaborative ..."
Abstract

Cited by 232 (15 self)
 Add to MetaCart
(Show Context)
The affine rank minimization problem consists of finding a matrix of minimum rank that satisfies a given system of linear equality constraints. Such problems have appeared in the literature of a diverse set of fields including system identification and control, Euclidean embedding, and collaborative filtering. Although specific instances can often be solved with specialized algorithms, the general affine rank minimization problem is NPhard, because it contains vector cardinality minimization as a special case. In this paper, we show that if a certain restricted isometry property holds for the linear transformation defining the constraints, the minimum rank solution can be recovered by solving a convex optimization problem, namely the minimization of the nuclear norm over the given affine space. We present several random ensembles of equations where the restricted isometry property holds with overwhelming probability, provided the codimension of the subspace is sufficiently large. The techniques used in our analysis have strong parallels in the compressed sensing framework. We discuss how affine rank minimization generalizes this preexisting concept and outline a dictionary relating concepts from cardinality minimization to those of rank minimization. We also discuss several algorithmic approaches to solving the norm minimization relaxations, and illustrate our results with numerical examples.
A learning theory approach to noninteractive database privacy
 In Proceedings of the 40th annual ACM symposium on Theory of computing
, 2008
"... In this paper we demonstrate that, ignoring computational constraints, it is possible to release synthetic databases that are useful for accurately answering large classes of queries while preserving differential privacy. Specifically, we give a mechanism that privately releases synthetic data usefu ..."
Abstract

Cited by 128 (14 self)
 Add to MetaCart
In this paper we demonstrate that, ignoring computational constraints, it is possible to release synthetic databases that are useful for accurately answering large classes of queries while preserving differential privacy. Specifically, we give a mechanism that privately releases synthetic data useful for answering a class of queries over a discrete domain with error that grows as a function of the size of the smallest net approximately representing the answers to that class of queries. We show that this in particular implies a mechanism for counting queries that gives error guarantees that grow only with the VCdimension of the class of queries, which itself grows at most logarithmically with the size of the query class. We also show that it is not possible to release even simple classes of queries (such as intervals and their generalizations) over continuous domains with worstcase utility guarantees while preserving differential privacy. In response to this, we consider a relaxation of the utility guarantee and give a privacy preserving polynomial time algorithm that for any halfspace query will provide an answer that is accurate for some small perturbation of the query. This algorithm does not release synthetic data, but instead another data structure capable of representing an answer for each query. We also give an efficient algorithm for releasing synthetic data for the class of interval queries and axisaligned rectangles of constant dimension over discrete domains. 1.
Approximate Nearest Neighbors and the Fast JohnsonLindenstrauss Transform
 STOC'06
, 2006
"... We introduce a new lowdistortion embedding of ℓ d 2 into O(log n) ℓp (p = 1, 2), called the FastJohnsonLindenstraussTransform. The FJLT is faster than standard random projections and just as easy to implement. It is based upon the preconditioning of a sparse projection matrix with a randomized F ..."
Abstract

Cited by 103 (8 self)
 Add to MetaCart
We introduce a new lowdistortion embedding of ℓ d 2 into O(log n) ℓp (p = 1, 2), called the FastJohnsonLindenstraussTransform. The FJLT is faster than standard random projections and just as easy to implement. It is based upon the preconditioning of a sparse projection matrix with a randomized Fourier transform. Sparse random projections are unsuitable for lowdistortion embeddings. We overcome this handicap by exploiting the “Heisenberg principle” of the Fourier transform, ie, its localglobal duality. The FJLT can be used to speed up search algorithms based on lowdistortion embeddings in ℓ1 and ℓ2. We consider the case of approximate nearest neighbors in ℓ d 2. We provide a faster algorithm using classical projections, which we then further speed up by plugging in the FJLT. We also give a faster algorithm for searching over the hypercube.
Improved approximation algorithms for large matrices via random projections
 in Proceedings of the 47th Annual Symposium on Foundations of Computer Science (FOCS
, 2006
"... ..."
(Show Context)
Euclidean distortion and the Sparsest Cut
 In Proceedings of the 37th Annual ACM Symposium on Theory of Computing
, 2005
"... BiLipschitz embeddings of finite metric spaces, a topic originally studied in geometric analysis and Banach space theory, became an integral part of theoretical computer science following work of Linial, London, and Rabinovich [29]. They presented an algorithmic version of a result of Bourgain [8] ..."
Abstract

Cited by 92 (21 self)
 Add to MetaCart
(Show Context)
BiLipschitz embeddings of finite metric spaces, a topic originally studied in geometric analysis and Banach space theory, became an integral part of theoretical computer science following work of Linial, London, and Rabinovich [29]. They presented an algorithmic version of a result of Bourgain [8] which shows that every
Feature hashing for large scale multitask learning
 In International Conference on Artificial Intelligence
"... Empirical evidence suggests that hashing is an effective strategy for dimensionality reduction and practical nonparametric estimation. In this paper we provide exponential tail bounds for feature hashing and show that the interaction between random subspaces is negligible with high probability. We d ..."
Abstract

Cited by 78 (17 self)
 Add to MetaCart
(Show Context)
Empirical evidence suggests that hashing is an effective strategy for dimensionality reduction and practical nonparametric estimation. In this paper we provide exponential tail bounds for feature hashing and show that the interaction between random subspaces is negligible with high probability. We demonstrate the feasibility of this approach with experimental results for a new use case — multitask learning with hundreds of thousands of tasks. 1.
FINDING STRUCTURE WITH RANDOMNESS: PROBABILISTIC ALGORITHMS FOR CONSTRUCTING APPROXIMATE MATRIX DECOMPOSITIONS
"... Lowrank matrix approximations, such as the truncated singular value decomposition and the rankrevealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for ..."
Abstract

Cited by 53 (1 self)
 Add to MetaCart
(Show Context)
Lowrank matrix approximations, such as the truncated singular value decomposition and the rankrevealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing lowrank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired lowrank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, speed, and robustness. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition
Distributed sparse random projections for refinable approximation
 In IEEE/ACM Int. Symposium on Information Processing in Sensor Networks (IPSN
, 2007
"... berkeley.edu Consider a largescale wireless sensor network measuring compressible data, where n distributed data values can be wellapproximated using only k ≪ n coefficients of some known transform. We address the problem of recovering an approximation of the n data values by querying any L sensor ..."
Abstract

Cited by 36 (5 self)
 Add to MetaCart
(Show Context)
berkeley.edu Consider a largescale wireless sensor network measuring compressible data, where n distributed data values can be wellapproximated using only k ≪ n coefficients of some known transform. We address the problem of recovering an approximation of the n data values by querying any L sensors, so that the reconstruction error is comparable to the optimal kterm approximation. To solve this problem, we present a novel distributed algorithm based on sparse random projections, which requires no global coordination or knowledge. The key idea is that the sparsity of the random projections greatly reduces the communication cost of preprocessing the data. Our algorithm allows the collector to choose the number of sensors to query according to the desired approximation error. The reconstruction quality depends only on the number of sensors queried, enabling robust refinable approximation.