Results 1  10
of
64
Convex multitask feature learning
 MACHINE LEARNING
, 2007
"... We present a method for learning sparse representations shared across multiple tasks. This method is a generalization of the wellknown singletask 1norm regularization. It is based on a novel nonconvex regularizer which controls the number of learned features common across the tasks. We prove th ..."
Abstract

Cited by 145 (15 self)
 Add to MetaCart
(Show Context)
We present a method for learning sparse representations shared across multiple tasks. This method is a generalization of the wellknown singletask 1norm regularization. It is based on a novel nonconvex regularizer which controls the number of learned features common across the tasks. We prove that the method is equivalent to solving a convex optimization problem for which there is an iterative algorithm which converges to an optimal solution. The algorithm has a simple interpretation: it alternately performs a supervised and an unsupervised step, where in the former step it learns taskspecific functions and in the latter step it learns commonacrosstasks sparse representations for these functions. We also provide an extension of the algorithm which learns sparse nonlinear representations using kernels. We report experiments on simulated and real data sets which demonstrate that the proposed method can both improve the performance relative to learning each task independently and lead to a few learned features common across related tasks. Our algorithm can also be used, as a special case, to simply select – not learn – a few common variables across the tasks.
Quantum Communication Complexity of Symmetric Predicates
 Izvestiya of the Russian Academy of Science, Mathematics
, 2002
"... We completely (that is, up to a logarithmic factor) characterize the boundederror quantum communication complexity of every predicate f(x; y) (x; y [n]) depending only on jx\yj. Namely, for a predicate D on f0; 1; : : : ; ng let ` 0 (D) = max f` j 1 ` n=2 ^ D(`) 6 D(` 1)g and ` 1 (D) = ..."
Abstract

Cited by 95 (1 self)
 Add to MetaCart
(Show Context)
We completely (that is, up to a logarithmic factor) characterize the boundederror quantum communication complexity of every predicate f(x; y) (x; y [n]) depending only on jx\yj. Namely, for a predicate D on f0; 1; : : : ; ng let ` 0 (D) = max f` j 1 ` n=2 ^ D(`) 6 D(` 1)g and ` 1 (D) = max fn ` j n=2 ` < n ^ D(`) 6 D(` + 1)g. Then the boundederror quantum communication complexity of f D (x; y) = D(jx \ yj) is equal (again, up to a logarithmic factor) to ` 1 (D). In particular, the complexity of the set disjointness predicate is n). This result holds both in the model with prior entanglement and without it.
Compressive Sensing and Structured Random Matrices
 RADON SERIES COMP. APPL. MATH XX, 1–95 © DE GRUYTER 20YY
"... These notes give a mathematical introduction to compressive sensing focusing on recovery using ℓ1minimization and structured random matrices. An emphasis is put on techniques for proving probabilistic estimates for condition numbers of structured random matrices. Estimates of this type are key to ..."
Abstract

Cited by 64 (13 self)
 Add to MetaCart
These notes give a mathematical introduction to compressive sensing focusing on recovery using ℓ1minimization and structured random matrices. An emphasis is put on techniques for proving probabilistic estimates for condition numbers of structured random matrices. Estimates of this type are key to providing conditions that ensure exact or approximate recovery of sparse vectors using ℓ1minimization.
A spectral regularization framework for multitask structure learning
 In NIPS
, 2008
"... Learning the common structure shared by a set of supervised tasks is an important practical and theoretical problem. Knowledge of this structure may lead to better generalization performance on the tasks and may also facilitate learning new tasks. We propose a framework for solving this problem, whi ..."
Abstract

Cited by 50 (7 self)
 Add to MetaCart
(Show Context)
Learning the common structure shared by a set of supervised tasks is an important practical and theoretical problem. Knowledge of this structure may lead to better generalization performance on the tasks and may also facilitate learning new tasks. We propose a framework for solving this problem, which is based on regularization with spectral functions of matrices. This class of regularization problems exhibits appealing computational properties and can be optimized efficiently by an alternating minimization algorithm. In addition, we provide a necessary and sufficient condition for convexity of the regularizer. We analyze concrete examples of the framework, which are equivalent to regularization with Lp matrix norms. Experiments on two real data sets indicate that the algorithm scales well with the number of tasks and improves on state of the art statistical performance. 1
Quantum algorithmic entropy
 in Proc. 16th IEEE Conf. Computational Complexity
, 2001
"... ABSTRACT. We extend algorithmic information theory to quantum mechanics, taking a universal semicomputable density matrix (“universal probability”) as a starting point, and define complexity (an operator) as its negative logarithm. A number of properties of Kolmogorov complexity extend naturally to ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
(Show Context)
ABSTRACT. We extend algorithmic information theory to quantum mechanics, taking a universal semicomputable density matrix (“universal probability”) as a starting point, and define complexity (an operator) as its negative logarithm. A number of properties of Kolmogorov complexity extend naturally to the new domain. Approximately, a quantum state is simple if it is within a small distance from a lowdimensional subspace of low Kolmogorov complexity. The von Neumann entropy of a computable density matrix is within an additive constant from the average complexity. Some of the theory of randomness translates to the new domain. We explore the relations of the new quantity to the quantum Kolmogorov complexity defined by Vitányi (we show that the latter is sometimes as large as 2n − 2 log n) and the qubit complexity defined by Berthiaume, Dam and Laplante. The “cloning ” properties of our complexity measure are similar to those of qubit complexity. 1.
A new approach to the modelling of local defects in crystals: the reduced HartreeFock case
 Commun. Math. Phys
"... Abstract. This article is concerned with the derivation and the mathematical study of a new meanfield model for the description of interacting electrons in crystals with local defects. We work with a reduced HartreeFock model, obtained from the usual HartreeFock model by neglecting the exchange t ..."
Abstract

Cited by 17 (10 self)
 Add to MetaCart
Abstract. This article is concerned with the derivation and the mathematical study of a new meanfield model for the description of interacting electrons in crystals with local defects. We work with a reduced HartreeFock model, obtained from the usual HartreeFock model by neglecting the exchange term. First, we recall the definition of the selfconsistent Fermi sea of the perfect crystal, which is obtained as a minimizer of some periodic problem, as was shown by Catto, Le Bris and Lions. We also prove some of its properties which were not mentioned before. Then, we define and study in detail a nonlinear model for the electrons of the crystal in the presence of a defect. We use formal analogies between the Fermi sea of a perturbed crystal and the Dirac sea in Quantum Electrodynamics in the presence of an external electrostatic field. The latter was recently studied by Hainzl, Lewin, Séré and Solovej, based on ideas from Chaix and Iracane. This enables us to define the ground state of the selfconsistent Fermi sea in the presence of a defect.
The ShannonMcMillan theorem for ergodic quantum lattice systems
, 2008
"... We formulate and prove a quantum ShannonMcMillan theorem. The theorem demonstrates the significance of the von Neumann entropy for translation invariant ergodic quantum spin systems on Z νlattices: the entropy gives the logarithm of the essential number of eigenvectors of the system on large boxes ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
(Show Context)
We formulate and prove a quantum ShannonMcMillan theorem. The theorem demonstrates the significance of the von Neumann entropy for translation invariant ergodic quantum spin systems on Z νlattices: the entropy gives the logarithm of the essential number of eigenvectors of the system on large boxes. The onedimensional case covers quantum information sources and is basic for coding theorems.
On spectral learning
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2010
"... In this paper, we study the problem of learning a matrix W from a set of linear measurements. Our formulation consists in solving an optimization problem which involves regularization with a spectral penalty term. That is, the penalty term is a function of the spectrum of the covariance of W. Instan ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
(Show Context)
In this paper, we study the problem of learning a matrix W from a set of linear measurements. Our formulation consists in solving an optimization problem which involves regularization with a spectral penalty term. That is, the penalty term is a function of the spectrum of the covariance of W. Instances of this problem in machine learning include multitask learning, collaborative filtering and multiview learning, among others. Our goal is to elucidate the form of the optimal solution of spectral learning. The theory of spectral learning relies on the von Neumann characterization of orthogonally invariant norms and their association with symmetric gauge functions. Using this tool we formulate a representer theorem for spectral regularization and specify it to several useful example, such as Schatten p−norms, trace norm and spectral norm, which should proved useful in applications.
When is there a representer theorem? Vector vs matrix regularizers
 J. of Machine Learning Res
"... We consider a general class of regularization methods which learn a vector of parameters on the basis of linear measurements. It is well known that if the regularizer is a nondecreasing function of the inner product then the learned vector is a linear combination of the input data. This result, know ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
We consider a general class of regularization methods which learn a vector of parameters on the basis of linear measurements. It is well known that if the regularizer is a nondecreasing function of the inner product then the learned vector is a linear combination of the input data. This result, known as the representer theorem, is at the basis of kernelbased methods in machine learning. In this paper, we prove the necessity of the above condition, thereby completing the characterization of kernel methods based on regularization. We further extend our analysis to regularization methods which learn a matrix, a problem which is motivated by the application to multitask learning. In this context, we study a more general representer theorem, which holds for a larger class of regularizers. We provide a necessary and sufficient condition for these class of matrix regularizers and highlight them with some concrete examples of practical importance. Our analysis uses basic principles from matrix theory, especially the useful notion Regularization in Hilbert spaces is an important methodology for learning from examples and has a long history in a variety of fields. It has been studied, from different perspectives, in statistics