Results 1 - 10
of
13
Structured sparsity-inducing norms through submodular functions
- IN ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS
, 2010
"... Sparse methods for supervised learning aim at finding good linear predictors from as few variables as possible, i.e., with small cardinality of their supports. This combinatorial selection problem is often turnedinto a convex optimization problem byreplacing the cardinality function by its convex en ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
Sparse methods for supervised learning aim at finding good linear predictors from as few variables as possible, i.e., with small cardinality of their supports. This combinatorial selection problem is often turnedinto a convex optimization problem byreplacing the cardinality function by its convex envelope (tightest convex lower bound), in this case the ℓ1-norm. In this paper, we investigate more general set-functions than the cardinality, that may incorporate prior knowledge or structural constraints which are common in many applications: namely, we show that for nonincreasing submodular set-functions, the corresponding convex envelope can be obtained from its Lovász extension, a common tool in submodular analysis. This defines a family of polyhedral norms, for which we provide generic algorithmic tools (subgradients and proximal operators) and theoretical results (conditions for support recovery or high-dimensional inference). By selecting specific submodular functions, we can give a new interpretation to known norms, such as those based on rank-statistics or grouped norms with potentially overlapping groups; we also define new norms, in particular ones that can be used as non-factorial priors for supervised learning.
Factorized Latent Spaces with Structured Sparsity
"... Recent approaches to multi-view learning have shown that factorizing the information into parts that are shared across all views and parts that are private to each view could effectively account for the dependencies and independencies between the different input modalities. Unfortunately, these appr ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Recent approaches to multi-view learning have shown that factorizing the information into parts that are shared across all views and parts that are private to each view could effectively account for the dependencies and independencies between the different input modalities. Unfortunately, these approaches involve minimizing non-convex objective functions. In this paper, we propose an approach to learning such factorized representations inspired by sparse coding techniques. In particular, we show that structured sparsity allows us to address the multiview learning problem by alternately solving two convex optimization problems. Furthermore, the resulting factorized latent spaces generalize over existing approaches in that they allow having latent dimensions shared between any subset of the views instead of between all the views only. We show that our approach outperforms state-of-the-art methods on the task of human pose estimation. 1
Convex and network flow optimization for structured sparsity
- JMLR
"... We consider a class of learning problems regularized by a structured sparsity-inducing norm defined as the sum of ℓ2- or ℓ∞-norms over groups of variables. Whereas much effort has been put in developing fast optimization techniques when the groups are disjoint or embedded in a hierarchy, we address ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
We consider a class of learning problems regularized by a structured sparsity-inducing norm defined as the sum of ℓ2- or ℓ∞-norms over groups of variables. Whereas much effort has been put in developing fast optimization techniques when the groups are disjoint or embedded in a hierarchy, we address here the case of general overlapping groups. To this end, we present two different strategies: On the one hand, we show that the proximal operator associated with a sum of ℓ∞norms can be computed exactly in polynomial time by solving a quadratic min-cost flow problem, allowing the use of accelerated proximal gradient methods. On the other hand, we use proximal splitting techniques, and address an equivalent formulation with non-overlapping groups, but in higher dimension and with additional constraints. We propose efficient and scalable algorithms exploiting these two strategies, which are significantly faster than alternative approaches. We illustrate these methods with several problems such as CUR matrix factorization, multi-task learning of tree-structured dictionaries, background subtraction in video sequences, image denoising with wavelets, and topographic dictionary learning of natural image patches.
Sparse Coding for Learning Interpretable Spatio-Temporal Primitives
"... Sparse coding has recently become a popular approach in computer vision to learn dictionaries of natural images. In this paper we extend the sparse coding framework to learn interpretable spatio-temporal primitives. We formulated the problem as a tensor factorization problem with tensor group norm c ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Sparse coding has recently become a popular approach in computer vision to learn dictionaries of natural images. In this paper we extend the sparse coding framework to learn interpretable spatio-temporal primitives. We formulated the problem as a tensor factorization problem with tensor group norm constraints over the primitives, diagonal constraints on the activations that provide interpretability as well as smoothness constraints that are inherent to human motion. We demonstrate the effectiveness of our approach to learn interpretable representations of human motion from motion capture data, and show that our approach outperforms recently developed matching pursuit and sparse coding algorithms. 1
2. Function space / normRegularizations
, 2009
"... • Minimize with respect to function f: X → Y: n∑ ℓ(yi,f(xi)) + i=1 Error on data λ ..."
Abstract
- Add to MetaCart
• Minimize with respect to function f: X → Y: n∑ ℓ(yi,f(xi)) + i=1 Error on data λ
• Two theoretical/algorithmic issues:
"... • Supervised learning and regularization – Kernel methods vs. sparse methods ..."
Abstract
- Add to MetaCart
• Supervised learning and regularization – Kernel methods vs. sparse methods
Order-preserving factor discovery from misaligned data.
"... Abstract—We present a factor analysis method that accounts for possible temporal misalignment of the factor loadings across the population of samples. Our main hypothesis is that the data contains a subset of variables with similar but delayed profiles obeying a consistent precedence ordering relati ..."
Abstract
- Add to MetaCart
Abstract—We present a factor analysis method that accounts for possible temporal misalignment of the factor loadings across the population of samples. Our main hypothesis is that the data contains a subset of variables with similar but delayed profiles obeying a consistent precedence ordering relationship. Our model is motivated by the difficulty of gene expression analysis across subjects who have common patterns of immune response but show different onset times after a uniform innoculation time of a viral pathogen. The proposed method is based on a linear model with additional degrees of freedom that account for each subject’s inherent delays. We present an algorithm to fit this model in a totally unsupervised manner and demonstrate its effectiveness on extracting gene expression factors affecting host response using a flu-virus human challenge study dataset.
1 Order-preserving factor analysis (OPFA)
"... We present a novel factor analysis method that can be applied to discovery of common factors shared among trajectories in multivariate time series data. These factors satisfy a precedence-ordering property: certain factors are recruited only after some other factors are activated. Precedence orderin ..."
Abstract
- Add to MetaCart
We present a novel factor analysis method that can be applied to discovery of common factors shared among trajectories in multivariate time series data. These factors satisfy a precedence-ordering property: certain factors are recruited only after some other factors are activated. Precedence ordering arise in applications where variables are activated in a specific order, which is unknown. The proposed method is based on a linear model that accounts for each factor’s inherent delays and relative order. We present an algorithm to fit the model in an unsupervised manner using techniques from convex and non-convex optimization that enforce sparsity of the factor scores and consistent precedence order of the factor loadings. We illustrate the OPFA method for the problem of extracting precedence-ordered factors from gene expression data. I.
Structured Sparse Canonical Correlation Analysis
"... In this paper, we propose to apply sparse canonical correlation analysis (sparse CCA) to an important genome-wide association study problem, eQTL mapping. Existing sparse CCA models do not incorporate structural information among variables such as pathways of genes. This work extends the sparse CCA ..."
Abstract
- Add to MetaCart
In this paper, we propose to apply sparse canonical correlation analysis (sparse CCA) to an important genome-wide association study problem, eQTL mapping. Existing sparse CCA models do not incorporate structural information among variables such as pathways of genes. This work extends the sparse CCA so that it could exploit either the pre-given or unknown group structure via the structured-sparsity-inducing penalty. Such structured penalty poses new challenge on optimization techniques. To address this challenge, by specializing the excessive gap framework, we develop a scalable primal-dual optimization algorithm with a fast rate of convergence. Empirical results show that the proposed optimization algorithm is more efficient than existing state-of-the-art methods. We also demonstrate the effectiveness of the structured sparse CCA on both simulated and genetic datasets. 1

