## StructuredSparse Principal ComponentAnalysis

### Cached

### Download Links

### BibTeX

@MISC{Jenatton_structuredsparseprincipal,

author = {Rodolphe Jenatton and Guillaume Obozinski and Francis Bach},

title = {StructuredSparse Principal ComponentAnalysis},

year = {}

}

### OpenURL

### Abstract

WepresentanextensionofsparsePCA,orsparse dictionary learning, where the sparsity patterns ofalldictionaryelementsarestructuredandconstrainedtobelongtoaprespecifiedsetofshapes. This structured sparse PCA is based on a structuredregularizationrecentlyintroducedbyJenatton et al. (2009). While classical sparse priors only deal with cardinality, the regularization we use encodes higher-order information about the data. We propose an efficient and simple optimization procedure to solve this problem. Experimentswithtwopracticaltasks,thedenoising ofsparsestructuredsignalsandfacerecognition, demonstrate the benefits of the proposed structuredapproach over unstructured approaches. 1

### Citations

988 |
Spectral Graph Theory
- Chung
- 1997
(Show Context)
Citation Context ... ones, penalized respectively by λ r 2 k=1 [Ωv(V k )] 2 +[Ωu(U k )] 2 and λ Pr k=1 Ωv(V k )Ωu(U k ). 2 Althoughweusethetermconvexinformallyhere,itcanhoweverbemadeprecisewiththenotionofconvexsubgraphs(=-=Chung, 1997-=-). (1) 367R.Jenatton, G.Obozinskiand F.Bach TheframeworkofJenattonetal.(2009)canbesummarized as follows: if we denote by G a subset of the power set of {1,...,p}, such that ⋃ G∈G G = {1,...,p}, we de... |

740 |
Nonlinear Programming. Athena Scientific
- Bertsekas
- 1999
(Show Context)
Citation Context ...uenceofproblemsthatareconvex in U forfixed V (andconversely,convexin V forfixed U). For this sequence of problems, we then present efficient optimizationproceduresbasedonblockcoordinatedescent (BCD) (=-=Bertsekas, 1995-=-, Section 2.7). We describe these in detail in Algorithm 1. Note that we depart from the approachofJenattonetal.(2009)whouseanactivesetalgorithm. Theirapproachdoesnotindeedallowwarmrestarts, which isc... |

303 | Model-based compressive sensing
- Baraniuk, Cevher, et al.
(Show Context)
Citation Context ...tructured sparsity has highlighted the benefitofexploitingsuchstructureinthecontextofregression and classification (Jenatton et al., 2009; Jacob et al., 2009; Huang et al., 2009), compressed sensing (=-=Baraniuk et al., 2008-=-), as well as within Bayesian frameworks (He and Carin, 2009). In particular, Jenatton et al. (2009) show that, given any intersection-closed family of patterns P of variables,suchasalltherectangleson... |

203 | Efficient sparse coding algorithms - Lee, Battle, et al. - 2006 |

155 | Consistency of the group Lasso and multiple kernel learning
- Bach
(Show Context)
Citation Context ...∥(V k i d G i )i∈G, k∈M ∥ α ] 1 α 2 . (4) 4 For the sake of clarity, we do not specify the dependence of ζ on (η G )G∈G. In fact, not surprisingly given that similar results hold for the group Lasso (=-=Bach, 2008-=-), it can be shown that the aboveextensionisequivalenttothevariationalformulation min U, V, Ωu(U k )≤1 (η G )G∈G∈R |M|×|G| + λ ∑ [ ∑ 2 M∈M k∈M 1 ∥ ⊤ X −UV 2np ∥ 2 F + (V k ) ⊤ Diag ( ζ M)−1 V k + ‖(η ... |

96 | Learning the kernel function via regularization - Micchelli, Pontil |

95 | Structured variable selection with sparsity-inducing norms
- Jenatton, Bach
(Show Context)
Citation Context ...ofgenesthatareneighbors inaprotein-protein interaction network. Recent research on structured sparsity has highlighted the benefitofexploitingsuchstructureinthecontextofregression and classification (=-=Jenatton et al., 2009-=-; Jacob et al., 2009; Huang et al., 2009), compressed sensing (Baraniuk et al., 2008), as well as within Bayesian frameworks (He and Carin, 2009). In particular, Jenatton et al. (2009) show that, give... |

80 |
Joint Covariate Selection and Joint Subspace Selection for Multiple Classification Problems
- Obozinksi, Taskar, et al.
(Show Context)
Citation Context ...nary elements V k and V k′ share the same sparsity pattern is equivalent to imposing that V k k′ i and Vi are simultaneously zero or non-zero. Following the approach used for joint feature selection (=-=Obozinski et al., 2009-=-) where the ℓ1 norm is composed with an ℓ2 norm, we compose the norm Ωα with the ℓ2 norm V M k i = ‖(Vi )k∈M‖2, of all ith entries of each dictionary element of a class M of the partition M, leading t... |

67 | Learning invariant features through topographic filter maps
- Kavukcuoglu, Ranzato, et al.
- 2009
(Show Context)
Citation Context ...over, although we focus in this work on controlling the structure of the dictionary V, we could instead impose structureonthedecompostioncoefficients U andstudythe induced effect on the dictionary V (=-=Kavukcuoglu et al., 2009-=-). This could be straightforward ti do with the same formulation, by transposing the data matrix X. Finally, we intendtoapplythisstructuredsparsity-inducingregularization for multi-task learning, in o... |

57 | Learning with structured sparsity
- Huang, Zhang, et al.
- 2009
(Show Context)
Citation Context ...interaction network. Recent research on structured sparsity has highlighted the benefitofexploitingsuchstructureinthecontextofregression and classification (Jenatton et al., 2009; Jacob et al., 2009; =-=Huang et al., 2009-=-), compressed sensing (Baraniuk et al., 2008), as well as within Bayesian frameworks (He and Carin, 2009). In particular, Jenatton et al. (2009) show that, given any intersection-closed family of patt... |

44 | Spectral bounds for sparse PCA: exact and greedy algorithms
- Moghaddam, Weiss, et al.
- 2006
(Show Context)
Citation Context ...t a time and perform a deflation of the covariance matrix at each step (see Mackey, 2009). The synthesis interpretation leads to non-convex global formulations (Zou et al., 2006; Mairal et al., 2009; =-=Moghaddam et al., 2006-=-; Lee et al., 2007) which estimate simultaneously all principal components, often drop the orthogonality constraints, and are referred to as matrix factorization problems (Singh and Gordon, 2008) in m... |

43 | Exploiting structure in wavelet-based Bayesian compressive sensing - He, Carin |

36 | Nonparametric Bayesian dictionary learning for analysis of noisy and incomplete images,” Apr
- Zhou, Chen, et al.
(Show Context)
Citation Context ...rk, we would like to investigate Bayesian frameworks that would define similar structured priors and allow the principled choice of the regularization parameter and the number of dictionary elements (=-=Zhou et al., 2009-=-). Moreover, although we focus in this work on controlling the structure of the dictionary V, we could instead impose structureonthedecompostioncoefficients U andstudythe induced effect on the diction... |

33 | A unified view of matrix factorization models
- Singh, Gordon
- 2008
(Show Context)
Citation Context ...2009; Moghaddam et al., 2006; Lee et al., 2007) which estimate simultaneously all principal components, often drop the orthogonality constraints, and are referred to as matrix factorization problems (=-=Singh and Gordon, 2008-=-) in machine learning, and dictionary learning insignal processing. The approach we propose fits more naturally in the framework of dictionnary learning, whose terminology we now introduce. 2.1 Matrix... |

18 | Convex sparse matrix factorizations - Bach, Mairal, et al. - 2008 |

16 | Deflation methods for sparse pca
- Mackey
(Show Context)
Citation Context ...leadstosequentialformulations(d’Aspremontetal.,2008; Moghaddam etal.,2006; Jolliffeetal.,2003) thatconsider components one at a time and perform a deflation of the covariance matrix at each step (see =-=Mackey, 2009-=-). The synthesis interpretation leads to non-convex global formulations (Zou et al., 2006; Mairal et al., 2009; Moghaddam et al., 2006; Lee et al., 2007) which estimate simultaneously all principal co... |

5 |
Group Lasso with overlap and graph
- Jacob, Obozinski, et al.
(Show Context)
Citation Context ... inaprotein-protein interaction network. Recent research on structured sparsity has highlighted the benefitofexploitingsuchstructureinthecontextofregression and classification (Jenatton et al., 2009; =-=Jacob et al., 2009-=-; Huang et al., 2009), compressed sensing (Baraniuk et al., 2008), as well as within Bayesian frameworks (He and Carin, 2009). In particular, Jenatton et al. (2009) show that, given any intersection-c... |

1 | Online dictionary learning forsparsecoding - Mairal, Bach, et al. |

1 |
A penalized matrixdecomposition,withapplicationstosparseprincipal components and canonical correlation analysis
- Witten, Tibshirani, et al.
- 2009
(Show Context)
Citation Context ...ts or decomposition coefficients,thematrixproduct UV ⊤ iscalledadecomposition of X. Learning simultaneously the dictionary V and the decomposition U corresponds to a matrix factorization problem (see =-=Witten et al., 2009-=-, and reference therein). As formulated by Bach et al. (2008) or Witten et al. (2009), it is natural, when learning a decomposition, to penalize or constrain some norms or quasi-norms of U and V, say ... |

1 |
One-stepsparseestimatesinnonconcave penalized likelihood models
- Li
- 2008
(Show Context)
Citation Context ...heretheunregularizedproblem,here dictionary learning is itself non convex. In light of recent work showing the advantages of addressing sparse problems through concave penalization (e.g., see Zou and =-=Li, 2008-=-),wethereforegeneralize Ωtoafamilyofnon-convex regularizers as follows: for α ∈(0,1), we define the quasinorm Ω α forall vectors y ∈R p as Ω α { ∑ (y) = G∈G ‖d G ◦ y‖ α 2 } 1 α = ‖ (‖d G ◦ y‖ 2 )G∈G ‖... |