## Vision, Perception and Multimedia Understanding (2010)

### BibTeX

@MISC{Mairal10vision,perception,

author = {Julien Mairal and Rodolphe Jenatton and Guillaume Obozinski and Francis Bach and Julien Mairal and Rodolphe Jenatton and Guillaume Obozinski and Francis Bach and Équipe-projet Willow},

title = {Vision, Perception and Multimedia Understanding},

year = {2010}

}

### OpenURL

### Abstract

apport de recherche

### Citations

816 | Least angle regression - Efron, Hastie, et al. - 2004 |

558 | Model selection and estimation in regression with grouped variables
- Yuan, Lin
- 2006
(Show Context)
Citation Context ...he Lasso [13]. When these coefficients are organized in groups, a penalty encoding explicitly this prior knowledge can improve the prediction performance and/or interpretability of the learned models =-=[12, 14, 15, 16]-=-. Such a penalty might for example take the form Ω(w) � ∑ ηgmax j∈g |wj| = ∑ ηg‖wg‖∞, (2) RR n° 7372 g∈G g∈GNetwork Flow Algorithms for Structured Sparsity 4 where G is a set of groups of indices, wj... |

550 | Tarjan, “A new approach to the maximum flow problem
- Goldberg, E
- 1986
(Show Context)
Citation Context ...m should be less than, or equal to, the sum of the constraints on the vectors ξ g . The optimal vector γ therefore gives a lower bound ||u −γ|| 2 2 /2 on the optimal cost. Then, the maximum-flow step =-=[24]-=- tries to find a feasible flow such that the vector ¯ ξ matches γ. If ¯ξ = γ, then the cost of the flow reaches the lower bound, and the flow is optimal. If ¯ξ ̸= γ, the lower bound cannot be reached,... |

415 | A fast iterative shrinkage-thresholding algorithm for linear inverse problems
- Beck, Teboulle
- 2009
(Show Context)
Citation Context ... ℓ1-norm has emerged as a powerful tool for addressing this combinatorial variable selection problem, relying on both a well-developed theory (see [1] and references therein) and efficient algorithms =-=[2, 3, 4]-=-. The ℓ1-norm primarily encourages sparse solutions, regardless of the potential structural relationships (e.g., spatial, temporal or hierarchical) existing between the variables. Much effort has rece... |

370 | Robust face recognition via sparse representation
- Wright, Yang, et al.
(Show Context)
Citation Context ...d of n pixels, we model y as a sparse linear combination of p other images X ∈ R n×p , plus an error term e in R n , i.e., y ≈ Xw + e for some sparse vector w in R p . This approach is reminiscent of =-=[30]-=- in the context of face recognition, where e is further made sparse to deal with small occlusions. The term Xw accounts for background parts present in both y and X, while e contains specific, or fore... |

297 |
Maximal flow through a network
- FULKERSON
- 1956
(Show Context)
Citation Context ... ¯ ξ matches γ. If ¯ξ = γ, then the cost of the flow reaches the lower bound, and the flow is optimal. If ¯ξ ̸= γ, the lower bound cannot be reached, and we construct a minimum (s,t)-cut of the graph =-=[25]-=- that defines two disjoints sets of nodes V + and V − ; V + is the part of the graph that can potentially receive more flow from the source, whereas all arcs linking s to V − are saturated. The proper... |

207 |
Gradient methods for minimizing composite objective function
- Nesterov
- 2007
(Show Context)
Citation Context ... ℓ1-norm has emerged as a powerful tool for addressing this combinatorial variable selection problem, relying on both a well-developed theory (see [1] and references therein) and efficient algorithms =-=[2, 3, 4]-=-. The ℓ1-norm primarily encourages sparse solutions, regardless of the potential structural relationships (e.g., spatial, temporal or hierarchical) existing between the variables. Much effort has rece... |

189 | Simultaneous analysis of Lasso and Dantzig selector
- Bickel, Ritov, et al.
- 2007
(Show Context)
Citation Context ... selected to describe the data. Regularization by the ℓ1-norm has emerged as a powerful tool for addressing this combinatorial variable selection problem, relying on both a well-developed theory (see =-=[1]-=- and references therein) and efficient algorithms [2, 3, 4]. The ℓ1-norm primarily encourages sparse solutions, regardless of the potential structural relationships (e.g., spatial, temporal or hierarc... |

158 | On implementing the push-relabel method for the maximum flow problem
- Cherkassky, Goldberg
- 1997
(Show Context)
Citation Context ...N]. • Efficient max-flow algorithm: We have implemented the “push-relabel” algorithm of [24] to solve our max-flow problems, using classical heuristics that significantly speed it up in practice (see =-=[24, 27]-=-). Our implementation uses the so-called “highest-active vertex selection rule, globalandgapheuristics”(see[24,27]), andhasaworst-casecomplexityofO(|V| 2 |E| 1/2 ) for a graph (V,E,s,t). This algorith... |

148 | Convex Analysis and Nonlinear Optimization: Theory and Examples
- Borwein, Lewis
- 2000
(Show Context)
Citation Context ...larizations [5, 17, 28]. We use it here to monitor the convergence of theproximalmethodthroughadualitygap, anddefineaproperoptimalitycriterionforproblem(1). We denote by f∗ the Fenchel conjugate of f =-=[29]-=-, defined by f∗ (κ) � supz[z⊤κ−f(z)]. The duality gap for problem (1) can be derived from standard Fenchel duality arguments [29] and it is equal to f(w)+λΩ(w)+f ∗ (−κ) for w,κ in Rp with Ω∗ (κ) ≤ λ. ... |

127 |
A Fast Parametric Maximum-Flow Algorithm and Applications
- Gallo, Grigoriadis, et al.
- 1989
(Show Context)
Citation Context ...nggroupsis more difficult. Hochbaum and Hong have shown in [19] that quadratic min-cost flow problems can be reduced to a specific parametric max-flow problem, for which an efficient algorithm exists =-=[22]-=-. 4 While this approach could be used to solve Eq. (4), it ignores the fact that our graphs have non-zero costs only on edges leading to the sink. To take advantage of this specificity, we propose the... |

117 | Group Lasso with overlap and graph Lasso
- Jacob, Obozinski, et al.
(Show Context)
Citation Context ...between the variables. Much effort has recently been devoted to designing sparsity-inducing regularizations capable of encoding higherorder information about allowed patterns of non-zero coefficients =-=[5, 6, 7, 8, 9]-=-, with successful applications in bioinformatics [6, 10], topic modeling [11] and computer vision [8]. Byconsideringsumsofnormsofappropriatesubsets,orgroups, ofvariables,theseregularizations control t... |

99 | Structured variable selection with sparsity-inducing norms - Jenatton, Audibert, et al. |

92 | Proximal splitting methods in signal processing,” in Fixed-Point Algorithms for Inverse Problems in Science
- Combettes, Pesquet
- 2010
(Show Context)
Citation Context ...hey are well suited to minimizing the sum f +λΩ of two convex terms, a smooth function f —continuously differentiable with Lipschitz-continuous gradient— and a potentially non-smooth function λΩ (see =-=[18]-=- and references therein). At each iteration, the function f is linearized at the current estimate w0 and the so-called proximal problem has to be solved: min w∈Rpf(w0)+(w−w0) ⊤ ∇f(w0)+λΩ(w)+ L 2 ‖w−w0... |

84 | A Unified framework for highdimensional analysis of M-Estimators with decomposable regularizers. Statis
- Negahban, Ravikumar, et al.
- 2012
(Show Context)
Citation Context ...puted in linear time [21]. 3.3 Computation of the Dual Norm The dual norm Ω∗ ofΩ, defined for any vector κ in Rp by Ω∗ (κ) � maxΩ(z)≤1z⊤κ, is a key quantity to study sparsity-inducing regularizations =-=[5, 17, 28]-=-. We use it here to monitor the convergence of theproximalmethodthroughadualitygap, anddefineaproperoptimalitycriterionforproblem(1). We denote by f∗ the Fenchel conjugate of f [29], defined by f∗ (κ)... |

82 | Exploring large feature spaces with hierarchical multiple kernel learning
- Bach
- 2009
(Show Context)
Citation Context ... selected in groups rather than individually. When the groups overlap, Ω is still a norm and sets groups of variables to zero together [5]. The latter setting has first been considered for hierarchies=-=[7, 10, 17]-=-, and then extended to generalgroup structures [5]. 1 Solving Eq. (1) in this context becomes challenging and is the topic of this paper. Following [11] who tackled the case of hierarchical groups, we... |

76 | Proximal methods for sparse hierarchical dictionary learning
- Jenatton, Mairal, et al.
- 2010
(Show Context)
Citation Context ...ing regularizations capable of encoding higherorder information about allowed patterns of non-zero coefficients [5, 6, 7, 8, 9], with successful applications in bioinformatics [6, 10], topic modeling =-=[11]-=- and computer vision [8]. Byconsideringsumsofnormsofappropriatesubsets,orgroups, ofvariables,theseregularizations control the sparsity patterns of the solutions. The underlying optimization problem is... |

75 | The composite absolute penalties family for grouped and hierarchical variable selection
- Zhao, Rocha, et al.
(Show Context)
Citation Context ...between the variables. Much effort has recently been devoted to designing sparsity-inducing regularizations capable of encoding higherorder information about allowed patterns of non-zero coefficients =-=[5, 6, 7, 8, 9]-=-, with successful applications in bioinformatics [6, 10], topic modeling [11] and computer vision [8]. Byconsideringsumsofnormsofappropriatesubsets,orgroups, ofvariables,theseregularizations control t... |

70 | Efficient projections onto the ℓ1-ball for learning in high dimensions
- Duchi, Shalev-Shwartz, et al.
- 2008
(Show Context)
Citation Context ...9]. One of the simplest cases, where G contains a single group g as in Figure 1(a), can be solved by an orthogonal projection on the ℓ1-ball of radius ληg. It has been shown, both in machine learning =-=[20]-=- and operations research [19, 21], that such a projection can be done in O(p) operations. When the group structure is a tree as in Figure 1(d), strategies developed in the two communities are also sim... |

66 | The benefit of group sparsity
- Huang, Zhang
- 2010
(Show Context)
Citation Context ...he Lasso [13]. When these coefficients are organized in groups, a penalty encoding explicitly this prior knowledge can improve the prediction performance and/or interpretability of the learned models =-=[12, 14, 15, 16]-=-. Such a penalty might for example take the form Ω(w) � ∑ ηgmax j∈g |wj| = ∑ ηg‖wg‖∞, (2) RR n° 7372 g∈G g∈GNetwork Flow Algorithms for Structured Sparsity 4 where G is a set of groups of indices, wj... |

65 | Tree-guided group lasso for multi-task regression with structured sparsity
- Kim, Xing
- 2010
(Show Context)
Citation Context ...designing sparsity-inducing regularizations capable of encoding higherorder information about allowed patterns of non-zero coefficients [5, 6, 7, 8, 9], with successful applications in bioinformatics =-=[6, 10]-=-, topic modeling [11] and computer vision [8]. Byconsideringsumsofnormsofappropriatesubsets,orgroups, ofvariables,theseregularizations control the sparsity patterns of the solutions. The underlying op... |

61 | Learning with structured sparsity
- Huang, Zhang, et al.
- 2009
(Show Context)
Citation Context ...between the variables. Much effort has recently been devoted to designing sparsity-inducing regularizations capable of encoding higherorder information about allowed patterns of non-zero coefficients =-=[5, 6, 7, 8, 9]-=-, with successful applications in bioinformatics [6, 10], topic modeling [11] and computer vision [8]. Byconsideringsumsofnormsofappropriatesubsets,orgroups, ofvariables,theseregularizations control t... |

38 |
An O(n) Algorithm for Quadratic Knapsack Problems
- BRUCKER
- 1984
(Show Context)
Citation Context ..., where G contains a single group g as in Figure 1(a), can be solved by an orthogonal projection on the ℓ1-ball of radius ληg. It has been shown, both in machine learning [20] and operations research =-=[19, 21]-=-, that such a projection can be done in O(p) operations. When the group structure is a tree as in Figure 1(d), strategies developed in the two communities are also similar [11, 19], and solve the prob... |

28 |
Linear Network Optimization
- Bertsekas
- 1991
(Show Context)
Citation Context ...nts sets of nodes V + and V − ; V + is the part of the graph that can potentially receive more flow from the source, whereas all arcs linking s to V − are saturated. The properties of a min (s,t)-cut =-=[26]-=- imply that there are no arcs from V + to V − (arcs inside V have infinite 4 By definition, a parametric max-flow problem consists in solving, for every value of a parameter, a max-flow problem on a g... |

11 | Collaborative hierarchical sparse modeling
- Sprechmann, Ramirez, et al.
- 2010
(Show Context)
Citation Context ...e pixels have values in [0,1]); no significant improvements were observed for lower levels of noise. 8 The simplified case where Ωtree and Ωjoint are the ℓ1- and mixed ℓ1/ℓ2-norms [14] corresponds to =-=[31]-=-. RR n° 7372Network Flow Algorithms for Structured Sparsity 12 Mean Square Error Denoising Experiment: Mean Square Error 0.21 Flat Tree 0.2 Multi−task Tree 0.19 0 100 200 300 400 Dictionary Size inri... |

10 |
About Strongly Polynomial Time Algorithms for Quadratic Optimization over Submodular Constraints
- Hochbaum, Hong
- 1995
(Show Context)
Citation Context ...or Structured Sparsity 7 inria-00512556, version 1 - 30 Aug 2010 3.2 Computation of the Proximal Operator Quadratic min-cost flow problems have been well studied in the operations research literature =-=[19]-=-. One of the simplest cases, where G contains a single group g as in Figure 1(a), can be solved by an orthogonal projection on the ℓ1-ball of radius ληg. It has been shown, both in machine learning [2... |

7 | Experimental evaluation of a parametric flow algorithm
- Babenko, Goldberg
(Show Context)
Citation Context ...y on edges leading to the sink. To take advantage of this specificity, we propose the dedicated Algorithm 1. Our method clearly shares some similarities with a simplified version of [22] presented in =-=[23]-=-, namely a divide and conquer strategy. Nonetheless, we performed an empirical comparison described in Appendix D, which shows that our dedicated algorithm has significantly better performance in prac... |

1 |
Model-based compressivesensing
- Baraniuk, Cevher, et al.
(Show Context)
Citation Context |

1 | The Group-Lassoforgeneralizedlinearmodels: uniquenessofsolutions and efficient algorithms - Fischer - 2008 |

1 |
Jointcovariateselectionandjointsubspaceselection for multiple classification problems
- Obozinski, Jordan
(Show Context)
Citation Context ...he Lasso [13]. When these coefficients are organized in groups, a penalty encoding explicitly this prior knowledge can improve the prediction performance and/or interpretability of the learned models =-=[12, 14, 15, 16]-=-. Such a penalty might for example take the form Ω(w) � ∑ ηgmax j∈g |wj| = ∑ ηg‖wg‖∞, (2) RR n° 7372 g∈G g∈GNetwork Flow Algorithms for Structured Sparsity 4 where G is a set of groups of indices, wj... |

1 |
inria-00512556, version 1 - 30 Aug 2010 RR n° 7372 version 1 - 30 Aug 2010 Centre de recherche INRIA Paris – Rocquencourt Domaine de Voluceau - Rocquencourt - BP 105 - 78153 Le Chesnay Cedex (France) Centre de recherche INRIA Bordeaux – Sud Ouest : Domain
- Press
- 2004
(Show Context)
Citation Context ...≤ αg, with the additional |G| conic constraints ‖zg‖∞ ≤ αg. This primal problem is convex and satisfies Slater’s conditions for generalized conic inequalities, which implies that strong duality holds =-=[32]-=-. We now consider the Lagrangian L defined as L(z,αg,τ,γg,ξ) = κ ⊤ z+τ(1− ∑ ηgαg)+ ∑ g∈G g∈G ( αg zg ) ⊤( γg with the dual variables {τ,(γg)g∈G,ξ} ∈ R +×R|G|×R p×|G| such that for all g ∈ G, ξ g j = 0... |