• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Supervised feature selection in graphs with path coding penalties and network flows (2013)

by J Mairal, B Yu
Venue:JMLR
Add To MetaCart

Tools

Sorted by:
Results 1 - 9 of 9

Convex relaxation of combinatorial penalties

by Guillaume Obozinski, Francis Bach , 2011
"... In this paper, we propose an unifying view of several recently proposed structured sparsityinducing norms. We consider the situation of a model simultaneously (a) penalized by a setfunction defined on the support of the unknown parameter vector which represents prior knowledge on supports, and (b) r ..."
Abstract - Cited by 12 (8 self) - Add to MetaCart
In this paper, we propose an unifying view of several recently proposed structured sparsityinducing norms. We consider the situation of a model simultaneously (a) penalized by a setfunction defined on the support of the unknown parameter vector which represents prior knowledge on supports, and (b) regularized in ℓp-norm. We show that the natural combinatorial optimization problems obtained may be relaxed into convex optimization problems and introduce a notion, the lower combinatorial envelope of a set-function, that characterizes the tightness of our relaxations. We moreover establish links with norms based on latent representations including the latent group Lasso and block-coding, and with norms obtained from submodular functions. 1

Efficient RNA isoform identification and quantification from RNA-Seq data with network flows

by Elsa Bernard, Laurent Jacob, Julien Mairal , 2013
"... Several state-of-the-art methods for isoform identification and quantification are based on sparse probabilistic models, such as Lasso regression. However, explicitly listing the — possibly exponentially — large set of candidate transcripts is intractable for genes with many exons. For this reason, ..."
Abstract - Cited by 8 (3 self) - Add to MetaCart
Several state-of-the-art methods for isoform identification and quantification are based on sparse probabilistic models, such as Lasso regression. However, explicitly listing the — possibly exponentially — large set of candidate transcripts is intractable for genes with many exons. For this reason, existing approaches using sparse models are either restricted to genes with few exons, or only run the regression algorithm on a small set of pre-selected isoforms. We introduce a new technique called FlipFlop which can efficiently tackle the sparse estimation problem on the full set of candidate isoforms by using network flow optimization. Our technique removes the need of a preselection step, leading to better isoform identification while keeping a low computational cost. Experiments with synthetic and real RNA-Seq data confirm that our approach is more accurate than alternative methods and one of the fastest available. Source code is freely available as an R package at
(Show Context)

Citation Context

...ariables. To do so, we show that the penalized likelihood maximization can be reformulated as a convex cost network flow problem, which can be solved efficiently (Ahuja et al., 1993; Bertsekas, 1998; =-=Mairal and Yu, 2012-=-). The paper is organized as follows. Section 2 introduces the statistical model (Section 3) and the penalized likelihood approach (Section 2.2) we follow. Our model is similar to the one used by Xia ...

[25.6.2014–3:16pm] [1–9] Paper: OP-CBIO140329 2014, pages 1–9 BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btu317

by Elsa Bernard, Laurent Jacob, Julien Mairal , 2014
"... Efficient RNA isoform identification and quantification from RNA-Seq data with network flows ..."
Abstract - Add to MetaCart
Efficient RNA isoform identification and quantification from RNA-Seq data with network flows

Editor: U.N.Known

by Jin Yu, S. V. N. Vishwanathan, Simon Günter, Nicol N. Schraudolph , 804
"... We extend the well-known BFGS quasi-Newton method and its limited-memory variant LBFGS to the optimization of nonsmooth convex objectives. This is done in a rigorous fashion by generalizing three components of BFGS to subdifferentials: The local quadratic model, the identification of a descent direc ..."
Abstract - Add to MetaCart
We extend the well-known BFGS quasi-Newton method and its limited-memory variant LBFGS to the optimization of nonsmooth convex objectives. This is done in a rigorous fashion by generalizing three components of BFGS to subdifferentials: The local quadratic model, the identification of a descent direction, and the Wolfe line search conditions. We apply the resulting subLBFGS algorithm to L2-regularized risk minimization with the binary hinge loss. To extend our algorithm to the multiclass and multilabel settings we develop a new, efficient, exact line search algorithm. We prove its worst-case time complexity bounds, and show that it can also extend a recently developed bundle method to the multiclass and multilabel settings. We also apply the direction-finding component of our algorithm to L1-regularized risk minimization with logistic loss. In all these contexts our methods perform comparable to or better than specialized state-of-the-art solvers on a number of publicly available datasets. Open source software implementing our algorithms is freely available for download.

Grenoble- Rhône-Alpes THEME

by Université Joseph Fourier
"... ..."
Abstract - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...rmation divergences between distributions (see Figure 4). 6.2.2. Supervised Feature Selection in Graphs with Path Coding Penalties and Network Flows Participants: Julien Mairal, Bin Yu. In this paper =-=[6]-=-, we consider supervised learning problems where the features are embedded in a graph, such as gene expressions in a gene network. In this context, it is of much interest to automatically select a sub...

FlipFlop: Fast Lasso-based Isoform Prediction as a Flow Problem

by Elsa Bernard, Laurent Jacob, Julien Mairal , 2013
"... FlipFlop implements a fast method for de novo transcript discovery and abundance estimation from RNA-Seq data. It differs from Cufflinks by simultaneously performing the transcript and quantitation tasks using a penalized maximum likelihood approach, which leads to improved precision/recall. Other s ..."
Abstract - Add to MetaCart
FlipFlop implements a fast method for de novo transcript discovery and abundance estimation from RNA-Seq data. It differs from Cufflinks by simultaneously performing the transcript and quantitation tasks using a penalized maximum likelihood approach, which leads to improved precision/recall. Other softwares taking this approach have an exponential complexity in the number of exons in the gene. We use a novel algorithm based on network flow formalism, which gives us a polynomial runtime. In practice, FlipFlop was shown to outperform penalized maximum likelihood based softwares in terms of speed and to perform transcript discovery in less than 1/2 second even for large genes. 1
(Show Context)

Citation Context

...ariables. To do so, we show that the penalized likelihood maximization can be reformulated as a convex cost network flow problem, which can be solved efficiently [Ahuja et al., 1993, Bertsekas, 1998, =-=Mairal and Yu, 2012-=-]. For more detail about the statistical model and method, see Bernard et al. [2013] and references therein. 2 Software features FlipFlop takes aligned reads in sam format and offers the following fun...

Structured Sparsity with Group-Graph Regularization

by Xin-yu Dai, Jian-bing Zhang, Shu-jian Huang, Jia-jun Chen, Zhi-hua Zhou
"... In many learning tasks with structural properties, struc-tural sparsity methods help induce sparse models, usu-ally leading to better interpretability and higher gener-alization performance. One popular approach is to use group sparsity regularization that enforces sparsity on the clustered groups o ..."
Abstract - Add to MetaCart
In many learning tasks with structural properties, struc-tural sparsity methods help induce sparse models, usu-ally leading to better interpretability and higher gener-alization performance. One popular approach is to use group sparsity regularization that enforces sparsity on the clustered groups of features, while another popu-lar approach is to adopt graph sparsity regularization that considers sparsity on the link structure of graph embedded features. Both the group and graph struc-tural properties co-exist in many applications. However, group sparsity and graph sparsity have not been consid-ered simultaneously yet. In this paper, we propose a g2-regularization that takes group and graph sparsity into joint consideration, and present an effective approach for its optimization. Experiments on both synthetic and real data show that, enforcing group-graph sparsity lead to better performance than using group sparsity or graph sparsity only.

Bayesian Models for Structured Sparse Estimation via

by Set Cover Prior
"... Abstract. A number of priors have been recently developed for Bayesian esti-mation of sparse models. In many applications the variables are simultaneously relevant or irrelevant in groups, and appropriately modeling this correlation is important for improved sample efficiency. Although group sparse ..."
Abstract - Add to MetaCart
Abstract. A number of priors have been recently developed for Bayesian esti-mation of sparse models. In many applications the variables are simultaneously relevant or irrelevant in groups, and appropriately modeling this correlation is important for improved sample efficiency. Although group sparse priors are also available, most of them are either limited to disjoint groups, or do not infer spar-sity at group level, or fail to induce appropriate patterns of support in the poste-rior. In this paper we tackle this problem by proposing a new framework of prior for overlapped group sparsity. It follows a hierarchical generation from group to variable, allowing group-driven shrinkage and relevance inference. It is also connected with set cover complexity in its maximum a posterior. Analysis on shrinkage profile and conditional dependency unravels favorable statistical be-havior compared with existing priors. Experimental results also demonstrate its superior performance in sparse recovery and compressive sensing. 1
(Show Context)

Citation Context

...n confirms the superiority of SCP in modeling group structured sparsity in comparison to GCP and MRF. SS performs worst as it completely ignores the structure. 5.3 Network Sparsity Following [34] and =-=[42]-=-, we next investigate the network sparsity where each node is a variable and each edge constitutes a group (i.e. all groups have size 2). We tried on four network structures: Email (p = 1, 133, #edge=...

RESEARCH ARTICLE Spatiotemporal Context Awareness for Urban Traffic Modeling and Prediction: Sparse Representation Based Variable Selection

by Su Yang, Shixiong Shi, Xiaobing Hu, Minjie Wang
"... Spatial-temporal correlations among the data play an important role in traffic flow prediction. Correspondingly, traffic modeling and prediction based on big data analytics emerges due to the city-scale interactions among traffic flows. A new methodology based on sparse representation is proposed to ..."
Abstract - Add to MetaCart
Spatial-temporal correlations among the data play an important role in traffic flow prediction. Correspondingly, traffic modeling and prediction based on big data analytics emerges due to the city-scale interactions among traffic flows. A new methodology based on sparse representation is proposed to reveal the spatial-temporal dependencies among traffic flows so as to simplify the correlations among traffic data for the prediction task at a given sensor. Three important findings are observed in the experiments: (1) Only traffic flows immediately prior to the present time affect the formation of current traffic flows, which implies the possi-bility to reduce the traditional high-order predictors into an 1-order model. (2) The spatial context relevant to a given prediction task is more complex than what is assumed to exist locally and can spread out to the whole city. (3) The spatial context varies with the target sensor undergoing prediction and enlarges with the increment of time lag for prediction. Because the scope of human mobility is subject to travel time, identifying the varying spatial context against time lag is crucial for prediction. Since sparse representation can capture the varying spatial context to adapt to the prediction task, it outperforms the traditional meth-ods the inputs of which are confined as the data from a fixed number of nearby sensors. As the spatial-temporal context for any prediction task is fully detected from the traffic data in an automated manner, where no additional information regarding network topology is needed, it has good scalability to be applicable to large-scale networks.
(Show Context)

Citation Context

... [12][13], for traffic flow prediction, the role of global spatial contexts should be explored. In this study, we propose to make use of sparse representation technique as a variable selection method =-=[27]-=- in exploring the spatial correlations among the traffic data of the whole city. The goal of sparse representation is to obtain as small as possible fitting error with as few as possible variables by ...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University