Results 1  10
of
254
Consistency of spectral clustering
, 2004
"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."
Abstract

Cited by 286 (15 self)
 Add to MetaCart
Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spectral clustering algorithms, which cluster the data with the help of eigenvectors of graph Laplacian matrices. We show that one of the two of major classes of spectral clustering (normalized clustering) converges under some very general conditions, while the other (unnormalized), is only consistent under strong additional assumptions, which, as we demonstrate, are not always satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized spectral clustering in practical applications. We believe that methods used in our analysis will provide a basis for future exploration of Laplacianbased methods in a statistical setting.
Consistency of the group lasso and multiple kernel learning
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2007
"... We consider the leastsquare regression problem with regularization by a block 1norm, i.e., a sum of Euclidean norms over spaces of dimensions larger than one. This problem, referred to as the group Lasso, extends the usual regularization by the 1norm where all spaces have dimension one, where it ..."
Abstract

Cited by 162 (28 self)
 Add to MetaCart
We consider the leastsquare regression problem with regularization by a block 1norm, i.e., a sum of Euclidean norms over spaces of dimensions larger than one. This problem, referred to as the group Lasso, extends the usual regularization by the 1norm where all spaces have dimension one, where it is commonly referred to as the Lasso. In this paper, we study the asymptotic model consistency of the group Lasso. We derive necessary and sufficient conditions for the consistency of group Lasso under practical assumptions, such as model misspecification. When the linear predictors and Euclidean norms are replaced by functions and reproducing kernel Hilbert norms, the problem is usually referred to as multiple kernel learning and is commonly used for learning from heterogeneous data sources and for non linear variable selection. Using tools from functional analysis, and in particular covariance operators, we extend the consistency results to this infinite dimensional case and also propose an adaptive scheme to obtain a consistent model estimate, even when the necessary condition required for the non adaptive scheme is not satisfied.
Image Parsing: Unifying Segmentation, Detection, and Recognition
, 2005
"... In this paper we present a Bayesian framework for parsing images into their constituent visual patterns. The parsing algorithm optimizes the posterior probability and outputs a scene representation in a "parsing graph", in a spirit similar to parsing sentences in speech and natural language. The ..."
Abstract

Cited by 160 (18 self)
 Add to MetaCart
In this paper we present a Bayesian framework for parsing images into their constituent visual patterns. The parsing algorithm optimizes the posterior probability and outputs a scene representation in a "parsing graph", in a spirit similar to parsing sentences in speech and natural language. The algorithm constructs the parsing graph and reconfigures it dynamically using a set of reversible Markov chain jumps. This computational framework integrates two popular inference approaches  generative (topdown) methods and discriminative (bottomup) methods. The former formulates the posterior probability in terms of generative models for images defined by likelihood functions and priors. The latter computes discriminative probabilities based on a sequence (cascade) of bottomup tests/filters.
Impact of human mobility on the design of opportunistic forwarding algorithms
 In Proc. IEEE Infocom
, 2006
"... Abstract — Studying transfer opportunities between wireless devices carried by humans, we observe that the distribution of the intercontact time, that is the time gap separating two contacts of the same pair of devices, exhibits a heavy tail such as one of a power law, over a large range of value. ..."
Abstract

Cited by 155 (10 self)
 Add to MetaCart
Abstract — Studying transfer opportunities between wireless devices carried by humans, we observe that the distribution of the intercontact time, that is the time gap separating two contacts of the same pair of devices, exhibits a heavy tail such as one of a power law, over a large range of value. This observation is confirmed on six distinct experimental data sets. It is at odds with the exponential decay implied by most mobility models. In this paper, we study how this new characteristic of human mobility impacts a class of previously proposed forwarding algorithms. We use a simplified model based on the renewal theory to study how the parameters of the distribution impact the delay performance of these algorithms. We make recommendation for the design of well founded opportunistic forwarding algorithms, in the context of human carried devices. I.
Mathematical Models for Local Nontexture Inpaintings
 SIAM J. Appl. Math
, 2002
"... Inspired by the recent work of Bertalmio et al. on digital inpaintings [SIGGRAPH 2000], we develop general mathematical models for local inpaintings of nontexture images. On smooth regions, inpaintings are connected to the harmonic and biharmonic extensions, and inpainting orders are analyzed. For i ..."
Abstract

Cited by 148 (30 self)
 Add to MetaCart
Inspired by the recent work of Bertalmio et al. on digital inpaintings [SIGGRAPH 2000], we develop general mathematical models for local inpaintings of nontexture images. On smooth regions, inpaintings are connected to the harmonic and biharmonic extensions, and inpainting orders are analyzed. For inpaintings involving the recovery of edges, we study a variational model that is closely connected to the classical total variation (TV) denoising model of Rudin, Osher, and Fatemi [PhSG D, 60 (1992), pp. 259268]. Other models are also discussed based on the MumfordShah regularity [Comm. Pure Appl. Mathq XLII (1989), pp. 577685] and curvature driven di#usions (CDD) of Chan and Shen [J. Visual Comm. Image Rep., 12 (2001)]. The broad applications of the inpainting models are demonstrated through restoring scratched old photos, disocclusion in vision analysis, text removal, digital zooming, and edgebased image coding.
MAP estimation via agreement on trees: Messagepassing and linear programming
, 2002
"... We develop and analyze methods for computing provably optimal maximum a posteriori (MAP) configurations for a subclass of Markov random fields defined on graphs with cycles. By decomposing the original distribution into a convex combination of treestructured distributions, we obtain an upper bound ..."
Abstract

Cited by 132 (8 self)
 Add to MetaCart
We develop and analyze methods for computing provably optimal maximum a posteriori (MAP) configurations for a subclass of Markov random fields defined on graphs with cycles. By decomposing the original distribution into a convex combination of treestructured distributions, we obtain an upper bound on the optimal value of the original problem (i.e., the log probability of the MAP assignment) in terms of the combined optimal values of the tree problems. We prove that this upper bound is tight if and only if all the tree distributions share an optimal configuration in common. An important implication is that any such shared configuration must also be a MAP configuration for the original distribution. Next we develop two approaches to attempting to obtain tight upper bounds: (a) a treerelaxed linear program (LP), which is derived from the Lagrangian dual of the upper bounds; and (b) a treereweighted maxproduct messagepassing algorithm that is related to but distinct from the maxproduct algorithm. In this way, we establish a connection between a certain LP relaxation of the modefinding problem, and a reweighted form of the maxproduct (minsum) messagepassing algorithm.
The effect of network topology on the spread of epidemics
 IN IEEE INFOCOM
, 2005
"... Many network phenomena are well modeled as spreads of epidemics through a network. Prominent examples include the spread of worms and email viruses, and, more generally, faults. Many types of information dissemination can also be modeled as spreads of epidemics. In this paper we address the question ..."
Abstract

Cited by 115 (8 self)
 Add to MetaCart
Many network phenomena are well modeled as spreads of epidemics through a network. Prominent examples include the spread of worms and email viruses, and, more generally, faults. Many types of information dissemination can also be modeled as spreads of epidemics. In this paper we address the question of what makes an epidemic either weak or potent. More precisely, we identify topological properties of the graph that determine the persistence of epidemics. In particular, we show that if the ratio of cure to infection rates is smaller than the spectral radius of the graph, then the mean epidemic lifetime is of order log n, where n is the number of nodes. Conversely, if this ratio is bigger than a generalization of the isoperimetric constant of the graph, then the mean epidemic lifetime is of order � Ò�, for a positive constant �. We apply these results to several network topologies including the hypercube, which is a representative connectivity graph for a distributed hash table, the complete graph, which is an important connectivity graph for BGP, and the power law graph, of which the ASlevel Internet graph is a prime example. We also study the star topology and the ErdősRényi graph as their epidemic spreading behaviors determine the spreading behavior of power law graphs.
MAP estimation via agreement on (hyper)trees: Messagepassing and linear programming approaches
 IEEE Transactions on Information Theory
, 2002
"... We develop an approach for computing provably exact maximum a posteriori (MAP) configurations for a subclass of problems on graphs with cycles. By decomposing the original problem into a convex combination of treestructured problems, we obtain an upper bound on the optimal value of the original ..."
Abstract

Cited by 107 (11 self)
 Add to MetaCart
We develop an approach for computing provably exact maximum a posteriori (MAP) configurations for a subclass of problems on graphs with cycles. By decomposing the original problem into a convex combination of treestructured problems, we obtain an upper bound on the optimal value of the original problem (i.e., the log probability of the MAP assignment) in terms of the combined optimal values of the tree problems. We prove that this upper bound is met with equality if and only if the tree problems share an optimal configuration in common. An important implication is that any such shared configuration must also be a MAP configuration for the original problem. Next we present and analyze two methods for attempting to obtain tight upper bounds: (a) a treereweighted messagepassing algorithm that is related to but distinct from the maxproduct (minsum) algorithm; and (b) a treerelaxed linear program (LP), which is derived from the Lagrangian dual of the upper bounds. Finally, we discuss the conditions that govern when the relaxation is tight, in which case the MAP configuration can be obtained. The analysis described here generalizes naturally to convex combinations of hypertreestructured distributions.
Finite Markov Chains and Algorithmic Applications
 IN LONDON MATHEMATICAL SOCIETY STUDENT TEXTS
, 2001
"... ..."
Fastest Mixing Markov Chain on A Graph
 SIAM REVIEW
, 2003
"... We consider a symmetric random walk on a connected graph, where each edge is labeled with the probability of transition between the two adjacent vertices. The associated Markov chain has a uniform equilibrium distribution; the rate of convergence to this distribution, i.e. the mixing rate of the Mar ..."
Abstract

Cited by 90 (15 self)
 Add to MetaCart
We consider a symmetric random walk on a connected graph, where each edge is labeled with the probability of transition between the two adjacent vertices. The associated Markov chain has a uniform equilibrium distribution; the rate of convergence to this distribution, i.e. the mixing rate of the Markov chain, is determined by the second largest (in magnitude) eigenvalue of the transition matrix. In this paper we address the problem of assigning probabilities to the edges of the graph in such a way as to minimize the second largest magnitude eigenvalue, i.e., the problem of finding the fastest mixing Markov chain on the graph. We show that