Results 1 
9 of
9
Finding Exemplars from Pairwise Dissimilarities via Simultaneous Sparse Recovery
"... Given pairwise dissimilarities between data points, we consider the problem of finding a subset of data points, called representatives or exemplars, that can efficiently describe the data collection. We formulate the problem as a rowsparsity regularized trace minimization problem that can be solved ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
(Show Context)
Given pairwise dissimilarities between data points, we consider the problem of finding a subset of data points, called representatives or exemplars, that can efficiently describe the data collection. We formulate the problem as a rowsparsity regularized trace minimization problem that can be solved efficiently using convex programming. The solution of the proposed optimization program finds the representatives and the probability that each data point is associated with each one of the representatives. We obtain the range of the regularization parameter for which the solution of the proposed optimization program changes from selecting one representative for all data points to selecting all data points as representatives. When data points are distributed around multiple clusters according to the dissimilarities, we show that the data points in each cluster select representatives only from that cluster. Unlike metricbased methods, our algorithm can be applied to dissimilarities that are asymmetric or violate the triangle inequality, i.e., it does not require that the pairwise dissimilarities come from a metric. We demonstrate the effectiveness of the proposed algorithm on synthetic data as well as realworld image and text data. 1
Minimizing Energies with Hierarchical Costs
 INTERNATIONAL JOURNAL OF COMPUTER VISION
, 2012
"... Computer vision is full of problems elegantly expressed in terms of energy minimization. We characterize a class of energies with hierarchical costs and propose a novel hierarchical fusion algorithm. Hierarchical costs are natural for modeling an array of difficult problems. For example, in semantic ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
(Show Context)
Computer vision is full of problems elegantly expressed in terms of energy minimization. We characterize a class of energies with hierarchical costs and propose a novel hierarchical fusion algorithm. Hierarchical costs are natural for modeling an array of difficult problems. For example, in semantic segmentation one could rule out unlikely object combinations via hierarchical context. In geometric model estimation, one could penalize the number of unique model families in a solution, not just the number of models—a kind of hierarchical MDL criterion. Hierarchical fusion uses the wellknown αexpansion algorithm as a subroutine, and offers a much better approximation bound in important cases.
A global approach for the detection of vanishing points and mutually orthogonal vanishing directions
 In CVPR
, 2013
"... This article presents a new global approach for detecting vanishing points and groups of mutually orthogonal vanishing directions using lines detected in images of manmade environments. These two multimodel fitting problems are respectively cast as Uncapacited Facility Location (UFL) and Hierarch ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
This article presents a new global approach for detecting vanishing points and groups of mutually orthogonal vanishing directions using lines detected in images of manmade environments. These two multimodel fitting problems are respectively cast as Uncapacited Facility Location (UFL) and Hierarchical Facility Location (HFL) instances that are efficiently solved using a message passing inference algorithm. We also propose new functions for measuring the consistency between an edge and a putative vanishing point, and for computing the vanishing point defined by a subset of edges. Extensive experiments in both synthetic and real images show that our algorithms outperform the stateoftheart methods while keeping computation tractable. In addition, we show for the first time results in simultaneously detecting multiple Manhattanworld configurations that can either share one vanishing direction (Atlanta world) or be completely independent. 1.
Recent developments in clustering algorithms
 Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
, 2012
"... Abstract. In this paper, we give a short review of recent developments in clustering. We shortly summarize important clustering paradigms before addressing important topics including metric adaptation in clustering, dealing with nonEuclidean data or large data sets, clustering evaluation, and lear ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract. In this paper, we give a short review of recent developments in clustering. We shortly summarize important clustering paradigms before addressing important topics including metric adaptation in clustering, dealing with nonEuclidean data or large data sets, clustering evaluation, and learning theoretical foundations. 1
1Dissimilaritybased Sparse Subset Selection
"... Abstract—Finding an informative subset of a large number of data points or models is at the center of many problems in machine learning, computer vision, bio/health informatics and image/signal processing. Given pairwise dissimilarities between the elements of a ‘source set ’ and a ‘target set, ’ we ..."
Abstract
 Add to MetaCart
Abstract—Finding an informative subset of a large number of data points or models is at the center of many problems in machine learning, computer vision, bio/health informatics and image/signal processing. Given pairwise dissimilarities between the elements of a ‘source set ’ and a ‘target set, ’ we consider the problem of finding a subset of the source set, called representatives or exemplars, that can efficiently describe the target set. We formulate the problem as a rowsparsity regularized trace minimization problem. Since the proposed formulation is, in general, an NPhard problem, we consider a convex relaxation. The solution of our proposed optimization program finds the representatives and the probability that each element of the target set is associated with the representatives. We analyze the solution of our proposed optimization as a function of the regularization parameter. We show that when the two sets jointly partition into multiple groups, the solution of our proposed optimization program finds representatives from all groups and reveals clustering of the sets. In addition, we show that our proposed formulation can effectively deal with outliers. Our algorithm works with arbitrary dissimilarities, which can be asymmetric or violate the triangle inequality. To efficiently implement our proposed algorithm, we consider an Alternating Direction Method of Multipliers (ADMM) framework, which results in quadratic complexity in the problem size. We show that the ADMM implementation allows to parallelize the algorithm, hence further reducing the computational cost. Finally, by experiments on realworld datasets, we show that our proposed algorithm improves the state of the art on the two problems of scene categorization using representative images and timeseries modeling and segmentation using representative models. Index Terms—Representatives, pairwise dissimilarities, simultaneous sparse recovery, encoding, convex programming, ADMM optimization, sampling, clustering, outlier detection, model identification, timeseries data, video summarization, activity clustering, scene recognition F 1
PARALLEL HIERARCHICAL AFFINITY PROPAGATION WITH MAPREDUCE
"... Abstract. The accelerated evolution and explosion of the Internet and social media is generating voluminous quantities of data (on zettabyte scales). Paramount amongst the desires to manipulate and extract actionable intelligence from vast big data volumes is the need for scalable, performancecons ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. The accelerated evolution and explosion of the Internet and social media is generating voluminous quantities of data (on zettabyte scales). Paramount amongst the desires to manipulate and extract actionable intelligence from vast big data volumes is the need for scalable, performanceconscious analytics algorithms. To directly address this need, we propose a novel MapReduce implementation of the exemplarbased clustering algorithm known as Affinity Propagation. Our parallelization strategy extends to the multilevel Hierarchical Affinity Propagation algorithm and enables tiered aggregation of unstructured data with minimal free parameters, in principle requiring only a similarity measure between data points. We detail the linear runtime complexity of our approach, overcoming the limiting quadratic complexity of the original algorithm. Experimental validation of our clustering methodology on a variety of synthetic and real data sets (e.g. images and point data) demonstrates our competitiveness against other stateoftheart MapReduce clustering techniques. 1.
Topical Structure in Long Informal Documents
"... iAbstract This dissertation describes a research project concerned with establishing the topical structure of long informal documents. In this research, we place special emphasis on literary data, but also work with speech transcripts and several other types of data. It has long been acknowledged t ..."
Abstract
 Add to MetaCart
iAbstract This dissertation describes a research project concerned with establishing the topical structure of long informal documents. In this research, we place special emphasis on literary data, but also work with speech transcripts and several other types of data. It has long been acknowledged that discourse is more than a sequence of sentences but, for the purposes of many Natural Language Processing tasks, it is often modelled exactly in that way. In this dissertation, we propose a practical approach to modelling discourse structure, with an emphasis on it being computationally feasible and easily applicable. Instead of following one of the many linguistic theories of discourse structure, we attempt to model the structure of a document as a tree of topical segments. Each segment encapsulates a span that concentrates on a particular topic at a certain level of granularity. Each span can be further subsegmented based on finer fluctuations of topic. The lowest (most refined) level of segmentation is individual paragraphs. In our model, each topical segment is described by a segment centre – a sentence or a paragraph that best captures the contents of the segment. In this manner, the segmenter
Hierarchical Topical Segmentation with Affinity Propagation
"... We present a hierarchical topical segmenter for free text. Hierarchical Affinity Propagation for Segmentation (HAPS) is derived from a clustering algorithm Affinity Propagation. Given a document, HAPS builds a topical tree. The nodes at the top level correspond to the most prominent shifts of topic ..."
Abstract
 Add to MetaCart
(Show Context)
We present a hierarchical topical segmenter for free text. Hierarchical Affinity Propagation for Segmentation (HAPS) is derived from a clustering algorithm Affinity Propagation. Given a document, HAPS builds a topical tree. The nodes at the top level correspond to the most prominent shifts of topic in the document. Nodes at lower levels correspond to finer topical fluctuations. For each segment in the tree, HAPS identifies a segment centre – a sentence or a paragraph which best describes its contents. We evaluate the segmenter on a subset of a novel manually segmented by several annotators, and on a dataset of Wikipedia articles. The results suggest that hierarchical segmentations produced by HAPS are better than those obtained by iteratively running several onelevel segmenters. An additional advantage of HAPS is that it does not require the “gold standard ” number of segments in advance. 1