Results 11  20
of
649
Multivariate information bottleneck
, 2001
"... The Information bottleneck method is an unsupervised nonparametric data organization technique. Given a joint distribution¢¤£¦¥¨§�©� � , this method constructs a new variable � that extracts partitions, or clusters, over the values of ¥ that are informative about ©. The information bottleneck has a ..."
Abstract

Cited by 96 (13 self)
 Add to MetaCart
(Show Context)
The Information bottleneck method is an unsupervised nonparametric data organization technique. Given a joint distribution¢¤£¦¥¨§�©� � , this method constructs a new variable � that extracts partitions, or clusters, over the values of ¥ that are informative about ©. The information bottleneck has already been applied to document classification, gene expression, neural code, and spectral analysis. In this paper, we introduce a general principled framework for multivariate extensions of the information bottleneck method. This allows us to consider multiple systems of data partitions that are interrelated. Our approach utilizes Bayesian networks for specifying the systems of clusters and what information each captures. We show that this construction provides insight about bottleneck variations and enables us to characterize solutions of these variations. We also present a general framework for iterative algorithms for constructing solutions, and apply it to several examples. 1
An efficient earth mover’s distance algorithm for robust histogram comparison
 PAMI
, 2007
"... DRAFT We propose EMDL1: a fast and exact algorithm for computing the Earth Mover’s Distance (EMD) between a pair of histograms. The efficiency of the new algorithm enables its application to problems that were previously prohibitive due to high time complexities. The proposed EMDL1 significantly s ..."
Abstract

Cited by 93 (5 self)
 Add to MetaCart
(Show Context)
DRAFT We propose EMDL1: a fast and exact algorithm for computing the Earth Mover’s Distance (EMD) between a pair of histograms. The efficiency of the new algorithm enables its application to problems that were previously prohibitive due to high time complexities. The proposed EMDL1 significantly simplifies the original linear programming formulation of EMD. Exploiting the L1 metric structure, the number of unknown variables in EMDL1 is reduced to O(N) from O(N 2) of the original EMD for a histogram with N bins. In addition, the number of constraints is reduced by half and the objective function of the linear program is simplified. Formally without any approximation, we prove that the EMDL1 formulation is equivalent to the original EMD with a L1 ground distance. To perform the EMDL1 computation, we propose an efficient treebased algorithm, TreeEMD. TreeEMD exploits the fact that a basic feasible solution of the simplex algorithmbased solver forms a spanning tree when we interpret EMDL1 as a network flow optimization problem. We empirically show that this new algorithm has average time complexity of O(N 2), which significantly improves the best reported supercubic complexity of the original EMD. The accuracy of the proposed methods is evaluated by
PhotoTOC: Automatic Clustering for Browsing Personal Photographs
, 2002
"... This paper presents Photo Table Of Contents (PhotoTOC), a system that helps users find digital photographs in their own collection of photographs. PhotoTOC is a browsing user interface that uses an overview+detail design. The detail view is a temporally ordered list of all of the user's photogr ..."
Abstract

Cited by 87 (0 self)
 Add to MetaCart
(Show Context)
This paper presents Photo Table Of Contents (PhotoTOC), a system that helps users find digital photographs in their own collection of photographs. PhotoTOC is a browsing user interface that uses an overview+detail design. The detail view is a temporally ordered list of all of the user's photographs. The overview of the user's collection is automatically generated by an image clustering algorithm, which clusters on the creation time and the color of the photographs. PhotoTOC was tested on users' own photographs against three other browsers. Searching for images with PhotoTOC was subjectively rated easier than all of the other browsers. This result shows that automatic organization of personal photographs facilitates efficient and satisfying search.
Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions
 INTERNATIONAL JOURNAL OF MATHEMATICAL MODELS AND METHODS IN APPLIED SCIENCES
, 2007
"... Distance or similarity measures are essential to solve many pattern recognition problems such as classification, clustering, and retrieval problems. Various distance/similarity measures that are applicable to compare two probability density functions, pdf in short, are reviewed and categorized in b ..."
Abstract

Cited by 86 (0 self)
 Add to MetaCart
Distance or similarity measures are essential to solve many pattern recognition problems such as classification, clustering, and retrieval problems. Various distance/similarity measures that are applicable to compare two probability density functions, pdf in short, are reviewed and categorized in both syntactic and semantic relationships. A correlation coefficient and a hierarchical clustering technique are adopted to reveal similarities among numerous distance/similarity measures.
Diffusion distance for histogram comparison
 In CVPR06
, 2006
"... In this paper we propose diffusion distance, a new dissimilarity measure between histogrambased descriptors. We define the difference between two histograms to be a temperature field. We then study the relationship between histogram similarity and a diffusion process, showing how diffusion handles ..."
Abstract

Cited by 80 (2 self)
 Add to MetaCart
In this paper we propose diffusion distance, a new dissimilarity measure between histogrambased descriptors. We define the difference between two histograms to be a temperature field. We then study the relationship between histogram similarity and a diffusion process, showing how diffusion handles deformation as well as quantization effects. As a result, the diffusion distance is derived as the sum of dissimilarities over scales. Being a crossbin histogram distance, the diffusion distance is robust to deformation, lighting change and noise in histogrambased local descriptors. In addition, it enjoys linear computational complexity which significantly improves previously proposed crossbin distances with quadratic complexity or higher. We tested the proposed approach on both shape recognition and interest point matching tasks using several multidimensional histogrambased descriptors including shape context, SIFT, and spin images. In all experiments, the diffusion distance performs excellently in both accuracy and efficiency in comparison with other stateoftheart distance measures. In particular, it performs as accurately as the Earth Mover’s Distance with much greater efficiency. 1.
The power of word clusters for text classification
 In 23rd European Colloquium on Information Retrieval Research
, 2001
"... The recently introduced Information Bottleneck method [21] provides an information theoretic framework, for extracting features of one variable, that are relevant for the values of another variable. Several previous works already suggested applying this method for document clustering, gene expressio ..."
Abstract

Cited by 79 (7 self)
 Add to MetaCart
(Show Context)
The recently introduced Information Bottleneck method [21] provides an information theoretic framework, for extracting features of one variable, that are relevant for the values of another variable. Several previous works already suggested applying this method for document clustering, gene expression data analysis, spectral analysis and more. In this work we present a novel implementation of this method for supervised text classification. Specifically, we apply the information bottleneck method to find wordclusters that preserve the information about document categories and use these clusters as features for classification. Previous work [1] used a similar clustering procedure to show that wordclusters can significantly reduce the feature space dimensionality, with only a minor change in classification accuracy. In this work we present similar results and go further to show that when the training sample is small word clusters can yield significant improvement in classification accuracy (up to ¢¡¤£) over the performance using the words directly. 1
Alignment of protein sequences by their profiles
, 2004
"... The accuracy of an alignment between two protein sequences can be improved by including other detectably related sequences in the comparison. We optimize and benchmark such an approach that relies on aligning two multiple sequence alignments, each one including one of the two protein sequences. Thir ..."
Abstract

Cited by 75 (14 self)
 Add to MetaCart
(Show Context)
The accuracy of an alignment between two protein sequences can be improved by including other detectably related sequences in the comparison. We optimize and benchmark such an approach that relies on aligning two multiple sequence alignments, each one including one of the two protein sequences. Thirteen different protocols for creating and comparing profiles corresponding to the multiple sequence alignments are implemented in the SALIGN command of MODELLER. A test set of 200 pairwise, structurebased alignments with sequence identities below 40 % is used to benchmark the 13 protocols as well as a number of previously described sequence alignment methods, including heuristic pairwise sequence alignment by BLAST, pairwise sequence alignment by global dynamic programming with an affine gap penalty function by the ALIGN command of MODELLER, sequenceprofile alignment by PSIBLAST, Hidden Markov Model methods implemented in SAM and LOBSTER, pairwise sequence alignment relying on predicted local structure by SEA, and multiple sequence alignment by CLUSTALW and COMPASS. The alignment accuracies of the best new protocols were significantly better than those of the other tested methods. For example, the fraction of the correctly aligned residues relative to the structurebased alignment by the best protocol is 56%, which can be compared with the accuracies of 26%, 42%, 43%, 48%, 50%, 49%, 43%, and 43 % for the other methods, respectively. The new method is currently applied to largescale comparative protein structure modeling of all known sequences.
A Unified Framework for Modelbased Clustering
 Journal of Machine Learning Research
, 2003
"... Modelbased clustering techniques have been widely used and have shown promising results in many applications involving complex data. This paper presents a unified framework for probabilistic modelbased clustering based on a bipartite graph view of data and models that highlights the commonaliti ..."
Abstract

Cited by 74 (7 self)
 Add to MetaCart
(Show Context)
Modelbased clustering techniques have been widely used and have shown promising results in many applications involving complex data. This paper presents a unified framework for probabilistic modelbased clustering based on a bipartite graph view of data and models that highlights the commonalities and differences among existing modelbased clustering algorithms. In this view, clusters are represented as probabilistic models in a model space that is conceptually separate from the data space. For partitional clustering, the view is conceptually similar to the ExpectationMaximization (EM) algorithm. For hierarchical clustering, the graphbased view helps to visualize critical/important distinctions between similaritybased approaches and modelbased approaches.
Using Unlabeled Data to Improve Text Classification
, 2001
"... One key difficulty with text classification learning algorithms is that they require many handlabeled examples to learn accurately. This dissertation demonstrates that supervised learning algorithms that use a small number of labeled examples and many inexpensive unlabeled examples can create high ..."
Abstract

Cited by 70 (0 self)
 Add to MetaCart
(Show Context)
One key difficulty with text classification learning algorithms is that they require many handlabeled examples to learn accurately. This dissertation demonstrates that supervised learning algorithms that use a small number of labeled examples and many inexpensive unlabeled examples can create highaccuracy text classifiers. By assuming that documents are created by a parametric generative model, ExpectationMaximization (EM) finds local maximum a posteriori models and classifiers from all the data  labeled and unlabeled. These generative models do not capture all the intricacies of text; however on some domains this technique substantially improves classification accuracy, especially when labeled data are sparse. Two problems arise from this basic approach. First, unlabeled data can hurt performance in domains where the generative modeling assumptions are too strongly violated. In this case the assumptions can be made more representative in two ways: by modeling subtopic class structure, and by modeling supertopic hierarchical class relationships. By doing so, model probability and classification accuracy come into correspondence, allowing unlabeled data to improve classification performance. The second problem is that even with a representative model, the improvements given by unlabeled data do not sufficiently compensate for a paucity of labeled data. Here, limited labeled data provide EM initializations that lead to lowprobability models. Performance can be significantly improved by using active learning to select highquality initializations, and by using alternatives to EM that avoid lowprobability local maxima.
View Selection for Volume Rendering
, 2005
"... In a visualization of a threedimensional dataset, the insights gained are dependent on what is occluded and what is not. Suggestion of interesting viewpoints can improve both the speed and efficiency of data understanding. This paper presents a view selection method designed for volume rendering. I ..."
Abstract

Cited by 60 (5 self)
 Add to MetaCart
In a visualization of a threedimensional dataset, the insights gained are dependent on what is occluded and what is not. Suggestion of interesting viewpoints can improve both the speed and efficiency of data understanding. This paper presents a view selection method designed for volume rendering. It can be used to find informative views for a given scene, or to find a minimal set of representative views which capture the entire scene. It becomes particularly useful when the visualization process is noninteractive  for example, when visualizing large datasets or timevarying sequences. We introduce a viewpoint "goodness" measure based on the formulation of entropy from information theory. The measure takes into account the transfer function, the data distribution and the visibility of the voxels. Combined with viewpoint properties like viewlikelihood and viewstability, this technique can be used as a guide which suggests "interesting" viewpoints for further exploration. Domain knowledge is incorporated into the algorithm via an importance transfer function or volume. This allows users to obtain view selection behaviors tailored to their specific situations. We generate a view space partitioning, and select one representative view for each partition. Together, this set of views encapsulates the "interesting" and distinct views of the data. Viewpoints in this set can be used as starting points for interactive exploration of the data, thus reducing the human effort in visualization. In noninteractive situations, such a set can be used as a representative visualization of the dataset from all directions.