Results 1  10
of
41
Quantization
 IEEE TRANS. INFORM. THEORY
, 1998
"... The history of the theory and practice of quantization dates to 1948, although similar ideas had appeared in the literature as long ago as 1898. The fundamental role of quantization in modulation and analogtodigital conversion was first recognized during the early development of pulsecode modula ..."
Abstract

Cited by 700 (12 self)
 Add to MetaCart
The history of the theory and practice of quantization dates to 1948, although similar ideas had appeared in the literature as long ago as 1898. The fundamental role of quantization in modulation and analogtodigital conversion was first recognized during the early development of pulsecode modulation systems, especially in the 1948 paper of Oliver, Pierce, and Shannon. Also in 1948, Bennett published the first highresolution analysis of quantization and an exact analysis of quantization noise for Gaussian processes, and Shannon published the beginnings of rate distortion theory, which would provide a theory for quantization as analogtodigital conversion and as data compression. Beginning with these three papers of fifty years ago, we trace the history of quantization from its origins through this decade, and we survey the fundamentals of the theory and many of the popular and promising techniques for quantization.
Concept Decompositions for Large Sparse Text Data using Clustering
 Machine Learning
, 2000
"... . Unlabeled document collections are becoming increasingly common and available; mining such data sets represents a major contemporary challenge. Using words as features, text documents are often represented as highdimensional and sparse vectorsa few thousand dimensions and a sparsity of 95 to 99 ..."
Abstract

Cited by 327 (26 self)
 Add to MetaCart
. Unlabeled document collections are becoming increasingly common and available; mining such data sets represents a major contemporary challenge. Using words as features, text documents are often represented as highdimensional and sparse vectorsa few thousand dimensions and a sparsity of 95 to 99% is typical. In this paper, we study a certain spherical kmeans algorithm for clustering such document vectors. The algorithm outputs k disjoint clusters each with a concept vector that is the centroid of the cluster normalized to have unit Euclidean norm. As our first contribution, we empirically demonstrate that, owing to the highdimensionality and sparsity of the text data, the clusters produced by the algorithm have a certain "fractallike" and "selfsimilar" behavior. As our second contribution, we introduce concept decompositions to approximate the matrix of document vectors; these decompositions are obtained by taking the leastsquares approximation onto the linear subspace spanned...
Centroidal Voronoi tessellations: Applications and algorithms
 SIAM Rev
, 1999
"... Abstract. A centroidal Voronoi tessellation is a Voronoi tessellation whose generating points are the centroids (centers of mass) of the corresponding Voronoi regions. We give some applications of such tessellations to problems in image compression, quadrature, finite difference methods, distributio ..."
Abstract

Cited by 264 (28 self)
 Add to MetaCart
Abstract. A centroidal Voronoi tessellation is a Voronoi tessellation whose generating points are the centroids (centers of mass) of the corresponding Voronoi regions. We give some applications of such tessellations to problems in image compression, quadrature, finite difference methods, distribution of resources, cellular biology, statistics, and the territorial behavior of animals. We discuss methods for computing these tessellations, provide some analyses concerning both the tessellations and the methods for their determination, and, finally, present the results of some numerical experiments.
A divisive informationtheoretic feature clustering algorithm for text classification
 Journal of Machine Learning Research
, 2003
"... High dimensionality of text can be a deterrent in applying complex learners such as Support Vector Machines to the task of text classification. Feature clustering is a powerful alternative to feature selection for reducing the dimensionality of text data. In this paper we propose a new informationth ..."
Abstract

Cited by 118 (14 self)
 Add to MetaCart
High dimensionality of text can be a deterrent in applying complex learners such as Support Vector Machines to the task of text classification. Feature clustering is a powerful alternative to feature selection for reducing the dimensionality of text data. In this paper we propose a new informationtheoretic divisive algorithm for feature/word clustering and apply it to text classification. Existing techniques for such “distributional clustering ” of words are agglomerative in nature and result in (i) suboptimal word clusters and (ii) high computational cost. In order to explicitly capture the optimality of word clusters in an information theoretic framework, we first derive a global criterion for feature clustering. We then present a fast, divisive algorithm that monotonically decreases this objective function value. We show that our algorithm minimizes the “withincluster JensenShannon divergence ” while simultaneously maximizing the “betweencluster JensenShannon divergence”. In comparison to the previously proposed agglomerative strategies our divisive algorithm is much faster and achieves comparable or higher classification accuracies. We further show that feature clustering is an effective technique for building smaller class models in hierarchical classification. We present detailed experimental results using Naive Bayes and Support Vector Machines on the 20Newsgroups data set and a 3level hierarchy of HTML documents collected from the Open Directory project (www.dmoz.org).
Digital color imaging
 IEEE Trans. Image Process
, 1997
"... in the area of digital color imaging. In order to establish the background and lay down terminology, fundamental concepts of color perception and measurement are first presented using vectorspace notation and terminology. Presentday color recording and reproduction systems are reviewed along with ..."
Abstract

Cited by 100 (14 self)
 Add to MetaCart
in the area of digital color imaging. In order to establish the background and lay down terminology, fundamental concepts of color perception and measurement are first presented using vectorspace notation and terminology. Presentday color recording and reproduction systems are reviewed along with the common mathematical models used for representing these devices. Algorithms for processing color images for display and communication are surveyed, and a forecast of research trends is attempted. An extensive bibliography is provided. I.
A DataClustering Algorithm On Distributed Memory Multiprocessors
 In LargeScale Parallel Data Mining, Lecture Notes in Artificial Intelligence
, 2000
"... To cluster increasingly massive data sets that are common today in data and text mining, we propose a parallel implementation of the kmeans clustering algorithm based on the message passing model. The proposed algorithm exploits the inherent dataparallelism in the kmeans algorithm. We analyticall ..."
Abstract

Cited by 98 (1 self)
 Add to MetaCart
To cluster increasingly massive data sets that are common today in data and text mining, we propose a parallel implementation of the kmeans clustering algorithm based on the message passing model. The proposed algorithm exploits the inherent dataparallelism in the kmeans algorithm. We analytically show that the speedup and the scaleup of our algorithm approach the optimal as the number of data points increases. We implemented our algorithm on an IBM POWERparallel SP2 with a maximum of 16 nodes. On typical test data sets, we observe nearly linear relative speedups, for example, 15.62 on 16 nodes, and essentially linear scaleup in the size of the data set and in the number of clusters desired. For a 2 gigabyte test data set, our implementation drives the 16 node SP2 at more than 1.8 gigaflops. Keywords: kmeans, data mining, massive data sets, messagepassing, text mining. 1 Introduction Data sets measuring in gigabytes and even terabytes are now quite common in data and text minin...
Efficient Clustering Of Very Large Document Collections
, 2001
"... An invaluable portion of scientific data occurs naturally in text form. Given a large unlabeled document collection, it is often helpful to organize this collection into clusters of related documents. By using a vector space model, text data can be treated as highdimensional but sparse numerical da ..."
Abstract

Cited by 96 (10 self)
 Add to MetaCart
An invaluable portion of scientific data occurs naturally in text form. Given a large unlabeled document collection, it is often helpful to organize this collection into clusters of related documents. By using a vector space model, text data can be treated as highdimensional but sparse numerical data vectors. It is a contemporary challenge to efficiently preprocess and cluster very large document collections. In this paper we present a time and memory ecient technique for the entire clustering process, including the creation of the vector space model. This efficiency is obtained by (i) a memoryecient multithreaded preprocessing scheme, and (ii) a fast clustering algorithm that fully exploits the sparsity of the data set. We show that this entire process takes time that is linear in the size of the document collection. Detailed experimental results are presented  a highlight of our results is that we are able to effectively cluster a collection of 113,716 NSF award abstracts in 23 minutes (including disk I/O costs) on a single workstation with modest memory consumption.
Enhanced Word Clustering for Hierarchical Text Classification
, 2002
"... In this paper we propose a new informationtheoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering" of features has been found to achieve improvements over feature selection in terms of classification accuracy, espe ..."
Abstract

Cited by 47 (1 self)
 Add to MetaCart
In this paper we propose a new informationtheoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering" of features has been found to achieve improvements over feature selection in terms of classification accuracy, especially at lower number of features [2, 28]. However the existing clustering techniques are agglomerative in nature and result in (i) suboptimal word clusters and (ii) high computational cost. In order to explicitly capture the optimality of word clusters in an information theoretic framework, we first derive a global criterion for feature clustering. We then present a fast, divisive algorithm that monotonically decreases this objective function value, thus converging to a local minimum. We show that our algorithm minimizes the "withincluster JensenShannon divergence" while simultaneously maximizing the "betweencluster JensenShannon divergence". In comparison to the previously proposed agglomerative strategies our divisive algorithm achieves higher classification accuracy especially at lower number of features. We further show that feature clustering is an effective technique for building smaller class models in hierarchical classification. We present detailed experimental results using Naive Bayes and Support Vector Machines on the 20 Newsgroups data set and a 3level hierarchy of HTML documents collected from Dmoz Open Directory.
Agglomerative Hierarchical Clustering with Constraints: Theoretical and Empirical Results
 Lecture notes in computer science
, 2005
"... Abstract. We explore the use of instance and clusterlevel constraints with agglomerative hierarchical clustering. Though previous work has illustrated the benefits of using constraints for nonhierarchical clustering, their application to hierarchical clustering is not straightforward for two prim ..."
Abstract

Cited by 34 (2 self)
 Add to MetaCart
Abstract. We explore the use of instance and clusterlevel constraints with agglomerative hierarchical clustering. Though previous work has illustrated the benefits of using constraints for nonhierarchical clustering, their application to hierarchical clustering is not straightforward for two primary reasons. First, some constraint combinations make the feasibility problem (Does there exist a single feasible solution?) NPcomplete. Second, some constraint combinations when used with traditional agglomerative algorithms can cause the dendrogram to stop prematurely in a deadend solution even though there exist other feasible solutions with a significantly smaller number of clusters. When constraints lead to efficiently solvable feasibility problems and standard agglomerative algorithms do not give rise to deadend solutions, we empirically illustrate the benefits of using constraints to improve cluster purity and average distortion. Furthermore, we introduce the new γ constraint and use it in conjunction with the triangle inequality to considerably improve the efficiency of agglomerative clustering. 1
Information Theoretic Feature Clustering for Text Classification
 JOURNAL OF MACHINE LEARNING RESEARCH (JMLR), SPECIAL
, 2002
"... High dimensionality of text can become a severe deterrent in applying complex learners like Support Vector Machines to the task of text classification. Word clustering is a powerful alternative to feature selection for reducing the dimensionality of text. In this paper we propose a new informationt ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
High dimensionality of text can become a severe deterrent in applying complex learners like Support Vector Machines to the task of text classification. Word clustering is a powerful alternative to feature selection for reducing the dimensionality of text. In this paper we propose a new informationtheoretic divisive algorithm for word clustering and apply it to text classification. Existing techniques for such "distributional clustering" of words are agglomerative in nature resulting in (i) suboptimal word clusters and (ii) high computational cost. In order to explicitly capture the optimality of word clusters in an information theoretic framework, we first derive a global criterion for feature clustering. We then