Results 11  20
of
24
ModelBased Hierarchical Clustering
 In Proc. 16th Conf. Uncertainty in Artificial Intelligence
, 2000
"... We present an approach to modelbased hierarchical clustering by formulating an objective function based on a Bayesian analysis. This model organizes the data into a cluster hierarchy while specifying a complex featureset partitioning that is a key component of our model. Features can have ei ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
We present an approach to modelbased hierarchical clustering by formulating an objective function based on a Bayesian analysis. This model organizes the data into a cluster hierarchy while specifying a complex featureset partitioning that is a key component of our model. Features can have either a unique distribution in every cluster or a common distribution over some (or even all) of the clusters. The cluster subsets over which these features have such a common distribution correspond to the nodes (clusters) of the tree representing the hierarchy. We apply this general model to the problem of document clustering for which we use a multinomial likelihood function and Dirichlet priors. Our algorithm consists of a twostage process wherein we first perform a flat clustering followed by a modified hierarchical agglomerative merging process that includes determining the features that will have common distributions over the merged clusters. The regularization induced...
Evolutionary Model Selection in Unsupervised Learning
, 2002
"... Feature subset selection is important not only for the insight gained from determining relevant modeling variables but also for the improved understandability, scalability, and possibly, accuracy of the resulting models. Feature selection has traditionally been studied in supervised learning situati ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
Feature subset selection is important not only for the insight gained from determining relevant modeling variables but also for the improved understandability, scalability, and possibly, accuracy of the resulting models. Feature selection has traditionally been studied in supervised learning situations, with some estimate of accuracy used to evaluate candidate subsets. However, we often cannot apply supervised learning for lack of a training signal. For these cases, we propose a new feature selection approach based on clustering. A number of heuristic criteria can be used to estimate the quality of clusters built from a given feature subset. Rather than combining such criteria, we use ELSA, an evolutionary local selection algorithm that maintains a diverse population of solutions that approximate the Pareto front in a multidimensional objective space. Each evolved solution represents a feature subset and a number of clusters; two representative clustering algorithms, Kmeans and EM, are applied to form the given number of clusters based on the selected features. Experimental results on both real and synthetic data show that the method can consistently find approximate Paretooptimal solutions through which we can identify the significant features and an appropriate number of clusters. This results in models with better and clearer semantic relevance. 1.
Generalized Model Selection For Unsupervised Learning In High Dimensions
 Proceedings of Neural Information Processing Systems
, 1999
"... In this paper we describe an approach to model selection in unsupervised learning. This approach determines both the feature set and the number of clusters. To this end we first derive an objective function that explicitly incorporates this generalization. We then evaluate two schemes for model sele ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
In this paper we describe an approach to model selection in unsupervised learning. This approach determines both the feature set and the number of clusters. To this end we first derive an objective function that explicitly incorporates this generalization. We then evaluate two schemes for model selection  one using this objective function (a Bayesian estimation scheme that selects the best model structure using the marginal or integrated likelihood) and the second based on a technique using a crossvalidated likelihood criterion. In the first scheme, for a particular application in document clustering, we derive a closedform solution of the integrated likelihood by assuming an appropriate form of the likelihood function and prior. Extensive experiments are carried out to ascertain the validity of both approaches and all results are verified by comparison against ground truth. In our experiments the Bayesian scheme using our objective function gave better results tha n crossvalidatio...
A semisupervised document clustering technique for information organization
 Proceedings of the ninth international conference on Information and knowledge management
, 2000
"... This paper discusses a new type of semisupervised document clustering that uses partial supervision to partition a large set of documents. Most clustering methods organizes documents into groups based only on similarity measures. Unfortunately, the traditional approaches to document clustering are ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
This paper discusses a new type of semisupervised document clustering that uses partial supervision to partition a large set of documents. Most clustering methods organizes documents into groups based only on similarity measures. Unfortunately, the traditional approaches to document clustering are often unable to correctly discern structural details hidden within the document corpus because their algorithms inherently strongly depend on the document themselves and their similarity to each other. In this paper, we attempt to isolate more semantically coherent clusters by employing the domainspecific knowledge provided by a document analyst. By using external human knowledge to guide the clustering mechanism with some flexibility when creating the clusters, clustering efficiency can be considerably enhanced. As a basic clustering strategy, we use a variant of completelinkage agglomerative hierarchical clustering, and develop the concepts (or seeds) of requested clusters by exploiting userrelevance feedback. Although the proposed method is slow when applied to large document collection, it yields higher quality clusters. Through experiments using the Reuters21578 corpus, we show that the proposed method outperforms unsupervised clustering method.
Variational Bayes for mixture models: Reversing EM
, 2000
"... Bayesian calculations for mixture models are hampered by the fact that exact calculation of the necessary parameter integrals is exponentially complex. Variational lower bounds are a simple and efficient way to approximate such integrals (Attias, 1999). This note presents a general variational metho ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Bayesian calculations for mixture models are hampered by the fact that exact calculation of the necessary parameter integrals is exponentially complex. Variational lower bounds are a simple and efficient way to approximate such integrals (Attias, 1999). This note presents a general variational method, based on "reversing" EM, and its application to Gaussian and multinomial mixtures. Experiments show the benefits and drawbacks of lower bounds compared to Taylor expansion (Laplace's method).
Lightweight Document Clustering
, 2000
"... Alightweight document clustering method is described that operates in high dimensions, processes tens of thousands of documents and groups them into several thousand clusters, or byvarying a single parameter, into a few dozen clusters. The method uses a reduced indexing view of the original docum ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Alightweight document clustering method is described that operates in high dimensions, processes tens of thousands of documents and groups them into several thousand clusters, or byvarying a single parameter, into a few dozen clusters. The method uses a reduced indexing view of the original documents, where only the k best keywords of each document are indexed. An efficient procedure for clustering is specified in two parts (a) compute k most similar documents for each document in the collection and (b) group the documents into clusters using these similarity scores. The method has been evaluated on a database of over 50,000 customer service problem reports that are reduced to 3,000 clusters and 5,000 exemplar documents. Results demonstrate efficient clustering performance with excellent group similarity measures. Keywords: text clustering, structuring information to aid searchandnavigation. automated presentation of information, text data mining 1 Introduction The objecti...
An Quantification of Cluster Novelty with an Application to Martian Topography
"... Abstract. Automated tools for knowledge discovery are frequently invoked in databases where objects already group into some known classification scheme. In the context of unsupervised learning or clustering, such tools delve inside large databases looking for alternative classification schemes that ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. Automated tools for knowledge discovery are frequently invoked in databases where objects already group into some known classification scheme. In the context of unsupervised learning or clustering, such tools delve inside large databases looking for alternative classification schemes that are both meaningful and novel. A quantification of cluster novelty can be looked upon as the degree of separation between each new cluster and its most similar class. Our approach models each cluster and class as a Gaussian distribution and estimates the degree of overlap between both distributions by measuring their intersecting area. Unlike other metrics, our method quantifies the novelty of each cluster individually, and enables us to rank classes according to its similarity to each new cluster. We test our algorithm on Martian landscapes using a set of known classes called geological units; experimental results show a new interpretation for the characterization of Martian landscapes. 1
Dimensionality Reduction
"... Dimensionality reduction studies methods that effectively reduce data dimensionality for efficient data processing tasks such as pattern recognition, machine learning, text retrieval, and data mining. We introduce the field of dimensionality reduction by dividing it into two parts: feature extracti ..."
Abstract
 Add to MetaCart
Dimensionality reduction studies methods that effectively reduce data dimensionality for efficient data processing tasks such as pattern recognition, machine learning, text retrieval, and data mining. We introduce the field of dimensionality reduction by dividing it into two parts: feature extraction and feature selection. Feature extraction creates new features resulting from the combination of the original features; and feature selection produces a subset of the original features. Both attempt to reduce the dimensionality of a dataset in order to facilitate efficient data processing tasks. We introduce key concepts of feature extraction and feature selection, describe some basic methods, and illustrate their applications with some practical cases. Extensive research into dimensionality reduction is being carried out for the past many decades. Even today its demand is further increasing due to important highdimensional applications such as gene expression data, text categorization, and document indexing.
External Cluster Assessment
, 2005
"... Automated tools for knowledge discovery are frequently invoked in databases where objects already group into some known (i.e., external) classification scheme. In the context of unsupervised learning or clustering, such tools delve inside large databases looking for alternative classification scheme ..."
Abstract
 Add to MetaCart
Automated tools for knowledge discovery are frequently invoked in databases where objects already group into some known (i.e., external) classification scheme. In the context of unsupervised learning or clustering, such tools delve inside large databases looking for alternative classification schemes that are meaningful and novel. An assessment of the information gained with new clusters can be effected by looking at the degree of separation between each new cluster and its most similar class. Our approach models each cluster and class as a multivariate Gaussian distribution and estimates their degree of separation through an information theoretic measure (i.e., through relative entropy or Kullback Leibler distance). The inherently large computational cost of this step is alleviated by first projecting all data over the single dimension that best separates both distributions (using Fisher’s Linear Discriminant). We test our algorithm on a dataset of Martian surfaces using the traditional division into geological units as external classes and the new, hydrologyinspired, automatically performed division as novel clusters. We find the new partitioning constitutes a formally meaningful classification that deviates substantially from the traditional classification.