Results 1 - 10
of
10
Non-Redundant Data Clustering
, 2004
"... Data clustering is a popular approach for automatically finding classes, concepts, or groups of patterns. In practice this discovery process should avoid redundancies with existing knowledge about class structures or groupings, and reveal novel, previously unknown aspects of the data. In order to de ..."
Abstract
-
Cited by 46 (2 self)
- Add to MetaCart
Data clustering is a popular approach for automatically finding classes, concepts, or groups of patterns. In practice this discovery process should avoid redundancies with existing knowledge about class structures or groupings, and reveal novel, previously unknown aspects of the data. In order to deal with this problem, we present an extension of the information bottleneck framework, called coordinated conditional information bottleneck, which takes negative relevance information into account by maximizing a conditional mutual information score subject to constraints. Algorithmically, one can apply an alternating optimization scheme that can be used in conjunction with different types of numeric and non-numeric attributes. We present experimental results for applications in text mining and computer vision.
Generative model-based document clustering: a comparative study
- Knowledge and Information Systems
, 2005
"... Semi-supervised learning has become an attractive methodology for improving classification models and is often viewed as using unlabeled data to aid supervised learning. However, it can also be viewed as using labeled data to help clustering, namely, semi-supervised clustering. Viewing semi-supervis ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Semi-supervised learning has become an attractive methodology for improving classification models and is often viewed as using unlabeled data to aid supervised learning. However, it can also be viewed as using labeled data to help clustering, namely, semi-supervised clustering. Viewing semi-supervised learning from a clustering angle is useful in practical situations when the set of labels available in labeled data are not complete, i.e., unlabeled data contain new classes that are not present in labeled data. This paper analyzes several multinomial modelbased semi-supervised document clustering methods under a principled model-based clustering framework. The framework naturally leads to a deterministic annealing extension of existing semi-supervised clustering approaches. We compare three (slightly) different semi-supervised approaches for clustering documents: Seeded damnl, Constrained damnl, and Feedback-based damnl, where damnl stands for multinomial model-based deterministic annealing algorithm. The first two are extensions of the seeded k-means and constrained k-means algorithms studied by Basu et al. (2002); the last one is motivated by Cohn et al. (2003). Through empirical experiments on text datasets, we show that: (a) deterministic annealing can often significantly improve the performance of semi-supervised clustering; (b) the constrained approach is the best when available labels are complete whereas the feedback-based approach excels when available labels are incomplete.
Non-Redundant Multi-View Clustering Via Orthogonalization
"... Typical clustering algorithms output a single clustering of the data. However, in real world applications, data can often be interpreted in many different ways; data can have different groupings that are reasonable and interesting from different perspectives. This is especially true for high-dimensi ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
Typical clustering algorithms output a single clustering of the data. However, in real world applications, data can often be interpreted in many different ways; data can have different groupings that are reasonable and interesting from different perspectives. This is especially true for high-dimensional data, where different feature subspaces may reveal different structures of the data. Why commit to one clustering solution while all these alternative clustering views might be interesting to the user. In this paper, we propose a new clustering paradigm for explorative data analysis: find all non-redundant clustering views of the data, where data points of one cluster can belong to different clusters in other views. We present a framework to solve this problem and suggest two approaches within this framework: (1) orthogonal clustering, and (2) clustering in orthogonal subspaces. In essence, both approaches find alternative ways to partition the data by projecting it to a space that is orthogonal to our current solution. The first approach seeks orthogonality in the cluster space, while the second approach seeks orthogonality in the feature space. We test our framework on both synthetic and high-dimensional benchmark data sets, and the results show that indeed our approaches were able to discover varied solutions that are interesting and meaningful.
COALA: A novel approach for the extraction of an alternate clustering of high quality and high dissimilarity
- in ICDM, 2006
"... Cluster analysis has long been a fundamental task in data mining and machine learning. However, traditional clustering methods concentrate on producing a single solution, even though multiple alternative clusterings may exist. It is thus difficult for the user to validate whether the given solution ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
Cluster analysis has long been a fundamental task in data mining and machine learning. However, traditional clustering methods concentrate on producing a single solution, even though multiple alternative clusterings may exist. It is thus difficult for the user to validate whether the given solution is in fact appropriate, particularly for large and complex datasets. In this paper we explore the critical requirements for systematically finding a new clustering, given that an already known clustering is available and we also propose a novel algorithm, COALA, to discover this new clustering. Our approach is driven by two important factors; dissimilarity and quality. These are especially important for finding a new clustering which is highly informative about the underlying structure of data, but is at the same time distinctively different from the provided clustering. We undertake an experimental analysis and show that our method is able to outperform existing techniques, for both synthetic and real datasets. 1.
The Information Bottleneck Revisited or How to Choose a Good Distortion Measure
"... Abstract — It is well-known that the information bottleneck method and rate distortion theory are related. Here it is described how the information bottleneck can be considered as rate distortion theory for a family of probability measures where information divergence is used as distortion measure. ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Abstract — It is well-known that the information bottleneck method and rate distortion theory are related. Here it is described how the information bottleneck can be considered as rate distortion theory for a family of probability measures where information divergence is used as distortion measure. It is shown that the information bottleneck method has some properties that are not shared with rate distortion theory based on any other divergence measure. In this sense the information bottleneck method is unique. I.
Generation of Alternative Clusterings Using the CAMI Approach
"... Exploratory data analysis aims to discover and generate multiple views of the structure within a dataset. Conventional clustering techniques, however, are designed to only provide a single grouping or clustering of a dataset. In this paper, we introduce a novel algorithm called CAMI, that can uncove ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Exploratory data analysis aims to discover and generate multiple views of the structure within a dataset. Conventional clustering techniques, however, are designed to only provide a single grouping or clustering of a dataset. In this paper, we introduce a novel algorithm called CAMI, that can uncover alternative clusterings from a dataset. CAMI takes a mathematically appealing approach, combining the use of mutual information to distinguish between alternative clusterings, coupled with an expectation maximization framework to ensure clustering quality. We experimentally test CAMI on both synthetic and real-world datasets, comparing it against a variety of state-of-the-art algorithms. We demonstrate that CAMI’s performance is high and that its formulation provides a number of advantages compared to existing techniques. 1
An Information Theoretic Approach to Machine Learning
, 2005
"... In this thesis, theory and applications of machine learning systems based on information theoretic criteria as performance measures are studied. A new clustering algorithm based on maximizing the Cauchy-Schwarz (CS) divergence measure between probability density functions (pdfs) is proposed. The CS ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
In this thesis, theory and applications of machine learning systems based on information theoretic criteria as performance measures are studied. A new clustering algorithm based on maximizing the Cauchy-Schwarz (CS) divergence measure between probability density functions (pdfs) is proposed. The CS divergence is estimated non-parametrically using the Parzen window technique for density estimation. The problem domain is transformed from discrete 0/1 cluster membership values to continuous membership values. A constrained gradient descent maximization algorithm is implemented. The gradients are stochastically approximated to reduce computational complexity, making the algorithm more practical. Parzen window annealing is incorporated into the algorithm to help avoid convergence to a local maximum. The clustering results obtained on synthetic and real data are encouraging. The Parzen window-based estimator for the CS divergence is shown to have a dual expression as a measure of the cosine of the angle between cluster mean vectors in a feature space determined by the eigenspectrum of a Mercer kernel matrix. A spectral clustering
Non-redundant clustering
, 2005
"... Data mining and knowledge discovery attempt to reveal concepts, patterns, relationships, and struc-tures of interest in data. Typically, data may have many such structures. Most existing data mining techniques allow the user little say in which structure will be returned from the search. Those techn ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Data mining and knowledge discovery attempt to reveal concepts, patterns, relationships, and struc-tures of interest in data. Typically, data may have many such structures. Most existing data mining techniques allow the user little say in which structure will be returned from the search. Those techniques which do allow the user control over the search typically require supervised information in the form of knowledge about a target solution. In the spirit of exploratory data mining, we consider the setting where the user does not have information about a target solution. Instead we suppose the user can provide information about solutions which are not desired. These undesired solutions may be previously obtained from data mining algorithms, or they may be known to the user a priori. The goal is then to discover novel structure in the dataset which is not redundant with respect to the known structure. Techniques should guide the search away from this known structure and towards novel, interesting structures. We describe and formally define the task of non-redundant clustering. Three different algorithmic approaches are derived for non-redundant clustering. Their performance is experimentally evaluated on data sets containing multiple cluster-ings. We explore how these techniques may be extended to systematically enumerate clusterings in a data set. Finally, we also investigate whether non-redundant approaches may be incorporated to enhance state-of-the-art supervised techniques.
Interpreting Classifiers by Multiple Views
, 2005
"... Next to prediction accuracy, interpretability is one of the fundamental performance criteria for machine learning. While high accuracy learners have intensively been explored, interpretability still poses a difficult problem. To combine ..."
Abstract
- Add to MetaCart
Next to prediction accuracy, interpretability is one of the fundamental performance criteria for machine learning. While high accuracy learners have intensively been explored, interpretability still poses a difficult problem. To combine
Learning with Multiple Views Proposal for an ICML Workshop
"... We propose to have a workshop on multi-view learning at the Twenty-Second International Conference on Machine Learning. Two main reasons lead us to the conclusion that the community would benefit from such a workshop. – Multi-view learning is a natural, yet non-standard new problem setting; in ..."
Abstract
- Add to MetaCart
We propose to have a workshop on multi-view learning at the Twenty-Second International Conference on Machine Learning. Two main reasons lead us to the conclusion that the community would benefit from such a workshop. – Multi-view learning is a natural, yet non-standard new problem setting; in

