Results 1 - 10
of
159
Data Clustering: 50 Years Beyond K-Means
, 2008
"... Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into taxonomic ranks: domain, kingdom, phylum, class, etc.). Cluster analysis is the formal study of algorithms and m ..."
Abstract
-
Cited by 294 (7 self)
- Add to MetaCart
Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into taxonomic ranks: domain, kingdom, phylum, class, etc.). Cluster analysis is the formal study of algorithms and methods for grouping, or clustering, objects according to measured or perceived intrinsic characteristics or similarity. Cluster analysis does not use category labels that tag objects with prior identifiers, i.e., class labels. The absence of category information distinguishes data clustering (unsupervised learning) from classification or discriminant analysis (supervised learning). The aim of clustering is exploratory in nature to find structure in data. Clustering has a long and rich history in a variety of scientific fields. One of the most popular and simple clustering algorithms, K-means, was first published in 1955. In spite of the fact that K-means was proposed over 50 years ago and thousands of clustering algorithms have been published since then, K-means is still widely used. This speaks to the difficulty of designing a general purpose clustering algorithm and the illposed problem of clustering. We provide a brief overview of clustering, summarize well known clustering methods, discuss the major challenges and key issues in designing clustering algorithms, and point out some of the emerging and useful research directions, including semi-supervised clustering, ensemble clustering, simultaneous feature selection, and data clustering and large scale data clustering.
Algorithms and applications for approximate nonnegative matrix factorization
- Computational Statistics and Data Analysis
, 2006
"... In this paper we discuss the development and use of low-rank approximate nonnegative matrix factorization (NMF) algorithms for feature extraction and identification in the fields of text mining and spectral data analysis. The evolution and convergence properties of hybrid methods based on both spars ..."
Abstract
-
Cited by 204 (8 self)
- Add to MetaCart
(Show Context)
In this paper we discuss the development and use of low-rank approximate nonnegative matrix factorization (NMF) algorithms for feature extraction and identification in the fields of text mining and spectral data analysis. The evolution and convergence properties of hybrid methods based on both sparsity and smoothness constraints for the resulting nonnegative matrix factors are discussed. The interpretability of NMF outputs in specific contexts are provided along with opportunities for future work in the modification of NMF algorithms for large-scale and time-varying datasets. Key words: nonnegative matrix factorization, text mining, spectral data analysis, email surveillance, conjugate gradient, constrained least squares.
Orthogonal nonnegative matrix tri-factorizations for clustering
- In SIGKDD
, 2006
"... Currently, most research on nonnegative matrix factorization (NMF) focus on 2-factor X = FG T factorization. We provide a systematic analysis of 3-factor X = FSG T NMF. While unconstrained 3-factor NMF is equivalent to unconstrained 2-factor NMF, constrained 3factor NMF brings new features to constr ..."
Abstract
-
Cited by 117 (22 self)
- Add to MetaCart
Currently, most research on nonnegative matrix factorization (NMF) focus on 2-factor X = FG T factorization. We provide a systematic analysis of 3-factor X = FSG T NMF. While unconstrained 3-factor NMF is equivalent to unconstrained 2-factor NMF, constrained 3factor NMF brings new features to constrained 2-factor NMF. We study the orthogonality constraint because it leads to rigorous clustering interpretation. We provide new rules for updating F,S,G and prove the convergence of these algorithms. Experiments on 5 datasets and a real world case study are performed to show the capability of bi-orthogonal 3-factor NMF on simultaneously clustering rows and columns of the input data matrix. We provide a new approach of evaluating the quality of clustering on words using class aggregate distribution and multi-peak distribution. We also provide an overview of various NMF extensions and examine their relationships.
Convex and Semi-Nonnegative Matrix Factorizations
, 2008
"... We present several new variations on the theme of nonnegative matrix factorization (NMF). Considering factorizations of the form X = F GT, we focus on algorithms in which G is restricted to contain nonnegative entries, but allow the data matrix X to have mixed signs, thus extending the applicable ra ..."
Abstract
-
Cited by 112 (10 self)
- Add to MetaCart
(Show Context)
We present several new variations on the theme of nonnegative matrix factorization (NMF). Considering factorizations of the form X = F GT, we focus on algorithms in which G is restricted to contain nonnegative entries, but allow the data matrix X to have mixed signs, thus extending the applicable range of NMF methods. We also consider algorithms in which the basis vectors of F are constrained to be convex combinations of the data points. This is used for a kernel extension of NMF. We provide algorithms for computing these new factorizations and we provide supporting theoretical analysis. We also analyze the relationships between our algorithms and clustering algorithms, and consider the implications for sparseness of solutions. Finally, we present experimental results that explore the properties of these new methods.
Learning spectral clustering, with application to speech separation
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... Spectral clustering refers to a class of techniques which rely on the eigenstructure of a similarity matrix to partition points into disjoint clusters, with points in the same cluster having high similarity and points in different clusters having low similarity. In this paper, we derive new cost fun ..."
Abstract
-
Cited by 70 (6 self)
- Add to MetaCart
Spectral clustering refers to a class of techniques which rely on the eigenstructure of a similarity matrix to partition points into disjoint clusters, with points in the same cluster having high similarity and points in different clusters having low similarity. In this paper, we derive new cost functions for spectral clustering based on measures of error between a given partition and a solution of the spectral relaxation of a minimum normalized cut problem. Minimizing these cost functions with respect to the partition leads to new spectral clustering algorithms. Minimizing with respect to the similarity matrix leads to algorithms for learning the similarity matrix from fully labelled datasets. We apply our learning algorithm to the blind one-microphone speech separation problem, casting the problem as one of segmentation of the spectrogram.
Spectral clustering for multi-type relational data
- In ICML
, 2006
"... Clustering on multi-type relational data has attracted more and more attention in recent years due to its high impact on various important applications, such as Web mining, e-commerce and bioinformatics. However, the research on general multi-type relational data clustering is still limited and prel ..."
Abstract
-
Cited by 60 (4 self)
- Add to MetaCart
Clustering on multi-type relational data has attracted more and more attention in recent years due to its high impact on various important applications, such as Web mining, e-commerce and bioinformatics. However, the research on general multi-type relational data clustering is still limited and preliminary. The contribution of the paper is three-fold. First, we propose a general model, the collective factorization on related matrices, for multi-type relational data clustering. The model is applicable to relational data with various structures. Second, under this model, we derive a novel algorithm, the spectral relational clustering, to cluster multi-type interrelated data objects simultaneously. The algorithm iteratively embeds each type of data objects into low dimensional spaces and benefits from the interactions among the hidden structures of different types of data objects. Extensive experiments demonstrate the promise and effectiveness of the proposed algorithm. Third, we show that the existing spectral clustering algorithms can be considered as the special cases of the proposed model and algorithm. This demonstrates the good theoretic generality of the proposed model and algorithm. 1.
Adaptive dimension reduction using discriminant analysis and k-means clustering
- In ICML
, 2007
"... We combine linear discriminant analysis (LDA) and K-means clustering into a coherent framework to adaptively select the most discriminative subspace. We use K-means clustering to generate class labels and use LDA to do subspace selection. The clustering process is thus integrated with the subspace s ..."
Abstract
-
Cited by 55 (7 self)
- Add to MetaCart
(Show Context)
We combine linear discriminant analysis (LDA) and K-means clustering into a coherent framework to adaptively select the most discriminative subspace. We use K-means clustering to generate class labels and use LDA to do subspace selection. The clustering process is thus integrated with the subspace selection process and the data are then simultaneously clustered while the feature subspaces are selected. We show the rich structure of the general LDA-Km framework by examining its variants and their relationships to earlier approaches. Extensive experimental results on real-world datasets show the effectiveness of our approach. 1.
SVD based initialization: A head start for nonnegative matrix factorization
- PATTERN RECOGNITION
, 2007
"... ..."
On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing
- Comput. Stat. Data Anal
, 2008
"... Non-negative Matrix Factorization (NMF) and Probabilistic Latent Semantic Indexing (PLSI) have been successfully applied to document clustering recently. In this paper, we show that PLSI and NMF (with the I-divergence objective func-tion) optimize the same objective function, although PLSI and NMF a ..."
Abstract
-
Cited by 50 (4 self)
- Add to MetaCart
(Show Context)
Non-negative Matrix Factorization (NMF) and Probabilistic Latent Semantic Indexing (PLSI) have been successfully applied to document clustering recently. In this paper, we show that PLSI and NMF (with the I-divergence objective func-tion) optimize the same objective function, although PLSI and NMF are different algorithms as veried by experiments. This provides a theoretical basis for a new hybrid method that runs PLSI and NMF alternatively, each jumping out of local minima of the other method successively, thus achieving a better nal solution. Extensive experiments on ve real-life datasets show relations between NMF and PLSI, and indicate the hybrid method leads to signicant improvements over NMF-only or PLSI-only methods. We also show that at rst order approximation, NMF is identical to 2-statistic. 1
The relationships among various nonnegative matrix factorization methods for clustering
- In ICDM
, 2006
"... The nonnegative matrix factorization (NMF) has been shown recently to be useful for clustering. Various extensions of NMF have also been proposed. In this paper we present an overview and theoretically analyze the relationships among them. In addition, we clarify previously unaddressed issues, such ..."
Abstract
-
Cited by 48 (10 self)
- Add to MetaCart
(Show Context)
The nonnegative matrix factorization (NMF) has been shown recently to be useful for clustering. Various extensions of NMF have also been proposed. In this paper we present an overview and theoretically analyze the relationships among them. In addition, we clarify previously unaddressed issues, such as NMF normalization, cluster posterior probabilty, and NMF algoritm convergence rate. Experiments are also conducted to empirically evaluate and compare various factorization methods.