Results 1  10
of
187
Evolutionary Spectral Clustering by Incorporating Temporal Smoothness
, 2007
"... Evolutionary clustering is an emerging research area essential to important applications such as clustering dynamic Web and blog contents and clustering data streams. In evolutionary clustering, a good clustering result should fit the current data well, while simultaneously not deviate too dramatica ..."
Abstract

Cited by 89 (8 self)
 Add to MetaCart
Evolutionary clustering is an emerging research area essential to important applications such as clustering dynamic Web and blog contents and clustering data streams. In evolutionary clustering, a good clustering result should fit the current data well, while simultaneously not deviate too dramatically from the recent history. To fulfill this dual purpose, a measure of temporal smoothness is integrated in the overall measure of clustering quality. In this paper, we propose two frameworks that incorporate temporal smoothness in evolutionary spectral clustering. For both frameworks, we start with intuitions gained from the wellknown kmeans clustering problem, and then propose and solve corresponding cost functions for the evolutionary spectral clustering problems. Our solutions to the evolutionary spectral clustering problems provide more stable and consistent clustering results that are less sensitive to shortterm noises while at the same time are adaptive to longterm cluster drifts. Furthermore, we demonstrate that our methods provide the optimal solutions to the relaxed versions of the corresponding evolutionary kmeans clustering problems. Performance experiments over a number of real and synthetic data sets illustrate our evolutionary spectral clustering methods provide more robust clustering results that are not sensitive to noise and can adapt to data drifts.
Crossdomain sentiment classification via spectral feature alignment
 In WWW
, 2010
"... Sentiment classification aims to automatically predict sentiment polarity (e.g., positive or negative) of users publishing sentiment data (e.g., reviews, blogs). Although traditional classification algorithms can be used to train sentiment classifiers from manually labeled text data, the labeling wo ..."
Abstract

Cited by 60 (10 self)
 Add to MetaCart
(Show Context)
Sentiment classification aims to automatically predict sentiment polarity (e.g., positive or negative) of users publishing sentiment data (e.g., reviews, blogs). Although traditional classification algorithms can be used to train sentiment classifiers from manually labeled text data, the labeling work can be timeconsuming and expensive. Meanwhile, users often use some different words when they express sentiment in different domains. If we directly apply a classifier trained in one domain to other domains, the performance will be very low due to the differences between these domains. In this work, we develop a general solution to sentiment classification when we do not have any labels in a target domain but have some labeled data in a different domain, regarded as source domain. In this crossdomain sentiment classification setting, to bridge the gap between the domains, we propose a spectral feature
Discriminative cluster analysis
 In International Conference on Machine Learning
, 2006
"... Clustering is one of the most widely used statistical tools for data analysis. Among all existing clustering techniques, kmeans is a very popular method because of its ease of programming and because it accomplishes a good tradeoff between achieved performance and computational complexity. However ..."
Abstract

Cited by 38 (3 self)
 Add to MetaCart
(Show Context)
Clustering is one of the most widely used statistical tools for data analysis. Among all existing clustering techniques, kmeans is a very popular method because of its ease of programming and because it accomplishes a good tradeoff between achieved performance and computational complexity. However, kmeans is prone to local minima problems, and it does not scale well with high dimensional data sets. A common approach to dealing with high dimensional data is to cluster in the space spanned by the principal components (PC). In this paper, we show the benefits of clustering in a low dimensional discriminative space rather than in the PC space (generative). In particular, we propose a new clustering algorithm called Discriminative Cluster Analysis (DCA). DCA jointly performs dimensionality reduction and clustering. Several toy and real examples show the benefits of DCA versus traditional PCA+kmeans clustering. Additionally, a new matrix formulation is suggested and connections with related techniques such as spectral graph methods and linear discriminant analysis are provided. 1.
Parallel Spectral Clustering
"... Abstract. Spectral clustering algorithm has been shown to be more effective in finding clusters than most traditional algorithms. However, spectral clustering suffers from a scalability problem in both memory use and computational time when a dataset size is large. To perform clustering on large dat ..."
Abstract

Cited by 36 (2 self)
 Add to MetaCart
Abstract. Spectral clustering algorithm has been shown to be more effective in finding clusters than most traditional algorithms. However, spectral clustering suffers from a scalability problem in both memory use and computational time when a dataset size is large. To perform clustering on large datasets, we propose to parallelize both memory use and computation on distributed computers. Through an empirical study on a large document dataset of 193, 844 data instances and a large photo dataset of 637, 137, we demonstrate that our parallel algorithm can effectively alleviate the scalability problem. Key words: Parallel spectral clustering, distributed computing 1
A LeastSquares Framework for Component Analysis
, 2009
"... ... (SC) have been extensively used as a feature extraction step for modeling, clustering, classification, and visualization. CA techniques are appealing because many can be formulated as eigenproblems, offering great potential for learning linear and nonlinear representations of data in closedfo ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
... (SC) have been extensively used as a feature extraction step for modeling, clustering, classification, and visualization. CA techniques are appealing because many can be formulated as eigenproblems, offering great potential for learning linear and nonlinear representations of data in closedform. However, the eigenformulation often conceals important analytic and computational drawbacks of CA techniques, such as solving generalized eigenproblems with rank deficient matrices (e.g., small sample size problem), lacking intuitive interpretation of normalization factors, and understanding commonalities and differences between CA methods. This paper proposes a unified leastsquares framework to formulate many CA methods. We show how PCA, LDA, CCA, LE, SC, and their kernel and regularized extensions, correspond to a particular instance of leastsquares weighted kernel reduced rank regression (LSWKRRR). The LSWKRRR formulation of CA methods has several benefits: (1) provides a clean connection between many CA techniques and an intuitive framework to understand normalization factors; (2) yields efficient numerical schemes to solve CA techniques; (3) overcomes the small sample size problem; (4) provides a framework to easily extend CA methods. We derive new weighted generalizations of PCA, LDA, CCA and SC, and several novel CA techniques.
The uniqueness of a good optimum for kmeans
 In ICML
, 2006
"... If we have found a ”good ” clustering C of a data set, can we prove that C is not far from the (unknown) best clustering Copt of these data? Perhaps surprisingly, the answer to this question is sometimes yes. When “goodness ” is measured by the distortion of Kmeans clustering, this paper proves spe ..."
Abstract

Cited by 24 (3 self)
 Add to MetaCart
(Show Context)
If we have found a ”good ” clustering C of a data set, can we prove that C is not far from the (unknown) best clustering Copt of these data? Perhaps surprisingly, the answer to this question is sometimes yes. When “goodness ” is measured by the distortion of Kmeans clustering, this paper proves spectral bounds on the distance d(C, Copt). The bounds exist in the case when the data admits a low distortion clustering. 1.
Clustered MultiTask Learning Via Alternating Structure Optimization
"... Multitask learning (MTL) learns multiple related tasks simultaneously to improve generalization performance. Alternating structure optimization (ASO) is a popular MTL method that learns a shared lowdimensional predictive structure on hypothesis spaces from multiple related tasks. It has been appli ..."
Abstract

Cited by 23 (7 self)
 Add to MetaCart
(Show Context)
Multitask learning (MTL) learns multiple related tasks simultaneously to improve generalization performance. Alternating structure optimization (ASO) is a popular MTL method that learns a shared lowdimensional predictive structure on hypothesis spaces from multiple related tasks. It has been applied successfully in many real world applications. As an alternative MTL approach, clustered multitask learning (CMTL) assumes that multiple tasks follow a clustered structure, i.e., tasks are partitioned into a set of groups where tasks in the same group are similar to each other, and that such a clustered structure is unknown a priori. The objectives in ASO and CMTL differ in how multiple tasks are related. Interestingly, we show in this paper the equivalence relationship between ASO and CMTL, providing significant new insights into ASO and CMTL as well as their inherent relationship. The CMTL formulation is nonconvex, and we adopt a convex relaxation to the CMTL formulation. We further establish the equivalence relationship between the proposed convex relaxation of CMTL and an existing convex relaxation of ASO, and show that the proposed convex CMTL formulation is significantly more efficient especially for highdimensional data. In addition, we present three algorithms for solving the convex CMTL formulation. We report experimental results on benchmark datasets to demonstrate the efficiency of the proposed algorithms. 1
Realtime motion trajectorybased indexing and retrieval of video sequences
 IEEE Trans. Multimedia
, 2007
"... Abstract—This paper presents a novel motion trajectorybased compact indexing and efficient retrieval mechanism for video sequences. Assuming trajectory information is already available, we represent trajectories as temporal ordering of subtrajectories. This approach solves the problem of trajectory ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
(Show Context)
Abstract—This paper presents a novel motion trajectorybased compact indexing and efficient retrieval mechanism for video sequences. Assuming trajectory information is already available, we represent trajectories as temporal ordering of subtrajectories. This approach solves the problem of trajectory representation when only partial trajectory information is available due to occlusion. It is achieved by a hypothesis testingbased method applied to curvature data computed from trajectories. The subtrajectories are then represented by their principal component analysis (PCA) coefficients for optimally compact representation. Different techniques are integrated to index and retrieve subtrajectories, including PCA, spectral clustering, and string matching. We assume a query by example mechanism where an example trajectory is presented to the system and the search system returns a ranked list of most similar items in the dataset. Experiments based on datasets obtained from University of California at Irvine’s KDD archives and Columbia University’s DVMM group demonstrate the superiority of our proposed PCAbased approaches in terms of indexing and retrieval times and precision recall ratios, when compared to other techniques in the literature. Index Terms—Principal component analysis, spectral clustering, string Matching, trajectory retrieval. I.
Linear and Nonlinear Projective Nonnegative Matrix Factorization
"... Abstract—A variant of nonnegative matrix factorization (NMF) which was proposed earlier is analyzed here. It is called Projective Nonnegative Matrix Factorization (PNMF). The new method approximately factorizes a projection matrix, minimizing the reconstruction error, into a positive lowrank matrix ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
(Show Context)
Abstract—A variant of nonnegative matrix factorization (NMF) which was proposed earlier is analyzed here. It is called Projective Nonnegative Matrix Factorization (PNMF). The new method approximately factorizes a projection matrix, minimizing the reconstruction error, into a positive lowrank matrix and its transpose. The dissimilarity between the original data matrix and its approximation can be measured by the Frobenius matrix norm or the modified KullbackLeibler divergence. Both measures are minimized by multiplicative update rules, whose convergence is proven for the first time. Enforcing orthonormality to the basic objective is shown to lead to an even more efficient update rule, which is also readily extended to nonlinear cases. The formulation of the PNMF objective is shown to be connected to a variety of existing nonnegative matrix factorization methods and clustering approaches. In addition, the derivation using Lagrangian multipliers reveals the relation between reconstruction and sparseness. For kernel principal component analysis with the binary constraint, useful in graph partitioning problems, the nonlinear kernel PNMF provides a good approximation which outperforms an existing discretization approach. Empirical study on three realworld databases shows that PNMF can achieve the best or close to the best in clustering. The proposed algorithm runs more efficiently than the compared nonnegative matrix factorization methods, especially for highdimensional data. Moreover, contrary to the basic NMF, the trained projection matrix can be readily used for newly coming samples and demonstrates good generalization. I.
Hierarchical Kernel StickBreaking Process for MultiTask Image Analysis
"... The kernel stickbreaking process (KSBP) is employed to segment general imagery, imposing the condition that patches (small blocks of pixels) that are spatially proximate are more likely to be associated with the same cluster (segment). The number of clusters is not set a priori and is inferred from ..."
Abstract

Cited by 16 (7 self)
 Add to MetaCart
(Show Context)
The kernel stickbreaking process (KSBP) is employed to segment general imagery, imposing the condition that patches (small blocks of pixels) that are spatially proximate are more likely to be associated with the same cluster (segment). The number of clusters is not set a priori and is inferred from the hierarchical Bayesian model. Further, KSBP is integrated with a shared Dirichlet process prior to simultaneously model multiple images, inferring their interrelationships. This latter application may be useful for sorting and learning relationships between multiple images. The Bayesian inference algorithm is based on a hybrid of variational Bayesian analysis and local sampling. In addition to providing details on the model and associated inference framework, example results are presented for several imageanalysis problems. 1.