Results 11  20
of
607,135
Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering
 Advances in Neural Information Processing Systems 14
, 2001
"... Drawing on the correspondence between the graph Laplacian, the LaplaceBeltrami operator on a manifold, and the connections to the heat equation, we propose a geometrically motivated algorithm for constructing a representation for data sampled from a low dimensional manifold embedded in a higher ..."
Abstract

Cited by 664 (8 self)
 Add to MetaCart
higher dimensional space. The algorithm provides a computationally efficient approach to nonlinear dimensionality reduction that has locality preserving properties and a natural connection to clustering. Several applications are considered.
Distance Metric Learning, With Application To Clustering With SideInformation
 ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 15
, 2003
"... Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as Kmeans initially fails to find one that is meaningful to a user, the only recourse may be for the us ..."
Abstract

Cited by 799 (14 self)
 Add to MetaCart
Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many "plausible" ways, and if a clustering algorithm such as Kmeans initially fails to find one that is meaningful to a user, the only recourse may
Text Classification from Labeled and Unlabeled Documents using EM
 MACHINE LEARNING
, 1999
"... This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classification problems obtaining training labels is expensive, while large qua ..."
Abstract

Cited by 1033 (19 self)
 Add to MetaCart
quantities of unlabeled documents are readily available. We introduce an algorithm for learning from labeled and unlabeled documents based on the combination of ExpectationMaximization (EM) and a naive Bayes classifier. The algorithm first trains a classifier using the available labeled documents
GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems
 SIAM J. SCI. STAT. COMPUT
, 1986
"... We present an iterative method for solving linear systems, which has the property ofminimizing at every step the norm of the residual vector over a Krylov subspace. The algorithm is derived from the Arnoldi process for constructing an l2orthogonal basis of Krylov subspaces. It can be considered a ..."
Abstract

Cited by 2046 (40 self)
 Add to MetaCart
We present an iterative method for solving linear systems, which has the property ofminimizing at every step the norm of the residual vector over a Krylov subspace. The algorithm is derived from the Arnoldi process for constructing an l2orthogonal basis of Krylov subspaces. It can be considered
Empirical Analysis of Predictive Algorithm for Collaborative Filtering
 Proceedings of the 14 th Conference on Uncertainty in Artificial Intelligence
, 1998
"... 1 ..."
Probabilistic Principal Component Analysis
 Journal of the Royal Statistical Society, Series B
, 1999
"... Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this paper we demonstrate how the principal axes of a set of observed data vectors may be determined through maximumlikelihood estimation of paramet ..."
Abstract

Cited by 703 (5 self)
 Add to MetaCart
of parameters in a latent variable model closely related to factor analysis. We consider the properties of the associated likelihood function, giving an EM algorithm for estimating the principal subspace iteratively, and discuss, with illustrative examples, the advantages conveyed by this probabilistic approach
The geometry of graphs and some of its algorithmic applications
 Combinatorica
, 1995
"... In this paper we explore some implications of viewing graphs as geometric objects. This approach offers a new perspective on a number of graphtheoretic and algorithmic problems. There are several ways to model graphs geometrically and our main concern here is with geometric representations that r ..."
Abstract

Cited by 543 (20 self)
 Add to MetaCart
their geometric images. In this paper we develop efficient algorithms for embedding graphs lowdimensionally with a small distortion. Further algorithmic applications include: 0 A simple, unified approach to a number of problems on multicommodity flows, including the LeightonRae Theorem [29] and some of its
GPFS: A SharedDisk File System for Large Computing Clusters
 In Proceedings of the 2002 Conference on File and Storage Technologies (FAST
, 2002
"... GPFS is IBM's parallel, shareddisk file system for cluster computers, available on the RS/6000 SP parallel supercomputer and on Linux clusters. GPFS is used on many of the largest supercomputers in the world. GPFS was built on many of the ideas that were developed in the academic community ove ..."
Abstract

Cited by 518 (3 self)
 Add to MetaCart
existing ideas scaled well, new approaches were necessary in many key areas. This paper describes GPFS, and discusses how distributed locking and recovery techniques were extended to scale to large clusters.
Scatter/Gather: A Clusterbased Approach to Browsing Large Document Collections
, 1992
"... Document clustering has not been well received as an information retrieval tool. Objections to its use fall into two main categories: first, that clustering is too slow for large corpora (with running time often quadratic in the number of documents); and second, that clustering does not appreciably ..."
Abstract

Cited by 772 (12 self)
 Add to MetaCart
document browsing technique that employs document clustering as its primary operation. We also present fast (linear time) clustering algorithms which support this interactive browsing paradigm. 1 Introduction Document clustering has been extensively investigated as a methodology for improving document
Fast Algorithms for Mining Association Rules
, 1994
"... We consider the problem of discovering association rules between items in a large database of sales transactions. We present two new algorithms for solving this problem that are fundamentally different from the known algorithms. Empirical evaluation shows that these algorithms outperform the known a ..."
Abstract

Cited by 3551 (15 self)
 Add to MetaCart
We consider the problem of discovering association rules between items in a large database of sales transactions. We present two new algorithms for solving this problem that are fundamentally different from the known algorithms. Empirical evaluation shows that these algorithms outperform the known
Results 11  20
of
607,135