Results 11  20
of
668
Fast Monte Carlo Algorithms for Matrices II: Computing a LowRank Approximation to a Matrix
 SIAM JOURNAL ON COMPUTING
, 2004
"... ... matrix A. It is often of interest to find a lowrank approximation to A, i.e., an approximation D to the matrix A of rank not greater than a specified rank k, where k is much smaller than m and n. Methods such as the Singular Value Decomposition (SVD) may be used to find an approximation to A ..."
Abstract

Cited by 216 (21 self)
 Add to MetaCart
... matrix A. It is often of interest to find a lowrank approximation to A, i.e., an approximation D to the matrix A of rank not greater than a specified rank k, where k is much smaller than m and n. Methods such as the Singular Value Decomposition (SVD) may be used to find an approximation to A which is the best in a well defined sense. These methods require memory and time which are superlinear in m and n; for many applications in which the data sets are very large this is prohibitive. Two simple and intuitive algorithms are presented which, when given an m n matrix A, compute a description of a lowrank approximation D to A, and which are qualitatively faster than the SVD. Both algorithms have provable bounds for the error matrix A D . For any matrix X , let kXk and kXk 2 denote its Frobenius norm and its spectral norm, respectively. In the rst algorithm, c = O(1) columns of A are randomly chosen. If the m c matrix C consists of those c columns of A (after appropriate rescaling) then it is shown that from C C approximations to the top singular values and corresponding singular vectors may be computed. From the computed singular vectors a description D of the matrix A may be computed such that rank(D ) k and such that holds with high probability for both = 2; F . This algorithm may be implemented without storing the matrix A in Random Access Memory (RAM), provided it can make two passes over the matrix stored in external memory and use O(m + n) additional RAM memory. The second algorithm is similar except that it further approximates the matrix C by randomly sampling r = O(1) rows of C to form a r c matrix W . Thus, it has additional error, but it can be implemented in three passes over the matrix using only constant ...
A Survey of Collaborative Filtering Techniques
, 2009
"... As one of the most successful approaches to building recommender systems, collaborative filtering (CF) uses the known preferences of a group of users to make recommendations or predictions of the unknown preferences for other users. In this paper, we first introduce CF tasks and their main challenge ..."
Abstract

Cited by 205 (0 self)
 Add to MetaCart
As one of the most successful approaches to building recommender systems, collaborative filtering (CF) uses the known preferences of a group of users to make recommendations or predictions of the unknown preferences for other users. In this paper, we first introduce CF tasks and their main challenges, such as data sparsity, scalability, synonymy, gray sheep, shilling attacks, privacy protection, etc., and their possible solutions. We then present three main categories of CF techniques: memorybased, modelbased, and hybrid CF algorithms (that combine CF with other recommendation techniques), with examples for representative algorithms of each category, and analysis of their predictive performance and their ability to address the challenges. From basic techniques to the stateoftheart, we attempt to present a comprehensive survey for CF techniques, which can be served as a roadmap for research and practice in this area.
Criterion Functions for Document Clustering: Experiments and Analysis
, 2002
"... In recent years, we have witnessed a tremendous growth in the volume of text documents available on the Internet, digital libraries, news sources, and companywide intranets. This has led to an increased interest in developing methods that can help users to effectively navigate, summarize, and org ..."
Abstract

Cited by 201 (13 self)
 Add to MetaCart
(Show Context)
In recent years, we have witnessed a tremendous growth in the volume of text documents available on the Internet, digital libraries, news sources, and companywide intranets. This has led to an increased interest in developing methods that can help users to effectively navigate, summarize, and organize this information with the ultimate goal of helping them to find what they are looking for. Fast and highquality document clustering algorithms play an important role towards this goal as they have been shown to provide both an intuitive navigation/browsing mechanism by organizing large amounts of information into a small number of meaningful clusters as well as to greatly improve the retrieval performance either via clusterdriven dimensionality reduction, termweighting, or query expansion. This everincreasing importance of document clustering and the expanded range of its applications led to the development of a number of new and novel algorithms with different complexityquality tradeoffs. Among them, a class of clustering algorithms that have relatively low computational requirements are those that treat the clustering problem as an optimization process which seeks to maximize or minimize a particular clustering criterion function defined over the entire clustering solution.
Modeling local coherence: An entitybased approach
 In Proceedings of ACL 2005
, 2005
"... This paper considers the problem of automatic assessment of local coherence. We present a novel entitybased representation of discourse which is inspired by Centering Theory and can be computed automatically from raw text. We view coherence assessment as a ranking learning problem and show that the ..."
Abstract

Cited by 185 (14 self)
 Add to MetaCart
This paper considers the problem of automatic assessment of local coherence. We present a novel entitybased representation of discourse which is inspired by Centering Theory and can be computed automatically from raw text. We view coherence assessment as a ranking learning problem and show that the proposed discourse representation supports the effective learning of a ranking function. Our experiments demonstrate that the induced model achieves significantly higher accuracy than a stateoftheart coherence model. 1
Generic text summarization using relevance measure and latent semantic analysis
 in Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
, 2001
"... In this paper, we propose two generic text summarization methods that create text summaries by ranking and extracting sentences from the original documents. The rst method uses standard IR methods to rank sentence relevances, while the second method uses the latent semantic analysis technique to ide ..."
Abstract

Cited by 181 (2 self)
 Add to MetaCart
(Show Context)
In this paper, we propose two generic text summarization methods that create text summaries by ranking and extracting sentences from the original documents. The rst method uses standard IR methods to rank sentence relevances, while the second method uses the latent semantic analysis technique to identify semantically important sentences, for summary creations. Both methods strive to select sentences that are highly ranked and di erent from each other. This is an attempt to create a summary with a wider coverage of the document's main content and less redundancy. Performance evaluations on the two summarization methods are conducted by comparing their summarization outputs with the manual summaries generated by three independent human evaluators. The evaluations also study the in uence of di erent VSM weighting schemes on the text summarization performances. Finally, the causes of the large disparities in the evaluators ' manual summarization results are investigated, and discussions on human text summarization patterns are presented.
WEBSOM  SelfOrganizing Maps of Document Collections
 Neurocomputing
, 1997
"... Searching for relevant text documents has traditionally been based on keywords and Boolean expressions of them. Often the search results show high recall and low precision, or vice versa. Considerable efforts have been made to develop alternative methods, but their practical applicability has been l ..."
Abstract

Cited by 171 (16 self)
 Add to MetaCart
(Show Context)
Searching for relevant text documents has traditionally been based on keywords and Boolean expressions of them. Often the search results show high recall and low precision, or vice versa. Considerable efforts have been made to develop alternative methods, but their practical applicability has been low. Powerful methods are needed for the exploration of miscellaneous document collections. The WEBSOM method organizes a document collection on a map display that provides an overview of the collection and facilitates interactive browsing. Interesting documents can be retrieved by a content addressable search of interesting map locations. The interesting locations could also be marked as filters for collecting interesting new documents.
Highdimensional data analysis: The curses and blessings of dimensionality
 AMS CONFERENCE ON MATH CHALLENGES OF THE 21ST CENTURY
, 2000
"... The coming century is surely the century of data. A combination of blind faith and serious purpose makes our society invest massively in the collection and processing of data of all kinds, on scales unimaginable until recently. Hyperspectral Imagery, Internet Portals, Financial tickbytick data, a ..."
Abstract

Cited by 168 (0 self)
 Add to MetaCart
The coming century is surely the century of data. A combination of blind faith and serious purpose makes our society invest massively in the collection and processing of data of all kinds, on scales unimaginable until recently. Hyperspectral Imagery, Internet Portals, Financial tickbytick data, and DNA Microarrays are just a few of the betterknown sources, feeding data in torrential streams into scientific and business databases worldwide. In traditional statistical data analysis, we think of observations of instances of particular phenomena (e.g. instance ↔ human being), these observations being a vector of values we measured on several variables (e.g. blood pressure, weight, height,...). In traditional statistical methodology, we assumed many observations and a few, wellchosen variables. The trend today is towards more observations but even more so, to radically larger numbers of variables – voracious, automatic, systematic collection of hyperinformative detail about each observed instance. We are seeing examples where the observations gathered on individual instances are curves, or spectra, or images, or
Improved approximation algorithms for large matrices via random projections
 In Proc. 47th Ann. IEEE Symp. Foundations of Computer Science (FOCS
, 2006
"... ..."
(Show Context)
Fast Computation of Low Rank Matrix Approximations
, 2001
"... In many practical applications, given an m n matrix A it is of interest to nd an approximation to A that has low rank. We introduce a technique that exploits spectral structure in A to accelerate Orthogonal Iteration and Lanczos Iteration, the two most common methods for computing such approximat ..."
Abstract

Cited by 161 (4 self)
 Add to MetaCart
In many practical applications, given an m n matrix A it is of interest to nd an approximation to A that has low rank. We introduce a technique that exploits spectral structure in A to accelerate Orthogonal Iteration and Lanczos Iteration, the two most common methods for computing such approximations. Our technique amounts to independently sampling and/or quantizing the entries of the input matrix A, thus speeding up computation by reducing the number of nonzero entries and/or the length of their representation. Our analysis s based on observing that both sampling and quantization can be viewed as adding a random matrix E to A, where the entries of E are independent, zeromean random variables of bounded variance. Such random matrices posses no significant linear structure, and we can thus prove that the effect of sampling and quantization nearly vanishes when a low rank approximation to A is computed. In fact, the more prominent the linear structure in A is, the more data we can afford to discard and, ultimately, the faster we can discover it. We give bounds on the quality of our approximation both in the L2 and in the Frobenius norm.