Indexing by latent semantic analysis
 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE
, 1990
Abstract

A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higherorder structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singularvalue decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudodocument vectors formed from weighted combinations of terms, and documents with suprathreshold cosine values are returned. initial tests find this completely automatic method for retrieval to be promising.
Indexing by Latent Semantic Analysis
 Journal of the American Society for Information Science
, 2001
Abstract
A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higherorder structure in the association of terms with documents ("semantic structure") in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singularvalue decomposition, in which a large term by document matrix is decomposed into a set of ca 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca 100 item vectors of factor weights. Queries are represented as pseudodocument vectors formed from weighted combinations of terms, and documents with suprathreshold cosine values are returned. Initial tests find this completely automatic method for retrieval to be promising. Deerwester  1  1.
DOI: 10.1007/S1133600890561 ON THE NONEXISTENCE OF OPTIMAL SOLUTIONS AND THE OCCURRENCE OF “DEGENERACY ” IN THE CANDECOMP/PARAFAC MODEL
, 2008
"... The CANDECOMP/PARAFAC (CP) model decomposes a threeway array into a prespecified number of R factors and a residual array by minimizing the sum of squares of the latter. It is well known that an optimal solution for CP need not exist. We show that if an optimal CP solution does not exist, then any ..."
Abstract
The CANDECOMP/PARAFAC (CP) model decomposes a threeway array into a prespecified number of R factors and a residual array by minimizing the sum of squares of the latter. It is well known that an optimal solution for CP need not exist. We show that if an optimal CP solution does not exist, then any sequence of CP factors monotonically decreasing the CP criterion value to its infimum will exhibit the features of a socalled “degeneracy”. That is, the parameter matrices become nearly rank deficient and the Euclidean norm of some factors tends to infinity. We also show that the CP criterion function does attain its infimum if one of the parameter matrices is constrained to be columnwise orthonormal.