Results 1 -
5 of
5
Indexing by latent semantic analysis
- JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE
, 1990
"... A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The p ..."
Abstract
-
Cited by 2168 (30 self)
- Add to MetaCart
A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 or-thogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are re-turned. initial tests find this completely automatic method for retrieval to be promising.
Enhancing Performance in Latent Semantic Indexing (LSI) Retrieval
, 1992
"... We have previously described an extension of the vector retrieval method called "Latent Semantic Indexing" (LSI) (Deerwester, et al., 1990; Dumais, et al., 1988; Furnas, et al., 1988). The LSI approach partially overcomes the problem of variability in human word choice by automatically organizing ob ..."
Abstract
-
Cited by 37 (0 self)
- Add to MetaCart
We have previously described an extension of the vector retrieval method called "Latent Semantic Indexing" (LSI) (Deerwester, et al., 1990; Dumais, et al., 1988; Furnas, et al., 1988). The LSI approach partially overcomes the problem of variability in human word choice by automatically organizing objects into a "semantic" structure more appropriate for information retrieval. This is done by modeling the implicit higher-order structure in the association of terms with objects. Initial tests find this completely automatic method to be a promising way to improve users' access to many kinds of textual materials or to objects for which textual descriptions are available. This paper describes some enhancements to the basic LSI method, including differential term weighting and relevance feedback. Appropriate term weighting improves performance by an average of 40%, and feedback based on 3 relevant documents improves performance by an average of 67%. September 1, 1992 D R A F T Dumais - 2 1....
Data-Driven Approaches To Information Access
- COGNITIVE SCIENCE
, 2003
"... This paper summarizes three lines of research that are motivated by the practical problem of helping users find information from external data sources, most notably computers. The application areas include information retrieval, text categorization, and question answering. Acommon theme in these app ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
This paper summarizes three lines of research that are motivated by the practical problem of helping users find information from external data sources, most notably computers. The application areas include information retrieval, text categorization, and question answering. Acommon theme in these applications is that practical information access problems can be solved by analyzing the statistical properties of words in large volumes of real world texts. The same statistical properties constrain human performance, thus we believe that solutions to practical information access problems can shed light on human knowledge representation and reasoning.
Indexing by Latent Semantic Analysis
- Journal of the American Society for Information Science
, 2001
"... A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents ("semantic structure") in order to improve the detection of relevant documents on the basis of terms found in queries. The p ..."
Abstract
- Add to MetaCart
A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents ("semantic structure") in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are returned. Initial tests find this completely automatic method for retrieval to be promising. Deerwester - 1 - 1.
INTELLIGENCE CHINESE DOCUMENT SEMANTIC INDEXING SYSTEM
"... With the rapid growth of the Internet, how to get information from this huge information space becomes an even more important problem. In this paper, An Intelligence Chinese Document Semantic Indexing System; ICDSIS, is proposed. Some new technologies are integrated in ICDSIS to obtain good performa ..."
Abstract
- Add to MetaCart
With the rapid growth of the Internet, how to get information from this huge information space becomes an even more important problem. In this paper, An Intelligence Chinese Document Semantic Indexing System; ICDSIS, is proposed. Some new technologies are integrated in ICDSIS to obtain good performance. ICDSIS is composed of four key procedures. A parallel, distributed and configurable Spider is used for information gather; a multi-hierarchy document classification approach combining the information gain initially processes gathered web documents; a swarm intelligence based document clustering method is used for information organization; a concept-based retrieval interface is applied for user interactive retrieval. ICDSIS is an all-sided solution for information retrieval on the Internet.

