Results 1 - 10
of
50
Indexing by latent semantic analysis
- JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE
, 1990
"... A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The p ..."
Abstract
-
Cited by 2168 (30 self)
- Add to MetaCart
A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 or-thogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are re-turned. initial tests find this completely automatic method for retrieval to be promising.
Inside the search process: Information seeking from the user’s perspective
- Journal of the American Society for Information Science
, 1991
"... The article discusses the users ’ perspective of informa-tion seeking. A model of the information search process is presented derived from a series of five studies inves-tigating common experiences of users in information seeking situations. The cognitive and affective aspects of the process of info ..."
Abstract
-
Cited by 126 (1 self)
- Add to MetaCart
The article discusses the users ’ perspective of informa-tion seeking. A model of the information search process is presented derived from a series of five studies inves-tigating common experiences of users in information seeking situations. The cognitive and affective aspects of the process of information seeking suggest a gap be-tween the users ’ natural process of information use and the information system and intermediaries ’ traditional patterns of information provision.
Internet Browsing and Searching: User Evaluations of Category Map and Concept Space Techniques
- JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE
, 1998
"... ..."
Using wordnet in a knowledge-based approach to information retrieval
, 1995
"... Abstract: The application of natural language processing tools and techniques to information retrieval tasks has long since been identified as potentially useful for the quality of information retrieval. Traditionally, IR has been based on matching words or terms in a query with words or terms in a ..."
Abstract
-
Cited by 67 (0 self)
- Add to MetaCart
Abstract: The application of natural language processing tools and techniques to information retrieval tasks has long since been identified as potentially useful for the quality of information retrieval. Traditionally, IR has been based on matching words or terms in a query with words or terms in a document. In this paper we introduce an approach to IR based on computing a semantic distance measurement between concepts or words and using this word distance to compute a similarity between a query and a document. Two such semantic distance measures are presented in this paper and both are benchmarked on queries and documents from the TREC collection. Although our results in terms of precision and recall are disappointing, we rationalise this in terms of our experimental setup and our results show promise for future work in this area. 1
A Concept Space Approach to Addressing the Vocabulary Problem in Scientific Information Retrieval: An Experiment on the Worm Community System
- Journal of the American Society for Information Science
, 1997
"... This research presents an algorithmic approach to addressing the vocabulary problem in scientific information retrieval and information sharing, using the molecular biology domain as an example. We first present a literature review of cognitive stud!es related to the vcrcabulaw problem and vocabular ..."
Abstract
-
Cited by 56 (14 self)
- Add to MetaCart
This research presents an algorithmic approach to addressing the vocabulary problem in scientific information retrieval and information sharing, using the molecular biology domain as an example. We first present a literature review of cognitive stud!es related to the vcrcabulaw problem and vocabulary-based search aids (thesauri) and then discuss technques for building robust and domain-specific thesauri to assist in cross-domain scientific information retrieval. Using a variation of the automatic thesaurus generation techniques, which we refer to as the concept space approach, we racentiy conducted an experiment in the molecular biology domain in whch we created a C. eksgans worm thesaurus of 7,657 worm-specific terms and a Drosophila fty thesaurus of 15,626 terms. About 30 % of these terms overtappad, which created vocabulary paths
A study of information seeking and retrieving, iii: Searchers, searches, overlap
- Journal of the American Society for Information Science and Technology
, 1988
"... The objectives of the study were to conduct a series of observations and experiments under as real-life situation as possible related to: (1) user context of questions in information retrieval; (2) the structure and classification of questions; (3) cognitive traits and decision making of searchers; ..."
Abstract
-
Cited by 54 (4 self)
- Add to MetaCart
The objectives of the study were to conduct a series of observations and experiments under as real-life situation as possible related to: (1) user context of questions in information retrieval; (2) the structure and classification of questions; (3) cognitive traits and decision making of searchers; and (4) different searches of the same question. The study is presented in three parts: Part I presents the background of the study and describes the models, measures, methods, procedures and statistical analyses used. Part II is devoted to results related to users, questions and effectiveness measures, and Part III to results related to searchers, searches and overlap studies. A concluding summary of all results is presented in Part III.
A Parallel Computing Approach to Creating Engineering Concept Spaces for Semantic Retrieval: The Illinois Digital Library Initiative Project
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1996
"... : This research presents preliminary results generated from the semantic retrieval research component of the Illinois Digital Library Initiative (DLI) project. Using a variation of the automatic thesaurus generation techniques, to which we refer as the concept space approach, we aimed to create gra ..."
Abstract
-
Cited by 37 (12 self)
- Add to MetaCart
: This research presents preliminary results generated from the semantic retrieval research component of the Illinois Digital Library Initiative (DLI) project. Using a variation of the automatic thesaurus generation techniques, to which we refer as the concept space approach, we aimed to create graphs of domain-specific concepts (terms) and their weighted co-occurrence relationships for all major engineering domains. Merging these concept spaces and providing traversal paths across different concept spaces could potentially help alleviate the vocabulary (difference) problem evident in large-scale information retrieval. We have experimented previously with such a technique for a smaller molecular biology domain (Worm Community System, with 10+ MBs of document collection) with encouraging results. In order to address the scalability issue related to large-scale information retrieval and analysis for the current Illinois DLI project, we recently conducted experiments using the concept sp...
Enhancing Performance in Latent Semantic Indexing (LSI) Retrieval
, 1992
"... We have previously described an extension of the vector retrieval method called "Latent Semantic Indexing" (LSI) (Deerwester, et al., 1990; Dumais, et al., 1988; Furnas, et al., 1988). The LSI approach partially overcomes the problem of variability in human word choice by automatically organizing ob ..."
Abstract
-
Cited by 37 (0 self)
- Add to MetaCart
We have previously described an extension of the vector retrieval method called "Latent Semantic Indexing" (LSI) (Deerwester, et al., 1990; Dumais, et al., 1988; Furnas, et al., 1988). The LSI approach partially overcomes the problem of variability in human word choice by automatically organizing objects into a "semantic" structure more appropriate for information retrieval. This is done by modeling the implicit higher-order structure in the association of terms with objects. Initial tests find this completely automatic method to be a promising way to improve users' access to many kinds of textual materials or to objects for which textual descriptions are available. This paper describes some enhancements to the basic LSI method, including differential term weighting and relevance feedback. Appropriate term weighting improves performance by an average of 40%, and feedback based on 3 relevant documents improves performance by an average of 67%. September 1, 1992 D R A F T Dumais - 2 1....
Using WordNet as a Knowledge Base for Measuring Semantic Similarity Between Words
- In Proceedings of AICS Conference
, 1994
"... In this paper we propose the use of WordNet as a knowledge base in an information retrieval task. The application areas range from information filtering and document retrieval to multimedia retrieval and data sharing in large scale distributed database systems. The WordNet derived knowledge base mak ..."
Abstract
-
Cited by 36 (0 self)
- Add to MetaCart
In this paper we propose the use of WordNet as a knowledge base in an information retrieval task. The application areas range from information filtering and document retrieval to multimedia retrieval and data sharing in large scale distributed database systems. The WordNet derived knowledge base makes semantic knowledge available which can be used in overcoming many problems associated with the richness of natural language. A semantic similarity measure is also proposed which can be used as an alternative to pattern matching in the comparison process. 1
Self-Organizing Maps In Natural Language Processing
, 1997
"... Kohonen's Self-Organizing Map (SOM) is one of the most popular artificial neural network algorithms. Word category maps are SOMs that have been organized according to word similarities, measured by the similarity of the short contexts of the words. Conceptually interrelated words tend to fall into t ..."
Abstract
-
Cited by 33 (2 self)
- Add to MetaCart
Kohonen's Self-Organizing Map (SOM) is one of the most popular artificial neural network algorithms. Word category maps are SOMs that have been organized according to word similarities, measured by the similarity of the short contexts of the words. Conceptually interrelated words tend to fall into the same or neighboring map nodes. Nodes may thus be viewed as word categories. Although no a priori information about classes is given, during the self-organizing process a model of the word classes emerges. The central topic of the thesis is the use of the SOM in natural language processing. The approach based on the word category maps is compared with the methods that are widely used in artificial intelligence research. Modeling gradience, conceptual change, and subjectivity of natural language interpretation are considered. The main application area is information retrieval and textual data mining for which a specific SOM-based method called the WEBSOM has been developed. The WEBSOM metho...

