Results 1 -
2 of
2
Retrieving records from a gigabyte of text on a minicomputer using statistical ranking
- Journal of the American Society for Information Science
, 1990
"... Statistically based ranked retrieval of records using keywords provides many advantages over traditional Boolean retrieval methods, especially for end users. This approach to retrieval, however, has not seen wide-spread use in large operational retrieval systems. To show the feasibility of this retr ..."
Abstract
-
Cited by 67 (1 self)
- Add to MetaCart
Statistically based ranked retrieval of records using keywords provides many advantages over traditional Boolean retrieval methods, especially for end users. This approach to retrieval, however, has not seen wide-spread use in large operational retrieval systems. To show the feasibility of this retrieval methodology, re-search was done to produce very fast search tech-niques using these ranking algorithms, and then to test the results against large databases with many end users. The results show not only response times on the order of 1 and l/2 seconds for 806 megabytes of text, but also very favorable user reaction. Novice users were able to consistently obtain good search results after 5 minutes of training. Additional work was done to de-vise new indexing techniques to create inverted files for large databases using a minicomputer. These techniques use no sorting, require a working space of only about 20 % of the size of the input text, and produce indices that are about 14 % of the input text size.
A Fuzzy Linguistic Model for Generating Similar Short Queries
"... Abstract — This work presents a model for generating a set of queries useful for obtaining massive information from Web search engines. Given an information request expressed following a boolean structure, the proposed model generates a set of related queries that are useful for being submitted to o ..."
Abstract
- Add to MetaCart
Abstract — This work presents a model for generating a set of queries useful for obtaining massive information from Web search engines. Given an information request expressed following a boolean structure, the proposed model generates a set of related queries that are useful for being submitted to one or several search engines in order to collect a huge amount of information. This model is based on a definition of the user query with linguistic constraints that allow guiding the process for selecting the best queries for a search process. The model presents a reformulation process which interchanges query terms using a knowledge source that provides related terms with respect to (w.r.t.) the original query terms. After reformulating the original query, a filtering process is proposed based on the aggregation of the inserted semantics to the different query terms through linguistic values. The final result of the filtering is the set of appropriate queries that can be submitted to a search engine in order to retrieve a huge amount of documents.

