Results 1 - 10
of
18
Reexamining the Cluster Hypothesis: Scatter/Gather on Retrieval Results
, 1996
"... We present Scatter/Gather, a cluster-based document browsing method, as an alternative to ranked titles for the organization and viewing of retrieval results. We systematically evaluate Scatter/Gather in this context and find significant improvements over similarity search ranking alone. This resul ..."
Abstract
-
Cited by 331 (5 self)
- Add to MetaCart
We present Scatter/Gather, a cluster-based document browsing method, as an alternative to ranked titles for the organization and viewing of retrieval results. We systematically evaluate Scatter/Gather in this context and find significant improvements over similarity search ranking alone. This result provides evidence validating the cluster hypothesis which states that relevant documents tend to be more similar to each other than to non-relevant documents. We describe a system employing Scatter/Gather and demonstrate that users are able to use this system close to its full potential. 1 Introduction An important service offered by an information access system is the organization of retrieval results. Conventional systems rank results based on an automatic assessment of relevance to the query [20]. Alternatives include graphical displays of interdocument similarity (e.g., [1, 22, 7]), relationship to fixed attributes (e.g., [21, 14]), and query term distribution patterns (e.g., [12]). I...
SONIA: A Service for Organizing Networked Information Autonomously
, 1998
"... The recent explosion of on-line information in Digital Libraries and on the World Wide Web has given rise to a number of query-based search engines and manually constructed topical hierarchies. However, these tools are quickly becoming inadequate as query results grow incomprehensibly large and manu ..."
Abstract
-
Cited by 38 (0 self)
- Add to MetaCart
The recent explosion of on-line information in Digital Libraries and on the World Wide Web has given rise to a number of query-based search engines and manually constructed topical hierarchies. However, these tools are quickly becoming inadequate as query results grow incomprehensibly large and manual classification in topic hierarchies creates an immense bottleneck. We address these problems with a system for topical information space navigation that combines the query-based and taxonomic systems. We employ machine learning techniques to create dynamic document categorizations based on the full-text of articles that are retrieved in response to users' queries. Our system, named SONIA (Service for Organizing Networked Information Autonomously), has been implemented as part of the Stanford Digital Libraries Testbed. It employs a combination of technologies that takes the results of queries to networked information sources and, in real-time, automatically retrieve, parse and organize the...
Next Generation Web Search: Setting Our Sites
- IEEE DATA ENGINEERING BULLETIN
, 2000
"... The current state of web search is most successful at directing users to appropriate web sites. Once at the site, the user has a choice of following hyperlinks or using site search, but the latter is notoriously problematic. One solution is to develop specialized search interfaces that explicitly su ..."
Abstract
-
Cited by 34 (2 self)
- Add to MetaCart
The current state of web search is most successful at directing users to appropriate web sites. Once at the site, the user has a choice of following hyperlinks or using site search, but the latter is notoriously problematic. One solution is to develop specialized search interfaces that explicitly support the types of tasks users perform using the information specific to the site. A new way to support task-based site search is to dynamically present appropriate metadata that organizes the search results and suggests what to look at next, as a personalized intermixing of search and hypertext.
SQLET: Short Query Linguistic Expansion Techniques, Palliating One-Word Queries by Providing Intermediate Structure to Text
- In Proceedings of the RIAO’97
, 1997
"... Most people using the WWW try to find information using one or two word queries. The information retrieval systems derived from research models were designed for longer queries and do not provide an adequate response to the user's needs. On the other hand, recent advances in natural language process ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
Most people using the WWW try to find information using one or two word queries. The information retrieval systems derived from research models were designed for longer queries and do not provide an adequate response to the user's needs. On the other hand, recent advances in natural language processing permit the extraction of typed information that is axed on one or two words. We review a selection of this typed information and describe how it could be used to present an intermediate structure for the user, a structure fitting between their short queries and the documents found on the web. The user would first access this structure, and having found a use corresponding to their information need in this structure, more quickly and precisely access the corresponding web pages without random searching. An example is presented for a sample short query.
The use of categories and clusters for organizing retrieval results
- Natural Language Information Retrieval
, 1999
"... Abstract. An important problem for information access systems is that of organizing large sets of documents that have been retrieved in response to a query. Text categorization and text clustering are two natural language processing tasks whose results can be applied to document organization. This c ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Abstract. An important problem for information access systems is that of organizing large sets of documents that have been retrieved in response to a query. Text categorization and text clustering are two natural language processing tasks whose results can be applied to document organization. This chapter describes user interfaces that use categories and clusters to organize retrieval results, and examines the relationship between the two. 1 1.
A WordNet-Based Interface to Internet Search Engines
- In Proceedings of the FLAIRS-98
, 1998
"... A vast amount of information is available on the Internet, and naturally, many information gathering tools have been developed. Several search engines with different characteristics, such as AltaVista, Lycos, Infoseek, and others are available. However, the web information retrieval technology is st ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
A vast amount of information is available on the Internet, and naturally, many information gathering tools have been developed. Several search engines with different characteristics, such as AltaVista, Lycos, Infoseek, and others are available. However, the web information retrieval technology is still in its infancy, and there is need for considerable improvement. Some inherent difficulties are: (1) the web information is diverse and highly unstructured, (2) the size of information is large and it grows at an exponential rate, and (3) the current search engine technology is still rudimentary. While the first two issues are more profound and require long term solutions, it may be possible to develop software around the search engines to improve the quality of the information retrieved. In this paper we present a natural language interface system to a search engine and discuss some of the results obtained. Introduction A main problem with the current search engines is the large volume ...
Combining Text-, Link-, and Classification-based Retrieval Methods to Enhance Information Discovery on the Web
, 2002
"... ..."
Automatic Construction of N-ary Tree Based Taxonomies
"... Hierarchies are an intuitive and effective organization paradigm for data. Of late there has been considerable research on automatically learning hierarchical organizations of data. In this paper, we explore the problem of learning n-ary tree based hierarchies of categories with no userdefined param ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Hierarchies are an intuitive and effective organization paradigm for data. Of late there has been considerable research on automatically learning hierarchical organizations of data. In this paper, we explore the problem of learning n-ary tree based hierarchies of categories with no userdefined parameters. We propose a framework that characterizes a “good ” taxonomy and also provide an algorithm to find it. This algorithm works completely automatically (with no user input) and is significantly less greedy than existing algorithms in literature. We evaluate our approach on multiple real life datasets from diverse domains, such as text mining, hyper-spectral analysis, written character recognition etc. Our experimental results show that not only are n-ary trees based taxonomies more “natural”, but also the output space decompositions induced by these taxonomies for many datasets yield better classification accuracies as opposed to classification on binary tree based taxonomies.
Improving the search on the Internet by using WordNet and lexical operators
- In IEEE Internet Computing
, 1998
"... A vast amount of information is available on the Internet, and naturally, many information gathering tools have been developed. Search engines with different characteristics, such as AltaVista, Lycos, Infoseek, and others are available. However, there are inherent difficulties associated with the ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
A vast amount of information is available on the Internet, and naturally, many information gathering tools have been developed. Search engines with different characteristics, such as AltaVista, Lycos, Infoseek, and others are available. However, there are inherent difficulties associated with the task of retrieving information on the Internet: (1) the web information is diverse and highly unstructured, (2) the size of information is large and it grows at an exponential rate. While these two issues are profound and require long term solutions, still it is possible to develop software around the search engines to improve the quality of the information retrieved. In this paper we present a natural language interface system to a search engine. The search improvement achieved by our system is based on: (1) a query extension using WordNet and (2) the use of new lexical operators that replace the classical boolean operators used by current search engines. Several tests have been performed using the TIPSTER topics collection, provided at the 6th Text Retrieval Conference (TREC-6); the results obtained are presented and discussed.
A New Visual Search Interface for Web Browsing
"... We introduce a new visual search interface for search engines. The interface is a user-friendly and informative graphical front-end for organizing and presenting search results in the form of topic groups. Such a semantics-oriented search result presentation is in contrast with conventional search i ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We introduce a new visual search interface for search engines. The interface is a user-friendly and informative graphical front-end for organizing and presenting search results in the form of topic groups. Such a semantics-oriented search result presentation is in contrast with conventional search interfaces which present search results according to the physical structures of the information. Given a user query, our interface first retrieves relevant online materials via a thirdparty search engine. And then we analyze the semantics of search results to detect latent topics in the result set. Once the topics are detected, we map the search result pages into topic clusters. According to the topic clustering result, we divide the available screen space for our visual interface into multiple topic displaying regions, one for each topic. For each topic’s displaying region, we summarize the information contained in the search results under the corresponding topic so that only key messages will be displayed. With this new visual search interface, users are conveyed the key information in the search results expediently. With the key information, users can navigate to the final, desired results with less effort and time than conventional searching. Supplementary materials for this paper are available at

