Results 1 - 10
of
19
ScentTrails: Integrating Browsing and Searching on the Web
- ACM TRANSACTIONS ON COMPUTER-HUMAN INTERACTION
, 2003
"... ..."
Information retrieval on the Web
- ACM Computing Surveys
, 2000
"... In this paper we review studies of the growth of the Internet and technologies that are useful for information search and retrieval on the Web. We present data on the Internet from several different sources, e.g., current as well as projected number of users, hosts, and Web sites. Although numerical ..."
Abstract
-
Cited by 58 (0 self)
- Add to MetaCart
In this paper we review studies of the growth of the Internet and technologies that are useful for information search and retrieval on the Web. We present data on the Internet from several different sources, e.g., current as well as projected number of users, hosts, and Web sites. Although numerical figures vary, overall trends cited
Constructing, Organizing, and Visualizing Collections of Topically Related Web Resources
- ACM Transactions on Computer-Human Interaction
, 1999
"... For many purposes, the Web page is too small a unit of interaction and analysis. Web sites are structured multimedia documents consisting of many pages, and users often are interested in obtaining and evaluating entire collections of topically related sites. Once such a collection is obtained, users ..."
Abstract
-
Cited by 40 (5 self)
- Add to MetaCart
For many purposes, the Web page is too small a unit of interaction and analysis. Web sites are structured multimedia documents consisting of many pages, and users often are interested in obtaining and evaluating entire collections of topically related sites. Once such a collection is obtained, users face the challenge of exploring, comprehending, and organizing the items. We report four innovations that address these user needs. . We replaced the web page with the web site as the basic unit of interaction and analysis. . We defined a new information structure, the clan graph, that groups together sets of related sites. . We augment the representation of a site with a site profile, information about site structure and content that helps inform user evaluation of a site. . We invented a new graph visualization, the auditorium visualization, that reveals important structural and content properties of sites within a clan graph. Detailed analysis and user studies document the utility o...
Cha-Cha: A system for organizing intranet search results
- In Proceedings of the 2nd USENIX Symposium on Internet Technologies and Systems
, 1999
"... Although search over World Wide Web pages has recently received much academic and commercial attention, surprisingly little research has been done on how to search the web pages within large, diverse intranets. Intranets contain the information associated with the internal workings of an organizatio ..."
Abstract
-
Cited by 32 (2 self)
- Add to MetaCart
Although search over World Wide Web pages has recently received much academic and commercial attention, surprisingly little research has been done on how to search the web pages within large, diverse intranets. Intranets contain the information associated with the internal workings of an organization. A standard search engine retrieves web pages that fall within a widely diverse range of information contexts, but presents these results uniformly, in a ranked list. As an alternative, the Cha-Cha system organizes web search results in such a way as to reflect the underlying structure of the intranet. In our approach, an “outline ” or “table of contents ” is created by first recording the shortest paths in hyperlinks from root pages to every page within the web intranet. After the user issues a query, these shortest paths are dynamically combined to form a hierarchical outline of the context in which the search results occur. The system is designed to be helpful for users with a wide range of computer skills. Preliminary user study and survey results suggest that some users find the resulting structure more helpful than the standard retrieval results display for intranet search. 1
Ontology-Based Web Site Mapping for Information Exploration
- In Proceedings of the 8 th International Conference On Information Knowledge Management (CIKM
, 1999
"... Centralized search process requires that the whole collection reside at a single site. This imposes a burden on both the system storage of the site and the network traffic near the site. It thus comes to require the search process to be distributed. Recently, more and more Web sites provide the abil ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
Centralized search process requires that the whole collection reside at a single site. This imposes a burden on both the system storage of the site and the network traffic near the site. It thus comes to require the search process to be distributed. Recently, more and more Web sites provide the ability to search their local collection of Web pages. Query brokering systems are used to direct queries to the promising sites and merge the results from these sites. Creation of meta-information of the sites plays an important role in such systems. In this article, we introduce an ontology-based web site mapping method used to produce conceptual meta-information, the Vector Space approach, and present a serial of experiments comparing it with Nave-Bayes approach. We found that the Vector Space approach produces better accuracy in ontology-based web site mapping. Keywords Distributed collections, information brokers, text categorization, IR agents. 1. INTRODUCTION The World Wide Web (WWW)...
Ephemeral Document Clustering for Web Applications
, 2000
"... We revisit document clustering in the context of the Web. Specifically, we investigate on-line ephemeral clustering, whereby the input document set is generated dynamically, typically by search results, and the output clustering hierarchy has a short life span, and is used for interactive browsing ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
We revisit document clustering in the context of the Web. Specifically, we investigate on-line ephemeral clustering, whereby the input document set is generated dynamically, typically by search results, and the output clustering hierarchy has a short life span, and is used for interactive browsing purposes. Ephemeral clustering for interactive use introduces several new challenges. It requires an efficient algorithm, since clustering is performed on-line. It also requires high precision, because users who are not domain experts are less tolerant to errors, and because the resulting hierarchy is fully automatically generated, as opposed to off-line clustering in which the hierarchy is often manually modified. Finally, interactive clustering requires a presentation layer that enables users to effectively browse the hierarchy, including visualization techniques and automatic annotations of the hierarchy. We present new concepts, techniques and algorithms that tailor clustering to...
Surfing the Web Backwards
- In: Proc. of WWW 8 Conference
, 1999
"... From a user’s perspective, hypertext links on the web form a directed graph between distinct information sources. We investigate the effects of discovering “backlinks ” from web resources, namely links pointing to the resource. We describe tools for backlink navigation on both the client and server ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
From a user’s perspective, hypertext links on the web form a directed graph between distinct information sources. We investigate the effects of discovering “backlinks ” from web resources, namely links pointing to the resource. We describe tools for backlink navigation on both the client and server side, using an applet for the client and a module for the Apache web server. We also discuss possible extensions to the HTTP protocol to facilitate the collection and navigation of backlink information in the world wide web. 1
Feature Reduction for Document Clustering and Classification
, 2000
"... Often users receive search results which contain a wide range of documents, only some of which are relevant to their information needs. To address this problem, ever more systems not only locate information for users, but also organise that information on their behalf. We look at two main automatic ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Often users receive search results which contain a wide range of documents, only some of which are relevant to their information needs. To address this problem, ever more systems not only locate information for users, but also organise that information on their behalf. We look at two main automatic approaches to information organisation: interactive clustering of search results and pre-categorising documents to provide hierarchical browsing structures. To be feasible in real world applications, both of these approaches require accurate yet efficient algorithms. Yet, both suffer from the curse of dimensionality — documents are typically represented by hundreds or thousands of words (features) which must be analysed and processed during clustering or classification. In this paper, we discuss feature reduction techniques and their application to document clustering and classification, showing that feature reduction improves efficiency as well as accuracy. We validate these algorithms using human relevance assignments and categorisation. 1
Overview and Preview Tools For Navigating the World-Wide Web
, 1999
"... This paper examines the problems inherent in navigating the World-Wide Web. It discusses the work done by others in crafting techniques, software products, and research prototypes that attempt to improve the browsing experience through the application of information visualization in the form of site ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
This paper examines the problems inherent in navigating the World-Wide Web. It discusses the work done by others in crafting techniques, software products, and research prototypes that attempt to improve the browsing experience through the application of information visualization in the form of sitemaps. This paper also describes an animated technique to generate previews and overviews of a web site in order to get a better understanding of its contents. The final section includes a technical description of an early prototype tool that uses this animated technique, with preliminary findings from an informal feasibility study involving 19 subjects. Keywords Web browsing; alternative user interfaces; web navigation; previews; overviews; web crawler; searching Context and Problem Statement The World-Wide Web is a constantly evolving maze of HTML, DHTML, XML, Java, JavaScript, CGI, Active Server Pages, Shockwave, Flash, and other means of generating hypertext content. It is an extremely ...
Apolo: Making Sense of Large Network Data by Combining Rich User Interaction and Machine Learning
"... Extracting useful knowledge from large network datasets has become a fundamental challenge in many domains, from scientific literature to social networks and the web. We introduce Apolo, a system that uses a mixed-initiative approach— combining visualization, rich user interaction and machine learni ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Extracting useful knowledge from large network datasets has become a fundamental challenge in many domains, from scientific literature to social networks and the web. We introduce Apolo, a system that uses a mixed-initiative approach— combining visualization, rich user interaction and machine learning—to guide the user to incrementally and interactively explore large network data and make sense of it. Apolo engages the user in bottom-up sensemaking to gradually build up an understanding over time by starting small, rather than starting big and drilling down. Apolo also helps users find relevant information by specifying exemplars, and then using a machine learning method called Belief Propagation to infer which other nodes may be of interest. We evaluated Apolo with twelve participants in a between-subjects study, with the task being to find relevant new papers to update an existing survey paper. Using expert judges, participants using Apolo found significantly more relevant papers. Subjective feedback of Apolo was also very positive.

