MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Improved Algorithms for Topic Distillation in a Hyperlinked Environment (1998) [351 citations — 6 self]

Abstract:

This paper addresses the problem of topic distillation on the World Wide Web, namely, given a typical user query to find quality documents related to the query topic. Connectivity analysis has been shown to be useful in identifying high quality pages within a topic specific graph of hyperlinked documents. The essence of our approach is to augment a previous connectivity analysis based algorithm with content analysis. We identify three problems with the existing approach and devise algorithms to tackle them. The results of a user evaluation are reported that show an improvement of precision at 10 documents by at least 45% over pure connectivity analysis.

Citations

1674 Authoritative sources in a hyperlinked environment – Kleinberg - 1999
1103 An Algorithm For Suffix Stripping – Porter - 1980
432 Scatter/Gather: A cluster-based approach to browsing large document collections – Cutting, Karger, et al. - 1992
245 Automatic resource compilation by analyzing hyperlink structure and associated text – Chakrabarti, Dom, et al. - 1998
209 A First Course in Stochastic Processes – Karlin, Taylor - 1975
207 Ramana: Silk from a sow’s ear: extracting usable structures from the Web – Pirolli, Pitkow, et al. - 1996
147 Term Weighting Approaches – Salton, Buckley - 1988
91 The connectivity server: Fast access to linkage information on the Web – Bharat, Broder, et al. - 1998
82 Bibliometrics of the World Wide Web: An Exploratory Analysis of the Intellectual Structure of Cyberspace – Larson - 1996
77 Cat-a-Cone: an interactive interface for specifying searches and viewing retrieval results using a large category hierarchy – Hearst, Karadi - 1997
74 Towards interactive query expansion – Harman
60 Providing government information on the Internet: experiences with THOMAS – Croft, Cook - 1995
46 Search engines for the World Wide Web: A comparative study and evaluation methodology. Paper presented at the annual Conference of the ASIS – Chu, Rosenthal - 1996
36 Exploiting Clustering and Phrases for Context-Based Information Retrieval – Anick, Vaithyanathan - 1997
28 A user-centred evaluation of ranking algorithms for interactive query expansion. Korfhage et.al – Efthimiadis - 1993
28 Interfaces for End-User Information Seeking – Marchionini - 1992
14 PageRank: Bringing order to the web. Stanford Digital Libraries Working Paper 1997-0072 – Page - 1997
7 Adapting a Full-text Information Retrieval System to Computer the Troubleshooting Domain – Anick - 1994
5 LiveTopics: Recherche Visuelle d’Information sur l’Internet.” Dossiers de l’Audiovisuel, La Documentation Francaise No. 74 (July-Aug – Bourdoncle - 1997
4 Distinguishing between Web Data Mining and Information Access – Hearst - 1997
3 Citation Indexing’s Achilles Heel? Evaluative Bibliometrics and Non Coverage of the Monographic – Cronin, Snyder - 1996
2 Search Engines for the World Wide Web: A Comparative Study and Evaluation Methodology – unknown authors - 1996
2 The TREC Conferences” R. Kuhnlen and M. Rittberger (Eds – Harman - 1995
2 Fast and Effective Query Refinement – Vklex, Sheldon, et al. - 1997