MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Finding Related Pages in the World Wide Web (1999) [124 citations — 1 self]

by Jeffrey Dean ,  Monika R. Henzinger
Add To MetaCart

Abstract:

When using traditional search engines, users have to formulate queries to describe their information need. This paper discusses a different approach to web searching where the input to the search process is not a set of query terms, but instead is the URL of a page, and the output is a set of related web pages. A related web page is one that addresses the same topic as the original page. For example, www.washingtonpost.com is a page related to www.nytimes.com, since both are online newspapers. We describe two algorithms to identify related web pages. These algorithms use only the connectivity information in the web (i.e., the links between pages) and not the content of pages or usage information. We have implemented both algorithms and measured their runtime performance. To evaluate the effectiveness of our algorithms, we performed a user study comparing our algorithms with Netscape's "What's Related" service [12]. Our study showed that the precision at 10 for our two algorithms are 7...

Citations

1839 The Anatomy of a Large-Scale Hypertextual Web Search Engine – Brin, Page - 1998
1669 Authoritative sources in a hyperlinked environment – Kleinberg - 1999
620 Social Information Filtering: Algorithms for Automating "Word of Mouth" SIGCHI '95 – Shardanand, Maes - 1995
349 Improved algorithms for topic distillation in hyperlinked environments – Bharat, Henzinger - 1998
254 Enhanced hypertext categorization using hyperlinks – Chakrabarti, Dom, et al. - 1998
244 Automatic resource compilation by analyzing hyperlink structure and associated text – Chakrabarti, Dom, et al. - 1998
205 Silk from a sow’s ear: Extracting usable structures from the Web – Pirolli, Pitkow, et al. - 1996
171 Co-citation in the scientific literature: A new measu re of the relationship between two documents – Small - 1973
127 Bibliographic coupling between scientific papers – KESSLER - 1963
91 The Connectivity Server: Fast access to linkage information on the Web – Bharat, Bröder, et al. - 1998
87 Parasite: Mining structural information on the web – Spertus - 1997
86 WebQuery: Searching and visualizing the Web through connectivity – Carrière, Kazman
78 WebL - A Programming Language for the Web – Kistler, Marais - 1998
67 Citation analysis as a tool in journal evaluation – Garfield - 1972
65 Applications of a Web query language – Arocena, Mendelzon, et al.
33 lawfulness on the electronic frontier – Life - 1997
30 Experiments in topic distillation – Chakrabarti, Dom, et al. - 1998
22 Citation Indexing – Garfield - 1979
20 Finding and visualizing intersite clan graphs – Terveen, Hill - 1998
17 Evaluating Emergent Collaboration on the Web – Terveen, Hill - 1998
2 Introductory Statistics – Ross - 1996