Results 1 -
3 of
3
Topical web crawlers: Evaluating adaptive algorithms
- ACM Transactions on Internet Technology
, 2004
"... Topical crawlers are increasingly seen as a way to address the scalability limitations of universal search engines, by distributing the crawling process across users, queries, or even client computers. The context available to such crawlers can guide the navigation of links with the goal of efficien ..."
Abstract
-
Cited by 35 (11 self)
- Add to MetaCart
Topical crawlers are increasingly seen as a way to address the scalability limitations of universal search engines, by distributing the crawling process across users, queries, or even client computers. The context available to such crawlers can guide the navigation of links with the goal of efficiently locating highly relevant target pages. We developed a framework to fairly evaluate topical crawling algorithms under a number of performance metrics. Such a framework is employed here to evaluate different algorithms that have proven highly competitive among those proposed in the literature and in our own previous research. In particular we focus on the tradeoff between exploration and exploitation of the cues available to a crawler, and on adaptive crawlers that use machine learning techniques to guide their search. We find that the best performance is achieved by a novel combination of explorative and exploitative bias, and introduce an evolutionary crawler that surpasses the performance of the best non-adaptive crawler after sufficiently long crawls. We also analyze the computational complexity of the various crawlers and discuss how performance and complexity scale with available resources. Evolutionary crawlers achieve high efficiency and scalability by distributing the work across concurrent agents, resulting in the best performance/cost ratio.
Adaptive Distributed Search and Advertising for WWW
, 2001
"... In this paper, we present the concept of, and discuss problems related to, distributed search architectures for the World Wide Web. We structure the problem area and analyse what aspects have already been covered by existing research and what needs to be done. We outline possible approaches to ..."
Abstract
- Add to MetaCart
In this paper, we present the concept of, and discuss problems related to, distributed search architectures for the World Wide Web. We structure the problem area and analyse what aspects have already been covered by existing research and what needs to be done. We outline possible approaches to some of the important research issues in distributed search architectures and present the ADSA (Adaptive Distributed Search and Advertising) project which aims to resolve them.
Advanced Distributed Search for the Web
, 2001
"... In this paper, we present the concept of, and discuss problems related to, distributed search architectures for the World Wide Web. We structure the problem area and analyse what aspects have already been covered by previous research and what needs to be done. We outline possible approaches to some ..."
Abstract
- Add to MetaCart
In this paper, we present the concept of, and discuss problems related to, distributed search architectures for the World Wide Web. We structure the problem area and analyse what aspects have already been covered by previous research and what needs to be done. We outline possible approaches to some of the important research issues in distributed search architectures and discuss them in the context of the ADSA (Adaptive Distributed Search and Advertising) project.

