• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Complementing Search Engines with Online Web Mining Agents (2002)

by Filippo Menczer
Add To MetaCart

Tools

Sorted by:
Results 1 - 8 of 8

Topical web crawlers: Evaluating adaptive algorithms

by Filippo Menczer, Gautam Pant, Padmini Srinivasan - ACM Transactions on Internet Technology , 2004
"... Topical crawlers are increasingly seen as a way to address the scalability limitations of universal search engines, by distributing the crawling process across users, queries, or even client computers. The context available to such crawlers can guide the navigation of links with the goal of efficien ..."
Abstract - Cited by 35 (11 self) - Add to MetaCart
Topical crawlers are increasingly seen as a way to address the scalability limitations of universal search engines, by distributing the crawling process across users, queries, or even client computers. The context available to such crawlers can guide the navigation of links with the goal of efficiently locating highly relevant target pages. We developed a framework to fairly evaluate topical crawling algorithms under a number of performance metrics. Such a framework is employed here to evaluate different algorithms that have proven highly competitive among those proposed in the literature and in our own previous research. In particular we focus on the tradeoff between exploration and exploitation of the cues available to a crawler, and on adaptive crawlers that use machine learning techniques to guide their search. We find that the best performance is achieved by a novel combination of explorative and exploitative bias, and introduce an evolutionary crawler that surpasses the performance of the best non-adaptive crawler after sufficiently long crawls. We also analyze the computational complexity of the various crawlers and discuss how performance and complexity scale with available resources. Evolutionary crawlers achieve high efficiency and scalability by distributing the work across concurrent agents, resulting in the best performance/cost ratio.

A General Evaluation Framework for Topical Crawlers

by P. Srinivasan, F. Menczer, G. Pant - INFORMATION RETRIEVAL , 2005
"... Topical crawlers are becoming important tools to support applications such as specialized Web portals, online searching, and competitive intelligence. As the Web mining field matures, the disparate crawling strategies proposed in the literature will have to be evaluated and compared on common tasks ..."
Abstract - Cited by 28 (10 self) - Add to MetaCart
Topical crawlers are becoming important tools to support applications such as specialized Web portals, online searching, and competitive intelligence. As the Web mining field matures, the disparate crawling strategies proposed in the literature will have to be evaluated and compared on common tasks through welldefined performance measures. This paper presents a general framework to evaluate topical crawlers. We identify a class of tasks that model crawling applications of di#erent nature and di#culty. We then introduce a set of performance measures for fair comparative evaluations of crawlers along several dimensions including generalized notions of precision, recall, and e#ciency that are appropriate and practical for the Web. The framework relies on independent relevance judgements compiled by human editors and available from public directories. Two sources of evidence are proposed to assess crawled pages, capturing di#erent relevance criteria. Finally we introduce a set of topic characterizations to analyze the variability in crawling e#ectiveness across topics. The proposed evaluation framework synthesizes a number of methodologies in the topical crawlers literature and many lessons learned from several studies conducted by our group. The general framework is described in detail and then illustrated in practice by a case study that evaluates four public crawling algorithms. We found that the proposed framework is e#ective at evaluating, comparing, di#erentiating and interpreting the performance of the four crawlers. For example, we found the IS crawler to be most sensitive to the popularity of topics.

Implicit: An agent-based recommendation system for web search

by Er Birukov, Enrico Blanzieri, Paolo Giorgini - In Proceedings of the 4th International Conference on Autonomous Agents and Multi-Agent Systems , 2005
"... The amount of information on Internet is increasing very fast and, as a result, search becomes more and more a harder task. A common solution is to use authority-based search engines. However, for a community of people with similar interests, quality of results can be improved exploiting also implic ..."
Abstract - Cited by 19 (8 self) - Add to MetaCart
The amount of information on Internet is increasing very fast and, as a result, search becomes more and more a harder task. A common solution is to use authority-based search engines. However, for a community of people with similar interests, quality of results can be improved exploiting also implicit knowledge. We propose an agentbased recommendation system for supporting communities of people in searching the web by means of a popular search engine. Agents use data mining techniques in order to learn and discover users behaviors, and they interact to share knowledge about the users. We also present a set of experimental results showing in terms of precision and recall how interaction increases the performance of the system. 1.

Topic-Driven Crawlers: Machine Learning Issues

by Filippo Menczer, Gautam Pant, Padmini Srinivasan - ACM TOIT, Submitted , 2002
"... Topic driven crawlers are increasingly seen as a way to address the scalability limitations of universal search engines, by distributing the crawling process across users, queries, or even client computers. ..."
Abstract - Cited by 9 (2 self) - Add to MetaCart
Topic driven crawlers are increasingly seen as a way to address the scalability limitations of universal search engines, by distributing the crawling process across users, queries, or even client computers.

A Consensus-based Multi-agent Approach for Information Retrieval in Internet

by Ngoc Thanh Nguyen, Maria Ganzha, Marcin Paprzycki
"... Abstract. This paper presents a consensus-based approach utilized within a multi-agent system which assists users in retrieving information from the Internet. In this system consensus methods are applied for reconciling inconsistencies among independent answers generated by agents (using different s ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Abstract. This paper presents a consensus-based approach utilized within a multi-agent system which assists users in retrieving information from the Internet. In this system consensus methods are applied for reconciling inconsistencies among independent answers generated by agents (using different search engines) for a given query. Proposed agent system has been implemented and initial experimental results are presented. 1

WYDZIAƁ MATEMATYKI

by Jakub Stadnik, I Nauk Informacyjnych, Jakub Stadnik , 2008
"... podpis promotora podpis autoraPOLITECHNIKA WARSZAWSKA ..."
Abstract - Add to MetaCart
podpis promotora podpis autoraPOLITECHNIKA WARSZAWSKA

Contents lists available at ScienceDirect Decision Support Systems

by Michael Chau, Cho Hung Wong
"... journal homepage: www.elsevier.com/locate/dss ..."
Abstract - Add to MetaCart
journal homepage: www.elsevier.com/locate/dss

Multi-Agent System for Distributed Data Retrieval using PQR Approach

by M. Murali, Dr. R. Srinivasan, Chennai India
"... The paper describes the process involving distributed data access from a mobile device, employing mobile agents. To answer any query in the distributed environment the search is conducted to answer the query only in the databases, which are known to the systems. The transfer of database to the syste ..."
Abstract - Add to MetaCart
The paper describes the process involving distributed data access from a mobile device, employing mobile agents. To answer any query in the distributed environment the search is conducted to answer the query only in the databases, which are known to the systems. The transfer of database to the system, where the query is originated will involve high communication cost, response time and increase the network traffic. In order to reduce the values of these parameters and incidentally the network traffic, mobile agents are used to fetch the result from various sites.. A mobile agent is a software program that migrates from one node to another where the data is located instead of transmitting data across the network. This paper presents a study of deployment of Multi-agent system for retrieval of data and also the management of distributed resources. Experiments conducted reveal the performance of Mobile agents, using Parallel Query Retrieval (PQR) approach.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University