Results 1 -
2 of
2
Incorporating Quality Metrics in Centralized/Distributed Information Retrieval on the World Wide Web
, 2000
"... Most information retrieval systems on the Internet rely primarily on similarity ranking algorithms based solely on term frequency statistics. Information quality is usually ignored. This leads to the problem that documents are retrieved without regard to their quality. We present an approach that co ..."
Abstract
-
Cited by 30 (0 self)
- Add to MetaCart
Most information retrieval systems on the Internet rely primarily on similarity ranking algorithms based solely on term frequency statistics. Information quality is usually ignored. This leads to the problem that documents are retrieved without regard to their quality. We present an approach that combines similarity-based similarity ranking with quality ranking in centralized and distributed search environments. Six quality metrics, including the currency, availability, information-to-noise ratio, authority, popularity, and cohesiveness, were investigated. Search effectiveness was significantly improved when the currency, availability, information-to-noise ratio and page cohesiveness metrics were incorporated in centralized search. The improvement seen when the availability, information-to-noise ratio, popularity, and cohesiveness metrics were incorporated in site selection was also significant. Finally, incorporating the popularity metric in information fusion resulted in a significan...
Ontology-Based Web Site Mapping for Information Exploration
- In Proceedings of the 8 th International Conference On Information Knowledge Management (CIKM
, 1999
"... Centralized search process requires that the whole collection reside at a single site. This imposes a burden on both the system storage of the site and the network traffic near the site. It thus comes to require the search process to be distributed. Recently, more and more Web sites provide the abil ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
Centralized search process requires that the whole collection reside at a single site. This imposes a burden on both the system storage of the site and the network traffic near the site. It thus comes to require the search process to be distributed. Recently, more and more Web sites provide the ability to search their local collection of Web pages. Query brokering systems are used to direct queries to the promising sites and merge the results from these sites. Creation of meta-information of the sites plays an important role in such systems. In this article, we introduce an ontology-based web site mapping method used to produce conceptual meta-information, the Vector Space approach, and present a serial of experiments comparing it with Nave-Bayes approach. We found that the Vector Space approach produces better accuracy in ontology-based web site mapping. Keywords Distributed collections, information brokers, text categorization, IR agents. 1. INTRODUCTION The World Wide Web (WWW)...

