MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Query Type Classification for Web Document Retrieval (2003)

by In-Ho Kang Department ,  In-ho Kang
In Proceedings of the 26th annual international ACM SIGIR conference on Research and Development in Information Retrieval
Add To MetaCart

Abstract:

The heterogeneous Web exacerbates IR problems and short user queries make them worse. The contents of web documents are not enough to find good answer documents. Link information and URL information compensates for the insu #ciencies of content information. However, static combination of multiple evidences may lower the retrieval performance. We need di#erent strategies to find target documents according to a query type. We can classify user queries as three categories, the topic relevance task, the homepage finding task, and the service finding task. In this paper, a user query classification scheme is proposed. This scheme uses the di#erence of distribution, mutual information, the usage rate as anchor texts, and the POS information for the classification. After we classified a user query, we apply di#erent algorithms and information for the better results. For the topic relevance task, we emphasize the content information, on the other hand, for the homepage finding task, we emphasize the Link information and the URL information. We could get the best performance when our proposed classification method with the OKAPI scoring algorithm was used.

Citations

1839 The Anatomy of a Large-Scale Hypertextual Web Search Engine – Brin, Page - 1998
1439 Modern Information Retrieval – Baeza-Yates, Ribeiro - 1999
1064 The PageRank Citation Ranking: Bringing Order to the Web – Page, Brin, et al. - 1999
311 Information theory and statistical mechanics – Jaynes - 1957
260 Okapi at TREC-3 – Robertson, Walker, et al. - 1992
206 Combination of multiple searches – Fox, Shaw - 1994
161 A taxonomy of Web search – Broder
127 Analyses of multiple evidence combination – Lee - 1997
69 Experiments using the lemur toolkit – Ogilvie, Callan - 2001
54 Combining approaches to information retrieval – Croft - 2000
54 Overview of the TREC-2001 web track – Hawking, Craswell - 2002
50 Engineering a multi-purpose test collection for web retrieval experiments – Bailey, Craswell, et al.
25 Retrieving web pages using content, links, urls and anchors – Westerveld, Kraaij, et al. - 2002
18 Language models for relevance feedback – Ponte - 2000
14 Combining text- and link-based retrieval methods for Web IR – Yang - 2002
1 research collections - trec web track. www.ted.cmis.csiro.au /TRECWeb – Web - 2001