Results 1 -
2 of
2
Learning to cluster web search results
- In Proc. of SIGIR ’04
, 2004
"... In web search, surfers are often faced with the problem of selecting their most wanted information from the potential huge amount of search results. The clustering of web search results is the possible solution, but the traditional content based clustering is not sufficient since it ignores many uni ..."
Abstract
-
Cited by 93 (6 self)
- Add to MetaCart
In web search, surfers are often faced with the problem of selecting their most wanted information from the potential huge amount of search results. The clustering of web search results is the possible solution, but the traditional content based clustering is not sufficient since it ignores many unique features of web pages. The link structure, authority, quality, or trustfulness of search results can play even the higher role than the actual contents of the web pages in clustering. These possible extents are reflected by Google's PageRank algorithm, HITS algorithm and etc. The main goal of this project is to integrate the authoritative information such as PageRank, link structure (e.g. in-links and out-links) into the K-Means clustering of web search results. The PageRank, inlinks and out-links can be used to extend the vector representation of web pages, and the PageRank can also be considered in the initial centroids selection, or the web page with higher PageRank influences the centroid computation to a higher degree. The relevance of this modified K-Means clustering algorithm needs to be compared to the ones obtained by the content based K-Means clustering, and the effects of different authoritative information also needs to be analyzed.
Clustering Web Search Results with Maximum Spanning Trees
"... Abstract. We present a novel method for clustering Web search results based on Word Sense Induction. First, we acquire the meanings of a query by means of a graph-based clustering algorithm that calculates the maximum spanning tree of the co-occurrence graph of the query. Then we cluster the search ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. We present a novel method for clustering Web search results based on Word Sense Induction. First, we acquire the meanings of a query by means of a graph-based clustering algorithm that calculates the maximum spanning tree of the co-occurrence graph of the query. Then we cluster the search results based on their semantic similarity to the induced word senses. We show that our approach improves classical search result clustering methods in terms of both clustering quality and degree of diversification. 1

