Criterion Functions for Document Clustering: Experiments and Analysis (2002)
| Citations: | 107 - 4 self |
BibTeX
@TECHREPORT{Zhao02criterionfunctions,
author = {Ying Zhao and George Karypis},
title = {Criterion Functions for Document Clustering: Experiments and Analysis},
institution = {},
year = {2002}
}
Years of Citing Articles
OpenURL
Abstract
In recent years, we have witnessed a tremendous growth in the volume of text documents available on the Internet, digital libraries, news sources, and company-wide intranets. This has led to an increased interest in developing methods that can help users to effectively navigate, summarize, and organize this information with the ultimate goal of helping them to find what they are looking for. Fast and high-quality document clustering algorithms play an important role towards this goal as they have been shown to provide both an intuitive navigation/browsing mechanism by organizing large amounts of information into a small number of meaningful clusters as well as to greatly improve the retrieval performance either via cluster-driven dimensionality reduction, term-weighting, or query expansion. This ever-increasing importance of document clustering and the expanded range of its applications led to the development of a number of new and novel algorithms with different complexity-quality trade-offs. Among them, a class of clustering algorithms that have relatively low computational requirements are those that treat the clustering problem as an optimization process which seeks to maximize or minimize a particular clustering criterion function defined over the entire clustering solution.







