• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

DMCA

A divisive information-theoretic feature clustering algorithm for text classification (2003)

Cached

  • Download as a PDF

Download Links

  • [www-ai.cs.uni-dortmund.de]
  • [www.cs.utexas.edu]
  • [ima.umn.edu]
  • [www.jmlr.org]
  • [jmlr.csail.mit.edu]
  • [www.ai.mit.edu]
  • [www-ai.informatik.uni-dortmund.de]
  • [silver.ima.umn.edu]
  • [jmlr.org]
  • [www.ai.mit.edu]
  • [redesign.ima.umn.edu]
  • [www.ima.umn.edu]
  • [www-ai.cs.uni-dortmund.de]
  • [www.jmlr.org]
  • [www.cs.utexas.edu]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Inderjit S. Dhillon , Subramanyam Mallela , Rahul Kumar
Venue:Journal of Machine Learning Research
Citations:138 - 15 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@ARTICLE{Dhillon03adivisive,
    author = {Inderjit S. Dhillon and Subramanyam Mallela and Rahul Kumar},
    title = {A divisive information-theoretic feature clustering algorithm for text classification},
    journal = {Journal of Machine Learning Research},
    year = {2003},
    volume = {3},
    pages = {1265--1287}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

High dimensionality of text can be a deterrent in applying complex learners such as Support Vector Machines to the task of text classification. Feature clustering is a powerful alternative to feature selection for reducing the dimensionality of text data. In this paper we propose a new informationtheoretic divisive algorithm for feature/word clustering and apply it to text classification. Existing techniques for such “distributional clustering ” of words are agglomerative in nature and result in (i) sub-optimal word clusters and (ii) high computational cost. In order to explicitly capture the optimality of word clusters in an information theoretic framework, we first derive a global criterion for feature clustering. We then present a fast, divisive algorithm that monotonically decreases this objective function value. We show that our algorithm minimizes the “within-cluster Jensen-Shannon divergence ” while simultaneously maximizing the “between-cluster Jensen-Shannon divergence”. In comparison to the previously proposed agglomerative strategies our divisive algorithm is much faster and achieves comparable or higher classification accuracies. We further show that feature clustering is an effective technique for building smaller class models in hierarchical classification. We present detailed experimental results using Naive Bayes and Support Vector Machines on the 20Newsgroups data set and a 3-level hierarchy of HTML documents collected from the Open Directory project (www.dmoz.org).

Keyphrases

text classification    divisive information-theoretic feature    feature clustering    support vector machine    divisive algorithm    html document    within-cluster jensen-shannon divergence    hierarchical classification    information theoretic framework    naive bayes    text data    global criterion    3-level hierarchy    effective technique    high dimensionality    word cluster    complex learner    new informationtheoretic divisive algorithm    detailed experimental result    open directory project    high computational cost    objective function value    sub-optimal word cluster    between-cluster jensen-shannon divergence    distributional clustering    classification accuracy    class model    data set    feature word clustering    agglomerative strategy   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University