MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Text Categorization with Support Vector Machines: Learning with Many Relevant Features (1998) [1053 citations — 10 self]

Abstract:

This paper explores the use of Support Vector Machines (SVMs) for learning text classifiers from examples. It analyzes the particular properties of learning with text data and identifies why SVMs are appropriate for this task. Empirical results support the theoretical findings. SVMs achieve substantial improvements over the currently best performing methods and behave robustly over a variety of different learning tasks. Furthermore, they are fully automatic, eliminating the need for manual parameter tuning.

Citations

5044 Statistical Learning Theory – Vapnik - 1998
3356 C4.5: Programs for Machine Learning – Quinlan - 1993
1091 Support-vector network – Cortes, Vapnik - 1995
915 Term-weighting approaches in automatic text retrieval – Salton, Buckley - 1988
594 Relevance feedback in information retrieval – Rocchio - 1971
565 A comparative study on feature selection in text categorization – Yang, Pedersen - 1997
346 An evaluation of statistical approaches to text categorization – Yang - 1999
250 A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization.ICML-97 – Joachims - 1997
49 The perceptron algorithm vs. Winnow: Linear vs. logarithmic mistake bounds when few input variables are relevant – Kivinen, Warmuth, et al. - 1997
18 Using corpus statistics to remove redundant words in text categorization – Yang, Wilbur - 1996