This work focuses on algorithms which learn from examples to perform multiclass text and speech categorization tasks. Our approach is based on a new and improved family of boosting algorithms. We describe in detail an implementation, called BoosTexter, of the new boosting algorithms for text categorization tasks. We present results comparing the performance of BoosTexter and a number of other text-categorization algorithms on a variety of tasks. We conclude by describing the application of our system to automatic call-type identification from unconstrained spoken customer responses.
|
2331
|
Introduction to Modern Information Retrieval
– Salton, McGill
- 1983
|
|
1210
|
A decision-theoretic generalization of on-line learning and an application to boosting
– Freund, Schapire
- 1997
|
|
1048
|
Experiments with a new boosting algorithm
– Freund, Schapire
- 1996
|
|
654
|
Fast effective rule induction
– Cohen
- 1995
|
|
595
|
Relevance feedback in information retrieval
– Rocchio
- 1971
|
|
501
|
Boosting the margin: A new explanation for the effectiveness of voting methods
– Schapire, Freund, et al.
- 1998
|
|
400
|
Improved boosting algorithms using confidence-rated predictions
– Schapire, Singer
- 1999
|
|
299
|
Learning to filter netnews
– Lang
- 1995
|
|
222
|
Bagging, boosting, and C4.5
– Quinlan
- 1996
|
|
213
|
A comparison of two learning algorithms for text categorization
– LEWIS, RINGUETTE
- 1994
|
|
197
|
Arcing classifiers
– Breiman
- 1998
|
|
194
|
Context sensitive learning methods for text categorization
– Cohen
- 1999
|
|
149
|
Heterogeneous uncertainty sampling for supervised learning.” ICML
– Lewis, Catlett
- 1994
|
|
147
|
Developments in Automatic Text Retrieval
– Salton
- 1991
|
|
135
|
Representation and learning in information retrieval
– Lewis
- 1992
|
|
127
|
Expert network: Effective and efficient learning from human decisions in text categorization and retrieval
– Yang
- 1994
|
|
105
|
Empirical support for winnow and weighted-majority based algorithms: results on a calendar scheduling domain
– Blum
- 1997
|
|
89
|
Feature selection, perceptron learning, and a usability case study for text categorization
– Ng, Goh, et al.
- 1997
|
|
79
|
C.: Boosting decision trees
– Drucker, Cortes
- 1995
|
|
75
|
Pruning adaptive boosting
– Margineantu, Dietterich
- 1997
|
|
74
|
Towards language independent automated learning of text categorization models
– Apté, Damerau, et al.
- 1994
|
|
72
|
An empirical evaluation of bagging and boosting
– Maclin, Opitz
- 1997
|
|
66
|
How may I help you
– Gorin, Riccardi, et al.
- 1997
|
|
66
|
Using output codes to boost multiclass learning problems
– Schapire
- 1997
|
|
59
|
Maximizing text-mining performance
– Weiss, Apte, et al.
- 1999
|
|
55
|
Using and combining predictors that specialize
– Freund, Schapire, et al.
- 1997
|
|
48
|
Text categorization of low quality images
– Ittner, Lewis, et al.
- 1995
|
|
41
|
Training text classifiers by uncertainty sampling
– Lewis, Gale
- 1994
|
|
37
|
Text categorization: a symbolic approach
– Moulinier, Raˇskinis, et al.
- 1996
|
|
23
|
Probabilistic information retrieval as a combination of abstraction, inductive learning, and probabilistic assumptions
– Fuhr, Pfeifer
- 1994
|
|
21
|
The automatic indexing system AIR/PHYS — from research to application
– Biebricher, Fuhr, et al.
- 1988
|
|
20
|
Towards automatic indexing: automatic assignment of controlledlanguage indexing and classification from free indexing
– Field
- 1975
|
|
20
|
Automatic acquisition of salient grammar fragments for call-type classification
– Wright, Gorin, et al.
- 1997
|
|
8
|
Hierarchically classifying docuemnts using very few words
– Koller, Sahami
- 1997
|
|
6
|
A probabilistic analysis of the Rochhio algorithm with TFIDF for text categorization
– Joachims
- 1997
|
|
6
|
Spoken language understanding for automated call routing
– Riccardi, Gorin, et al.
- 1997
|