Results 1 -
2 of
2
A maximal figure-of-merit (MFoM)-learning approach to robust classifier design for text categorization
- ACM Transactions on Information Systems
, 2006
"... We propose a maximal figure-of-merit learning (MFoM) approach for robust classifier design, which directly optimizes performance metrics of interest for different target classifiers. The proposed approach, embedding the decision functions of classifiers and performance metrics into the overall train ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We propose a maximal figure-of-merit learning (MFoM) approach for robust classifier design, which directly optimizes performance metrics of interest for different target classifiers. The proposed approach, embedding the decision functions of classifiers and performance metrics into the overall training objective, learns the parameters of classifiers in a decision-feedback manner to effectively take into account of both positive and negative training samples, and therefore reduce the required size of the positive training data. It has three desirable properties: (a) it is a performance metric oriented learning; (b) the optimized metric is consistent in both training and evaluation sets; and, (c) it is more robust and less sensitive to data variation, and can handle insufficient training data scenario. We evaluate it on the text categorization task using the Reuters-21578 dataset. Training a F1-based binary tree classifier using MFoM, we observed significantly improved performance and enhanced robustness compared to the baseline and SVM, especially on categories with insufficient training samples. The generality for designing other metric-based classifiers is also demonstrated by comparing the precision, recall, and F1-based classifiers. The results clearly show the consistency in performance
Random-Walk Term Weighting for Improved Text Classification
- In Proceedings of TextGraphs: 2nd Workshop on Graph Based Methods for Natural Language Processing. ACL
, 2006
"... This paper describes a new approach for estimating term weights in a text classification task. The approach uses term cooccurrence as a measure of dependency between word features. A random walk model is applied on a graph encoding words and co-occurrence dependencies, resulting in scores that repre ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper describes a new approach for estimating term weights in a text classification task. The approach uses term cooccurrence as a measure of dependency between word features. A random walk model is applied on a graph encoding words and co-occurrence dependencies, resulting in scores that represent a quantification of how a particular word feature contributes to a given context. We argue that by modeling feature weights using these scores, as opposed to the traditional frequency-based scores, we can achieve better results in a text classification task. Experiments performed on four standard classification datasets show that the new random-walk based approach outperforms the traditional term frequency approach to feature weighting. 1

