Results 1 -
3 of
3
Scalable Discriminative Learning for Natural Language Parsing and Translation
- In Proceedings of the 2006 Neural Information Processing Systems (NIPS
, 2006
"... Parsing and translating natural languages can be viewed as problems of predicting tree structures. For machine learning approaches to these predictions, the diversity and high dimensionality of the structures involved mandate very large training sets. This paper presents a purely discriminative lear ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Parsing and translating natural languages can be viewed as problems of predicting tree structures. For machine learning approaches to these predictions, the diversity and high dimensionality of the structures involved mandate very large training sets. This paper presents a purely discriminative learning method that scales up well to problems of this size. Its accuracy was at least as good as other comparable methods on a standard parsing task. To our knowledge, it is the first purely discriminative learning algorithm for translation with treestructured models. Unlike other popular methods, this method does not require a great deal of feature engineering a priori, because it performs feature selection over a compound feature space as it learns. Experiments demonstrate the method’s versatility, accuracy, and efficiency. Relevant software is freely available at
Carefully Appoximated Bayes Factors for Feature Selection in MaxEnt Models
, 2004
"... Feature selection is essentially a model selection problem. If we take a frequentist maximum likelihood approach, we will, in the limit, select all features ..."
Abstract
- Add to MetaCart
Feature selection is essentially a model selection problem. If we take a frequentist maximum likelihood approach, we will, in the limit, select all features
All-Topology, Semi-Abstract Syntactic Features for Text Categorisation
"... Good performance on Text Classification (TC) tasks depends on effective and statistically significant features. Typically, the simple bag-of-words representation is widely used because unigram counts are more likely to be significant compared to more compound features. This research explores the ide ..."
Abstract
- Add to MetaCart
Good performance on Text Classification (TC) tasks depends on effective and statistically significant features. Typically, the simple bag-of-words representation is widely used because unigram counts are more likely to be significant compared to more compound features. This research explores the idea that the major cause of poor performance of some complex features is sparsity. Syntactic features are usually complex being made up of both lexical and syntactic information. This paper introduces the use of a class of automatically extractable, syntactic features to the TC task. These features are based on subtrees of parse trees. As such, a large number of these features are generated. Our results suggest that generating a diverse set of these features may help in increasing performance. Partial abstraction of the features also seems to boost performance by counteracting sparsity. We will show that various subsets of our syntactic features do outperform the bag-of-words representation alone. 1

