Results 1 - 10
of
67
Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network
- IN PROCEEDINGS OF HLT-NAACL
, 2003
"... We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective ..."
Abstract
-
Cited by 181 (12 self)
- Add to MetaCart
We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation, (ii) broad use of lexical features, including jointly conditioning on multiple consecutive words, (iii) effective use of priors in conditional loglinear models, and (iv) fine-grained modeling of unknown word features. Using these ideas together, the resulting tagger gives a 97.24% accuracy on the Penn Treebank WSJ, an error reduction of 4.4% on the best previous single automatically learned tagging result.
Scaling to Very Very Large Corpora for Natural Language Disambiguation
, 2001
"... The amount of readily available online text has reached hundreds of billions of words and continues to grow. Yet for most core natural language tasks, algorithms continue to be optimized, tested and compared after training on corpora consisting of only one million words or less. In this pape ..."
Abstract
-
Cited by 82 (3 self)
- Add to MetaCart
The amount of readily available online text has reached hundreds of billions of words and continues to grow. Yet for most core natural language tasks, algorithms continue to be optimized, tested and compared after training on corpora consisting of only one million words or less. In this paper, we evaluate the performance of different learning methods on a prototypical natural language disambiguation task, confusion set disambiguation, when trained on orders of magnitude more labeled data than has previously been used. We are fortunate that for this particular application, correctly labeled training data is free. Since this will often not be the case, we examine methods for effectively exploiting very large corpora when labeled data comes at a cost.
Boosting Applied to Tagging and PP Attachment
- In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora
, 1999
"... Boosting is a machine learning algorithm that is not well known in computational linguistics. We apply it to part-of-speech tagging and prepositional phrase attachment. Performance is very encouraging. We also show how to improve data quality by using boosting to identify annotation errors. ..."
Abstract
-
Cited by 62 (6 self)
- Add to MetaCart
Boosting is a machine learning algorithm that is not well known in computational linguistics. We apply it to part-of-speech tagging and prepositional phrase attachment. Performance is very encouraging. We also show how to improve data quality by using boosting to identify annotation errors.
Combining Independent Modules to Solve Multiple-choice Synonym and Analogy Problems
- In Proceedings of the International Conference on Recent Advances in Natural Language Processing
, 2003
"... Existing statistical approaches to natural language problems are very coarse approximations to the true complexity of language processing. ..."
Abstract
-
Cited by 40 (8 self)
- Add to MetaCart
Existing statistical approaches to natural language problems are very coarse approximations to the true complexity of language processing.
SVMTool: A general POS tagger generator based on Support Vector Machines
, 2004
"... This report presents the svmtool , a simple, flexible, effective and efficient part--of--speech tagger based on Support Vector Machines. The svmtool offers a fairly good balance among these properties which make it really practical for current NLP applications. It is very easy to use and easily c ..."
Abstract
-
Cited by 34 (0 self)
- Add to MetaCart
This report presents the svmtool , a simple, flexible, effective and efficient part--of--speech tagger based on Support Vector Machines. The svmtool offers a fairly good balance among these properties which make it really practical for current NLP applications. It is very easy to use and easily configurable so as to perfectly fit the needs of a number of different applications. Results are also very competitive, achieving an accuracy of 97.16% for English on the Wall Street Journal corpus. It has been also successfully applied to Spanish and Catalan exhibiting a similar performance. A first release of the svmtool Perl prototype is now freely available for public use. A more efficient C++ version is coming very soon, by summer 2004.
A Simple Approach to Building Ensembles of Naive Bayesian Classifiers for Word Sense Disambiguation
, 2000
"... This paper presents a corpus-based approach to word sense disambiguation that builds an ensemble of Naive Bayesian classifiers, each of which is based on lexical features that represent co-occurring words in varying sized windows of context. Despite the simplicity of this approach, empirical results ..."
Abstract
-
Cited by 31 (1 self)
- Add to MetaCart
This paper presents a corpus-based approach to word sense disambiguation that builds an ensemble of Naive Bayesian classifiers, each of which is based on lexical features that represent co-occurring words in varying sized windows of context. Despite the simplicity of this approach, empirical results disambiguating the widely studied nouns line and interest show that such an ensemble achieves accuracy rivaling the best previously published results.
Combining Classifiers for Word Sense Disambiguation
"... Classifier combination is an effective and broadly useful method of improving system performance. This article investigates in depth a large number of both well-established and novel classifier combination approaches for the word sense disambiguation task, studied over a diverse classifier pool whic ..."
Abstract
-
Cited by 27 (2 self)
- Add to MetaCart
Classifier combination is an effective and broadly useful method of improving system performance. This article investigates in depth a large number of both well-established and novel classifier combination approaches for the word sense disambiguation task, studied over a diverse classifier pool which includes feature-enhanced Naïve Bayes, Cosine, Decision List, Transformation-based Learning and MMVC classifiers. Each classifier has access to the same rich feature space, comprised of distance weighted bag-of-lemmas, local ngram context and specific syntactic relations, such as Verb-Object and Noun-Modifier.
Language Independent, Minimally Supervised Induction of Lexical Probabilities
- In Proceedings of ACL-2000
, 2000
"... A central problem in part-of-speech tagging, especially for new languages for which limited annotated resources are available, is estimating the distribution of lexical probabilities for unknown words. This paper introduces a new paradigmatic similarity measure and presents a minimally supervised le ..."
Abstract
-
Cited by 26 (7 self)
- Add to MetaCart
A central problem in part-of-speech tagging, especially for new languages for which limited annotated resources are available, is estimating the distribution of lexical probabilities for unknown words. This paper introduces a new paradigmatic similarity measure and presents a minimally supervised learning approach combining effective selection and weighting methods based on paradigmatic and contextual similarity measures populated from large quantities of inexpensive raw text data. This approach is highly language independent and requires no modification to the algorithm or implementation to shift between languages such as French and English.
A Multi-Strategy Approach to Improving Pronunciation by Analogy
"... Pronunciation by analogy (PbA) is a data-driven method for relating letters to sound, with potential application to next-generation text-to-speech systems. This paper extends previous work on PbA in several directions. First, we have included `full' pattern matching between input letter string and d ..."
Abstract
-
Cited by 25 (3 self)
- Add to MetaCart
Pronunciation by analogy (PbA) is a data-driven method for relating letters to sound, with potential application to next-generation text-to-speech systems. This paper extends previous work on PbA in several directions. First, we have included `full' pattern matching between input letter string and dictionary entries, as well as including lexical stress in letter-to-phoneme conversion. Second, we have extended the method to phonemeto -letter conversion. Third, and most important, we have experimented with multiple, different strategies for scoring the candidate pronunciations. Individual scores for each strategy are obtained on the basis of rank and either multiplied or summed to produce a final, overall score. Five strategies have been studied and results obtained from all 31 possible combinations. The two combination methods perform comparably, with the product rule only very marginally superior to the sum rule. Nonparametric statistical analysis reveals that performance improves as more strategies are included in the combination: this trend is very highly significant ( p 0 0005). Accordingly for letter-to-phoneme conversion, best results are obtained when all five strategies are combined: word accuracy is raised to 65.5% relative to 61.7% for our best previous result and 63.0% for the best-performing single strategy. These improvements are very highly significant ( p 0 and p 0 00011 respectively). Similar results were found for phoneme-to-letter and letter-to-stress conversion, although the former was an easier problem for PbA than letter-to-phoneme conversion and the latter was harder. The main sources of error for the multi-strategy approach are very similar to those for the best single strategy, and mostly involve vowel letters and phonemes. 1
Morphological Tagging: Data vs. Dictionaries
, 2000
"... Part of Speech tagging for English seems to have reached the the human levels of error, but full morphological tagging for inflectionally rich languages, such as Romanian, Czech, or Hungarian, is still an open problem, and the results are far from being satisfactory. This paper presents results obta ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
Part of Speech tagging for English seems to have reached the the human levels of error, but full morphological tagging for inflectionally rich languages, such as Romanian, Czech, or Hungarian, is still an open problem, and the results are far from being satisfactory. This paper presents results obtained by using a universalized exponential feature-based model for five such languages. It focuses on the data sparseness issue, which is especially severe for such languages (the more so that there are no extensive annotated data for those languages). In conclusion, we argue strongly that the use of an independent morphological dictionary is the preferred choice to more annotated data under such circumstances.

