@MISC{Søgaard_ensemble-basedpos, author = {Anders Søgaard}, title = {Ensemble-based POS tagging of Italian}, year = {} }
Bookmark
OpenURL
Abstract
Abstract. Simple learning algorithms are used to learn what labels to assign words based on the predictions provided by two non-optimized part-of-speech taggers, a tagger with a different tag set and, possibly, an earlier run of the classifier. Our accuracy on the tagged La Repubblica corpus of Italian newspaper articles is 3.6 % higher than that of our best input tagger in general, and ∼16.4 % better in tagging unknown words. Results are generally non-competitive because of non-optimized input taggers. In Sect. 3, we show (i) how results can be improved with the current input taggers (by ∼0.7 % and ∼2.3, resp.), and (ii) how more competitive results can be obtained with more qualified input taggers.