Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop
SVM HeaderParse 0.2
We present an approach to using a morphological analyzer for tokenizing and morphologically tagging (including partof-speech tagging) Arabic words in one process. We learn classifiers for individual morphological features, as well as ways of using these classifiers to choose among entries from the output of the analyzer. We obtain accuracy rates on all tasks in the high nineties.