• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 4 of 4

LEARNING POS TAGGING FROM A TAGGED MACEDONIAN TEXT CORPUS

by Viktor Vojnovski, Sašo Džeroski, Tomaž Erjavec
"... This paper presents several new linguistic resources for the Macedonian language, in particular a language corpus consisting of the digitized and annotated Orwell's “1984 ” in the Macedonian translation. The produced resources (morphosyntactic specification, lexicon, and corpus) are compatible ..."
Abstract - Cited by 3 (1 self) - Add to MetaCart
This paper presents several new linguistic resources for the Macedonian language, in particular a language corpus consisting of the digitized and annotated Orwell's “1984 ” in the Macedonian translation. The produced resources (morphosyntactic specification, lexicon, and corpus) are compatible

East meets West: Producing Multilingual Resources in a European Context

by Tomaž Erjavec , Ann Lawson , Laurent Romary - First International Language Resources and Evaluation Conference , 1998
"... Abstract The EU concerted action TELRI has released a two-volume CD-ROM, which contains multilingual language resources, namely corpora, lexica, and tools for language engineering. This CD-ROM provides harmonised resources for unprecedented numbers and kinds of languages, mainly from non-EU countri ..."
Abstract - Cited by 10 (4 self) - Add to MetaCart
and tagged novel '1984' by George Orwell and accompanying lexica in seven languages. The paper presents the CD-ROM, the methods employed in its creation and its prospective uses.

Learning to Lemmatise Slovene Words

by Saso Dzeroski, Tomaz Erjavec - Cussens and S. Dzˇeroski, Learning Language in Logic, Number 1925 in Lecture notes in artificial intelligence , 2000
"... . Automatic lemmatisation is a core application for many language processing tasks. In inflectionally rich languages, such as Slovene, assigning the correct lemma to each word in a running text is not trivial: nouns and adjectives, for instance, inflect for number and case, with a complex config ..."
Abstract - Cited by 6 (1 self) - Add to MetaCart
the word form given the correct morphosyntactic tag. A statistics-based trigram tagger is used to learn to perform morphosyntactic tagging and a first-order decision list learning system is used to learn rules for morphological analysis. The dataset used is the 90.000 word Slovene translation of Orwell's

Learning to Lemmatise Slovene Words

by unknown authors
"... Abstract. Automatic lemmatisation is a core application for many language processing tasks. In inflectionally rich languages, such as Slovene, assigning the correct lemma to each word in a running text is not trivial: nouns and adjectives, for instance, inflect for number and case, with a complex co ..."
Abstract - Add to MetaCart
form given the correct morphosyntactic tag. A statistics-based trigram tagger is used to learn to perform morphosyntactic tagging and a first-order decision list learning system is used to learn rules for morphological analysis. The dataset used is the 90.000 word Slovene translation of Orwell’s ‘1984
Results 1 - 4 of 4
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University