Results 1 - 10
of
11
Two-Level Morphology with Composition
- In Proceedings of the 14 th International Conference on Computational Linguistics (COLING'92
, 1992
"... this paper are the following: (1) Lexical representations tend to be arbitrary. Because it is difficult to write and test two-level systems that map between pairs of radically dissimilar forms, lexical representations in existing two-level analyzers tend to stay close to the surface forms. This is n ..."
Abstract
-
Cited by 68 (7 self)
- Add to MetaCart
this paper are the following: (1) Lexical representations tend to be arbitrary. Because it is difficult to write and test two-level systems that map between pairs of radically dissimilar forms, lexical representations in existing two-level analyzers tend to stay close to the surface forms. This is not a problem for morphologically simple languages like English because, for most words, inflected forms are very similar to the canonical dictionary entry. Except for a small number of irregular verbs and nouns, it is not difficult to create a two-level description for English in which lexical forms coincide with the canonical citation forms found in a dictionary. However, current analyzers for morphologically more complex languages (Finnish and Russian, for example) are not as satisfying in this respect. In these systems, lexical forms typically contain diacritic markers and special symbols; they are not real words in the language. For example, in Finnish the lexical counterpart of otin `I took' might be rendered as
Finite-state Constraints
, 1993
"... This paper is a report on the application of finite-state methods to phonological and morphological analysis that has brought about spectacular progress in computational morphology over the last several years. We will review the fundamental theoretical work that underlies this progress and discuss i ..."
Abstract
-
Cited by 38 (8 self)
- Add to MetaCart
This paper is a report on the application of finite-state methods to phonological and morphological analysis that has brought about spectacular progress in computational morphology over the last several years. We will review the fundamental theoretical work that underlies this progress and discuss its relevance for linguistics. The two central problems in morphology are word formation and morphological alternations.
Arabic Morphology Using Only Finite-State Operations
, 1998
"... Finite-state morphology has been successful in the description and computational implementa. tion of a wide variety of natural languages. However, the particular challenges of Arabic, and the limitations of some implementations of finite-state morphology, have led many researchers to believe that fi ..."
Abstract
-
Cited by 28 (2 self)
- Add to MetaCart
Finite-state morphology has been successful in the description and computational implementa. tion of a wide variety of natural languages. However, the particular challenges of Arabic, and the limitations of some implementations of finite-state morphology, have led many researchers to believe that finite-state power was not sufficient to handle Arabic and other Semitic morphology. This paper illustrates how the morphotactics and the variation rules of Arabic have been described using only finitestate operations and how this approach has been implemented in a significant morphological analyzer/generator.
Arabic Morphological Analysis on the Internet
- Proceedings of the International Conference on Multi-Lingual Computing (Arabic
, 1998
"... : [Arabic, Morphology, Finite State] This paper describes a finite-state morphological analyzer of written Modern Standard Arabic words that is available for testing on the Internet at http://www.xrce.xerox.com/research/mltt/arabic. The system consists of the analyzer proper, running on a network s ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
: [Arabic, Morphology, Finite State] This paper describes a finite-state morphological analyzer of written Modern Standard Arabic words that is available for testing on the Internet at http://www.xrce.xerox.com/research/mltt/arabic. The system consists of the analyzer proper, running on a network server, and Java applets that run on the user's machine and render words in standard Arabic orthography both for input and output. An overview of the system is provided, including the history, finite-state technology, dictionary coverage and status. 1 Introduction In 1996, the Xerox Research Centre Europe produced a large morphological analyzer for Modern Standard Arabic, henceforth Arabic (Beesley, 1996). In 1997, the rules were rewritten to more reliably support generation, and a Java user interface was added to allow users to interact with the system via the Internet in standard Arabic orthography. The analyzer-generator is based on dictionaries from an earlier project at ALPNET (Beesley,...
Consonant Spreading in Arabic Stems
, 1998
"... This paper exalnines the phenomenon of consonant spreading in Arabic stems. Each spreading involves a local surface copying of an underlying consonant, and, in certain phonological contexts, spreading alternates productively with consonant lengthening (or gemination). The morphophonemic triggers of ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
This paper exalnines the phenomenon of consonant spreading in Arabic stems. Each spreading involves a local surface copying of an underlying consonant, and, in certain phonological contexts, spreading alternates productively with consonant lengthening (or gemination). The morphophonemic triggers of spreading lie in the patterns or even in the roots themselves, and the combination of a spreading root and a spreading pattern causes a consonant to be copied nmltiple times. The interdigitation of Arabic stems and the realization of consonant spreading are formalized using finite-state morphotactics and variation rules, and this approach has been successfully implemented in a large-scale Arabic morphological analyzer which is available for testing on the Internet.
Computing with realizational morphology
- Computational Linguistics and Intelligent Text Processing
, 2003
"... Abstract. The theory of realizational morphology presented by Stump in his influential book Inflectional Morphology (2001) describes the derivation of inflected surface forms from underlying lexical forms by means of ordered blocks of realization rules. The theory presents a rich formalism for expre ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Abstract. The theory of realizational morphology presented by Stump in his influential book Inflectional Morphology (2001) describes the derivation of inflected surface forms from underlying lexical forms by means of ordered blocks of realization rules. The theory presents a rich formalism for expressing generalizations about phenomena commonly found in the morphological systems of natural languages. This paper demonstrates that, in spite of the apparent complexity of Stump’s formalism, the system as a whole is no more powerful than a collection of regular relations. Consequently, a Stump-style description of the morphology of a particular language such as Lingala or Bulgarian can be compiled into a finite-state transducer that maps the underlying lexical representations directly into the corresponding surface forms or forms, and vice versa, yielding a single lexical transducer. For illustration we will present an explicit finite-state implementation of an analysis of Lingala based on Stump’s description and other sources. 1
A short history of two-level morphology
, 2001
"... Twenty years ago morphological analysis of natural language was a challenge to computational linguists. Simple cut-and-paste programs could be and were written to analyze strings in particular languages, but there was no general language-independent method available. Furthermore, cut-and-paste progr ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Twenty years ago morphological analysis of natural language was a challenge to computational linguists. Simple cut-and-paste programs could be and were written to analyze strings in particular languages, but there was no general language-independent method available. Furthermore, cut-and-paste programs for analysis were not reversible, they could not be used to generate words. Generative phonologists of that time described morphological alternations by means of ordered rewrite rules, but it was not understood how such rules could be used for analysis. This was the situation in the spring of 1981 when Kimmo Koskenniemi came to a conference on parsing that Lauri Karttunen had organized at the University of Texas at Austin. Also at the same conference were two Xerox researchers from Palo Alto, Ronald M. Kaplan and Martin Kay. The four Ks discovered that all of them were interested and had been working on the problem of morphological analysis. Koskenniemi went on to Palo Alto to visit Kay and Kaplan at PARC. This was the beginning of Two-Level Morphology, the first general model in the history of computational linguistics for the analysis and generation of morphologically complex languages. The language-specific components, the lexicon and the rules, were combined with a runtime engine applicable to all languages. In this article we trace the development of the finite-state technology that Two-Level Morphology is based on. 1 The Origins Traditional phonological grammars, formalized in the 1960s by Noam Chomsky and Morris Halle (Chomsky and Halle, 1968) , consisted of an ordered sequence of rewrite rules that converted abstract phonological representations into surface forms through a series of intermediate representations. Such rules have the general form x-> y / z w where x, y, z, and w can be arbitrarily complex strings or feature-matrices. In mathematical linguistics (Partee et al., 1993), such rules are called CONTEXT-SENSITIVE REWRITE RULES, and they are more powerful than regular expressions or context-free rewrite rules.
2002 Gradient constraints in finite state OT: The unidirectional and the bidirectional case
- Bilkent University
, 1998
"... Optimality Theory (Prince and Smolensky 1993) has offered a novel unifying perspective on different branches of linguistic research. It was originally conceived as a form of synchronic grammatical description in the tradition of generative grammar, and the first domain of application was phonology. ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Optimality Theory (Prince and Smolensky 1993) has offered a novel unifying perspective on different branches of linguistic research. It was originally conceived as a form of synchronic grammatical description in the tradition of generative grammar, and the first domain of application was phonology. In the meantime optimality theoretic concepts
Twenty-Five years of Finite-State Morphology
- INQUIRIES INTO WORDS, CONSTRAINTS AND CONTEXTS. FESTSCHRIFT FOR KIMMO KOSKENNIEMI ON HIS 60TH BIRTHDAY
, 2005
"... ..."
Inducing Domain Theories
"... This thesis presents a method for learning a domain theory automatically from a corpus of parsed sentences. What is meant by a ‘domain theory ’ is a collection of facts and generalisations or rules which capture what commonly happens (or does not happen) in some domain of interest. As language users ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This thesis presents a method for learning a domain theory automatically from a corpus of parsed sentences. What is meant by a ‘domain theory ’ is a collection of facts and generalisations or rules which capture what commonly happens (or does not happen) in some domain of interest. As language users we implicitly draw on such theories in various disambiguation tasks, such as anaphora resolution and prepositional phrase attachment, and formal encodings of domain theories can be used for this purpose in natural language processing. Domain theories may also be objects of interest in their own right, that is, as the output of a knowledge discovery process, providing previously unobserved information to aid with the understanding of the domain. The learning paradigm employed is Inductive Logic Programming (ILP), which generalises over examples from the domain to obtain more general patterns covering the majority of the input instances. ILP was preferred over other machine learning techniques due to the expressive power of the language specifications guiding the search for general patterns andthefactthatitallowstheinclusion

