Results 1 - 10
of
30
Two-Level Morphology with Composition
- In Proceedings of the 14 th International Conference on Computational Linguistics (COLING'92
, 1992
"... this paper are the following: (1) Lexical representations tend to be arbitrary. Because it is difficult to write and test two-level systems that map between pairs of radically dissimilar forms, lexical representations in existing two-level analyzers tend to stay close to the surface forms. This is n ..."
Abstract
-
Cited by 68 (7 self)
- Add to MetaCart
this paper are the following: (1) Lexical representations tend to be arbitrary. Because it is difficult to write and test two-level systems that map between pairs of radically dissimilar forms, lexical representations in existing two-level analyzers tend to stay close to the surface forms. This is not a problem for morphologically simple languages like English because, for most words, inflected forms are very similar to the canonical dictionary entry. Except for a small number of irregular verbs and nouns, it is not difficult to create a two-level description for English in which lexical forms coincide with the canonical citation forms found in a dictionary. However, current analyzers for morphologically more complex languages (Finnish and Russian, for example) are not as satisfying in this respect. In these systems, lexical forms typically contain diacritic markers and special symbols; they are not real words in the language. For example, in Finnish the lexical counterpart of otin `I took' might be rendered as
Optimality theory and the generative complexity of constraint violability
- Computational Linguistics
, 1998
"... It has been argued that rule-based phonological descriptions can uniformly be expressed as map-pings carried out by finite-state transducers, and therefore fall within the class of rational relations. If this property of generative capacity is an empirically correct characterization of phonological ..."
Abstract
-
Cited by 59 (2 self)
- Add to MetaCart
It has been argued that rule-based phonological descriptions can uniformly be expressed as map-pings carried out by finite-state transducers, and therefore fall within the class of rational relations. If this property of generative capacity is an empirically correct characterization of phonological mappings, it should hold of any sufficiently restrictive theory of phonology, whether it utilizes con-straints or rewrite rules. In this paper, we investigate the conditions under which the phonological descriptions that are possible within the view of constraint interaction embodied in Optimality Theory (Prince and Smolensky 1993) remain within the class of rational relations. We show that this is true when GEN is itself a rational relation, and each of the constraints distinguishes among finitely many regular sets of candidates. 1.
Arabic Finite-State Morphological Analysis and Generation
, 1996
"... This paper describes a large-scale system that performs morphological analysis and generation of on-line Arabic words represented in the standard orthography, whether fully voweled, partially vowtied or unvoweled. Analyses display the root, pattern and all other affixes together witIt feature tags i ..."
Abstract
-
Cited by 50 (5 self)
- Add to MetaCart
This paper describes a large-scale system that performs morphological analysis and generation of on-line Arabic words represented in the standard orthography, whether fully voweled, partially vowtied or unvoweled. Analyses display the root, pattern and all other affixes together witIt feature tags indicating part of speech, person, nmnbet, mood, voice, aspect, etc. The system is based on lexicons and rules from an earlier KIMMO-style twodevcl morphological system, reworked extensively usiug Xerox Finite-State Morphology tools. The result is an Arabic Finite-State Lexical Transducer that is applied with the same runtime code used for English, French, German, Spanish, Portuguese, Dutch and Italian lexical transducers.
The Proper Treatment of Optimality in Computational Phonology
- Bilkent University
, 1998
"... This paper presents a novel forrealization of optimality theory. Unlike pre- yions treatments of optimality in computational linguistics, starting with Ellison (1994), the new approach does not require any explicit marking and counting of constraint violations. It is based on the notion of "lenient ..."
Abstract
-
Cited by 44 (5 self)
- Add to MetaCart
This paper presents a novel forrealization of optimality theory. Unlike pre- yions treatments of optimality in computational linguistics, starting with Ellison (1994), the new approach does not require any explicit marking and counting of constraint violations. It is based on the notion of "lenient composition", defined as the combination of ordinary composition and priority union. If an underlying form has outputs that can meet a given constraint, lenient composition enforces the constraint; ff none of the output candidates meets the constraint, lenient composition allows all of them. For the sake of greater efficiency, we may "eniently compose" the a. relation and all the constraints into a single finite-state transducer that maps each underlying form directly into its op- timal surface realizations, and vice versa.. Seen from this perspective, optimality theory is surprisingly similar to the two older strains of finite-state phonology: classical rewrite systems and two-level models. In particular, the ranking of optimality constraints corresponds to the ordering of rewrite rules.
Arabic Morphology Using Only Finite-State Operations
, 1998
"... Finite-state morphology has been successful in the description and computational implementa. tion of a wide variety of natural languages. However, the particular challenges of Arabic, and the limitations of some implementations of finite-state morphology, have led many researchers to believe that fi ..."
Abstract
-
Cited by 28 (2 self)
- Add to MetaCart
Finite-state morphology has been successful in the description and computational implementa. tion of a wide variety of natural languages. However, the particular challenges of Arabic, and the limitations of some implementations of finite-state morphology, have led many researchers to believe that finite-state power was not sufficient to handle Arabic and other Semitic morphology. This paper illustrates how the morphotactics and the variation rules of Arabic have been described using only finitestate operations and how this approach has been implemented in a significant morphological analyzer/generator.
Learning bias and phonological-rule induction
- Computational Linguistics
, 1996
"... A fundamental debate in the machine learning of language has been the role of prior knowledge in the learning process. Purely nativist approaches, such as the Principles and Parameters model, build parameterized linguistic generalizations directly into the learning system. Purely empirical approache ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
A fundamental debate in the machine learning of language has been the role of prior knowledge in the learning process. Purely nativist approaches, such as the Principles and Parameters model, build parameterized linguistic generalizations directly into the learning system. Purely empirical approaches use a general, domain-independent learning rule (Error Back-Propagation, Instance-based Generalization, Minimum Description Length) to learn linguistic generalizations directly from the data. In this paper we suggest that an alternative to the purely nativist or purely empiricist learning paradigms is to represent the prior knowledge of language as a set of abstract learning biases, which guide an empirical inductive learning algorithm. We test our idea by examining the machine learning of simple Sound Pattern of English ( S P E)-style phonological rules. We represent phonological rules as finite-state transducers that accept underlying forms as input and generate surface forms as output. We show that OSTIA, a general-purpose transducer induction algorithm, was incapable of learning simple phonological rules like flapping. We then augmented OSTIA with three kinds of learning biases that are specific to natural language phonology, and that are assumed explicitly or implicitly by every theory of phonology: faithfulness (underlying segments
Finite-State Non-Concatenative Morphotactics
, 2000
"... Finite-state morphology in the general tradition of the Two-Level and Xerox implementations has proved very successful in the production of robust morphological analyzer-generators, including many large-scale commercial systems. However, it has long been recognized that these implementations have se ..."
Abstract
-
Cited by 21 (3 self)
- Add to MetaCart
Finite-state morphology in the general tradition of the Two-Level and Xerox implementations has proved very successful in the production of robust morphological analyzer-generators, including many large-scale commercial systems. However, it has long been recognized that these implementations have serious limitations in handling non-concatenative phenomena. We describe a new technique for constructing nitestate transducers that involves reapplying the regular-expression compiler to its own output. Implemented in an algorithm called compilereplace, this technique has proved useful for handling non-concatenative phenomena; and we demonstrate it on Malay full-stem reduplication and Arabic stem interdigitation.
Finite State Transducers with Predicates and Identities
- Grammars
, 2001
"... An extension to finite state transducers is presented, in which atomic symbols are replaced by arbitrary predicates over symbols. The extension is motivated by applications in natural language processing (but may be more widely applicable) as well as by the observation that transducers with predicat ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
An extension to finite state transducers is presented, in which atomic symbols are replaced by arbitrary predicates over symbols. The extension is motivated by applications in natural language processing (but may be more widely applicable) as well as by the observation that transducers with predicates generally have fewer states and fewer transitions. Although the extension is fairly trivial for finite state acceptors, the introduction of predicates is more interesting for transducers. It is shown how various operations on transducers (e.g. composition) can be implemented, as well as how the transducer determinization algorithm can be generalized for predicate-augmented finite state transducers.
Consonant Spreading in Arabic Stems
, 1998
"... This paper exalnines the phenomenon of consonant spreading in Arabic stems. Each spreading involves a local surface copying of an underlying consonant, and, in certain phonological contexts, spreading alternates productively with consonant lengthening (or gemination). The morphophonemic triggers of ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
This paper exalnines the phenomenon of consonant spreading in Arabic stems. Each spreading involves a local surface copying of an underlying consonant, and, in certain phonological contexts, spreading alternates productively with consonant lengthening (or gemination). The morphophonemic triggers of spreading lie in the patterns or even in the roots themselves, and the combination of a spreading root and a spreading pattern causes a consonant to be copied nmltiple times. The interdigitation of Arabic stems and the realization of consonant spreading are formalized using finite-state morphotactics and variation rules, and this approach has been successfully implemented in a large-scale Arabic morphological analyzer which is available for testing on the Internet.

