Results 1 - 10
of
53
Learnability in Optimality Theory
, 1995
"... In this article we show how Optimality Theory yields a highly general Constraint Demotion principle for grammar learning. The resulting learning procedure specifically exploits the grammatical structure of Optimality Theory, independent of the content of substantive constraints defining any given gr ..."
Abstract
-
Cited by 208 (20 self)
- Add to MetaCart
In this article we show how Optimality Theory yields a highly general Constraint Demotion principle for grammar learning. The resulting learning procedure specifically exploits the grammatical structure of Optimality Theory, independent of the content of substantive constraints defining any given grammatical module. We decompose the learning problem and present formal results for a central subproblem, deducing the constraint ranking particular to a target language, given structural descriptions of positive examples. The structure imposed on the space of possible grammars by Optimality Theory allows efficient convergence to a correct grammar. We discuss implications for learning from overt data only, as well as other learning issues. We argue that Optimality Theory promotes confluence of the demands of more effective learnability and deeper linguistic explanation.
The acquisition of stress: a data-oriented approach
- COMPUTATIONAL LINGUISTICS
, 1994
"... A data-oriented (empiricist) alternative to the currently pervasive (nativist) Principles and Pa-rameters approach to the acquisition of stress assignment is investigated. A similarity-based algorithm, viz. an augmented version of Instance-Based Learning is used to learn the system of main stress as ..."
Abstract
-
Cited by 47 (20 self)
- Add to MetaCart
A data-oriented (empiricist) alternative to the currently pervasive (nativist) Principles and Pa-rameters approach to the acquisition of stress assignment is investigated. A similarity-based algorithm, viz. an augmented version of Instance-Based Learning is used to learn the system of main stress assignment in Dutch. In this nontrivial task a comprehensive lexicon of Dutch monomorphemes is used instead of the idealized and highly simplified description of the empirical data used in previous approaches. It is demonstrated that a similarity-based learning method is effective in learning the complex stress system of Dutch. The task is accomplished without the a priori knowledge assumed to pre-exist in the learner in a Principles and Parameters framework. A comparison of the system's behavior with a consensus linguistic analysis (in the framework of Metrical Phonology) shows that ease of learning correlates with decreasing degrees of marked-ness of metrical phenomena. It is also shown that the learning algorithm captures subregularities within the stress system of Dutch that cannot be described without going beyond some of the theoretical assumptions of metrical phonology.
Grammatical Acquisition: Inductive Bias and Coevolution of Language and the Language Acquisition Device
- Language
, 2000
"... An account of grammatical acquisition is developed within the parametersetting framework applied to a generalized categorial grammar (GCG). The GCG is embedded in a default inheritance network yielding a natural partial ordering (reflecting generality) of parameters which determines a partial ord ..."
Abstract
-
Cited by 35 (0 self)
- Add to MetaCart
An account of grammatical acquisition is developed within the parametersetting framework applied to a generalized categorial grammar (GCG). The GCG is embedded in a default inheritance network yielding a natural partial ordering (reflecting generality) of parameters which determines a partial order for parameter setting. Computational simulation shows that several resulting acquisition procedures are effective on a parameter set expressing major typological distinctions based on constituent order, and defining 70 distinct full languages and over 200 subset languages. The effects on acquisition of inductive bias, that is, of differing initial parameter settings, are explored via computational simulation. Computational simulation of populations of language learners and users instantiating the acquisition model show: 1) that variant acquisition procedures, with differing inductive biases, exert differing selective pressures on the evolution of language(s); 2) acquisition proc...
A maximum entropy model of phonotactics and phonotactic learning
, 2006
"... The study of phonotactics (e.g., the ability of English speakers to distinguish possible words like blick from impossible words like *bnick) is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence. Our ..."
Abstract
-
Cited by 35 (5 self)
- Add to MetaCart
The study of phonotactics (e.g., the ability of English speakers to distinguish possible words like blick from impossible words like *bnick) is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence. Our grammars consist of constraints that are assigned numerical weights according to the principle of maximum entropy. Possible words are assessed by these grammars based on the weighted sum of their constraint violations. The learning algorithm yields grammars that can capture both categorical and gradient phonotactic patterns. The algorithm is not provided with any constraints in advance, but uses its own resources to form constraints and weight them. A baseline model, in which Universal Grammar is reduced to a feature set and an SPE-style constraint format, suffices to learn many phonotactic phenomena. In order to learn nonlocal phenomena such as stress and vowel harmony, it is necessary to augment the model with autosegmental tiers and metrical grids. Our results thus offer novel, learning-theoretic support for such representations. We apply the model to English syllable onsets, Shona vowel harmony, quantity-insensitive stress typology, and the full phonotactics of Wargamay, showing that the learned grammars capture the distributional generalizations of these languages and accurately predict the findings of a phonotactic experiment.
Learning bias and phonological-rule induction
- Computational Linguistics
, 1996
"... A fundamental debate in the machine learning of language has been the role of prior knowledge in the learning process. Purely nativist approaches, such as the Principles and Parameters model, build parameterized linguistic generalizations directly into the learning system. Purely empirical approache ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
A fundamental debate in the machine learning of language has been the role of prior knowledge in the learning process. Purely nativist approaches, such as the Principles and Parameters model, build parameterized linguistic generalizations directly into the learning system. Purely empirical approaches use a general, domain-independent learning rule (Error Back-Propagation, Instance-based Generalization, Minimum Description Length) to learn linguistic generalizations directly from the data. In this paper we suggest that an alternative to the purely nativist or purely empiricist learning paradigms is to represent the prior knowledge of language as a set of abstract learning biases, which guide an empirical inductive learning algorithm. We test our idea by examining the machine learning of simple Sound Pattern of English ( S P E)-style phonological rules. We represent phonological rules as finite-state transducers that accept underlying forms as input and generate surface forms as output. We show that OSTIA, a general-purpose transducer induction algorithm, was incapable of learning simple phonological rules like flapping. We then augmented OSTIA with three kinds of learning biases that are specific to natural language phonology, and that are assumed explicitly or implicitly by every theory of phonology: faithfulness (underlying segments
Connectionist Models and Linguistic Theory: Investigations of Stress Systems in Language
- Cognitive Science
, 1994
"... This paper discusses a perceptron model of the learning and assignment of linguistic stress, using data from nineteen human languages. First, we point out some interesting parallels between aspects of the model and the constructs and predictions of metrical phonology, the linguistic theory of str ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
This paper discusses a perceptron model of the learning and assignment of linguistic stress, using data from nineteen human languages. First, we point out some interesting parallels between aspects of the model and the constructs and predictions of metrical phonology, the linguistic theory of stress. Second, we develop a novel analysis of linguistic stress in terms of ease of perceptron-learnability. These two sets of results suggest that simple statistical learning techniques have the potential to complement, and provide computational validation for, abstract theoretical investigations of language. We then examine why such methodologies should be of interest for linguistic theorizing. Our analysis began at a high level by observing inherent characteristics of various stress systems, much as theoretical linguistics does. However, our explanations changed substantially whenwe included a detailed account of the model's processing mechanisms. Our higher-level, theoretical accou...
The Acquisition of a Unification-Based Generalised Categorial Grammar
, 2002
"... The purpose of this work is to investigate the process of grammatical acquisition from data. In order to do that, a computational learning system is used, composed of a Universal Grammar with associated parameters, and a learning algorithm, following the Principles and Parameters Theory. The Univers ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
The purpose of this work is to investigate the process of grammatical acquisition from data. In order to do that, a computational learning system is used, composed of a Universal Grammar with associated parameters, and a learning algorithm, following the Principles and Parameters Theory. The Universal Grammar is implemented as a Unification-Based Generalised Categorial Grammar, embedded in a default inheritance network of lexical types. The learning algorithm receives input from a corpus of spontaneous child-directed transcribed speech annotated with logical forms and sets the parameters based on this input. This framework is used as a basis to investigate several aspects of language acquisition. In this thesis I concentrate on the acquisition of subcategorisation frames and word order information, from data. The data to which the learner is exposed can be noisy and ambiguous, and I investigate how these factors a#ect the learning process. The results obtained show a robust learner converging towards the target grammar given the input data available. They also show how the amount of noise present in the input data a#ects the speed of convergence of the learner towards the target grammar. Future work is suggested for investigating the developmental stages of language acquisition as predicted by the learning model, with a thorough comparison with the developmental stages of a child. This is primarily a cognitive computational model of language learning that can be used to investigate and gain a better understanding of human language acquisition, and can potentially be relevant to the development of more adaptive NLP technology.
The Iterative Learning of Phonological Constraints
- Computational Linguistics
, 1991
"... This paper presents a simplicity measure for violable phonological constraints based on the minimum message length method. This measure captures the intuitive desiderata of conciseness, accuracy and precision. A family of constraints can be specified by parameterising a specific constraint, and so f ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
This paper presents a simplicity measure for violable phonological constraints based on the minimum message length method. This measure captures the intuitive desiderata of conciseness, accuracy and precision. A family of constraints can be specified by parameterising a specific constraint, and so forming a template. The combination of this measure with a search algorithm is a powerful learning method for finding the best constraint matching a template and fitting a corpus. This method may be applied iteratively, using the same template, to learn a number of different constraints. Five applications of an implementation show some of the successes of this learning method: from learning consonant cluster constraints to vowel harmony.
The Informational Complexity of Learning from Examples
, 1996
"... This thesis attempts to quantify the amount of information needed to learn certain tasks. The tasks chosen vary from learning functions in a Sobolev space using radial basis function networks to learning grammars in the principles and parameters framework of modern linguistic theory. These problem ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
This thesis attempts to quantify the amount of information needed to learn certain tasks. The tasks chosen vary from learning functions in a Sobolev space using radial basis function networks to learning grammars in the principles and parameters framework of modern linguistic theory. These problems are analyzed from the perspective of computational learning theory and certain unifying perspectives emerge. Copyright c fl Massachusetts Institute of Technology, 1996 This report describes research done within the Center for Biological and Computational Learning in the Department of Brain and Cognitive Sciences and at the Artificial Intelligence Laboratory at the Massachusetts Institute of Technology. This research is sponsored by a grant from the National Science Foundation under contract ASC-9217041 (this award includes funds from ARPA provided under the HPCC program); and by a grant from ARPA/ONR under contract N00014-92-J-1879. Additional support has been provided by Siemens Co...

