Results 1 - 10
of
82
Empirical tests of the Gradual Learning Algorithm
- LINGUISTIC INQUIRY 32.45–86
, 2001
"... The Gradual Learning Algorithm (Boersma 1997) is a constraint ranking algorithm for learning Optimality-theoretic grammars. The purpose of this article is to assess the capabilities of the Gradual Learning Algorithm, particularly in comparison with the Constraint Demotion algorithm of Tesar and Smol ..."
Abstract
-
Cited by 147 (27 self)
- Add to MetaCart
The Gradual Learning Algorithm (Boersma 1997) is a constraint ranking algorithm for learning Optimality-theoretic grammars. The purpose of this article is to assess the capabilities of the Gradual Learning Algorithm, particularly in comparison with the Constraint Demotion algorithm of Tesar and Smolensky (1993, 1996, 1998, 2000), which initiated the learnability research program for Optimality Theory. We argue that the Gradual Learning Algorithm has a number of special advantages: it can learn free variation, deal effectively with noisy learning data, and account for gradient wellformedness judgments. The case studies we examine involve Ilokano reduplication and metathesis, Finnish genitive plurals, and the distribution of English light and dark /l/.
A maximum entropy model of phonotactics and phonotactic learning
, 2006
"... The study of phonotactics (e.g., the ability of English speakers to distinguish possible words like blick from impossible words like *bnick) is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence. Our ..."
Abstract
-
Cited by 35 (5 self)
- Add to MetaCart
The study of phonotactics (e.g., the ability of English speakers to distinguish possible words like blick from impossible words like *bnick) is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence. Our grammars consist of constraints that are assigned numerical weights according to the principle of maximum entropy. Possible words are assessed by these grammars based on the weighted sum of their constraint violations. The learning algorithm yields grammars that can capture both categorical and gradient phonotactic patterns. The algorithm is not provided with any constraints in advance, but uses its own resources to form constraints and weight them. A baseline model, in which Universal Grammar is reduced to a feature set and an SPE-style constraint format, suffices to learn many phonotactic phenomena. In order to learn nonlocal phenomena such as stress and vowel harmony, it is necessary to augment the model with autosegmental tiers and metrical grids. Our results thus offer novel, learning-theoretic support for such representations. We apply the model to English syllable onsets, Shona vowel harmony, quantity-insensitive stress typology, and the full phonotactics of Wargamay, showing that the learned grammars capture the distributional generalizations of these languages and accurately predict the findings of a phonotactic experiment.
Probabilistic Syntax
, 2002
"... istic methods for syntax, just as for a long time McCarthy and Hayes (1969) discouraged exploration of probabilistic methods in Artificial Intelligence. Among his arguments were that: (i) Probabilistic models wrongly mix in world knowledge (New York occurs more in text than Dayton, Ohio, but for no ..."
Abstract
-
Cited by 27 (1 self)
- Add to MetaCart
istic methods for syntax, just as for a long time McCarthy and Hayes (1969) discouraged exploration of probabilistic methods in Artificial Intelligence. Among his arguments were that: (i) Probabilistic models wrongly mix in world knowledge (New York occurs more in text than Dayton, Ohio, but for no linguistic reason), (ii) Probabilistic models don't model grammaticality (neither Colorless green ideas sleep furiously nor Furiously sleep ideas green colorless have previously been uttered -- and hence must be estimated to have probability zero, Chomsky wrongly assumes -- but the former is grammatical while the latter is not, and (iii) Use of probabilities does not meet the goal of describing the mind-internal I-language as opposed to the observed-in-the-world E-language. This chapter is not meant to be a detailed critique of Chomsky's arguments -- Abney (1996) provides a survey and a rebuttal, and Pereira (2000) has further useful discussion -- but some of these concerns are still importa
Soft Constraints Mirror Hard Constraints: Voice and Person in English and Lummi
- PROCEEDINGS OF THE LFG 01 CONFERENCE. CSLI
, 2001
"... The same categorical phenomena which are attributed to hard grammatical constraints in some languages continue to show up as statistical preferences in other languages, motivating a grammatical model that can account for soft constraints. The effects of a hierarchy of person (1st, 2nd 3rd) on gramm ..."
Abstract
-
Cited by 24 (8 self)
- Add to MetaCart
The same categorical phenomena which are attributed to hard grammatical constraints in some languages continue to show up as statistical preferences in other languages, motivating a grammatical model that can account for soft constraints. The effects of a hierarchy of person (1st, 2nd 3rd) on grammar are categorical in some languages, most famously in languages with inverse systems, but also in languages with person restrictions on passivization. In Lummi, for example, the person of the subject argument cannot be lower than the person of a nonsubject argument. If this would happen in the active, passivization is obligatory; if it would happen in the passive, the active is obligatory (Jelinek and Demers 1983). These facts follow from the theory of harmonic alignment in OT: constraints favoring the harmonic association of prominent person (1st, 2nd) with prominent syntactic function (subject) are hypothesized to be present as subhierarchies of the grammars of all languages, but to vary ...
Learning Phonology With Substantive Bias: An Experimental and Computational Study of Velar Palatalization
, 2006
"... There is an active debate within the field of phonology concerning the cognitive status of substantive phonetic factors such as ease of articulation and perceptual distinctiveness. A new framework is proposed in which substance acts as a bias, or prior, on phonological learning. Two experiments test ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
There is an active debate within the field of phonology concerning the cognitive status of substantive phonetic factors such as ease of articulation and perceptual distinctiveness. A new framework is proposed in which substance acts as a bias, or prior, on phonological learning. Two experiments tested this framework with a method in which participants are first provided highly impoverished evidence of a new phonological pattern, and then tested on how they extend this pattern to novel contexts and novel sounds. Participants were found to generalize velar palatalization (e.g., the change from [k]asinkeep to [t�ʃ]asincheap) in a way that accords with linguistic typology, and that is predicted by a cognitive bias in favor of changes that relate perceptually similar sounds. Velar palatalization was extended from the mid front vowel context (i.e., before [e]asincape) to the high front vowel context (i.e., before [i]asin keep), but not vice versa. The key explanatory notion of perceptual similarity is quantified with a psychological model of categorization, and the substantively biased framework is formalized as a conditional random field. Implications of these results for the debate on substance, theories of phonological generalization, and the formalization of similarity are discussed.
Against formal phonology
- Language
, 2005
"... Chomsky and Halle (1968) and many formal linguists rely on the notion of a universally available phonetic space defined in discrete time. This assumption plays a central role in phonological theory. Discreteness at the phonetic level guarantees the discreteness of all other levels of language. But d ..."
Abstract
-
Cited by 16 (10 self)
- Add to MetaCart
Chomsky and Halle (1968) and many formal linguists rely on the notion of a universally available phonetic space defined in discrete time. This assumption plays a central role in phonological theory. Discreteness at the phonetic level guarantees the discreteness of all other levels of language. But decades of phonetics research demonstrate that there exists no universal inventory of phonetic objects. We discuss three kinds of evidence: first, phonologies differ incommensurably. Second, some phonetic characteristics of languages depend on intrinsically temporal patterns, and, third, some linguistic sound categories within a language are different from each other despite a high degree of overlap that precludes distinctness. Linguistics has mistakenly presumed that speech can always be spelled with letter-like tokens. A variety of implications of these conclusions for research in phonology are discussed.* The generative paradigm of language description (Chomsky 1964, 1965, Chomsky & Halle 1968) has dominated linguistic thinking in the United States for many years. Its specific claims about the phonetic basis of linguistic analysis still provide the cornerstone of most linguistic research. Many criticisms have been raised against the phonetic claims of the Sound pattern of English (Chomsky & Halle 1968), some from early on
Phonology Competes with Syntax: Experimental Evidence for the Interaction of Word Order and Accent Placement in the Realization of Information Structure
- Cognition
, 2000
"... In this paper, we investigate the interaction of phonological and syntactic constraints on the realization of Information Structure in Greek, a free word order language. We use magnitude estimation as our experimental paradigm, which allows us to quantify the influence of a given linguistic const ..."
Abstract
-
Cited by 14 (8 self)
- Add to MetaCart
In this paper, we investigate the interaction of phonological and syntactic constraints on the realization of Information Structure in Greek, a free word order language. We use magnitude estimation as our experimental paradigm, which allows us to quantify the influence of a given linguistic constraint on the acceptability of a sentence. We present results from two experiments. In the first experiment, we focus on the interaction of word order and context. In the second experiment, we investigate the additional effect of accent placement and clitic doubling. The results show that word order, in contrast to standard assumptions in the theoretical literature, plays only a secondary role in marking the Information Structure of a sentence. Order preferences are relatively weak and can be overridden by constraints on accent placement and clitic doubling. Our experiments also demonstrate that a null context shows the same preference pattern as an all focus context, indicating that `...
Bi-Directional Optimality Theory: An Application of Game Theory
- Journal of Semantics
, 2000
"... Optimality Theory catches on in linguistics, first in phonology, then in syntax, and recently also at the semantics / pragmatics interface. In this paper we point at some parallels between principles employed in optimality theoretic interpretation, and notions from the well-established field of G ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Optimality Theory catches on in linguistics, first in phonology, then in syntax, and recently also at the semantics / pragmatics interface. In this paper we point at some parallels between principles employed in optimality theoretic interpretation, and notions from the well-established field of Game Theory. Optimality theoretic interpretation can be defined as what we call an "interpretation game", and optimality itself can be viewed as a solution concept for a game. More in particular, optimality can be characterized in terms of the game-theoretical notion of a "Nash Equilibrium".

