Results 1 - 10
of
88
Learnability in Optimality Theory
, 1995
"... In this article we show how Optimality Theory yields a highly general Constraint Demotion principle for grammar learning. The resulting learning procedure specifically exploits the grammatical structure of Optimality Theory, independent of the content of substantive constraints defining any given gr ..."
Abstract
-
Cited by 208 (20 self)
- Add to MetaCart
In this article we show how Optimality Theory yields a highly general Constraint Demotion principle for grammar learning. The resulting learning procedure specifically exploits the grammatical structure of Optimality Theory, independent of the content of substantive constraints defining any given grammatical module. We decompose the learning problem and present formal results for a central subproblem, deducing the constraint ranking particular to a target language, given structural descriptions of positive examples. The structure imposed on the space of possible grammars by Optimality Theory allows efficient convergence to a correct grammar. We discuss implications for learning from overt data only, as well as other learning issues. We argue that Optimality Theory promotes confluence of the demands of more effective learnability and deeper linguistic explanation.
The induction of dynamical recognizers
- Machine Learning
, 1991
"... A higher order recurrent neural network architecture learns to recognize and generate languages after being "trained " on categorized exemplars. Studying these networks from the perspective of dynamical systems yields two interesting discoveries: First, a longitudinal examination of the learning pro ..."
Abstract
-
Cited by 197 (15 self)
- Add to MetaCart
A higher order recurrent neural network architecture learns to recognize and generate languages after being "trained " on categorized exemplars. Studying these networks from the perspective of dynamical systems yields two interesting discoveries: First, a longitudinal examination of the learning process illustrates a new form of mechanical inference: Induction by phase transition. A small weight adjustment causes a "bifurcation" in the limit behavior of the network. This phase transition corresponds to the onset of the network’s capacity for generalizing to arbitrary-length strings. Second, a study of the automata resulting from the acquisition of previously published training sets indicates that while the architecture is not guaranteed to find a minimal finite automaton consistent with the given exemplars, which is an NP-Hard problem, the architecture does appear capable of generating non-regular languages by exploiting fractal and chaotic dynamics. I end the paper with a hypothesis relating linguistic generative capacity to the behavioral regimes of non-linear dynamical systems.
Functional Phonology -- Formalizing the interactions between articulatory and perceptual drives
, 1998
"... ..."
Learning Semantic Grammars with Constructive Inductive Logic Programming
- In Proceedings of the Eleventh National Conference on Artificial Intelligence
, 1993
"... Automating the construction of semantic grammars is a difficult and interesting problem for machine learning. This paper shows how the semantic-grammar acquisition problem can be viewed as the learning of search-control heuristics in a logic program. Appropriate control rules are learned using a new ..."
Abstract
-
Cited by 63 (13 self)
- Add to MetaCart
Automating the construction of semantic grammars is a difficult and interesting problem for machine learning. This paper shows how the semantic-grammar acquisition problem can be viewed as the learning of search-control heuristics in a logic program. Appropriate control rules are learned using a new first-order induction algorithm that automatically invents useful syntactic and semantic categories. Empirical results show that the learned parsers generalize well to novel sentences and out-perform previous approaches based on connectionist techniques. Introduction Designing computer systems to "understand" natural language input is a difficult task. The laboriously hand-crafted computational grammars supporting natural language applications are often inefficient, incomplete and ambiguous. The difficulty in constructing adequate grammars is an example of the "knowledge acquisition bottleneck" which has motivated much research in machine learning. While numerous researchers have studied ...
Acquiring Disambiguation Rules From Text
, 1989
"... An effective procedure for automatically acquiring a new set of disambiguation rules for an existing deterministic parser on the basis of tagged text is presented. Performance of the automatically acquired rules is much better than the existing handwritten disambiguation rules. The success of the ac ..."
Abstract
-
Cited by 61 (0 self)
- Add to MetaCart
An effective procedure for automatically acquiring a new set of disambiguation rules for an existing deterministic parser on the basis of tagged text is presented. Performance of the automatically acquired rules is much better than the existing handwritten disambiguation rules. The success of the acquired rules depends on using the linguistic information encoded in the parser; enhancements to various components of the parser improves the acquired rule set. This work suggests a path toward more robust and comprehensive syntactic analyz- er8.
Language Acquisition in the Absence of Explicit Negative Evidence: How Important is Starting Small?
- COGNITION
, 1999
"... It is commonly assumed that innate linguistic constraints are necessary to learn a natural language, based on the apparent lack of explicit negative evidence provided to children and on Gold's proof that, under assumptions of virtually arbitrary positive presentation, most interesting classes of ..."
Abstract
-
Cited by 59 (5 self)
- Add to MetaCart
It is commonly assumed that innate linguistic constraints are necessary to learn a natural language, based on the apparent lack of explicit negative evidence provided to children and on Gold's proof that, under assumptions of virtually arbitrary positive presentation, most interesting classes of languages are not learnable. However, Gold's results do not apply under the rather common assumption that language presentation may be modeled as a stochastic process. Indeed, Elman (Elman, J.L., 1993. Learning and development in neural networks: the importance of starting small. Cognition 48, 71--99) demonstrated that a simple recurrent connectionist network could learn an artificial grammar with some of the complexities of English, including embedded clauses, based on performing a word prediction task within a stochastic environment. However, the network was successful only when either embedded sentences were initially withheld and only later introduced gradually, or when the network itself was given initially limited memory which only gradually improved. This finding has been taken as support for Newport's `less is more' proposal, that child language acquisition may be aided rather than hindered by limited cognitive resources. The current article reports on connectionist simulations which indicate, to the contrary, that starting with simplified inputs or limited memory is not necessary in training recurrent networks to learn pseudonatural languages; in fact, such restrictions hinder acquisition as the languages are made more English-like by the introduction of semantic as well as syntactic constraints. We suggest that, under a statistical model of the language environment, Gold's theorem and the possible lack of explicit negative evidence do not implicate i...
Maltparser: A language-independent system for data-driven dependency parsing
- In Proc. of the Fourth Workshop on Treebanks and Linguistic Theories
, 2005
"... ..."
Practical Unification-based Parsing of Natural Language
, 1993
"... The thesis describes novel techniques and algorithms for the practical parsing of realistic Natural Language (NL) texts with a wide-coverage unification-based grammar of English. The thesis tackles two of the major problems in this area: firstly, the fact that parsing realistic inputs with such gr ..."
Abstract
-
Cited by 46 (7 self)
- Add to MetaCart
The thesis describes novel techniques and algorithms for the practical parsing of realistic Natural Language (NL) texts with a wide-coverage unification-based grammar of English. The thesis tackles two of the major problems in this area: firstly, the fact that parsing realistic inputs with such grammars can be computationally very expensive, and secondly, the observation that many analyses are often assigned to an input, only one of which usually forms the basis of the correct interpretation. The thesis starts by presenting a new unification algorithm, justifies why it is well-suited to practical NL parsing, and describes a bottom-up active chart parser which employs this unification algorithm together with several other novel processing and optimisation techniques. Empirical results demonstrate that an implementation of this parser has significantly better practical
The Power of Vacillation in Language Learning
, 1992
"... Some extensions are considered of Gold's influential model of language learning by machine from positive data. Studied are criteria of successful learning featuring convergence in the limit to vacillation between several alternative correct grammars. The main theorem of this paper is that there are ..."
Abstract
-
Cited by 44 (11 self)
- Add to MetaCart
Some extensions are considered of Gold's influential model of language learning by machine from positive data. Studied are criteria of successful learning featuring convergence in the limit to vacillation between several alternative correct grammars. The main theorem of this paper is that there are classes of languages that can be learned if convergence in the limit to up to (n+1) exactly correct grammars is allowed but which cannot be learned if convergence in the limit is to no more than n grammars, where the no more than n grammars can each make finitely many mistakes. This contrasts sharply with results of Barzdin and Podnieks and, later, Case and Smith, for learnability from both positive and negative data. A subset principle from a 1980 paper of Angluin is extended to the vacillatory and other criteria of this paper. This principle, provides a necessary condition for circumventing overgeneralization in learning from positive data. It is applied to prove another theorem to the eff...
The acquisition and use of context-dependent grammars for English
- Computational Linguistics
, 1993
"... This paper introduces a paradigm of context-dependent grammar (CDG) and an acquisition system that, through interactive teaching sessions, accumulates the CDG rules. The resulting context-sensitive rules are used by a stack-based, shift~reduce parser to compute unambiguous syntactic structures of se ..."
Abstract
-
Cited by 37 (0 self)
- Add to MetaCart
This paper introduces a paradigm of context-dependent grammar (CDG) and an acquisition system that, through interactive teaching sessions, accumulates the CDG rules. The resulting context-sensitive rules are used by a stack-based, shift~reduce parser to compute unambiguous syntactic structures of sentences. The acquisition system and parser have been applied to the phrase structure and case analyses of 345 sentences, mainly from newswire stories, with 99 % accuracy. Extrapolation from our current grammar predicts that about 25 thousand CDG rule examples will be sufficient to train the system in phrase structure analysis of most news stories. Overall, this research concludes that CDG is a computationally and conceptually tractable approach for the construction of sentence grammar for large subsets of natural language text. 1.

