Results 1  10
of
153
The induction of dynamical recognizers
 Machine Learning
, 1991
"... A higher order recurrent neural network architecture learns to recognize and generate languages after being "trained " on categorized exemplars. Studying these networks from the perspective of dynamical systems yields two interesting discoveries: First, a longitudinal examination of the le ..."
Abstract

Cited by 218 (14 self)
 Add to MetaCart
(Show Context)
A higher order recurrent neural network architecture learns to recognize and generate languages after being "trained " on categorized exemplars. Studying these networks from the perspective of dynamical systems yields two interesting discoveries: First, a longitudinal examination of the learning process illustrates a new form of mechanical inference: Induction by phase transition. A small weight adjustment causes a "bifurcation" in the limit behavior of the network. This phase transition corresponds to the onset of the network’s capacity for generalizing to arbitrarylength strings. Second, a study of the automata resulting from the acquisition of previously published training sets indicates that while the architecture is not guaranteed to find a minimal finite automaton consistent with the given exemplars, which is an NPHard problem, the architecture does appear capable of generating nonregular languages by exploiting fractal and chaotic dynamics. I end the paper with a hypothesis relating linguistic generative capacity to the behavioral regimes of nonlinear dynamical systems.
Supertagging: An Approach to Almost Parsing
 Computational Linguistics
, 1999
"... this paper, we have proposed novel methods for robust parsing that integrate the flexibility of linguistically motivated lexical descriptions with the robustness of statistical techniques. Our thesis is that the computation of linguistic structure can be localized if lexical items are associated wit ..."
Abstract

Cited by 163 (23 self)
 Add to MetaCart
(Show Context)
this paper, we have proposed novel methods for robust parsing that integrate the flexibility of linguistically motivated lexical descriptions with the robustness of statistical techniques. Our thesis is that the computation of linguistic structure can be localized if lexical items are associated with rich descriptions (Supertags) that impose complex constraints in a local context. The supertags are designed such that only those elements on which the lexical item imposes constraints appear within a given supertag. Further, each lexical item is associated with as many supertags as the number of different syntactic contexts in which the lexical item can appear. This makes the number of different descriptions for each lexical item much larger, than when the descriptions are less complex; thus increasing the local ambiguity for a parser. But this local ambiguity can be resolved by using statistical distributions of supertag cooccurrences collected from a corpus of parses. We have explored these ideas in the context of Lexicalized TreeAdjoining Grammar (LTAG) framework. The supertags in LTAG combine both phrase structure information and dependency information in a single representation. Supertag disambiguation results in a representation that is effectively a parse (almost parse), and the parser needs `only' combine the individual supertags. This method of parsing can also be used to parse sentence fragments such as in spoken utterances where the disambiguated supertag sequence may not combine into a single structure. 1 Introduction In this paper, we present a robust parsing approach called supertagging that integrates the flexibility of linguistically motivated lexical descriptions with the robustness of statistical techniques. The idea underlying the approach is that the ...
Grammatical Framework: A TypeTheoretical Grammar Formalism
, 2003
"... Grammatical Framework (GF) is a specialpurpose functional language for defining grammars. It uses a Logical Framework (LF) for a description of abstract syntax, and adds to this a notation for defining concrete syntax. GF grammars themselves are purely declarative, but can be used both for lineariz ..."
Abstract

Cited by 90 (22 self)
 Add to MetaCart
Grammatical Framework (GF) is a specialpurpose functional language for defining grammars. It uses a Logical Framework (LF) for a description of abstract syntax, and adds to this a notation for defining concrete syntax. GF grammars themselves are purely declarative, but can be used both for linearizing syntax trees and parsing strings. GF can describe both formal and natural languages. The key notion of this description is a grammatical object, which is not just a string, but a record that contains all information on inflection and inherent grammatical features such as number and gender in natural languages, or precedence in formal languages. Grammatical objects have a type system, which helps to eliminate runtime errors in language processing. In the same way as an LF, GF uses...
Applying CoTraining methods to Statistical Parsing
, 2001
"... We propose a novel CoTraining method for statistical parsing. The algorithm takes as input a small corpus (9695 sentences) annotated with parse trees, a dictionary of possible lexicalized structures for each word in the training set and a large pool of unlabeled text. The algorithm iteratively labe ..."
Abstract

Cited by 60 (3 self)
 Add to MetaCart
We propose a novel CoTraining method for statistical parsing. The algorithm takes as input a small corpus (9695 sentences) annotated with parse trees, a dictionary of possible lexicalized structures for each word in the training set and a large pool of unlabeled text. The algorithm iteratively labels the entire data set with parse trees. Using empirical results based on parsing the Wall Street Journal corpus we show that training a statistical parser on the combined labeled and unlabeled data strongly outperforms training only on the labeled data. 1
Factoring predicate argument and scope semantics: Underspecified semantics with LTAG
 12th Amsterdam Colloquium. Proceedings
, 1999
"... Abstract. In this paper we propose a compositional semantics for lexicalized treeadjoining grammar (LTAG). Treelocal multicomponent derivations allow separation of the semantic contribution of a lexical item into one component contributing to the predicate argument structure and a second component ..."
Abstract

Cited by 59 (13 self)
 Add to MetaCart
(Show Context)
Abstract. In this paper we propose a compositional semantics for lexicalized treeadjoining grammar (LTAG). Treelocal multicomponent derivations allow separation of the semantic contribution of a lexical item into one component contributing to the predicate argument structure and a second component contributing to scope semantics. Based on this idea a syntaxsemantics interface is presented where the compositional semantics depends only on the derivation structure. It is shown that the derivation structure (and indirectly the locality of derivations) allows an appropriate amount of underspecification. This is illustrated by investigating underspecified representations for quantifier scope ambiguities and related phenomena such as adjunct scope and island constraints. Key words: computational semantics, lexicalized treeadjoining grammar, quantifier scope, underspecification 1.
Bidirectional Parsing Of Lexicalized Tree Adjoining Grammars
, 1991
"... In this paper a bidirectional parser for Lexicalized Tree Adjoining Grammars will be presented. The algorithm takes advantage of a peculiar characteristic of Lexicalized TAGs, i.e. that each elementary tree is associated with a lexical item, called its anchor. The algorithm employs a mixed strategy: ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
In this paper a bidirectional parser for Lexicalized Tree Adjoining Grammars will be presented. The algorithm takes advantage of a peculiar characteristic of Lexicalized TAGs, i.e. that each elementary tree is associated with a lexical item, called its anchor. The algorithm employs a mixed strategy: it works bottom up from the lexical anchors and then expands (.partial) analyses making topdown predictions. Even if such an algorithm does not improve the worstcase time bounds of already known TAGs parsing methods, it could be relevant from the perspective of linguistic information processing, because it employs lexical information in a more direct way.
Chinese NumberNames, Tree Adjoining Languages, and Mild ContextSensitivity
 COMPUTATIONAL LINGUISTICS
, 1991
"... ... this paper that the numbername system of Chinese is generated neither by this formalism nor by any other equivalent or weaker ones, suggesting that such a task might require the use of the more powerful Indexed Grammar formalism. Given that our formal results apply only to a proper subset of Ch ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
... this paper that the numbername system of Chinese is generated neither by this formalism nor by any other equivalent or weaker ones, suggesting that such a task might require the use of the more powerful Indexed Grammar formalism. Given that our formal results apply only to a proper subset of Chinese, we extensively discuss the issue of whether they have any implications for the whole of that natural language. We conclude that our results bear directly either on the syntax of Chinese or on the interface between Chinese and the cognitive component responsible for arithmetic reasoning. Consequently, either Tree Adjoining Grammars, as currently defined, fail to generate the class of natural languages in a way that discriminates between linguistically warranted sublanguages, or formalisms with generative power equivalent to Tree Adjoining Grammar cannot serve as a basis for the interface between the human linguistic and mathematical faculties.
Discriminative Learning and Spanning Tree Algorithms for Dependency Parsing
, 2006
"... In this thesis we develop a discriminative learning method for dependency parsing using
online largemargin training combined with spanning tree inference algorithms. We will
show that this method provides stateoftheart accuracy, is extensible through the feature
set and can be implemented effici ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
(Show Context)
In this thesis we develop a discriminative learning method for dependency parsing using
online largemargin training combined with spanning tree inference algorithms. We will
show that this method provides stateoftheart accuracy, is extensible through the feature
set and can be implemented efficiently. Furthermore, we display the language independent
nature of the method by evaluating it on over a dozen diverse languages as well as show its
practical applicability through integration into a sentence compression system.
We start by presenting an online largemargin learning framework that is a generaliza
tion of the work of Crammer and Singer [34, 37] to structured outputs, such as sequences
and parse trees. This will lead to the heart of this thesis – discriminative dependency pars
ing. Here we will formulate dependency parsing in a spanning tree framework, yielding
efficient parsing algorithms for both projective and nonprojective tree structures. We will
then extend the parsing algorithm to incorporate features over larger substructures with
out an increase in computational complexity for the projective case. Unfortunately, the
nonprojective problem then becomes NPhard so we provide structurally motivated ap
proximate algorithms. Having defined a set of parsing algorithms, we will also define a
rich feature set and train various parsers using the online largemargin learning framework.
We then compare our trained dependency parsers to other stateoftheart parsers on 14
diverse languages: Arabic, Bulgarian, Chinese, Czech, Danish, Dutch, English, German,
Japanese, Portuguese, Slovene, Spanish, Swedish and Turkish.
Having built an efficient and accurate discriminative dependency parser, this thesis will
then turn to improving and applying the parser. First we will show how additional re
sources can provide useful features to increase parsing accuracy and to adapt parsers to
new domains. We will also argue that the robustness of discriminative inferencebased
learning algorithms lend themselves well to dependency parsing when feature representa
tions or structural constraints do not allow for tractable parsing algorithms. Finally, we
integrate our parsing models into a stateoftheart sentence compression system to show
its applicability to a real world problem.
On languages piecewise testable in the strict sense
 In Proceedings of the 11th Meeting of the Assocation for Mathematics of Language
, 2009
"... Abstract. In this paper we explore the class of Strictly Piecewise languages, originally introduced to characterize longdistance phonotactic patterns by Heinz [1] as the Precedence Languages. We provide a series of equivalent abstract characterizations, discuss their basic properties, locate them r ..."
Abstract

Cited by 18 (12 self)
 Add to MetaCart
(Show Context)
Abstract. In this paper we explore the class of Strictly Piecewise languages, originally introduced to characterize longdistance phonotactic patterns by Heinz [1] as the Precedence Languages. We provide a series of equivalent abstract characterizations, discuss their basic properties, locate them relative to other wellknown subregular classes and provide algorithms for translating between the grammars defined here and finite state automata as well as an algorithm for deciding whether a regular language is Strictly Piecewise. 1
Mildly contextsensitive dependency languages
 IN: 45TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL)
, 2007
"... Dependencybased representations of natural language syntax require a fine balance between structural flexibility and computational complexity. In previous work, several constraints have been proposed to identify classes of dependency structures that are wellbalanced in this sense; the bestknown bu ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
Dependencybased representations of natural language syntax require a fine balance between structural flexibility and computational complexity. In previous work, several constraints have been proposed to identify classes of dependency structures that are wellbalanced in this sense; the bestknown but also most restrictive of these is projectivity. Most constraints are formulated on fully specified structures, which makes them hard to integrate into models where structures are composed from lexical information. In this paper, we show how two empirically relevant relaxations of projectivity can be lexicalized, and how combining the resulting lexicons with a regular means of syntactic composition gives rise to a hierarchy of mildly contextsensitive dependency languages.