Results 1  10
of
238
Widecoverage efficient statistical parsing with CCG and loglinear models
 COMPUTATIONAL LINGUISTICS
, 2007
"... This paper describes a number of loglinear parsing models for an automatically extracted lexicalized grammar. The models are "full" parsing models in the sense that probabilities are defined for complete parses, rather than for independent events derived by decomposing the parse tree. Dis ..."
Abstract

Cited by 153 (35 self)
 Add to MetaCart
This paper describes a number of loglinear parsing models for an automatically extracted lexicalized grammar. The models are "full" parsing models in the sense that probabilities are defined for complete parses, rather than for independent events derived by decomposing the parse tree. Discriminative training is used to estimate the models, which requires incorrect parses for each sentence in the training data as well as the correct parse. The lexicalized grammar formalism used is Combinatory Categorial Grammar (CCG), and the grammar is automatically extracted from CCGbank, a CCG version of the Penn Treebank. The combination of discriminative training and an automatically extracted grammar leads to a significant memory requirement (over 20 GB), which is satisfied using a parallel implementation of the BFGS optimisation algorithm running on a Beowulf cluster. Dynamic programming over a packed chart, in combination with the parallel implementation, allows us to solve one of the largestscale estimation problems in the statistical parsing literature in under three hours. A key component of the parsing system, for both training and testing, is a Maximum Entropy supertagger which assigns CCG lexical categories to words in a sentence. The supertagger makes the discriminative training feasible, and also leads to a highly efficient parser. Surprisingly,
Combinators and Grammars
 Categorial Grammars and Natural Language Structures
, 1988
"... The term Categorial Grammar (CG) names a group of theories of natural language syntax and semantics in which the main responsibility for defining syntactic form is borne by the lexicon. CG is therefore one of the oldest and purest examples of a class of “lexicalized ” theories of grammar which also ..."
Abstract

Cited by 84 (3 self)
 Add to MetaCart
The term Categorial Grammar (CG) names a group of theories of natural language syntax and semantics in which the main responsibility for defining syntactic form is borne by the lexicon. CG is therefore one of the oldest and purest examples of a class of “lexicalized ” theories of grammar which also includes HPSG, LFG, TAG, Montague Grammar, Relational Grammar and certain recent versions of the Chomskean theory. The various modern versions of CG are characterized by a much freer notion of derivational syntactic structure than is assumed under most other formal or generative theories of grammar. All forms of CG also follow Montague 1974 in sharing a strong commitment to the Principle of Compositionality—that is to the assumption that syntax and interpretation are homomorphically related, and may be derived in tandem. Significant contributions have been made by Categorial Grammarians to the study of semantics, syntax, morphology, intonational phonology, computational linguistics and human sentence processing. Since the problem of formalizing the grammar of natural languages was first defined in its modern form in the 1950’s, there have been two styles. Chomsky 1957 and much subsequent work in generative grammar begins by capturing the basic facts of English constituent order exemplified in (1) in a Contextfree Phrase Structure Grammar (CFPSG) or system of rewrite rules or “productions ” like (2), which have their origin in early work in recursion theory by Post, among others.
MultiModal Combinatory Categorial Grammar
, 2003
"... The paper shows how Combinatory Categorial Grammar (CCG) can be adapted to take advantage of the extra resourcesensitivity provided by the Categorial Type Logic framework. The resulting reformulation, MultiModal CCG, supports lexically specified control over the applicability of combinatory ..."
Abstract

Cited by 61 (21 self)
 Add to MetaCart
The paper shows how Combinatory Categorial Grammar (CCG) can be adapted to take advantage of the extra resourcesensitivity provided by the Categorial Type Logic framework. The resulting reformulation, MultiModal CCG, supports lexically specified control over the applicability of combinatory rules, permitting a universal role component and shedding the need for languagespecific restrictions on rules. We discuss some of the linguistic motivation for these changes, define the MultiModal CCG system and demonstrate how it works on some basic examples. We furthermore outline some possible extensions and address computational aspects of MultiModal CCG.
Coupling CCG and Hybrid Logic Dependency Semantics
 IN PROC. ACL 2002
, 2002
"... Categorial grammar has traditionally used the lcalculus to represent meaning. We present an alternative, dependencybased perspective on linguistic meaning and situate it in the computational setting. This perspective is formalized in terms of hybrid logic and has a rich yet perspicuous prop ..."
Abstract

Cited by 53 (25 self)
 Add to MetaCart
Categorial grammar has traditionally used the lcalculus to represent meaning. We present an alternative, dependencybased perspective on linguistic meaning and situate it in the computational setting. This perspective is formalized in terms of hybrid logic and has a rich yet perspicuous propositional ontology that enables a wide variety of semantic phenomena to be represented in a single meaning formalism. Finally,
Substructural Logics on Display
, 1998
"... Substructural logics are traditionally obtained by dropping some or all of the structural rules from Gentzen's sequent calculi LK or LJ. It is well known that the usual logical connectives then split into more than one connective. Alternatively, one can start with the (intuitionistic) Lambek ca ..."
Abstract

Cited by 38 (16 self)
 Add to MetaCart
Substructural logics are traditionally obtained by dropping some or all of the structural rules from Gentzen's sequent calculi LK or LJ. It is well known that the usual logical connectives then split into more than one connective. Alternatively, one can start with the (intuitionistic) Lambek calculus, which contains these multiple connectives, and obtain numerous logics like: exponentialfree linear logic, relevant logic, BCK logic, and intuitionistic logic, in an incremental way. Each of these logics also has a classical counterpart, and some also have a "cyclic" counterpart. These logics have been studied extensively and are quite well understood. Generalising further, one can start with intuitionistic BiLambek logic, which contains the dual of every connective from the Lambek calculus. The addition of the structural rules then gives Bilinear, Birelevant, BiBCK and Biintuitionistic logic, again in an incremental way. Each of these logics also has a classical counterpart, and som...
CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank
, 2007
"... This article presents an algorithm for translating the Penn Treebank into a corpus of Combinatory Categorial Grammar (CCG) derivations augmented with local and longrange word–word dependencies. The resulting corpus, CCGbank, includes 99.4 % of the sentences in the Penn Treebank. It is available fro ..."
Abstract

Cited by 31 (0 self)
 Add to MetaCart
This article presents an algorithm for translating the Penn Treebank into a corpus of Combinatory Categorial Grammar (CCG) derivations augmented with local and longrange word–word dependencies. The resulting corpus, CCGbank, includes 99.4 % of the sentences in the Penn Treebank. It is available from the Linguistic Data Consortium,and has been used to train widecoverage statistical parsers that obtain stateoftheart rates of dependency recovery. In order to obtain linguistically adequate CCG analyses,and to eliminate noise and inconsistencies in the original annotation,an extensive analysis of the constructions and annotations in the Penn Treebank was called for,and a substantial number of changes to the Treebank were necessary. We discuss the implications of our findings for the extraction of other linguistically expressive grammars from the Treebank,and for the design of future treebanks.
Remnant Movement and Complexity
, 1998
"... this paper both Lex and Far finite. Given a gr ammar that is for malized in this way, we will consider 2 Keenan and Stabler (1996) define languages this way in or]6 to consider for example, whatr4:j6 ons a r pr]j24" by automor phisms of L that fix F,justasas in semantics we can consider t ..."
Abstract

Cited by 28 (7 self)
 Add to MetaCart
this paper both Lex and Far finite. Given a gr ammar that is for malized in this way, we will consider 2 Keenan and Stabler (1996) define languages this way in or]6 to consider for example, whatr4:j6 ons a r pr]j24" by automor phisms of L that fix F,justasas in semantics we can consider the similar question: what is pr eser ved by automor  phisms of E that fix E,(2, #) . But in this paper , languages ar e defined as closur es justfor simplicity. 302 / Edward Stabler . What is the expr]j0 ve power of gr]:0HN that der ve the ver al complexes? (what sets ar definable) . What is the str:064" complexity of thever]H complexes in these gre mar: . Since str uctur e building is dr iven by lexical featur es, is ther e a usefulrful4:6[6j4 on ofder vations as gr]]H of featur checking r4H2H ons? Exprj:[ ve power is familiar but expr[06 on complexity isper haps not so familiar , so we quicklyr eview the ideas we will use. What is the size of a sentence like the following? every student criticized some teacher When compar ng sentences, one possible measur is simply the count of the charj4"H0 the length of the sequence. In this case, we have 37 char4"] (counting the spaces between wor4]2 To get a slightly moruniver04 measur0 it is common to consider how many binar choices a rr:N2 r ed to specify each element of the sequence. In the ASCII coding scheme, eachchar acter is specified by 7binar y choices, 7 bits. So in the ASCII coding scheme, the sequence isrj:4"]HH with 359 bits. 3 When we have a gr[HH0 that includes the sentence, we make available another way of specifying the sentence. The sentence can be specified by specifying its shor4[j der vation. Let's see how this wor s. Suppose we have the followinggr][0H G =#Lex,F#,with 8 lexical items consisting of str in...
Incremental processing and acceptability
 Computational Linguistics
, 2000
"... We describe a lefttoright incremental procedure for the processing of Lambek categorial grammar by proof net construction. A simple metric of complexity, the profile in time of the number of unresolved valencies, correctly predicts a wide variety of performance phenomena including garden pathing, ..."
Abstract

Cited by 26 (4 self)
 Add to MetaCart
We describe a lefttoright incremental procedure for the processing of Lambek categorial grammar by proof net construction. A simple metric of complexity, the profile in time of the number of unresolved valencies, correctly predicts a wide variety of performance phenomena including garden pathing, the unacceptability of center embedding, preference for lower attachment, lefttoright quantifier scope preference, and heavy noun phrase shift.
Polarization and abstraction of grammatical formalisms as methods for lexical disambiguation
 In CoLing’2004, 2004
, 2004
"... for lexical disambiguation ..."