Results 1 - 10
of
131
Wide-coverage efficient statistical parsing with CCG and log-linear models
- COMPUTATIONAL LINGUISTICS
, 2007
"... This paper describes a number of log-linear parsing models for an automatically extracted lexicalized grammar. The models are "full" parsing models in the sense that probabilities are defined for complete parses, rather than for independent events derived by decomposing the parse tree. Discriminativ ..."
Abstract
-
Cited by 87 (20 self)
- Add to MetaCart
This paper describes a number of log-linear parsing models for an automatically extracted lexicalized grammar. The models are "full" parsing models in the sense that probabilities are defined for complete parses, rather than for independent events derived by decomposing the parse tree. Discriminative training is used to estimate the models, which requires incorrect parses for each sentence in the training data as well as the correct parse. The lexicalized grammar formalism used is Combinatory Categorial Grammar (CCG), and the grammar is automatically extracted from CCGbank, a CCG version of the Penn Treebank. The combination of discriminative training and an automatically extracted grammar leads to a significant memory requirement (over 20 GB), which is satisfied using a parallel implementation of the BFGS optimisation algorithm running on a Beowulf cluster. Dynamic programming over a packed chart, in combination with the parallel implementation, allows us to solve one of the largest-scale estimation problems in the statistical parsing literature in under three hours. A key component of the parsing system, for both training and testing, is a Maximum Entropy supertagger which assigns CCG lexical categories to words in a sentence. The supertagger makes the discriminative training feasible, and also leads to a highly efficient parser. Surprisingly,
Categorial Grammar
, 1998
"... tem of rewrite rules or "productions" like (2), which have their origin in early work in recursion theory by Post, among others. (1) Dexter likes Warren. (2) S ! NP VP VP ! TV NP TV ! flikes;sees; : : :g Categorial Grammar (CG), together with its close cousin Dependency Grammar (which also originat ..."
Abstract
-
Cited by 76 (3 self)
- Add to MetaCart
tem of rewrite rules or "productions" like (2), which have their origin in early work in recursion theory by Post, among others. (1) Dexter likes Warren. (2) S ! NP VP VP ! TV NP TV ! flikes;sees; : : :g Categorial Grammar (CG), together with its close cousin Dependency Grammar (which also originated in the 1950s, in work by Tesniere) stems from an alternative approach to context-free grammar pioneered by Bar-Hillel 1953 and Lambek 1958, with earlier antecedents in Ajdukiewicz 1935 and still earlier work by Husserl and Russell in category theory and the theory of types. Categorial Grammars capture the same information by associating a functional type or category with all grammatical entities. For example, all transitive verbs are associated via the lexicon with a category that can be written as follows: (3) likes := (SnNP)=NP The no
Multi-Modal Combinatory Categorial Grammar
, 2003
"... The paper shows how Combinatory Categorial Grammar (CCG) can be adapted to take advantage of the extra resourcesensitivity provided by the Categorial Type Logic framework. The resulting reformulation, Multi-Modal CCG, supports lexically specified control over the applicability of combinatory ..."
Abstract
-
Cited by 49 (16 self)
- Add to MetaCart
The paper shows how Combinatory Categorial Grammar (CCG) can be adapted to take advantage of the extra resourcesensitivity provided by the Categorial Type Logic framework. The resulting reformulation, Multi-Modal CCG, supports lexically specified control over the applicability of combinatory rules, permitting a universal role component and shedding the need for language-specific restrictions on rules. We discuss some of the linguistic motivation for these changes, define the Multi-Modal CCG system and demonstrate how it works on some basic examples. We furthermore outline some possible extensions and address computational aspects of Multi-Modal CCG.
Type Logical Grammar
, 1994
"... The canonical linguistic process is the cycle of the speech-circuit [Saussure, 1915]. A speaker expresses a psychological idea by means of a physiological articulation. The signal is transmitted through the medium by a physical process incident on a hearer who from the consequent physiological impre ..."
Abstract
-
Cited by 47 (0 self)
- Add to MetaCart
The canonical linguistic process is the cycle of the speech-circuit [Saussure, 1915]. A speaker expresses a psychological idea by means of a physiological articulation. The signal is transmitted through the medium by a physical process incident on a hearer who from the consequent physiological impression recovers the psychological idea. The hearer may then reply, swapping the roles of speaker and hearer, and so the circuit cycles. For communication to be successful speakers and hearers must have shared associations between forms (signifiers) andmeanings(signifieds). De Saussure called such a pairing of signifier and signified a sign. Therelationisone-to-many (ambiguity) and many-to-one (paraphrase). Let us call a stable totality of such associations a language. It would be arbitrary to propose that there is a longest expression (where would we propose to cut off IknowthatyouknowthatIknow that you know...?) therefore language is an infinite abstraction over the finite number of acts of communication that can ever occur. The program of formal syntax [Chomsky, 1957] is to define the set of all and only
Coupling CCG and Hybrid Logic Dependency Semantics
- IN PROC. ACL 2002
, 2002
"... Categorial grammar has traditionally used the l-calculus to represent meaning. We present an alternative, dependency-based perspective on linguistic meaning and situate it in the computational setting. This perspective is formalized in terms of hybrid logic and has a rich yet perspicuous prop ..."
Abstract
-
Cited by 41 (16 self)
- Add to MetaCart
Categorial grammar has traditionally used the l-calculus to represent meaning. We present an alternative, dependency-based perspective on linguistic meaning and situate it in the computational setting. This perspective is formalized in terms of hybrid logic and has a rich yet perspicuous propositional ontology that enables a wide variety of semantic phenomena to be represented in a single meaning formalism. Finally,
Substructural Logics on Display
, 1998
"... Substructural logics are traditionally obtained by dropping some or all of the structural rules from Gentzen's sequent calculi LK or LJ. It is well known that the usual logical connectives then split into more than one connective. Alternatively, one can start with the (intuitionistic) Lambek calculu ..."
Abstract
-
Cited by 36 (16 self)
- Add to MetaCart
Substructural logics are traditionally obtained by dropping some or all of the structural rules from Gentzen's sequent calculi LK or LJ. It is well known that the usual logical connectives then split into more than one connective. Alternatively, one can start with the (intuitionistic) Lambek calculus, which contains these multiple connectives, and obtain numerous logics like: exponential-free linear logic, relevant logic, BCK logic, and intuitionistic logic, in an incremental way. Each of these logics also has a classical counterpart, and some also have a "cyclic" counterpart. These logics have been studied extensively and are quite well understood. Generalising further, one can start with intuitionistic Bi-Lambek logic, which contains the dual of every connective from the Lambek calculus. The addition of the structural rules then gives Bi-linear, Bi-relevant, Bi-BCK and Bi-intuitionistic logic, again in an incremental way. Each of these logics also has a classical counterpart, and som...
Remnant Movement and Complexity
, 1998
"... this paper both Lex and Far finite. Given a gr ammar that is for malized in this way, we will consider 2 Keenan and Stabler (1996) define languages this way in or]6 to consider for example, whatr4:j6 ons a r pr]j24" by automor phisms of L that fix F,justasas in semantics we can consider the si ..."
Abstract
-
Cited by 24 (7 self)
- Add to MetaCart
this paper both Lex and Far finite. Given a gr ammar that is for malized in this way, we will consider 2 Keenan and Stabler (1996) define languages this way in or]6 to consider for example, whatr4:j6 ons a r pr]j24" by automor phisms of L that fix F,justasas in semantics we can consider the similar question: what is pr eser ved by automor - phisms of E that fix E,(2, #) . But in this paper , languages ar e defined as closur es justfor simplicity. 302 / Edward Stabler . What is the expr]j0 ve power of gr]:0HN that der ve the ver al complexes? (what sets ar definable) . What is the str:064" complexity of thever]H complexes in these gre mar: . Since str uctur e building is dr iven by lexical featur es, is ther e a usefulrful4:6[6j4 on ofder vations as gr]]H of featur checking r4H2H ons? Exprj:[ ve power is familiar but expr[06 on complexity isper haps not so familiar , so we quicklyr eview the ideas we will use. What is the size of a sentence like the following? every student criticized some teacher When compar ng sentences, one possible measur is simply the count of the charj4"H0 the length of the sequence. In this case, we have 37 char4"] (counting the spaces between wor4]2 To get a slightly moruniver04 measur0 it is common to consider how many binar choices a rr:N2 r ed to specify each element of the sequence. In the ASCII coding scheme, eachchar acter is specified by 7binar y choices, 7 bits. So in the ASCII coding scheme, the sequence isrj:4"]HH with 359 bits. 3 When we have a gr[HH0 that includes the sentence, we make available another way of specifying the sentence. The sentence can be specified by specifying its shor4[j der vation. Let's see how this wor s. Suppose we have the followinggr][0H G =#Lex,F#,with 8 lexical items consisting of str in...
Incremental processing and acceptability
- Computational Linguistics
, 2000
"... We describe a left-to-right incremental procedure for the processing of Lambek categorial grammar by proof net construction. A simple metric of complexity, the profile in time of the number of unresolved valencies, correctly predicts a wide variety of performance phenomena including garden pathing, ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
We describe a left-to-right incremental procedure for the processing of Lambek categorial grammar by proof net construction. A simple metric of complexity, the profile in time of the number of unresolved valencies, correctly predicts a wide variety of performance phenomena including garden pathing, the unacceptability of center embedding, preference for lower attachment, left-to-right quantifier scope preference, and heavy noun phrase shift.
Memoisation of Categorial Proof Nets: Parallelism in Categorial Processing
, 1996
"... We introduce a method of memoisation of categorial proof nets. Exploiting the planarity of non-commutative proof nets, and unifiability as a correctness criterion, parallelism is simulated through construction of a proof net matrix of most general unifiers for modules, in a manner analogous to the C ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
We introduce a method of memoisation of categorial proof nets. Exploiting the planarity of non-commutative proof nets, and unifiability as a correctness criterion, parallelism is simulated through construction of a proof net matrix of most general unifiers for modules, in a manner analogous to the Cocke-Younger-Kasami algorithm for context free grammar. 1 Memoisation of categorial proof nets: parallelism in categorial processing 1 Introduction If the evolutionary tendency of grammatical formalisms could be summed up in one word, that word could well be lexicalism. The lexicon was once considered the locus of all and only idiosyncratic information; it may be hard now to find any proponents at all of such a view. Rather, one hears of the balance or tradeoff between lexicon and syntax: the tenet that the lexicon should comprise only what is idiosyncratic is, simply, no longer held. The notion that there is a compromise to be struck between lexicon and syntax is in turn rejected in the...
Remnant Movement and Structural Complexity
- CONSTRAINTS AND RESOURCES IN NATURAL LANGUAGE, STUDIES IN LOGIC, LANGUAGE AND INFORMATION. CSLI
, 1998
"... In some recent efforts to reduce the theoretical machinery of transformational syntax, all structures have the underlyingly order "specifier-head-complement", all movement is leftward, feature-driven, phrasal, and overt. With these developments, the movement of constituents from which material ha ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
In some recent efforts to reduce the theoretical machinery of transformational syntax, all structures have the underlyingly order "specifier-head-complement", all movement is leftward, feature-driven, phrasal, and overt. With these developments, the movement of constituents from which material has already been extracted, "remnant movement," is increasingly common. This paper shows that these restricted transformational frameworks remain very expressive in terms of their generative power, and that although the structures they define look complex and require new parsing strategies, their coding complexity is no higher than that of traditional analyses. Furthermore, the derivations of these structures have a simplicity which is revealed by representing them as graphs of matching pairs (feature checking relations), as is done in the "proof nets" of the type logical tradition.

