Results 1 - 10
of
65
Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora
, 1997
"... ..."
Wide-coverage efficient statistical parsing with CCG and log-linear models
- COMPUTATIONAL LINGUISTICS
, 2007
"... This paper describes a number of log-linear parsing models for an automatically extracted lexicalized grammar. The models are "full" parsing models in the sense that probabilities are defined for complete parses, rather than for independent events derived by decomposing the parse tree. Discriminativ ..."
Abstract
-
Cited by 87 (20 self)
- Add to MetaCart
This paper describes a number of log-linear parsing models for an automatically extracted lexicalized grammar. The models are "full" parsing models in the sense that probabilities are defined for complete parses, rather than for independent events derived by decomposing the parse tree. Discriminative training is used to estimate the models, which requires incorrect parses for each sentence in the training data as well as the correct parse. The lexicalized grammar formalism used is Combinatory Categorial Grammar (CCG), and the grammar is automatically extracted from CCGbank, a CCG version of the Penn Treebank. The combination of discriminative training and an automatically extracted grammar leads to a significant memory requirement (over 20 GB), which is satisfied using a parallel implementation of the BFGS optimisation algorithm running on a Beowulf cluster. Dynamic programming over a packed chart, in combination with the parallel implementation, allows us to solve one of the largest-scale estimation problems in the statistical parsing literature in under three hours. A key component of the parsing system, for both training and testing, is a Maximum Entropy supertagger which assigns CCG lexical categories to words in a sentence. The supertagger makes the discriminative training feasible, and also leads to a highly efficient parser. Surprisingly,
Parsing Inside-Out
, 1998
"... Probabilistic Context-Free Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probabili ..."
Abstract
-
Cited by 65 (2 self)
- Add to MetaCart
Probabilistic Context-Free Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probability that any given non-terminal covers any piece of the input sentence. The traditional use of these probabilities is to improve the probabilities of grammar rules. In this thesis we show that these values are useful for solving many other problems in Statistical Natural Language Processing. We give a framework for describing parsers. The framework generalizes the inside and outside values to semirings. It makes it easy to describe parsers that compute a wide variety of interesting quantities, including the inside and outside probabilities, as well as related quantities such as Viterbi probabilities and n-best lists. We also present three novel uses for the inside and outside probabilities. T...
Practical Unification-based Parsing of Natural Language
, 1993
"... The thesis describes novel techniques and algorithms for the practical parsing of realistic Natural Language (NL) texts with a wide-coverage unification-based grammar of English. The thesis tackles two of the major problems in this area: firstly, the fact that parsing realistic inputs with such gr ..."
Abstract
-
Cited by 46 (7 self)
- Add to MetaCart
The thesis describes novel techniques and algorithms for the practical parsing of realistic Natural Language (NL) texts with a wide-coverage unification-based grammar of English. The thesis tackles two of the major problems in this area: firstly, the fact that parsing realistic inputs with such grammars can be computationally very expensive, and secondly, the observation that many analyses are often assigned to an input, only one of which usually forms the basis of the correct interpretation. The thesis starts by presenting a new unification algorithm, justifies why it is well-suited to practical NL parsing, and describes a bottom-up active chart parser which employs this unification algorithm together with several other novel processing and optimisation techniques. Empirical results demonstrate that an implementation of this parser has significantly better practical
Parsing and hypergraphs
- In IWPT
, 2001
"... While symbolic parsers can be viewed as deduction systems, this view is less natural for probabilistic parsers. We present a view of parsing as directed hypergraph analysis which naturally covers both symbolic and probabilistic parsing. We illustrate the approach by showing how a dynamic extension o ..."
Abstract
-
Cited by 42 (3 self)
- Add to MetaCart
While symbolic parsers can be viewed as deduction systems, this view is less natural for probabilistic parsers. We present a view of parsing as directed hypergraph analysis which naturally covers both symbolic and probabilistic parsing. We illustrate the approach by showing how a dynamic extension of Dijkstra’s algorithm can be used to construct a probabilistic chart parser with an Ç Ò time bound for arbitrary PCFGs, while preserving as much of the flexibility of symbolic chart parsers as allowed by the inherent ordering of probabilistic dependencies. 1
Recognition can be Harder than Parsing
- Computational Intelligence
, 1992
"... this paper is to discuss the scope and limitations of this approach, and to examine the suitability of several syntactic formalisms on the criterion of their ability to handle it. 2 Parsing as intersection ..."
Abstract
-
Cited by 34 (0 self)
- Add to MetaCart
this paper is to discuss the scope and limitations of this approach, and to examine the suitability of several syntactic formalisms on the criterion of their ability to handle it. 2 Parsing as intersection
Relating complexity to practical performance in parsing with wide-coverage unification grammars
, 1994
"... The paper demonstrates that exponential complexities with respect to grammar size and input length have little impact on the performance of three unification-based parsing algorithms, using a wide-coverage grammar. The results imply that tile study and optimisation of unification-based parsing must ..."
Abstract
-
Cited by 30 (6 self)
- Add to MetaCart
The paper demonstrates that exponential complexities with respect to grammar size and input length have little impact on the performance of three unification-based parsing algorithms, using a wide-coverage grammar. The results imply that tile study and optimisation of unification-based parsing must rely on empirical data until complexity theory can more accurately predict the practical behaviour of such parsers.
Parsing Incomplete Sentences
, 1988
"... An efficient context-free parsing algorithln is preseuted that can parse sentences with unknown parts of unknown length. It produc in finite form all possible parses (often infinite in number) that could account for the missing parts. The algorithm is a variation on the construction due to Earl ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
An efficient context-free parsing algorithln is preseuted that can parse sentences with unknown parts of unknown length. It produc in finite form all possible parses (often infinite in number) that could account for the missing parts. The algorithm is a variation on the construction due to Earley. ltowever, its presentation is such that it can readily be adapted to any chart parsing schema (top- down, bottom-up, etc...).
A graph grammar approach to graphical parsing
- In Proc. 1995 IEEE Symposium Visual Languages
, 1995
"... We present a new graph grammar based approach for defining the syntax of visual languages and for generating visual language parsers. Its main advantage ⎯ in comparison to other visual language parsing approaches ⎯ is its ability to handle context-sensitive productions which may replace more than on ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
We present a new graph grammar based approach for defining the syntax of visual languages and for generating visual language parsers. Its main advantage ⎯ in comparison to other visual language parsing approaches ⎯ is its ability to handle context-sensitive productions which may replace more than one non-terminal at the same time and which may contain very complex context requirements. Its implementation will be part of a forthcoming parsing toolkit for visual languages and the already existing graph grammar programming environment PROGRES. In reading the proceedings of the past VL workshops or any book on software engineering one cannot help but notice that a large variety of visual languages exists of which only a few

