Results 1  10
of
18
Generalized Probabilistic LR Parsing of Natural Language (Corpora) with UnificationBased Grammars
 COMPUTATIONAL LINGUISTICS
, 1993
"... ..."
Practical Unificationbased Parsing of Natural Language
, 1993
"... The thesis describes novel techniques and algorithms for the practical parsing of realistic Natural Language (NL) texts with a widecoverage unificationbased grammar of English. The thesis tackles two of the major problems in this area: firstly, the fact that parsing realistic inputs with such gr ..."
Abstract

Cited by 48 (7 self)
 Add to MetaCart
The thesis describes novel techniques and algorithms for the practical parsing of realistic Natural Language (NL) texts with a widecoverage unificationbased grammar of English. The thesis tackles two of the major problems in this area: firstly, the fact that parsing realistic inputs with such grammars can be computationally very expensive, and secondly, the observation that many analyses are often assigned to an input, only one of which usually forms the basis of the correct interpretation. The thesis starts by presenting a new unification algorithm, justifies why it is wellsuited to practical NL parsing, and describes a bottomup active chart parser which employs this unification algorithm together with several other novel processing and optimisation techniques. Empirical results demonstrate that an implementation of this parser has significantly better practical
Robust Stochastic Parsing Using the InsideOutside Algorithm
, 1992
"... this paper, we discuss the application of the Viterbi algorithm and the BaumWelch algorithm (in wide use for speech recognition) to the parsing problem and describe a recent experiment designed to produce a simple, robust, probabilistic parser which selects an appropriate analysis frequently enough ..."
Abstract

Cited by 41 (0 self)
 Add to MetaCart
this paper, we discuss the application of the Viterbi algorithm and the BaumWelch algorithm (in wide use for speech recognition) to the parsing problem and describe a recent experiment designed to produce a simple, robust, probabilistic parser which selects an appropriate analysis frequently enough to be useful and deals effectively with the problem of undergeneration. We focus on the application of these stochastic algorithms here because, although other statistically based approaches have been proposed (e.g. Sampson et al., 1989; Garside & Leech, 1985; Magerman & Marcus, 1991a,b), these appear most promising as they are computationallytractable (in principle) and wellintegrated with formal language / automata theory. The Viterbi algorithm and BaumWelch algorithm are optimised algorithms (with polynomial computational complexity) which can be used in conjunction with stochastic regular grammars (finitestate automata, i.e. (hidden) markov models, Baum, 1972) and with probabilistic contextfree grammars (Baker, 1982; Fujisaki
An Efficient Implementation of a New DOP Model
 In EACL
, 2003
"... Two apparently opposing DOP models exist in the literature: one which computes the parse tree involving the most frequent subtrees from a treebank and one which computes the parse tree involving the fewest subtrees from a treebank. This paper proposes an integration of the two models which ou ..."
Abstract

Cited by 30 (6 self)
 Add to MetaCart
Two apparently opposing DOP models exist in the literature: one which computes the parse tree involving the most frequent subtrees from a treebank and one which computes the parse tree involving the fewest subtrees from a treebank. This paper proposes an integration of the two models which outperforms each of them separately. Together with a PCFGreduction of DOP we obtain improved accuracy and efficiency on the Wall Street Journal treebank. Our results show an 11% relative reduction in error rate over previous models, and an average processing time of 3.6 seconds per WSJ sentence.
Efficient Parsing of DOP with PCFGreductions
, 2003
"... Contents R. Bod, R. Scha and K. Sima'an PART I: The Basic DataOriented Parsing Model 1. A DOP model for phrasestructure trees R. Bod and R. Scha 2. Probability models for DOP 3. Encoding frequency information in stochastic parsing models J. Carroll and D. Weir PART II: Computational Is ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
Contents R. Bod, R. Scha and K. Sima'an PART I: The Basic DataOriented Parsing Model 1. A DOP model for phrasestructure trees R. Bod and R. Scha 2. Probability models for DOP 3. Encoding frequency information in stochastic parsing models J. Carroll and D. Weir PART II: Computational Issues 1. Computational complexity of disambiguation under DOP 2. Parsing DOP with Monte Carlo techniques J. Chappelier and M. Rajman 3. Towards efficient Monte Carlo parsing 4. Efficient parsing of DOP with PCFGreductions J. Goodman 5. An approximation of DOP through memorybased learning G. de Pauw 6. Compositional partial parsing by memorybased sequence learning I. Dagan and Y. Krymolowsky PART III: Richer Models 1. A headdriven dataoriented approach to lexical dependency 2. A DOP model for LexicalFunctional Grammar representations R. Bod and R. Kaplan 3. A datadriven approach to Headdriven PhraseStructure G. Neumann 4. TreeAdjoining Grammars and its applic
Efficient Disambiguation by means of Stochastic Tree Substitution Grammars
, 1994
"... In Stochastic Tree Substitution Grammars (STSGs), a parse(tree) of an input sentence can be generated by (exponentially) many derivations. Each of these derivations is the result of a different combination of STSG elementarytrees and therefore receives a distinct probability; the probability of the ..."
Abstract

Cited by 21 (9 self)
 Add to MetaCart
In Stochastic Tree Substitution Grammars (STSGs), a parse(tree) of an input sentence can be generated by (exponentially) many derivations. Each of these derivations is the result of a different combination of STSG elementarytrees and therefore receives a distinct probability; the probability of the parse is defined as the sum of the probabilities of all derivations which generate that parse. Therefore, some methods of Stochastic ContextsFree Grammars (SCFGs), e.g. the Viterbi algorithm for finding the most probable parse (MPP) of an input sentence, are not applicable to STSGs. In this paper we study the problem of efficient disambiguation by means of STSGs under the Data Oriented Parsing model (DOP) [Bod, 1993c]. We present polynomial algorithms for computing the probability of a parse and the probability of an input sentence and its most probable derivation (MPD). In addition, we present a Viterbilike optimization technique for search algorithms for the MPP. A major concern in desi...
Probabilistic Normalisation and Unpacking of Packed Parse Forests for Unificationbased Grammars
 IN PROCEEDINGS OF THE AAAI FALL SYMPOSIUM ON PROBABILISTIC APPROACHES TO NATURAL LANGUAGE
, 1992
"... The research described below forms part of a wider programme to develop a practical parser for naturallyoccurring natural language input which is capable of returning the nbest syntacticallydeterminate analyses, containing that which is semantically and pragmatically most appropriate (preferably ..."
Abstract

Cited by 18 (3 self)
 Add to MetaCart
The research described below forms part of a wider programme to develop a practical parser for naturallyoccurring natural language input which is capable of returning the nbest syntacticallydeterminate analyses, containing that which is semantically and pragmatically most appropriate (preferably as the highest ranked) from the exponential (in sentence length) syntactically legitimate possibilities (Church & Patil 1983), which can frequently run into the thousands with realistic sentences and grammars. We have opted to develop a domainindependent solution to this problem based on integrating statistical Markov modelling techniques, which offer the potential for rapid tuning to different sublanguages / corpora on the basis of supervised training, with linguisticallyadequate grammatical (language) models, capable of returning analyses detailed enough to support semantic interpretation.
DataOriented Language Processing  An Overview
 CORPUSBASED METHODS IN LANGUAGE AND SPEECH PROCESSING
, 1997
"... Dataoriented models of language processing embody the assumption that human language perception and production works with representations of concrete past language experiences, rather than with abstract grammar rules. Such models therefore maintain large corpora of linguistic representations of pre ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
Dataoriented models of language processing embody the assumption that human language perception and production works with representations of concrete past language experiences, rather than with abstract grammar rules. Such models therefore maintain large corpora of linguistic representations of previously occurring utterances. When processing a new input utterance, analyses of this utterance are constructed by combining fragments from the corpus; the occurrencefrequencies of the fragments are used to estimate which analysis is the most probable one. This paper motivates the idea of dataoriented language processing by considering the problem of syntactic disambiguation. One relatively simple parsing/disambiguation model that implements this idea is described in some detail. This model assumes a corpus of utterances annotated with labelled phrasestructure trees, and parses new input by combining subtrees from the corpus; it selects the most probable parse of an input utterance by considering the sum of the probabilities of all its derivations. The paper discusses some experiments carried out with this model. Finally, it reviews some other models that instantiate the dataoriented processing approach. Many of these models also employ labelled phrasestructure trees, but use different criteria for extracting subtrees from the corpus or employ different disambiguation strategies; other models use richer formalisms for their corpus annotations.
The Problem of Computing the Most Probable Tree in DataOriented Parsing and Stochastic Tree Grammars
 In Proceedings of the Seventh Conference of the European Chapter of the ACL
"... We deal with the question as to whether there exists a polynomial time algorithm for computing the most probable parse tree of a sentence generated by a dataoriented parsing (DOP) model. (Scha, 1990; Bod, 1992, 1993a). Therefore we describe DOP as a stochastic treesubstitution grammar (STSG) ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
We deal with the question as to whether there exists a polynomial time algorithm for computing the most probable parse tree of a sentence generated by a dataoriented parsing (DOP) model. (Scha, 1990; Bod, 1992, 1993a). Therefore we describe DOP as a stochastic treesubstitution grammar (STSG). In STSG, a tree can be generated by exponentially many derivations involving different elementary trees. The probability of a tree is equal to the sum of the probabilities of all its derivations.
Snippet Search: a Single Phrase Approach to Text Access
 In Proceedings of the 1991 Joint Statistical Meetings. American Statistical Association
, 1991
"... this paper. In the worst case, the inner loop of this algorithm is executed ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
this paper. In the worst case, the inner loop of this algorithm is executed