Results 1  10
of
43
HeadDriven Statistical Models for Natural Language Parsing
, 1999
"... Mitch Marcus was a wonderful advisor. He gave consistently good advice, and allowed an ideal level of intellectual freedom in pursuing ideas and research topics. I would like to thank the members of my thesis committee Aravind Joshi, Mark Liberman, Fernando Pereira and Mark Steedman  for the remar ..."
Abstract

Cited by 955 (16 self)
 Add to MetaCart
Mitch Marcus was a wonderful advisor. He gave consistently good advice, and allowed an ideal level of intellectual freedom in pursuing ideas and research topics. I would like to thank the members of my thesis committee Aravind Joshi, Mark Liberman, Fernando Pereira and Mark Steedman  for the remarkable breadth and depth of their feedback. I had countless impromptu but in uential discussions with Jason Eisner, Dan Melamed and Adwait Ratnaparkhi in the LINC lab. They also provided feedback on many drafts of papers and thesis chapters. Paola Merlo pushed me to think about many new angles of the research. Dimitrios Samaras gave invaluable feedback on many portions of the work. Thanks to James Brooks for his contribution to the work that comprises chapter 5 of this thesis. The community of faculty, students and visitors involved with the Institute for Research in Cognitive Science at Penn provided an intensely varied and stimulating environment. I would like to thank them collectively. Some deserve special mention for discussions that contributed quite directly to this research: Breck Baldwin, Srinivas Bangalore, Dan
Parsing Algorithms and Metrics
 IN PROCEEDINGS OF THE 34TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
, 1996
"... Many different metrics exist for evaluating parsing results, including Viterbi, Crossing Brackets Rate, Zero Crossing Brackets Rate, and several others. However, most parsing algorithms, including the Viterbi algorithm, attempt to optimize the same metric, namely the probability of getting th ..."
Abstract

Cited by 90 (5 self)
 Add to MetaCart
Many different metrics exist for evaluating parsing results, including Viterbi, Crossing Brackets Rate, Zero Crossing Brackets Rate, and several others. However, most parsing algorithms, including the Viterbi algorithm, attempt to optimize the same metric, namely the probability of getting the correct labelled tree. By choosing a parsing algorithm appropriate for the evaluation metric, better performance can be achieved. We present two new algorithms: the "Labelled Recall Algorithm," which maximizes the expected Labelled Recall Rate, and the "Bracketed Recall Algorithm," which maximizes the Bracketed Recall Rate. Experimental results are given, showing that the two new algorithms have improved performance over the Viterbi algorithm on many criteria, especially the ones that they optimize.
Parsing InsideOut
, 1998
"... Probabilistic ContextFree Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probabili ..."
Abstract

Cited by 82 (2 self)
 Add to MetaCart
Probabilistic ContextFree Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probability that any given nonterminal covers any piece of the input sentence. The traditional use of these probabilities is to improve the probabilities of grammar rules. In this thesis we show that these values are useful for solving many other problems in Statistical Natural Language Processing. We give a framework for describing parsers. The framework generalizes the inside and outside values to semirings. It makes it easy to describe parsers that compute a wide variety of interesting quantities, including the inside and outside probabilities, as well as related quantities such as Viterbi probabilities and nbest lists. We also present three novel uses for the inside and outside probabilities. T...
Designing Statistical Language Learners: Experiments on Noun Compounds
, 1995
"... Statistical language learning research takes the view that many traditional natural language processing tasks can be solved by training probabilistic models of language on a sufficient volume of training data. The design of statistical language learners therefore involves answering two questions: (i ..."
Abstract

Cited by 79 (0 self)
 Add to MetaCart
Statistical language learning research takes the view that many traditional natural language processing tasks can be solved by training probabilistic models of language on a sufficient volume of training data. The design of statistical language learners therefore involves answering two questions: (i) Which of the multitude of possible language models will most accurately reflect the properties necessary to a given task? (ii) What will constitute a sufficient volume of training data? Regarding the first question, though a variety of successful models have been discovered, the space of possible designs remains largely unexplored. Regarding the second, exploration of the design space has so far proceeded without an adequate answer. The goal of this thesis is to advance the exploration of the statistical language learning design space. In pursuit of that goal, the thesis makes two main theoretical contributions: it identifies a new class of designs by providing a novel theory of statistical natural language processing, and it presents the foundations for a predictive theory of data requirements to assist in future design explorations. The first of these contributions is called the meaning distributions theory. This theory
A probabilistic corpusdriven model for lexicalfunctional analysis
 Proceedings COLINGACL'98
, 1998
"... rens.bod @ let.uva.nl Wc develop a l)ataOricntcd Parsing (DOP) model based on the syntactic representations of Lexicalf;unctional Grammar (LFG). We start by summarizing the original DOP model for tree representations and then show how it can be extended with corresponding functional structures. ..."
Abstract

Cited by 59 (16 self)
 Add to MetaCart
rens.bod @ let.uva.nl Wc develop a l)ataOricntcd Parsing (DOP) model based on the syntactic representations of Lexicalf;unctional Grammar (LFG). We start by summarizing the original DOP model for tree representations and then show how it can be extended with corresponding functional structures. The resulting LFGDOP model triggers a new, corpusbased notion of grammaticality, and its probability models exhibit interesting behavior with respect to specificity and the interpretation of illformed strings. 1.
Efficient Algorithms for Parsing the DOP Model
, 1996
"... Excellent results have been reported for DataOriented Parsing (DOP) of natural language texts (Bod, 1993c). Unfortunately, existing algorithms are both computationally intensive and difficult to implement. Previous algorithms are expensive due to two factors: the exponential number of rules that mus ..."
Abstract

Cited by 58 (4 self)
 Add to MetaCart
Excellent results have been reported for DataOriented Parsing (DOP) of natural language texts (Bod, 1993c). Unfortunately, existing algorithms are both computationally intensive and difficult to implement. Previous algorithms are expensive due to two factors: the exponential number of rules that must be generated and the use of a Monte Carlo p arsing algorithm. In this paper we solve the first problem by a novel reduction of the DOP model toga small, equivalent probabilistic contextfree grammar. We solve the second problem by a novel deterministic parsing strategy that maximizes the expected number of correct con stituents, rather than the probability of a correct parse tree. Using ithe optimizations, experiments yield a 97% crossing brackets rate and 88% zero crossing brackets rate. This differs significantly from the results reported by Bod, and is compara ble to results from a duplication of Pereira and Schabes's (1992) experiment on the same data. We show that Bod's results are at least partially due to an extremely fortuitous choice of test data, and partially due to using cleaner data than other researchers.
Unsupervised Language Acquisition: Theory and Practice
, 2001
"... In this thesis I present various algorithms for the unsupervised machine learning of aspects of natural languages using a variety of statistical models. The scientific object of the work is to examine the validity of the socalled Argument from the Poverty of the Stimulus advanced in favour of the p ..."
Abstract

Cited by 40 (0 self)
 Add to MetaCart
In this thesis I present various algorithms for the unsupervised machine learning of aspects of natural languages using a variety of statistical models. The scientific object of the work is to examine the validity of the socalled Argument from the Poverty of the Stimulus advanced in favour of the proposition that humans have languagespecific innate knowledge. I start by examining an a priori argument based on Gold's theorem, that purports to prove that natural languages cannot be learned, and some formal issues related to the choice of statistical grammars rather than symbolic grammars. I present three novel algorithms for learning various parts of natural languages: first, an algorithm for the induction of syntactic categories from unlabelled text using distributional information, that can deal with ambiguous and rare words; secondly, a set of algorithms for learning morphological processes in a variety of languages, including languages such as Arabic with nonconcatenative morphology; thirdly an algorithm for the unsupervised induction of a contextfree grammar from tagged text. I carefully examine the interaction between the various components, and show how these algorithms can form the basis for a empiricist model of language acquisition. I therefore conclude that the Argument from the Poverty of the Stimulus is unsupported by the evidence.
A DOP Model for Semantic Interpretation
 Proceedings ACL/EACL97
, 1997
"... In dataoriented language processing, an annotated language corpus is used as a stochastic grammar. The most probable analysis of a new sentence is constructed by combining fragments from the corpus in the most probable way. This approach has been successfully used for syntactic analysis, usi ..."
Abstract

Cited by 37 (14 self)
 Add to MetaCart
In dataoriented language processing, an annotated language corpus is used as a stochastic grammar. The most probable analysis of a new sentence is constructed by combining fragments from the corpus in the most probable way. This approach has been successfully used for syntactic analysis, using corpora with syntactic annota tions such as the Penn Treebank. If a cor pus with semantically annotated sentences is used, the same approach can also gen erate the most probable semantic interpretation of an input sentence. The present paper explains this semantic interpretation method. A dataoriented semantic inter pretation algorithm was tested on two semantically annotated corpora: the English ATIS corpus and the Dutch OVIS corpus.
Parsing with the Shortest Derivation
 Proceedings COLING2000
, 2000
"... tens @ scs.lecd s.ac.uk Common wisdom has it that tile bias of stochastic grammars in favor of shorter deriwttions of a sentence is hamfful and should be redressed. We show that the common wisdom is wrong for stochastic grammars that use elementary trees instead o1 ' conlextl'ree rules, such as Sto ..."
Abstract

Cited by 36 (14 self)
 Add to MetaCart
tens @ scs.lecd s.ac.uk Common wisdom has it that tile bias of stochastic grammars in favor of shorter deriwttions of a sentence is hamfful and should be redressed. We show that the common wisdom is wrong for stochastic grammars that use elementary trees instead o1 ' conlextl'ree rules, such as Stochastic TreeSubstitution Grammars used by DataOriented Parsing models. For such grammars a nonprobabilistic metric based on tile shortest derivation outperforms a probabilistic metric on the ATIS and OVIS corpora, while it obtains competitive results on the Wall Street Journal (WSJ) corpus. This paper also contains the first publislmd experiments with DOP on the WSJ. 1.