Results 1  10
of
104
Learning Accurate, Compact, and Interpretable Tree Annotation
 In ACL ’06
, 2006
"... We present an automatic approach to tree annotation in which basic nonterminal symbols are alternately split and merged to maximize the likelihood of a training treebank. Starting with a simple Xbar grammar, we learn a new grammar whose nonterminals are subsymbols of the original nonterminals. In co ..."
Abstract

Cited by 291 (37 self)
 Add to MetaCart
We present an automatic approach to tree annotation in which basic nonterminal symbols are alternately split and merged to maximize the likelihood of a training treebank. Starting with a simple Xbar grammar, we learn a new grammar whose nonterminals are subsymbols of the original nonterminals. In contrast with previous work, we are able to split various terminals to different degrees, as appropriate to the actual complexity in the data. Our grammars automatically learn the kinds of linguistic distinctions exhibited in previous work on manual tree annotation. On the other hand, our grammars are much more compact and substantially more accurate than previous work on automatic annotation. Despite its simplicity, our best grammar achieves an F1 of 90.2 % on the Penn Treebank, higher than fully lexicalized systems. 1
Improved Inference for Unlexicalized Parsing
, 2007
"... We present several improvements to unlexicalized parsing with hierarchically statesplit PCFGs. First, we present a novel coarsetofine method in which a grammar’s own hierarchical projections are used for incremental pruning, including a method for efficiently computing projections of a grammar wi ..."
Abstract

Cited by 187 (24 self)
 Add to MetaCart
We present several improvements to unlexicalized parsing with hierarchically statesplit PCFGs. First, we present a novel coarsetofine method in which a grammar’s own hierarchical projections are used for incremental pruning, including a method for efficiently computing projections of a grammar without a treebank. In our experiments, hierarchical pruning greatly accelerates parsing with no loss in empirical accuracy. Second, we compare various inference procedures for statesplit PCFGs from the standpoint of risk minimization, paying particular attention to their practical tradeoffs. Finally, we present multilingual experiments which show that parsing with hierarchical statesplitting is fast and accurate in multiple languages and domains, even without any languagespecific tuning.
Parsing InsideOut
, 1998
"... Probabilistic ContextFree Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probabili ..."
Abstract

Cited by 85 (2 self)
 Add to MetaCart
Probabilistic ContextFree Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probability that any given nonterminal covers any piece of the input sentence. The traditional use of these probabilities is to improve the probabilities of grammar rules. In this thesis we show that these values are useful for solving many other problems in Statistical Natural Language Processing. We give a framework for describing parsers. The framework generalizes the inside and outside values to semirings. It makes it easy to describe parsers that compute a wide variety of interesting quantities, including the inside and outside probabilities, as well as related quantities such as Viterbi probabilities and nbest lists. We also present three novel uses for the inside and outside probabilities. T...
Probabilistic CFG with Latent Annotations
, 2005
"... This paper defines a generative probabilistic model of parse trees, which we call PCFGLA. This model is an extension of PCFG in which nonterminal symbols are augmented with latent variables. Finegrained CFG rules are automatically induced from a parsed corpus by training a PCFGLA model using an E ..."
Abstract

Cited by 71 (1 self)
 Add to MetaCart
This paper defines a generative probabilistic model of parse trees, which we call PCFGLA. This model is an extension of PCFG in which nonterminal symbols are augmented with latent variables. Finegrained CFG rules are automatically induced from a parsed corpus by training a PCFGLA model using an EMalgorithm. Because exact parsing with a PCFGLA is NPhard, several approximations are described and empirically compared. In experiments using the Penn WSJ corpus, our automatically trained model gave a performance of 86.6 % (F ¥ , sentences ¦ 40 words), which is comparable to that of an unlexicalized PCFG parser created using extensive manual feature selection.
Statistical Machine Translation by Parsing
, 2004
"... In an ordinary syntactic parser, the input is a string, and the grammar ranges over strings. This paper explores generalizations of ordinary parsing algorithms that allow the input to consist of string tuples and/or the grammar to range over string tuples. Such algorithms can infer the synchronous s ..."
Abstract

Cited by 65 (7 self)
 Add to MetaCart
In an ordinary syntactic parser, the input is a string, and the grammar ranges over strings. This paper explores generalizations of ordinary parsing algorithms that allow the input to consist of string tuples and/or the grammar to range over string tuples. Such algorithms can infer the synchronous structures hidden in parallel texts. It turns out that these generalized parsers can do most of the work required to train and apply a syntaxaware statistical machine translation system.
A DOP Model for Semantic Interpretation
 Proceedings ACL/EACL97
, 1997
"... In dataoriented language processing, an annotated language corpus is used as a stochastic grammar. The most probable analysis of a new sentence is constructed by combining fragments from the corpus in the most probable way. This approach has been successfully used for syntactic analysis, usi ..."
Abstract

Cited by 37 (14 self)
 Add to MetaCart
In dataoriented language processing, an annotated language corpus is used as a stochastic grammar. The most probable analysis of a new sentence is constructed by combining fragments from the corpus in the most probable way. This approach has been successfully used for syntactic analysis, using corpora with syntactic annota tions such as the Penn Treebank. If a cor pus with semantically annotated sentences is used, the same approach can also gen erate the most probable semantic interpretation of an input sentence. The present paper explains this semantic interpretation method. A dataoriented semantic inter pretation algorithm was tested on two semantically annotated corpora: the English ATIS corpus and the Dutch OVIS corpus.
Parsing with the Shortest Derivation
 Proceedings COLING2000
, 2000
"... tens @ scs.lecd s.ac.uk Common wisdom has it that tile bias of stochastic grammars in favor of shorter deriwttions of a sentence is hamfful and should be redressed. We show that the common wisdom is wrong for stochastic grammars that use elementary trees instead o1 ' conlextl'ree rules, s ..."
Abstract

Cited by 36 (14 self)
 Add to MetaCart
tens @ scs.lecd s.ac.uk Common wisdom has it that tile bias of stochastic grammars in favor of shorter deriwttions of a sentence is hamfful and should be redressed. We show that the common wisdom is wrong for stochastic grammars that use elementary trees instead o1 ' conlextl'ree rules, such as Stochastic TreeSubstitution Grammars used by DataOriented Parsing models. For such grammars a nonprobabilistic metric based on tile shortest derivation outperforms a probabilistic metric on the ATIS and OVIS corpora, while it obtains competitive results on the Wall Street Journal (WSJ) corpus. This paper also contains the first publislmd experiments with DOP on the WSJ. 1.
Novel Estimation Methods for Unsupervised Discovery of Latent Structure in Natural Language Text
, 2006
"... This thesis is about estimating probabilistic models to uncover useful hidden structure in data; specifically, we address the problem of discovering syntactic structure in natural language text. We present three new parameter estimation techniques that generalize the standard approach, maximum likel ..."
Abstract

Cited by 30 (8 self)
 Add to MetaCart
This thesis is about estimating probabilistic models to uncover useful hidden structure in data; specifically, we address the problem of discovering syntactic structure in natural language text. We present three new parameter estimation techniques that generalize the standard approach, maximum likelihood estimation, in different ways. Contrastive estimation maximizes the conditional probability of the observed data given a “neighborhood” of implicit negative examples. Skewed deterministic annealing locally maximizes likelihood using a cautious parameter search strategy that starts with an easier optimization problem than likelihood, and iteratively moves to harder problems, culminating in likelihood. Structural annealing is similar, but starts with a heavy bias toward simple syntactic structures and gradually relaxes the bias. Our estimation methods do not make use of annotated examples. We consider their performance in both an unsupervised model selection setting, where models trained under different initialization and regularization settings are compared by evaluating the training objective on a small set of unseen, unannotated development data, and supervised model selection, where the most accurate model on the development set (now with annotations)
An Efficient Implementation of a New DOP Model
 In EACL
, 2003
"... Two apparently opposing DOP models exist in the literature: one which computes the parse tree involving the most frequent subtrees from a treebank and one which computes the parse tree involving the fewest subtrees from a treebank. This paper proposes an integration of the two models which ou ..."
Abstract

Cited by 30 (6 self)
 Add to MetaCart
Two apparently opposing DOP models exist in the literature: one which computes the parse tree involving the most frequent subtrees from a treebank and one which computes the parse tree involving the fewest subtrees from a treebank. This paper proposes an integration of the two models which outperforms each of them separately. Together with a PCFGreduction of DOP we obtain improved accuracy and efficiency on the Wall Street Journal treebank. Our results show an 11% relative reduction in error rate over previous models, and an average processing time of 3.6 seconds per WSJ sentence.
Building a TreeBank of Modern Hebrew Text
, 2001
"... This paper describes the process of building the first treebank for Modern Hebrew texts. A major concern in this process is the need for reducing the cost of manual annotation by the use of automatic means. To this end, the joint utility of an automatic morphological analyzer, a probabilistic parse ..."
Abstract

Cited by 29 (2 self)
 Add to MetaCart
This paper describes the process of building the first treebank for Modern Hebrew texts. A major concern in this process is the need for reducing the cost of manual annotation by the use of automatic means. To this end, the joint utility of an automatic morphological analyzer, a probabilistic parser and a small manually annotated treebank was explored.