Results 11 - 20
of
68
Bayesian grammar induction for language modeling
- In Proceedings of ACL
, 1995
"... We describe a corpus-based induction algorithm for probabilistic context-free grammars. The algorithm employs a greedy heuristic search within a Bayesian framework, and a post-pass using the Inside-Outside algorithm. We compare the performance of our algorithm to n-gram models and the Inside-Outside ..."
Abstract
-
Cited by 49 (2 self)
- Add to MetaCart
We describe a corpus-based induction algorithm for probabilistic context-free grammars. The algorithm employs a greedy heuristic search within a Bayesian framework, and a post-pass using the Inside-Outside algorithm. We compare the performance of our algorithm to n-gram models and the Inside-Outside algorithm in three language modeling tasks. In two of the tasks, the training data is generated by a probabilistic context-free grammar and in both tasks our algorithm outperforms the other techniques. The third task involves naturally-occurring data, and in this task our algorithm does not perform as well as n-gram models but vastly outperforms the Inside-Outside algorithm. 1
Applying Co-Training methods to Statistical Parsing
, 2001
"... We propose a novel Co-Training method for statistical parsing. The algorithm takes as input a small corpus (9695 sentences) annotated with parse trees, a dictionary of possible lexicalized structures for each word in the training set and a large pool of unlabeled text. The algorithm iteratively labe ..."
Abstract
-
Cited by 48 (3 self)
- Add to MetaCart
We propose a novel Co-Training method for statistical parsing. The algorithm takes as input a small corpus (9695 sentences) annotated with parse trees, a dictionary of possible lexicalized structures for each word in the training set and a large pool of unlabeled text. The algorithm iteratively labels the entire data set with parse trees. Using empirical results based on parsing the Wall Street Journal corpus we show that training a statistical parser on the combined labeled and unlabeled data strongly outperforms training only on the labeled data. 1
Head Automata and Bilingual Tiling: Translation with Minimal Representations
, 1996
"... We present a language model consisting of a collection of costed bidirectional finite state automata associated with the head words of phrases. The model is suitable for incremental application of lexical associations in a dynamic programming search for optimal dependency tree derivations. We ..."
Abstract
-
Cited by 40 (3 self)
- Add to MetaCart
We present a language model consisting of a collection of costed bidirectional finite state automata associated with the head words of phrases. The model is suitable for incremental application of lexical associations in a dynamic programming search for optimal dependency tree derivations. We also
Stochastic Lexicalized Context-Free Grammar
, 1993
"... Stochastic lexicalized context-free grammar (SLCFG) is an attractive compromise between the parsing efficiency of stochastic context-free grammar (SCFG) and the lexical sensitivity of stochastic lexicalized tree-adjoining grammar (SLTAG). SLCFG is a restricted form of SLTAG that can only generate ..."
Abstract
-
Cited by 40 (6 self)
- Add to MetaCart
Stochastic lexicalized context-free grammar (SLCFG) is an attractive compromise between the parsing efficiency of stochastic context-free grammar (SCFG) and the lexical sensitivity of stochastic lexicalized tree-adjoining grammar (SLTAG). SLCFG is a restricted form of SLTAG that can only generate contextfree languages and can be parsed in cubic time. However, SLCFG retains the lexical sensitivity of SLTAG and is therefore a much better basis for capturing distributional information about words than SCFG.
Can Subcategorisation Probabilities Help a Statistical Parser?
- In Proceedings of the 6th ACL/SIGDAT Workshop on Very Large Corpora
, 1998
"... Research into the automatic acquisition of lexical information from corpora is starting to produce large-scale computational lexicons containing data on the relative frequencies of subcategorisation alternatives for individual verbal predicates. However, the empirical question of whether this type ..."
Abstract
-
Cited by 39 (5 self)
- Add to MetaCart
Research into the automatic acquisition of lexical information from corpora is starting to produce large-scale computational lexicons containing data on the relative frequencies of subcategorisation alternatives for individual verbal predicates. However, the empirical question of whether this type of frequency information can in practice improve the accuracy of a statistical parser has not yet been answered. In this paper we describe an experiment with a widecoverage statistical grammar and parser for English and subcategorisation frequencies acquired from ten million words of text which shows that this information can significantly improve parse accuracy 1 .
Global Thresholding and Multiple-Pass Parsing
, 1997
"... We present a variation on classic beam thresholding techniques that is up to an order of magnitude faster than the traditional method, at the same performance level. We also present a new thresholding technique, global thresholding, which, combined with the new beam thresholding, gives an addi ..."
Abstract
-
Cited by 36 (3 self)
- Add to MetaCart
We present a variation on classic beam thresholding techniques that is up to an order of magnitude faster than the traditional method, at the same performance level. We also present a new thresholding technique, global thresholding, which, combined with the new beam thresholding, gives an additional factor of two improvement, and a novel technique, multiple pass parsing, that can be combined with the others to yield yet another 50% improvement. We use a new search algorithm to simultaneously op- timize the thresholding parameters of the various algorithms.
Statistical Parsing With an Automatically-Extracted Tree Adjoining Grammar
, 2000
"... We discuss the advantages of lexicalized tree-adjoining grammar as an alternative to lexicalized PCFG for statistical parsing, describing the induction of a probabilistic LTAG model from the Penn Treebank and evaluating its parsing performance. We find that this induction method is an improvement ov ..."
Abstract
-
Cited by 33 (1 self)
- Add to MetaCart
We discuss the advantages of lexicalized tree-adjoining grammar as an alternative to lexicalized PCFG for statistical parsing, describing the induction of a probabilistic LTAG model from the Penn Treebank and evaluating its parsing performance. We find that this induction method is an improvement over the EM-based method of [Hwa, 1998], and that the induced model yields results comparable to lexicalized PCFG.
Generation as Dependency Parsing
- In Proceedings of the 40th ACL
, 2002
"... Natural-Language Generation from flat semantics is an NP-complete problem. ..."
Abstract
-
Cited by 33 (3 self)
- Add to MetaCart
Natural-Language Generation from flat semantics is an NP-complete problem.
Probabilistic constraint logic programming
, 1999
"... Abstract. This paper addresses two central problems for probabilistic processing models: parameter estimation from incomplete data and efficient retrieval of most probable analyses. These questions have been answered satisfactorily only for probabilistic regular and context-free models. We address t ..."
Abstract
-
Cited by 29 (2 self)
- Add to MetaCart
Abstract. This paper addresses two central problems for probabilistic processing models: parameter estimation from incomplete data and efficient retrieval of most probable analyses. These questions have been answered satisfactorily only for probabilistic regular and context-free models. We address these problems for a more expressive probabilistic constraint logic programming model. We present a log-linear probability model for probabilistic constraint logic programming. On top of this model we define an algorithm to estimate the parameters and to select the properties of log-linear models from incomplete data. This algorithm is an extension of the improved iterative scaling algorithm of Della Pietra, Della Pietra, and Lafferty (1995). Our algorithm applies to loglinear models in general and is accompanied with suitable approximation methods when applied to large data spaces. Furthermore, we present an approach for searching for most probable analyses of the probabilistic constraint logic programming model. This method can be applied to the ambiguity resolution problem in natural language processing applications. 1.

