Results 1  10
of
49
Insideoutside reestimation from partially bracketed corpora
 In Proceedings of the 30th Annual Meeting of the ACL
, 1992
"... The insideoutside algorithm for inferring the parameters of a stochastic contextfree grammar is extended to take advantage of constituent information (constituent bracketing) in a partially parsed corpus. Experiments on formal and natural language parsed corpora show that the new algorithm can ach ..."
Abstract

Cited by 294 (3 self)
 Add to MetaCart
(Show Context)
The insideoutside algorithm for inferring the parameters of a stochastic contextfree grammar is extended to take advantage of constituent information (constituent bracketing) in a partially parsed corpus. Experiments on formal and natural language parsed corpora show that the new algorithm can achieve faster convergence and better modeling of hierarchical structure than the original one. In particular, over 90 % test set bracketing accuracy was achieved for grammars inferred by our algorithm from a training set of handparsed partofspeech strings for sentences in the Air Travel Information System spoken language corpus. Finally, the new algorithm has better time complexity than the original one when sufficient bracketing is provided. 1
An Efficient Probabilistic ContextFree Parsing Algorithm that Computes Prefix Probabilities
 Computational Linguistics
, 2002
"... this article can compute solutions to all four of these problems in a single flamework, with a number of additional advantages over previously presented isolated solutions ..."
Abstract

Cited by 218 (5 self)
 Add to MetaCart
(Show Context)
this article can compute solutions to all four of these problems in a single flamework, with a number of additional advantages over previously presented isolated solutions
Introduction to the special issue on computational linguistics using large corpora
 Computational Linguistics
, 1993
"... ..."
(Show Context)
PartofSpeech Tagging and Partial Parsing
 CorpusBased Methods in Language and Speech
, 1996
"... m we can carve o# next. `Partial parsing' is a cover term for a range of di#erent techniques for recovering some but not all of the information contained in a traditional syntactic analysis. Partial parsing techniques, like tagging techniques, aim for reliability and robustness in the face of t ..."
Abstract

Cited by 108 (0 self)
 Add to MetaCart
m we can carve o# next. `Partial parsing' is a cover term for a range of di#erent techniques for recovering some but not all of the information contained in a traditional syntactic analysis. Partial parsing techniques, like tagging techniques, aim for reliability and robustness in the face of the vagaries of natural text, by sacrificing completeness of analysis and accepting a low but nonzero error rate. 1 Tagging The earliest taggers [35, 51] had large sets of handconstructed rules for assigning tags on the basis of words' character patterns and on the basis of the tags assigned to preceding or following words, but they had only small lexica, primarily for exceptions to the rules. TAGGIT [35] was used to generate an initial tagging of the Brown corpus, which was then handedited. (Thus it provided the data that has since been used to train other taggers [20].) The tagger described by Garside [56, 34], CLAWS, was a probabilistic version of TAGGIT, and the DeRose tagger improved on
Parsing InsideOut
, 1998
"... Probabilistic ContextFree Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probabili ..."
Abstract

Cited by 97 (2 self)
 Add to MetaCart
(Show Context)
Probabilistic ContextFree Grammars (PCFGs) and variations on them have recently become some of the most common formalisms for parsing. It is common with PCFGs to compute the inside and outside probabilities. When these probabilities are multiplied together and normalized, they produce the probability that any given nonterminal covers any piece of the input sentence. The traditional use of these probabilities is to improve the probabilities of grammar rules. In this thesis we show that these values are useful for solving many other problems in Statistical Natural Language Processing. We give a framework for describing parsers. The framework generalizes the inside and outside values to semirings. It makes it easy to describe parsers that compute a wide variety of interesting quantities, including the inside and outside probabilities, as well as related quantities such as Viterbi probabilities and nbest lists. We also present three novel uses for the inside and outside probabilities. T...
Bayesian grammar induction for language modeling
 In Proceedings of ACL
, 1995
"... We describe a corpusbased induction algorithm for probabilistic contextfree grammars. The algorithm employs a greedy heuristic search within a Bayesian framework, and a postpass using the InsideOutside algorithm. We compare the performance of our algorithm to ngram models and the InsideOutside ..."
Abstract

Cited by 59 (1 self)
 Add to MetaCart
(Show Context)
We describe a corpusbased induction algorithm for probabilistic contextfree grammars. The algorithm employs a greedy heuristic search within a Bayesian framework, and a postpass using the InsideOutside algorithm. We compare the performance of our algorithm to ngram models and the InsideOutside algorithm in three language modeling tasks. In two of the tasks, the training data is generated by a probabilistic contextfree grammar and in both tasks our algorithm outperforms the other techniques. The third task involves naturallyoccurring data, and in this task our algorithm does not perform as well as ngram models but vastly outperforms the InsideOutside algorithm. 1
Induction of probabilistic synchronous treeinsertion grammars for machine translation
 PROCEEDINGS OF THE 7TH CONFERENCE OF THE ASSOCIATION FOR MACHINE TRANSLATION IN THE AMERICAS (AMTA 2006
, 2006
"... The more expressive and flexible a base formalism for machine translation is, the less efficient parsing of it will be. However, even among formalisms with the same parse complexity, some formalisms better realize the desired characteristics for machine translation formalisms than others. We introdu ..."
Abstract

Cited by 27 (3 self)
 Add to MetaCart
The more expressive and flexible a base formalism for machine translation is, the less efficient parsing of it will be. However, even among formalisms with the same parse complexity, some formalisms better realize the desired characteristics for machine translation formalisms than others. We introduce a particular formalism, probabilistic synchronous treeinsertion grammar (PSTIG) that we argue satisfies the desiderata optimally within the class of formalisms that can be parsed no less efficiently than contextfree grammars and demonstrate that it outperforms stateoftheart wordbased and phrasebased finitestate translation models on training and test data taken from the EuroParl corpus (Koehn, 2005). We then argue that a higher level of translation quality can be achieved by hybridizing our induced model with elementary structures produced using supervised techniques such as those of Groves et al. (2004).
M.: Probability and statistics in computational linguistics, a brief review
 In: Mathematical foundations of speech and language processing
, 2003
"... Computational linguistics studies the computational processes involved in language learning, production, and comprehension. Computational linguists believe that the essence of these processes (in humans and machines) is a computational manipulation of information. Computational psycholinguistics ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
Computational linguistics studies the computational processes involved in language learning, production, and comprehension. Computational linguists believe that the essence of these processes (in humans and machines) is a computational manipulation of information. Computational psycholinguistics
Integrating Language Models with Speech Recognition
 In Proceedings of the AAAI94 Workshop on the Integration of Natural Language and Speech Processing
, 1994
"... The question of how to integrate language models with speech recognition systems is becoming more important as speech recognition technology matures. For the purposes of this paper, we have classified the level of integration of current and past approaches into three categories: tightlycoupled, loo ..."
Abstract

Cited by 14 (5 self)
 Add to MetaCart
(Show Context)
The question of how to integrate language models with speech recognition systems is becoming more important as speech recognition technology matures. For the purposes of this paper, we have classified the level of integration of current and past approaches into three categories: tightlycoupled, looselycoupled, or semicoupled systems. We then argue that loose coupling is more appropriate given the current state of the art and given that it allows one to measure more precisely which components of the language model are most important. We will detail how the speech component in our approach interacts with the language model and discuss why we chose our language model. 1 Introduction State of the art speech recognition systems achieve high recognition accuracies only on tasks that have low perplexities. The perplexity of a task is, roughly speaking, the average number of choices at any decision point. The perplexity of a task is at a minimum when the true language model is known and co...