Results 1  10
of
97
Generation and Synchronous TreeAdjoining Grammars
, 1990
"... Treeadjoining grammars (TAG) have been proposed as a formalism for generation based on the intuition that the extended domain of syntactic locality that TAGs provide should aid in localizing semantic dependencies as well, in turn serving as an aid to generation from semantic representations. We dem ..."
Abstract

Cited by 635 (39 self)
 Add to MetaCart
Treeadjoining grammars (TAG) have been proposed as a formalism for generation based on the intuition that the extended domain of syntactic locality that TAGs provide should aid in localizing semantic dependencies as well, in turn serving as an aid to generation from semantic representations. We demonstrate that this intuition can be made concrete by using the formalism of synchronous treeadjoining grammars. The use of synchronous TAGs for generation provides solutions to several problems with previous approaches to TAG generation. Furthermore, the semantic monotonicity requirement previously advocated for generation gram mars as a computational aid is seen to be an inherent property of synchronous TAGs.
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 564 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
Insideoutside reestimation from partially bracketed corpora
 In Proceedings of the 30th Annual Meeting of the ACL
, 1992
"... The insideoutside algorithm for inferring the parameters of a stochastic contextfree grammar is extended to take advantage of constituent information (constituent bracketing) in a partially parsed corpus. Experiments on formal and natural language parsed corpora show that the new algorithm can ach ..."
Abstract

Cited by 275 (3 self)
 Add to MetaCart
The insideoutside algorithm for inferring the parameters of a stochastic contextfree grammar is extended to take advantage of constituent information (constituent bracketing) in a partially parsed corpus. Experiments on formal and natural language parsed corpora show that the new algorithm can achieve faster convergence and better modeling of hierarchical structure than the original one. In particular, over 90 % test set bracketing accuracy was achieved for grammars inferred by our algorithm from a training set of handparsed partofspeech strings for sentences in the Air Travel Information System spoken language corpus. Finally, the new algorithm has better time complexity than the original one when sufficient bracketing is provided. 1
Three New Probabilistic Models for Dependency Parsing: An Exploration
, 1996
"... After presenting a novel O(n³) parsing algorithm for dependency grammar, we develop three contrasting ways to stochasticize it. We propose (a) a lexical affinity model where words struggle to modify each other, (b) a sense tagging model where words fluctuate randomly in their selectional prefe ..."
Abstract

Cited by 254 (12 self)
 Add to MetaCart
After presenting a novel O(n³) parsing algorithm for dependency grammar, we develop three contrasting ways to stochasticize it. We propose (a) a lexical affinity model where words struggle to modify each other, (b) a sense tagging model where words fluctuate randomly in their selectional preferences, and (c) a generative model where the speaker fleshes out each word's syntactic and conceptual structure without regard to the implications for the hearer. We also give preliminary empirical results from evaluating the three models' parsing performance on annotated Wall Street Journal training text (derived from the Penn Treebank). In these results, the generative model performs significantly better than the others, and does about equally well at assigning partofspeech tags.
Recognition of visual activities and interactions by stochastic parsing
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2000
"... This paper describes a probabilistic syntactic approach to the detection and recognition of temporally extended activities and interactions between multiple agents. The fundamental idea is to divide the recognition problem into two levels. The lower level detections are performed using standard inde ..."
Abstract

Cited by 236 (5 self)
 Add to MetaCart
This paper describes a probabilistic syntactic approach to the detection and recognition of temporally extended activities and interactions between multiple agents. The fundamental idea is to divide the recognition problem into two levels. The lower level detections are performed using standard independent probabilistic event detectors to propose candidate detections of lowlevel features. The outputs of these detectors provide the input stream for a stochastic contextfree grammar parsing mechanism. The grammar and parser provide longer range temporal constraints, disambiguate uncertain lowlevel detections, and allow the inclusion of a priori knowledge about the structure of temporal events in a given domain. To achieve such a system we: 1) provide techniques for generating a discrete symbol stream from continuous lowlevel detectors; 2) extend stochastic contextfree parsing to handle uncertainty in the input symbol stream; 3) augment a runtime parsing algorithm to enforce intersymbol constraints such as requiring temporal consistency between primitives; and 4) extend the consistency filtering to maintain consistent multiobject interactions. We develop a realtime system and demonstrate the approach in several experiments on gesture recognition and in video surveillance. In the surveillance application, we show how the system correctly interprets activities of multiple, interacting objects.
An Efficient Probabilistic ContextFree Parsing Algorithm that Computes Prefix Probabilities
 Computational Linguistics
, 2002
"... this article can compute solutions to all four of these problems in a single flamework, with a number of additional advantages over previously presented isolated solutions ..."
Abstract

Cited by 188 (6 self)
 Add to MetaCart
this article can compute solutions to all four of these problems in a single flamework, with a number of additional advantages over previously presented isolated solutions
The Bayesian image retrieval system, PicHunter: Theory, implementation, and psychophysical experiments
 IEEE TRANSACTIONS ON IMAGE PROCESSING
, 2000
"... This paper presents the theory, design principles, implementation, and performance results of PicHunter, a prototype contentbased image retrieval (CBIR) system that has been developed over the past three years. In addition, this document presents the rationale, design, and results of psychophysica ..."
Abstract

Cited by 181 (2 self)
 Add to MetaCart
This paper presents the theory, design principles, implementation, and performance results of PicHunter, a prototype contentbased image retrieval (CBIR) system that has been developed over the past three years. In addition, this document presents the rationale, design, and results of psychophysical experiments that were conducted to address some key issues that arose during PicHunter’s development. The PicHunter project makes four primary contributions to research on contentbased image retrieval. First, PicHunter represents a simple instance of a general Bayesian framework we describe for using relevance feedback to direct a search. With an explicit model of what users would do, given what target image they want, PicHunter uses Bayes’s rule to predict what is the target they want, given their actions. This is done via a probability distribution over possible image targets, rather than by refining a query. Second, an entropyminimizing display algorithm is described that attempts to maximize the information obtained from a user at each iteration of the search. Third, PicHunter makes use of hidden annotation rather than a possibly inaccurate/inconsistent annotation structure that the user must learn and make queries in. Finally, PicHunter introduces two experimental paradigms to quantitatively evaluate the performance of the system, and psychophysical experiments are presented that support the theoretical claims.
Two decades of statistical language modeling: Where do we go from here
 Proceedings of the IEEE
, 2000
"... Statistical Language Models estimate the distribution of various natural language phenomena for the purpose of speech recognition and other language technologies. Since the first significant model was proposed in 1980, many attempts have been made to improve the state of the art. We review them here ..."
Abstract

Cited by 147 (1 self)
 Add to MetaCart
Statistical Language Models estimate the distribution of various natural language phenomena for the purpose of speech recognition and other language technologies. Since the first significant model was proposed in 1980, many attempts have been made to improve the state of the art. We review them here, point to a few promising directions, and argue for a Bayesian approach to integration of linguistic theories with data. 1. OUTLINE Statistical language modeling (SLM) is the attempt to capture regularities of natural language for the purpose of improving the performance of various natural language applications. By and large, statistical language modeling amounts to estimating the probability distribution of various linguistic units, such as words, sentences, and whole documents. Statistical language modeling is crucial for a large variety of language technology applications. These include speech recognition (where SLM got its start), machine translation, document classification and routing, optical character recognition, information retrieval, handwriting recognition, spelling correction, and many more. In machine translation, for example, purely statistical approaches have been introduced in [1]. But even researchers using rulebased approaches have found it beneficial to introduce some elements of SLM and statistical estimation [2]. In information retrieval, a language modeling approach was recently proposed by [3], and a statistical/information theoretical approach was developed by [4]. SLM employs statistical estimation techniques using language training data, that is, text. Because of the categorical nature of language, and the large vocabularies people naturally use, statistical techniques must estimate a large number of parameters, and consequently depend critically on the availability of large amounts of training data.
Inducing probabilistic grammars by bayesian model merging
 In: Int. Conf. Grammatical Inference. URL: citeseer.nj.nec.com/stolcke94inducing.html
, 1994
"... We describe a framework for inducing probabilistic grammars from corpora of positive samples. First, samples are incorporated by adding adhoc rules to a working grammar; subsequently, elements of the model (such as states or nonterminals) are merged to achieve generalization and a more compact repr ..."
Abstract

Cited by 130 (0 self)
 Add to MetaCart
We describe a framework for inducing probabilistic grammars from corpora of positive samples. First, samples are incorporated by adding adhoc rules to a working grammar; subsequently, elements of the model (such as states or nonterminals) are merged to achieve generalization and a more compact representation. The choice of what to merge and when to stop is governed by the Bayesian posterior probability of the grammar given the data, which formalizes a tradeoff between a close fit to the data and a default preference for simpler models (‘Occam’s Razor’). The general scheme is illustrated using three types of probabilistic grammars: Hidden Markov models, classbasedgrams, and stochastic contextfree grammars. 1