Results 1 - 10
of
44
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have bee ..."
Abstract
-
Cited by 393 (4 self)
- Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linear-Gaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying Rao-Blackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
Context-Based Vision System for Place and Object Recognition
, 2003
"... While navigating in an environment, a vision system has' to be able to recognize where it is' and what the main objects' in the scene are. In this paper we present a context-based vision system for place and object recognition. The goal is' to identify familiar locations' (e.g., office 610, conferen ..."
Abstract
-
Cited by 169 (4 self)
- Add to MetaCart
While navigating in an environment, a vision system has' to be able to recognize where it is' and what the main objects' in the scene are. In this paper we present a context-based vision system for place and object recognition. The goal is' to identify familiar locations' (e.g., office 610, conference room 941, Main Street), to categorize new environments' (office, corridor, street) and to use that information to provide contextualpriors for object recognition (e.g., table, chair, car, computeD. We present a low-dimensional global image representation that provides relevant information for place recognition and categorization, and how such contextual information introduces strong priors' that simplify object recognition. We have trained the system to recognize over 60 locations (indoors' and outdoors') and to suggest the presence and locations' of more than 20 different object types. The algorithm has been integrated into a mobile system that provides real-time feedback to the user. 1This work was sponsored by the Air Force under Air Force Contract F19628-00-C-0002. Opinions, interpretations, conclusions, and recommendations are those of the author and are not necessarily endorsed by the U.S. Government.
Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction
, 1998
"... We introduce an entropic prior for multinomial parameter estimation problems and solve for its maximum... ..."
Abstract
-
Cited by 59 (0 self)
- Add to MetaCart
We introduce an entropic prior for multinomial parameter estimation problems and solve for its maximum...
Graphical models and automatic speech recognition
- Mathematical Foundations of Speech and Language Processing
, 2003
"... Graphical models provide a promising paradigm to study both existing and novel techniques for automatic speech recognition. This paper first provides a brief overview of graphical models and their uses as statistical models. It is then shown that the statistical assumptions behind many pattern recog ..."
Abstract
-
Cited by 49 (10 self)
- Add to MetaCart
Graphical models provide a promising paradigm to study both existing and novel techniques for automatic speech recognition. This paper first provides a brief overview of graphical models and their uses as statistical models. It is then shown that the statistical assumptions behind many pattern recognition techniques commonly used as part of a speech recognition system can be described by a graph – this includes Gaussian distributions, mixture models, decision trees, factor analysis, principle component analysis, linear discriminant analysis, and hidden Markov models. Moreover, this paper shows that many advanced models for speech recognition and language processing can also be simply described by a graph, including many at the acoustic-, pronunciation-, and language-modeling levels. A number of speech recognition techniques born directly out of the graphical-models paradigm are also surveyed. Additionally, this paper includes a novel graphical analysis regarding why derivative (or delta) features improve hidden Markov model-based speech recognition by improving structural discriminability. It also includes an example where a graph can be used to represent language model smoothing constraints. As will be seen, the space of models describable by a graph is quite large. A thorough exploration of this space should yield techniques that ultimately will supersede the hidden Markov model.
Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies
, 2001
"... Recurrent networks (crossreference Chapter 12) can, in principle, use their feedback connections to store representations of recent input events in the form of activations. The most widely used algorithms for learning what to put in short-term memory, however, take too much time to be feasible or d ..."
Abstract
-
Cited by 33 (20 self)
- Add to MetaCart
Recurrent networks (crossreference Chapter 12) can, in principle, use their feedback connections to store representations of recent input events in the form of activations. The most widely used algorithms for learning what to put in short-term memory, however, take too much time to be feasible or do not work well at all, especially when minimal time lags between inputs and corresponding teacher signals are long. Although theoretically fascinating, they do not provide clear practical advantages over, say, backprop in feedforward networks with limited time windows (see crossreference Chapters 11 and 12). With conventional "algorithms based on the computation of the complete gradient", such as "Back-Propagation Through Time" (BPTT, e.g., [22, 27, 26]) or "Real-Time Recurrent Learning" (RTRL, e.g., [21]) error signals "flowing backwards in time" tend to either (1) blow up or (2) vanish: the temporal evolution of the backpropagated error ex
Unsupervised Language Acquisition: Theory and Practice
, 2001
"... In this thesis I present various algorithms for the unsupervised machine learning of aspects of natural languages using a variety of statistical models. The scientific object of the work is to examine the validity of the so-called Argument from the Poverty of the Stimulus advanced in favour of the p ..."
Abstract
-
Cited by 32 (0 self)
- Add to MetaCart
In this thesis I present various algorithms for the unsupervised machine learning of aspects of natural languages using a variety of statistical models. The scientific object of the work is to examine the validity of the so-called Argument from the Poverty of the Stimulus advanced in favour of the proposition that humans have language-specific innate knowledge. I start by examining an a priori argument based on Gold's theorem, that purports to prove that natural languages cannot be learned, and some formal issues related to the choice of statistical grammars rather than symbolic grammars. I present three novel algorithms for learning various parts of natural languages: first, an algorithm for the induction of syntactic categories from unlabelled text using distributional information, that can deal with ambiguous and rare words; secondly, a set of algorithms for learning morphological processes in a variety of languages, including languages such as Arabic with nonconcatenative morphology; thirdly an algorithm for the unsupervised induction of a context-free grammar from tagged text. I carefully examine the interaction between the various components, and show how these algorithms can form the basis for a empiricist model of language acquisition. I therefore conclude that the Argument from the Poverty of the Stimulus is unsupported by the evidence.
Hatching by Example: a Statistical Approach
, 2002
"... We present a new approach to synthetic (computer-aided) drawing with patches of strokes. Grouped strokes convey the local intensity level that is desired in drawing. The key point of our approach is learning by example: the system does not know a priori the distribution of the strokes. Instead, by a ..."
Abstract
-
Cited by 26 (0 self)
- Add to MetaCart
We present a new approach to synthetic (computer-aided) drawing with patches of strokes. Grouped strokes convey the local intensity level that is desired in drawing. The key point of our approach is learning by example: the system does not know a priori the distribution of the strokes. Instead, by analyzing a sample (training) patch of strokes, our system is able to synthesize freely an arbitrary sequence of strokes that "looks like" the given sample. Strokes are considered as parametrical curves represented by a vector of random variables following a Markovian distribution. Our method is based on Shannon's N-gram approach and is a direct extension of Efros's texture synthesis models [EL99; EF01]. Nevertheless, one major difference between our method and traditional texture synthesis is the use of such curves as a basic element instead of pixels. We define a statistical metric for comparison between different patches containing various layouts of strokes. We hope that our method performs a first step towards capturing a very difficult notion of style in drawing -- hatching style in our case. We illustrate our method by varied examples, ranging from typical hatching in traditional drawing to highly heterogeneous sets of strokes.
Learning Dynamics for Exemplar-based Gesture Recognition
- IN IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION
, 2003
"... This paper addresses the problem of capturing the dynamics for exemplar-based recognition systems. Traditional HMM provides a probabilistic tool to capture system dynamics and in exemplar paradigm, HMM states are typically coupled with the exemplars. Alternatively, we propose a non-parametric HMM ap ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
This paper addresses the problem of capturing the dynamics for exemplar-based recognition systems. Traditional HMM provides a probabilistic tool to capture system dynamics and in exemplar paradigm, HMM states are typically coupled with the exemplars. Alternatively, we propose a non-parametric HMM approach that uses a discrete HMM with arbitrary states (decoupled from exemplars) to capture the dynamics over a large exemplar space where a nonparametric estimation approach is used to model the exemplar distribution. This reduces the need for lengthy and non-optimal training of the HMM observation model. We used the proposed approach for view-based recognition of gestures. The approach is based on representing each gesture as a sequence of learned body poses (exemplars). The gestures are recognized through a probabilistic framework for matching these body poses and for imposing temporal constraints between different poses using the proposed nonparametric HMM.
Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks
- In Proceedings of the International Conference on Machine Learning, ICML 2006
, 2006
"... Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data. In speech recognition, for example, an acoustic signal is transcribed into words or sub-word units. Recurrent neural networks (RNNs) are powerful sequence learners that would see ..."
Abstract
-
Cited by 13 (9 self)
- Add to MetaCart
Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data. In speech recognition, for example, an acoustic signal is transcribed into words or sub-word units. Recurrent neural networks (RNNs) are powerful sequence learners that would seem well suited to such tasks. However, because they require pre-segmented training data, and post-processing to transform their outputs into label sequences, their applicability has so far been limited. This paper presents a novel method for training RNNs to label unsegmented sequences directly, thereby solving both problems. An experiment on the TIMIT speech corpus demonstrates its advantages over both a baseline HMM and a hybrid HMM-RNN. 1.
Discrete-Time, Discrete-Valued Observable Operator Models: A Tutorial
, 1998
"... This tutorial gives a basic yet rigorous introduction to observable operator models (OOMs). OOMs are a recently discovered class of models of stochastic processes. They are mathematically simple in that they require only concepts from elementary linear algebra. The linear algebra nature gives rise t ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
This tutorial gives a basic yet rigorous introduction to observable operator models (OOMs). OOMs are a recently discovered class of models of stochastic processes. They are mathematically simple in that they require only concepts from elementary linear algebra. The linear algebra nature gives rise to an e#cient, consistent, unbiased, constructive learning procedure for estimating models from empirical data. The tutorial describes in detail the mathematical foundations and the practical use of OOMs for identifying and predicting discrete-time, discrete-valued processes, both for output-only and input-output systems. key words: stochastic time series, system identification, observable operator models Zusammenfassung Dies Tutorial bietet eine grundliche Einfuhrung in observable operator Modelle (OOMs). OOMs sind eine kurzlich entdeckte Klasse von Modellen stochastischer Prozesse. Sie sind mit den Mitteln der elementaren linearen Algebra darzustellen. Die Einfachheit der Darstellung fuhrt...

