Results 1  10
of
14
A General Framework for Adaptive Processing of Data Structures
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 1998
"... A structured organization of information is typically required by symbolic processing. On the other hand, most connectionist models assume that data are organized according to relatively poor structures, like arrays or sequences. The framework described in this paper is an attempt to unify adaptive ..."
Abstract

Cited by 127 (50 self)
 Add to MetaCart
A structured organization of information is typically required by symbolic processing. On the other hand, most connectionist models assume that data are organized according to relatively poor structures, like arrays or sequences. The framework described in this paper is an attempt to unify adaptive models like artificial neural nets and belief nets for the problem of processing structured information. In particular, relations between data variables are expressed by directed acyclic graphs, where both numerical and categorical values coexist. The general framework proposed in this paper can be regarded as an extension of both recurrent neural networks and hidden Markov models to the case of acyclic graphs. In particular we study the supervised learning problem as the problem of learning transductions from an input structured space to an output structured space, where transductions are assumed to admit a recursive hidden statespace representation. We introduce a graphical formalism for r...
Robust Automatic Speech Recognition With Unreliable Data
, 1999
"... Theoretical and practical issues of some of the problems in robust automatic speech recognition (ASR) and some of the techniques that address them are presented in this report. The problem of the robustness of the ASR in reallife (as opposed to laboratory) conditions is paramount to the widespread ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Theoretical and practical issues of some of the problems in robust automatic speech recognition (ASR) and some of the techniques that address them are presented in this report. The problem of the robustness of the ASR in reallife (as opposed to laboratory) conditions is paramount to the widespread deployment of speech enabled products. The report reviews techniques used so far for robust ASR, ranging from simple spectrum subtraction to various types of model adaptation. A possible connection of robust ASR with the computational auditory scene analysis (CASA), methods for local SignaltoNoise Ratio (SNR) estimation and classification/scoring with online adapted statistical models is discussed. The main focus is on the techniques that would allow for incorporation of CASA and local SNR estimates (used as methods for speech/nonspeech separation) into the present prevailing stochastic pattern matching paradigms  Hidden Markov models (HMM) and artificial neural networks (ANN). Th...
A Solution for Missing Data in Recurrent Neural Networks With an Application to Blood Glucose Prediction
, 1998
"... We consider neural network models for stochastic nonlinear dynamical systems where measurements of the variable of interest are only available at irregular intervals i.e. most realizations are missing. Difculties arise since the solutions for prediction and maximum likelihood learning with missi ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
We consider neural network models for stochastic nonlinear dynamical systems where measurements of the variable of interest are only available at irregular intervals i.e. most realizations are missing. Difculties arise since the solutions for prediction and maximum likelihood learning with missing data lead to complex integrals, which even for simple cases cannot be solved analytically. In this paper we propose a speci c combination of a nonlinear recurrent neural predictive model and a linear error model which leads to tractable prediction and maximum likelihood adaptation rules. In particular, the recurrent neural network can be trained using the realtime recurrent learning rule and the linear error model can be trained by an EM adaptation rule, implemented using forwardbackward Kalman lter equations. The model is applied to predict the glucose/insulin metabolism of a diabetic patient where blood glucose measurements are only available a few times a day at irregular intervals. The new model shows considerable improvement with respect to both recurrent neural networks trained with teacher forcing or in a free running mode and various linear models.
Discriminative Recurrent Sparse AutoEncoders
, 1301
"... We present the discriminative recurrent sparse autoencoder model, comprising a recurrent encoder of rectified linear units, unrolled for a fixed number of iterations, and connected to two linear decoders that reconstruct the input and predict its supervised classification. Training via backpropagat ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We present the discriminative recurrent sparse autoencoder model, comprising a recurrent encoder of rectified linear units, unrolled for a fixed number of iterations, and connected to two linear decoders that reconstruct the input and predict its supervised classification. Training via backpropagationthroughtime initially minimizes an unsupervised sparse reconstruction error; the loss function is then augmented with a discriminative term on the supervised classification. The depth implicit in the temporallyunrolled form allows the system to exhibit far more representational power, while keeping the number of trainable parameters fixed. From an initially unstructured network the hidden units differentiate into categoricalunits, each of which represents an input prototype with a welldefined class; and partunits representing deformations of these prototypes. The learned organization of the recurrent encoder is hierarchical: partunits are driven directly by the input, whereas the activity of categoricalunits builds up over time through interactions with the partunits. Even using a small number of hidden units per layer, discriminative recurrent sparse autoencoders achieve excellent performance on MNIST. 1
unknown title
"... The continuous latent variable modelling formalism This chapter gives the theoretical basis for continuous latent variable models. Section 2.1 defines intuitively the concept of latent variable models and gives a brief historical introduction to them. Section 2.2 uses a simple example, inspired by t ..."
Abstract
 Add to MetaCart
The continuous latent variable modelling formalism This chapter gives the theoretical basis for continuous latent variable models. Section 2.1 defines intuitively the concept of latent variable models and gives a brief historical introduction to them. Section 2.2 uses a simple example, inspired by the mechanics of a mobile point, to justify and explain latent variables. Section 2.3 gives a more rigorous definition, which we will use throughout this thesis. Section 2.6 describes the most important specific continuous latent variable models and section 2.7 defines mixtures of continuous latent variable models. The chapter discusses other important topics, including parameter estimation, identifiability, interpretability and marginalisation in high dimensions. Section 2.9 on dimensionality reduction will be the basis for part II of the thesis. Section 2.10 very briefly mentions some applications of continuous latent variable models for dimensionality reduction. Section 2.11 shows a worked example of a simple continuous latent variable model. Section 2.12 give some complementary mathematical results, in particular the derivation of a diagonal noise GTM model and of its EM algorithm. 2.1 Introduction and historical overview of latent variable models Latent variable models are probabilistic models that try to explain a (relatively) highdimensional process in
unknown title
"... The continuous latent variable modelling formalism This chapter gives the theoretical basis for continuous latent variable models. Section 2.1 defines intuitively the concept of latent variable models and gives a brief historical introduction to them. Section 2.2 uses a simple example, inspired by t ..."
Abstract
 Add to MetaCart
The continuous latent variable modelling formalism This chapter gives the theoretical basis for continuous latent variable models. Section 2.1 defines intuitively the concept of latent variable models and gives a brief historical introduction to them. Section 2.2 uses a simple example, inspired by the mechanics of a mobile point, to justify and explain latent variables. Section 2.3 gives a more rigorous definition, which we will use throughout this thesis. Section 2.6 describes the most important specific continuous latent variable models and section 2.7 defines mixtures of continuous latent variable models. The chapter discusses other important topics, including parameter estimation, identifiability, interpretability and marginalisation in high dimensions. Section 2.9 on dimensionality reduction will be the basis for part II of the thesis. Section 2.10 very briefly mentions some applications of continuous latent variable models for dimensionality reduction. Section 2.11 shows a worked example of a simple continuous latent variable model. Section 2.12 give some complementary mathematical results, in particular the derivation of a diagonal noise GTM model and of its EM algorithm. 2.1 Introduction and historical overview of latent variable models Latent variable models are probabilistic models that try to explain a (relatively) highdimensional process in
Chapter 4 Dimensionality reduction
"... This chapter introduces and defines the problem of dimensionality reduction, discusses the topics of the curse of the dimensionality and the intrinsic dimensionality and then surveys nonprobabilistic methods for dimensionality reduction, that is, methods that do not define a probabilistic model for ..."
Abstract
 Add to MetaCart
This chapter introduces and defines the problem of dimensionality reduction, discusses the topics of the curse of the dimensionality and the intrinsic dimensionality and then surveys nonprobabilistic methods for dimensionality reduction, that is, methods that do not define a probabilistic model for the data. These include linear methods (PCA, projection pursuit), nonlinear autoassociators, kernel methods, local dimensionality reduction, principal curves, vector quantisation methods (elastic net, selforganising map) and multidimensional scaling methods. One of these methods (the elastic net) does define a probabilistic model but not a continuous dimensionality reduction mapping. If one is interested in stochastically modelling the dimensionality reduction mapping then the natural choice are latent variable models, discussed in chapter 2. We close the chapter with a summary and with some thoughts on dimensionality reduction with discrete variables. Consider an application in which a system processes data in the form of a collection of realvalued vectors: speech signals, images, etc. Suppose that the system is only effective if the dimension of each individual vector—the number of components of the vector—is not too high, where high depends on the particular application. The problem of dimensionality reduction appears when the data are in fact of a higher dimension
unknown title
, 2001
"... Continuous latent variable models for dimensionality reduction and sequential data reconstruction by ..."
Abstract
 Add to MetaCart
Continuous latent variable models for dimensionality reduction and sequential data reconstruction by