Results 1  10
of
14
A General Framework for Adaptive Processing of Data Structures
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 1998
"... A structured organization of information is typically required by symbolic processing. On the other hand, most connectionist models assume that data are organized according to relatively poor structures, like arrays or sequences. The framework described in this paper is an attempt to unify adaptive ..."
Abstract

Cited by 149 (61 self)
 Add to MetaCart
A structured organization of information is typically required by symbolic processing. On the other hand, most connectionist models assume that data are organized according to relatively poor structures, like arrays or sequences. The framework described in this paper is an attempt to unify adaptive models like artificial neural nets and belief nets for the problem of processing structured information. In particular, relations between data variables are expressed by directed acyclic graphs, where both numerical and categorical values coexist. The general framework proposed in this paper can be regarded as an extension of both recurrent neural networks and hidden Markov models to the case of acyclic graphs. In particular we study the supervised learning problem as the problem of learning transductions from an input structured space to an output structured space, where transductions are assumed to admit a recursive hidden statespace representation. We introduce a graphical formalism for r...
Continuous latent variable models for dimensionality reduction and sequential data reconstruction
, 2001
"... ..."
Training EnergyBased Models for TimeSeries Imputation
"... Imputing missing values in high dimensional timeseries is a difficult problem. This paper presents a strategy for training energybased graphical models for imputation directly, bypassing difficulties probabilistic approaches would face. The training strategy is inspired by recent work on optimizat ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Imputing missing values in high dimensional timeseries is a difficult problem. This paper presents a strategy for training energybased graphical models for imputation directly, bypassing difficulties probabilistic approaches would face. The training strategy is inspired by recent work on optimizationbased learning (Domke, 2012) and allows complex neural models with convolutional and recurrent structures to be trained for imputation tasks. In this work, we use this training strategy to derive learning rules for three substantially different neural architectures. Inference in these models is done by either truncated gradient descent or variational meanfield iterations. In our experiments, we found that the training methods outperform the Contrastive Divergence learning algorithm. Moreover, the training methods can easily handle missing values in the training data itself during learning. We demonstrate the performance of this learning scheme and the three models we introduce on one artificial and two realworld data sets.
Discriminative Recurrent Sparse AutoEncoders
, 1301
"... We present the discriminative recurrent sparse autoencoder model, comprising a recurrent encoder of rectified linear units, unrolled for a fixed number of iterations, and connected to two linear decoders that reconstruct the input and predict its supervised classification. Training via backpropagat ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
We present the discriminative recurrent sparse autoencoder model, comprising a recurrent encoder of rectified linear units, unrolled for a fixed number of iterations, and connected to two linear decoders that reconstruct the input and predict its supervised classification. Training via backpropagationthroughtime initially minimizes an unsupervised sparse reconstruction error; the loss function is then augmented with a discriminative term on the supervised classification. The depth implicit in the temporallyunrolled form allows the system to exhibit far more representational power, while keeping the number of trainable parameters fixed. From an initially unstructured network the hidden units differentiate into categoricalunits, each of which represents an input prototype with a welldefined class; and partunits representing deformations of these prototypes. The learned organization of the recurrent encoder is hierarchical: partunits are driven directly by the input, whereas the activity of categoricalunits builds up over time through interactions with the partunits. Even using a small number of hidden units per layer, discriminative recurrent sparse autoencoders achieve excellent performance on MNIST. 1
Robust Automatic Speech Recognition With Unreliable Data
, 1999
"... Theoretical and practical issues of some of the problems in robust automatic speech recognition (ASR) and some of the techniques that address them are presented in this report. The problem of the robustness of the ASR in reallife (as opposed to laboratory) conditions is paramount to the widespread ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Theoretical and practical issues of some of the problems in robust automatic speech recognition (ASR) and some of the techniques that address them are presented in this report. The problem of the robustness of the ASR in reallife (as opposed to laboratory) conditions is paramount to the widespread deployment of speech enabled products. The report reviews techniques used so far for robust ASR, ranging from simple spectrum subtraction to various types of model adaptation. A possible connection of robust ASR with the computational auditory scene analysis (CASA), methods for local SignaltoNoise Ratio (SNR) estimation and classification/scoring with online adapted statistical models is discussed. The main focus is on the techniques that would allow for incorporation of CASA and local SNR estimates (used as methods for speech/nonspeech separation) into the present prevailing stochastic pattern matching paradigms  Hidden Markov models (HMM) and artificial neural networks (ANN). Th...
A Solution for Missing Data in Recurrent Neural Networks With an Application to Blood Glucose Prediction
, 1998
"... We consider neural network models for stochastic nonlinear dynamical systems where measurements of the variable of interest are only available at irregular intervals i.e. most realizations are missing. Difculties arise since the solutions for prediction and maximum likelihood learning with missi ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
We consider neural network models for stochastic nonlinear dynamical systems where measurements of the variable of interest are only available at irregular intervals i.e. most realizations are missing. Difculties arise since the solutions for prediction and maximum likelihood learning with missing data lead to complex integrals, which even for simple cases cannot be solved analytically. In this paper we propose a speci c combination of a nonlinear recurrent neural predictive model and a linear error model which leads to tractable prediction and maximum likelihood adaptation rules. In particular, the recurrent neural network can be trained using the realtime recurrent learning rule and the linear error model can be trained by an EM adaptation rule, implemented using forwardbackward Kalman lter equations. The model is applied to predict the glucose/insulin metabolism of a diabetic patient where blood glucose measurements are only available a few times a day at irregular intervals. The new model shows considerable improvement with respect to both recurrent neural networks trained with teacher forcing or in a free running mode and various linear models.
unknown title
, 2001
"... Continuous latent variable models for dimensionality reduction and sequential data reconstruction by ..."
Abstract
 Add to MetaCart
Continuous latent variable models for dimensionality reduction and sequential data reconstruction by
unknown title
"... The continuous latent variable modelling formalism This chapter gives the theoretical basis for continuous latent variable models. Section 2.1 defines intuitively the concept of latent variable models and gives a brief historical introduction to them. Section 2.2 uses a simple example, inspired by t ..."
Abstract
 Add to MetaCart
The continuous latent variable modelling formalism This chapter gives the theoretical basis for continuous latent variable models. Section 2.1 defines intuitively the concept of latent variable models and gives a brief historical introduction to them. Section 2.2 uses a simple example, inspired by the mechanics of a mobile point, to justify and explain latent variables. Section 2.3 gives a more rigorous definition, which we will use throughout this thesis. Section 2.6 describes the most important specific continuous latent variable models and section 2.7 defines mixtures of continuous latent variable models. The chapter discusses other important topics, including parameter estimation, identifiability, interpretability and marginalisation in high dimensions. Section 2.9 on dimensionality reduction will be the basis for part II of the thesis. Section 2.10 very briefly mentions some applications of continuous latent variable models for dimensionality reduction. Section 2.11 shows a worked example of a simple continuous latent variable model. Section 2.12 give some complementary mathematical results, in particular the derivation of a diagonal noise GTM model and of its EM algorithm. 2.1 Introduction and historical overview of latent variable models Latent variable models are probabilistic models that try to explain a (relatively) highdimensional process in