Results 1 
7 of
7
A General Framework for Adaptive Processing of Data Structures
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 1998
"... A structured organization of information is typically required by symbolic processing. On the other hand, most connectionist models assume that data are organized according to relatively poor structures, like arrays or sequences. The framework described in this paper is an attempt to unify adaptive ..."
Abstract

Cited by 117 (46 self)
 Add to MetaCart
A structured organization of information is typically required by symbolic processing. On the other hand, most connectionist models assume that data are organized according to relatively poor structures, like arrays or sequences. The framework described in this paper is an attempt to unify adaptive models like artificial neural nets and belief nets for the problem of processing structured information. In particular, relations between data variables are expressed by directed acyclic graphs, where both numerical and categorical values coexist. The general framework proposed in this paper can be regarded as an extension of both recurrent neural networks and hidden Markov models to the case of acyclic graphs. In particular we study the supervised learning problem as the problem of learning transductions from an input structured space to an output structured space, where transductions are assumed to admit a recursive hidden statespace representation. We introduce a graphical formalism for r...
Sample Complexity for Learning Recurrent Perceptron Mappings
 IEEE Trans. Inform. Theory
, 1996
"... Recurrent perceptron classifiers generalize the classical perceptron model. They take into account those correlations and dependences among input coordinates which arise from linear digital filtering. This paper provides tight bounds on sample complexity associated to the fitting of such models to e ..."
Abstract

Cited by 23 (10 self)
 Add to MetaCart
Recurrent perceptron classifiers generalize the classical perceptron model. They take into account those correlations and dependences among input coordinates which arise from linear digital filtering. This paper provides tight bounds on sample complexity associated to the fitting of such models to experimental data. Keywords: perceptrons, recurrent models, neural networks, learning, VapnikChervonenkis dimension 1 Introduction One of the most popular approaches to binary pattern classification, underlying many statistical techniques, is based on perceptrons or linear discriminants ; see for instance the classical reference [9]. In this context, one is interested in classifying kdimensional input patterns v = (v 1 ; : : : ; v k ) into two disjoint classes A + and A \Gamma . A perceptron P which classifies vectors into A + and A \Gamma is characterized by a vector (of "weights") ~c 2 R k , and operates as follows. One forms the inner product ~c:v = c 1 v 1 + : : : c k v k . I...
A delay damage model selection algorithm for NARX neural networks
 IEEE TRANSACTIONS ON SIGNAL PROCESSING
, 1997
"... Recurrent neural networks have become popular models for system identification and time series prediction. Nonlinear autoregressive models with exogenous inputs (NARX) neural network models are a popular subclass of recurrent networks and have been used in many applications. Although embedded memory ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Recurrent neural networks have become popular models for system identification and time series prediction. Nonlinear autoregressive models with exogenous inputs (NARX) neural network models are a popular subclass of recurrent networks and have been used in many applications. Although embedded memory can be found in all recurrent network models, it is particularly prominent in NARX models. We show that using intelligent memory order selection through pruning and good initial heuristics significantly improves the generalization and predictive performance of these nonlinear systems on problems as diverse as grammatical inference and time series prediction.
Blind Source Separation and Deconvolution of Fast Sampled Signals
 Proceedings of the International Conference on Neural Information Processing, ICONIP97, New Zealand, Vol. I
, 1997
"... In real world implementations of blind source separation and deconvolution, the mixing takes place in continuous time. In the models normally considered, discrete time sampling is implicitly assumed to provide a mixing filter matrix from a suitable demixing filter matrix which can be learned given a ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In real world implementations of blind source separation and deconvolution, the mixing takes place in continuous time. In the models normally considered, discrete time sampling is implicitly assumed to provide a mixing filter matrix from a suitable demixing filter matrix which can be learned given an appropriate algorithm. In this paper, we consider the implications of trying to separate and deconvolve signals which may include some signals which are low frequency compared to the sample rate. It is shown that if a fast sampling rate is used to obtain the discrete time observed data, learning to solve blind source separation and deconvolution tasks can be very difficult. This is due to the data covariance matrix becoming almost singular. We propose a discrete time model based on alternative discrete time operators which is capable of overcoming the problems and giving significantly improved performance under the conditions described. 1 Introduction An important topic of growing intere...
Complete Memory Structures for Approximating Nonlinear DiscreteTime Mappings
"... This paper introduces a general structure that is capable of approximating inputoutput maps of nonlinear discretetime systems. The structure is comprised of two stages, a dynamical stage followed by a memoryless nonlinear stage. A theorem is presented which gives a simple necessary and sufficient ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
This paper introduces a general structure that is capable of approximating inputoutput maps of nonlinear discretetime systems. The structure is comprised of two stages, a dynamical stage followed by a memoryless nonlinear stage. A theorem is presented which gives a simple necessary and sufficient condition for a large set of structures of this form to be capable of modeling a wide class of nonlinear discretetime systems. In particular, we introduce the concept of a "complete memory." A structure with a complete memory dynamical stage and a sufficiently powerful memoryless stage is shown to be capable of approximating arbitrarily well a wide class of continuous, causal, timeinvariant, approximatelyfinitememory mappings between discretetime signal spaces. Furthermore we show that any boundedinput boundedoutput, timeinvariant, causal memory structure has such an approximation capability if and only if it is a complete memory. Several examples of linear and nonlinear complete mem...
Alternative DiscreteTime Operators: An Algorithm for Optimal Selection of Parameters
 IEEE Trans. on Signal Processing
, 1999
"... In this note, we consider the issue of parameter sensitivity in models based on alternative discrete time operators (ADTOs). A generic first order ADTO is proposed which encompasses all the known first order ADTOs introduced so far in the literature. New bounds on the operator parameters are derived ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In this note, we consider the issue of parameter sensitivity in models based on alternative discrete time operators (ADTOs). A generic first order ADTO is proposed which encompasses all the known first order ADTOs introduced so far in the literature. New bounds on the operator parameters are derived, and a new algorithm is given for optimally selecting the parameters to give minimum parameter senstivity. Keywords Digital filters, low sensitivity, delta operator, shift operator, gamma operator. I. INTRODUCTION An important aspect in the development of linear models, are practical properties such as parameter convergence, model robustness, parameter sensitivity, and parsimony in representations. In this note we consider the problem of parameter sensitivity which occurs when using a finite wordlength representation. Suppose we have a hardware implementation using two's complement arithmetic. Then any real valued number can be represented in bits as ...
Alternative discretetime operators and their application to nonlinear models
, 1997
"... The shift operator, defined as q x(t) = x(t+1), is the basis for almost all discretetime models. It has been shown however, that linear models based on the shift operator suffer problems when used to model lightlydampedlowfrequency (LDLF) systems, with poles near (1; 0) on the unit circle in th ..."
Abstract
 Add to MetaCart
The shift operator, defined as q x(t) = x(t+1), is the basis for almost all discretetime models. It has been shown however, that linear models based on the shift operator suffer problems when used to model lightlydampedlowfrequency (LDLF) systems, with poles near (1; 0) on the unit circle in the complex plane. This problem occurs under fast sampling conditions. As the sampling rate increases, coefficient sensitivity and roundoff noise become a problem as the difference between successive sampled inputs becomes smaller and smaller. The resulting coefficients of the model approach the coefficients obtained in a binomial expansion, regardless of the underlying continuoustime system. This implies that for a given finite wordlength, severe inaccuracies may result. Wordlengths for the coefficients may also need to be made longer to accommodate models which have low frequency characteristics, corresponding to poles in the neighbourhood of (1,0). These problems also arise in neural network models which comprise of linear parts and nonlinear neural activation functions. Various alternative discretetime operators can be introduced which offer numerical computational advantages over the conventional shift operator. The alternative discretetime operators have been proposed independently of each other in the fields of digital filtering, adaptive control and neural networks. These include the delta, rho, gamma and bilinear operators. In this paper we first review these operators and examine some of their properties. An analysis of the TDNN and FIR MLP network structures is given which shows their susceptibility to parameter sensitivity problems. Subsequently, it is shown that models may be formulated using alternative discretetime operators which have low sensitivity properties. Consideration is given to the problem of finding parameters for stable alternative discretetime operators. A learning algorithm which adapts the alternative discretetime operators parameters online is presented for MLP neural network models based on alternative discretetime operators. It is shown that neural network models which use these alternative discretetime perform better than those using the shift operator alone.