Results 1  10
of
47
Learning longterm dependencies in NARX recurrent neural networks
, 1996
"... It has recently been shown that gradientdescent learning algorithms for recurrent neural networks can perform poorly on tasks that involve longterm dependencies, i.e. those problems for which the desired output depends on inputs presented at times far in the past. We show tht the longterm de ..."
Abstract

Cited by 46 (5 self)
 Add to MetaCart
It has recently been shown that gradientdescent learning algorithms for recurrent neural networks can perform poorly on tasks that involve longterm dependencies, i.e. those problems for which the desired output depends on inputs presented at times far in the past. We show tht the longterm dependencies problem is lessened for a class of architectures called NARX recurrent neural networks, which have powerful representational capabilities. We have previously reported that gradient descent learning can be more effective in NARX networks than in recurrent neural network architectures that have "hidden states" on problems including grammatical inference and nonlinear system identification. Typically, the network converges much faster and generalizes better than other networks. The results in this paper are consistent with this phenomenon. We present some experimental results which show that NARX networks can often retain information for two to three times as long as conventi...
Action Reaction Learning: Automatic Visual Analysis and Synthesis of Interactive Behaviour
 in Proc. International Conference on Vision Systems
, 1999
"... We propose ActionReaction Learning as an approach for analyzing and synthesizing human behaviour. This paradigm uncovers causal mappings between past and future events or between an action and its reaction by observing time sequences. We apply this method to analyze human interaction and to subs ..."
Abstract

Cited by 38 (3 self)
 Add to MetaCart
We propose ActionReaction Learning as an approach for analyzing and synthesizing human behaviour. This paradigm uncovers causal mappings between past and future events or between an action and its reaction by observing time sequences. We apply this method to analyze human interaction and to subsequently synthesize human behaviour. Using a time series of perceptual measurements, a system automatically uncovers correlations between past gestures from one human participant (an action) and a subsequent gesture(areaction) from another participant. A probabilistic model is trainedfrom data of the human interaction using a novel estimation technique, Conditional Expectation Maximization (CEM). The estimation uses general bounding and maximization to monotonically find the maximum conditional likelihood solution. The learning system drives a graphical interactive character which probabilistically predicts a likely response to a user's behaviour and performs it interactively. Thus, after analyzing human interaction in a pair of participants, the system is able to replace one of them and interact with a single remaining user. 1
Comparative study of stock trend prediction using time delay, recurrent and probabilistic neural networks
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 1998
"... Three networks are compared for low false alarm stock trend predictions. Shortterm trends, particularly attractive for neural network analysis, can be used profitably in scenarios such as option trading, but only with significant risk. Therefore, we focus on limiting false alarms, which improves ..."
Abstract

Cited by 36 (0 self)
 Add to MetaCart
Three networks are compared for low false alarm stock trend predictions. Shortterm trends, particularly attractive for neural network analysis, can be used profitably in scenarios such as option trading, but only with significant risk. Therefore, we focus on limiting false alarms, which improves the risk/reward ratio by preventing losses. To predict stock trends, we exploit time delay, recurrent, and probabilistic neural networks (TDNN, RNN, and PNN, respectively), utilizing conjugate gradient and multistream extended Kalman filter training for TDNN and RNN. We also discuss different predictability analysis techniques and perform an analysis of predictability based on a history of daily closing price. Our results indicate that all the networks are feasible, the primary preference being one of convenience.
Predictions with Confidence Intervals (Local Error Bars)
, 1994
"... We present a new method for obtaining local error bars, i.e., estimates of the confidence in the predicted value that depend on the input. We approach this problem of nonlinear regression in a maximum likelihood framework. We demonstrate our technique first on computer generated data with locally ..."
Abstract

Cited by 23 (3 self)
 Add to MetaCart
We present a new method for obtaining local error bars, i.e., estimates of the confidence in the predicted value that depend on the input. We approach this problem of nonlinear regression in a maximum likelihood framework. We demonstrate our technique first on computer generated data with locally varying, normally distributed target noise. We then apply it to the laser data from the Santa Fe Time Series Competition. Finally, we extend the technique to estimate error bars for iterated predictions, and apply it to the exact competition task where it gives the best performance to date. 1 Obtaining Error Bars Using a Maximum Likelihood Framework 1.1 Motivation and Concept Feedforward artificial neural networks are widely used and wellsuited for nonlinear regression. They can be interpreted as predicting the expected value of the conditional target distribution as a function of (or "conditioned on") the input pattern (e.g., Buntine & Weigend, 1991). This target distribution in re...
Learning a Class of Large Finite State Machines with a Recurrent Neural Network
, 1995
"... One of the issues in any learning model is how it scales with problem size. The problem of learning finite state machine (FSMs) from examples with recurrent neural networks has been extensively explored. However, these results are somewhat disappointing in the sense that the machines that can be le ..."
Abstract

Cited by 20 (11 self)
 Add to MetaCart
One of the issues in any learning model is how it scales with problem size. The problem of learning finite state machine (FSMs) from examples with recurrent neural networks has been extensively explored. However, these results are somewhat disappointing in the sense that the machines that can be learned are too small to be competitive with existing grammatical inference algorithms. We show that a type of recurrent neural network (Narendra & Parthasarathy, 1990, IEEE Trans. Neural Networks, 1, 427) which has feedback but no hidden state neurons can learn a special type of FSM called a finite memory machine (FMM) under certain constraints. These machines have a large number of states (simulations are for 256 and 512 state FMMs) but have minimal order, relatively small depth and little logic when the FMM is implemented as a sequential machine,
An Anytime Approach To Connectionist Theory Refinement: Refining The Topologies Of KnowledgeBased Neural Networks
, 1995
"... Many scientific and industrial problems can be better understood by learning from samples of the task at hand. For this reason, the machine learning and statistics communities devote considerable research effort on generating inductivelearning algorithms that try to learn the true "concept" of a ta ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
Many scientific and industrial problems can be better understood by learning from samples of the task at hand. For this reason, the machine learning and statistics communities devote considerable research effort on generating inductivelearning algorithms that try to learn the true "concept" of a task from a set of its examples. Often times, however, one has additional resources readily available, but largely unused, that can improve the concept that these learning algorithms generate. These resources include available computer cycles, as well as prior knowledge describing what is currently known about the domain. Effective utilization of available computer time is important since for most domains an expert is willing to wait for weeks, or even months, if a learning system can produce an improved concept. Using prior knowledge is important since it can contain information not present in the current set of training examples. In this thesis, I present three "anytime" approaches to connec...
Learning longterm dependencies is not as difficult with NARX recurrent neural networks
, 1996
"... It has recently been shown that gradient descent learning algorithms for recurrent neural networks can perform poorly on tasks that involve longterm dependencies, i.e. those problems for which the desired output depends on inputs presented at times far in the past. In this paper we explore the lon ..."
Abstract

Cited by 18 (3 self)
 Add to MetaCart
It has recently been shown that gradient descent learning algorithms for recurrent neural networks can perform poorly on tasks that involve longterm dependencies, i.e. those problems for which the desired output depends on inputs presented at times far in the past. In this paper we explore the longterm dependencies problem for a class of architectures called NARX recurrent neural networks, which have powerful representational capabilities. We have previously reported that gradient descent learning is more effective in NARX networks than in recurrent neural network architectures that have "hidden states" on problems including grammatical inference and nonlinear system identification. Typically, the network converges much faster and generalizes better than other networks. The results in this paper are an attempt to explain this phenomenon. We present some experimental results which show that NARX networks can often retain information for two to three times as long as conventional rec...
Predicting the Stock Market
, 1998
"... This paper presents a tuturial introduction to predictions of stock time series. The various approaches of technical and fundamental analysis is presented and the prediction problem is formulated as a special case of inductive learning. The problems with performance evaluation of nearrandomwalk pr ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
This paper presents a tuturial introduction to predictions of stock time series. The various approaches of technical and fundamental analysis is presented and the prediction problem is formulated as a special case of inductive learning. The problems with performance evaluation of nearrandomwalk processes are illustrated with examples together with guidelines for avoiding the risk of datasnooping. The connections to concepts like "the biasvariance dilemma", overtraining and model complexity are further covered. Existing benchmarks and testing metrics are surveyed and some new measures are introduced.
Diagrammatic Derivation of Gradient Algorithms for Neural Networks
 in Neural Computation
, 1994
"... Deriving gradient algorithms for timedependent neural network structures typically requires numerous chain rule expansions, diligent bookkeeping, and careful manipulation of terms. In this paper, we show how to use the principle of Network Reciprocity to derive such algorithms via a set of simple b ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
Deriving gradient algorithms for timedependent neural network structures typically requires numerous chain rule expansions, diligent bookkeeping, and careful manipulation of terms. In this paper, we show how to use the principle of Network Reciprocity to derive such algorithms via a set of simple block diagram manipulation rules. The approach provides a common framework to derive popular algorithms including backpropagation and backpropagationthroughtime without a single chain rule expansion. Additional examples are provided for a variety of complicated architectures to illustrate both the generality and the simplicity of the approach. 1 Introduction Deriving the appropriate gradient descent algorithm for a new network architecture or system configuration normally involves brute force derivative calculations. For example, the celebrated backpropagation algorithm for training feedforward neural networks was derived by repeatedly applying chain rule expansions backward through the ne...
TimeDelay Neural Networks: Representation and Induction of Finite State Machines
 IEEE Transactions on Neural Networks
, 1997
"... In this work, we characterize and contrast the capabilities of the general class of timedelay neural networks (TDNN), with input delay neural networks (IDNN), the subclass of TDNNs with delays limited to the inputs. Each class of networks is capable of representing the same set of languages, those ..."
Abstract

Cited by 15 (6 self)
 Add to MetaCart
In this work, we characterize and contrast the capabilities of the general class of timedelay neural networks (TDNN), with input delay neural networks (IDNN), the subclass of TDNNs with delays limited to the inputs. Each class of networks is capable of representing the same set of languages, those embodied by the definite memory machines (DMM), a subclass of finite state machines. We demonstrate the close affinity between TDNNs and DMM languages by learning a very large DMM (2048 states) using only a few training examples. Even though both architectures are capable of representing the same class of languages, they have distinguishable learning biases. Intuition suggests that general TDNNs which include delays in hidden layers should perform well, compared to IDNNs, on problems in which the output can be expressed as a function on narrow input windows which repeat in time. On the other hand, these general TDNNs should perform poorly when the input windows are wide, or there is little r...