Results 1  10
of
75
Learning longterm dependencies in NARX recurrent neural networks
 IEEE Transactions on Neural Networks
, 1996
"... Abstract It has recently been shown that gradientdescent learning algorithms for recurrent neural networks can perform poorly on tasks that involve longterm dependencies, i.e., those problems for which the desired output depends on inputs presented at times far in the past. We show that the long ..."
Abstract

Cited by 68 (5 self)
 Add to MetaCart
Abstract It has recently been shown that gradientdescent learning algorithms for recurrent neural networks can perform poorly on tasks that involve longterm dependencies, i.e., those problems for which the desired output depends on inputs presented at times far in the past. We show that the longterm dependencies problem is lessened for a class of architectures called Nonlinear AutoRegressive models with exogenous (NARX) recurrent neural networks, which have powerful representational capabilities. We have previously reported that gradient descent learning can be more effective in NARX networks than in recurrent neural network architectures that have “hidden states ” on problems including grammatical inference and nonlinear system identification. Typically, the network converges much faster and generalizes better than other networks. The results in this paper are consistent with this phenomenon. We present some experimental results which show that NARX networks can often retain information for two to three times as long as conventional recurrent neural networks. We show that although NARX networks do not circumvent the problem of longterm dependencies, they can greatly improve performance on longterm dependency problems. We also describe in detail some of the assumption regarding what it means to latch information robustly and suggest possible ways to loosen these assumptions. I.
Comparative study of stock trend prediction using time delay, recurrent and probabilistic neural networks
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 1998
"... Three networks are compared for low false alarm stock trend predictions. Shortterm trends, particularly attractive for neural network analysis, can be used profitably in scenarios such as option trading, but only with significant risk. Therefore, we focus on limiting false alarms, which improves ..."
Abstract

Cited by 53 (0 self)
 Add to MetaCart
(Show Context)
Three networks are compared for low false alarm stock trend predictions. Shortterm trends, particularly attractive for neural network analysis, can be used profitably in scenarios such as option trading, but only with significant risk. Therefore, we focus on limiting false alarms, which improves the risk/reward ratio by preventing losses. To predict stock trends, we exploit time delay, recurrent, and probabilistic neural networks (TDNN, RNN, and PNN, respectively), utilizing conjugate gradient and multistream extended Kalman filter training for TDNN and RNN. We also discuss different predictability analysis techniques and perform an analysis of predictability based on a history of daily closing price. Our results indicate that all the networks are feasible, the primary preference being one of convenience.
Action Reaction Learning: Automatic Visual Analysis and Synthesis of Interactive Behaviour
 in Proc. International Conference on Vision Systems
, 1999
"... We propose ActionReaction Learning as an approach for analyzing and synthesizing human behaviour. This paradigm uncovers causal mappings between past and future events or between an action and its reaction by observing time sequences. We apply this method to analyze human interaction and to subs ..."
Abstract

Cited by 44 (3 self)
 Add to MetaCart
We propose ActionReaction Learning as an approach for analyzing and synthesizing human behaviour. This paradigm uncovers causal mappings between past and future events or between an action and its reaction by observing time sequences. We apply this method to analyze human interaction and to subsequently synthesize human behaviour. Using a time series of perceptual measurements, a system automatically uncovers correlations between past gestures from one human participant (an action) and a subsequent gesture(areaction) from another participant. A probabilistic model is trainedfrom data of the human interaction using a novel estimation technique, Conditional Expectation Maximization (CEM). The estimation uses general bounding and maximization to monotonically find the maximum conditional likelihood solution. The learning system drives a graphical interactive character which probabilistically predicts a likely response to a user's behaviour and performs it interactively. Thus, after analyzing human interaction in a pair of participants, the system is able to replace one of them and interact with a single remaining user. 1
Predicting the Stock Market
, 1998
"... This paper presents a tuturial introduction to predictions of stock time series. The various approaches of technical and fundamental analysis is presented and the prediction problem is formulated as a special case of inductive learning. The problems with performance evaluation of nearrandomwalk pr ..."
Abstract

Cited by 31 (1 self)
 Add to MetaCart
(Show Context)
This paper presents a tuturial introduction to predictions of stock time series. The various approaches of technical and fundamental analysis is presented and the prediction problem is formulated as a special case of inductive learning. The problems with performance evaluation of nearrandomwalk processes are illustrated with examples together with guidelines for avoiding the risk of datasnooping. The connections to concepts like "the biasvariance dilemma", overtraining and model complexity are further covered. Existing benchmarks and testing metrics are surveyed and some new measures are introduced.
Predictions with Confidence Intervals (Local Error Bars)
, 1994
"... We present a new method for obtaining local error bars, i.e., estimates of the confidence in the predicted value that depend on the input. We approach this problem of nonlinear regression in a maximum likelihood framework. We demonstrate our technique first on computer generated data with locally ..."
Abstract

Cited by 27 (3 self)
 Add to MetaCart
We present a new method for obtaining local error bars, i.e., estimates of the confidence in the predicted value that depend on the input. We approach this problem of nonlinear regression in a maximum likelihood framework. We demonstrate our technique first on computer generated data with locally varying, normally distributed target noise. We then apply it to the laser data from the Santa Fe Time Series Competition. Finally, we extend the technique to estimate error bars for iterated predictions, and apply it to the exact competition task where it gives the best performance to date. 1 Obtaining Error Bars Using a Maximum Likelihood Framework 1.1 Motivation and Concept Feedforward artificial neural networks are widely used and wellsuited for nonlinear regression. They can be interpreted as predicting the expected value of the conditional target distribution as a function of (or "conditioned on") the input pattern (e.g., Buntine & Weigend, 1991). This target distribution in re...
Spectral basis neural networks for realtime travel time forecasting
 ASCE Journal of Transportation Engineering
, 1999
"... ABSTRACT: This paper examines how realtime information gathered as part of intelligent transportation systems can be used to predict link travel times for one through five time periods ahead (of 5min duration). The study employed a spectral basis artificial neural network (SNN) that utilizes a sin ..."
Abstract

Cited by 25 (3 self)
 Add to MetaCart
(Show Context)
ABSTRACT: This paper examines how realtime information gathered as part of intelligent transportation systems can be used to predict link travel times for one through five time periods ahead (of 5min duration). The study employed a spectral basis artificial neural network (SNN) that utilizes a sinusoidal transformation technique to increase the linear separability of the input features. Link travel times from Houston that had been collected as part of the automatic vehicle identification system of the TranStar system were used as a test bed. It was found that the SNN outperformed a conventional artificial neural network and gave similar results to that of modular neural networks. However, the SNN requires significantly less effort on the part of the modeler than modular neural networks. The results of the best SNN were compared with conventional link travel time prediction techniques including a Kalman filtering model, exponential smoothing model, historical profile, and realtime profile. It was found that the SNN gave the best overall results.
Learning a Class of Large Finite State Machines with a Recurrent Neural Network
, 1995
"... One of the issues in any learning model is how it scales with problem size. The problem of learning finite state machine (FSMs) from examples with recurrent neural networks has been extensively explored. However, these results are somewhat disappointing in the sense that the machines that can be le ..."
Abstract

Cited by 22 (11 self)
 Add to MetaCart
One of the issues in any learning model is how it scales with problem size. The problem of learning finite state machine (FSMs) from examples with recurrent neural networks has been extensively explored. However, these results are somewhat disappointing in the sense that the machines that can be learned are too small to be competitive with existing grammatical inference algorithms. We show that a type of recurrent neural network (Narendra & Parthasarathy, 1990, IEEE Trans. Neural Networks, 1, 427) which has feedback but no hidden state neurons can learn a special type of FSM called a finite memory machine (FMM) under certain constraints. These machines have a large number of states (simulations are for 256 and 512 state FMMs) but have minimal order, relatively small depth and little logic when the FMM is implemented as a sequential machine,
An Anytime Approach To Connectionist Theory Refinement: Refining The Topologies Of KnowledgeBased Neural Networks
, 1995
"... Many scientific and industrial problems can be better understood by learning from samples of the task at hand. For this reason, the machine learning and statistics communities devote considerable research effort on generating inductivelearning algorithms that try to learn the true "concept&quo ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
Many scientific and industrial problems can be better understood by learning from samples of the task at hand. For this reason, the machine learning and statistics communities devote considerable research effort on generating inductivelearning algorithms that try to learn the true "concept" of a task from a set of its examples. Often times, however, one has additional resources readily available, but largely unused, that can improve the concept that these learning algorithms generate. These resources include available computer cycles, as well as prior knowledge describing what is currently known about the domain. Effective utilization of available computer time is important since for most domains an expert is willing to wait for weeks, or even months, if a learning system can produce an improved concept. Using prior knowledge is important since it can contain information not present in the current set of training examples. In this thesis, I present three "anytime" approaches to connec...
Diagrammatic Derivation of Gradient Algorithms for Neural Networks
 in Neural Computation
, 1994
"... Deriving gradient algorithms for timedependent neural network structures typically requires numerous chain rule expansions, diligent bookkeeping, and careful manipulation of terms. In this paper, we show how to use the principle of Network Reciprocity to derive such algorithms via a set of simple b ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
(Show Context)
Deriving gradient algorithms for timedependent neural network structures typically requires numerous chain rule expansions, diligent bookkeeping, and careful manipulation of terms. In this paper, we show how to use the principle of Network Reciprocity to derive such algorithms via a set of simple block diagram manipulation rules. The approach provides a common framework to derive popular algorithms including backpropagation and backpropagationthroughtime without a single chain rule expansion. Additional examples are provided for a variety of complicated architectures to illustrate both the generality and the simplicity of the approach. 1 Introduction Deriving the appropriate gradient descent algorithm for a new network architecture or system configuration normally involves brute force derivative calculations. For example, the celebrated backpropagation algorithm for training feedforward neural networks was derived by repeatedly applying chain rule expansions backward through the ne...
Local Learning for Iterated Time Series Prediction
 In
, 1999
"... We introduce and discuss a local method to learn onestepahead predictors for iterated time series forecasting. For each single onestepahead prediction, our method selects among different alternatives a local model representation on the basis of a local crossvalidation procedure. In the literatur ..."
Abstract

Cited by 20 (7 self)
 Add to MetaCart
(Show Context)
We introduce and discuss a local method to learn onestepahead predictors for iterated time series forecasting. For each single onestepahead prediction, our method selects among different alternatives a local model representation on the basis of a local crossvalidation procedure. In the literature, local learning is generally used for function estimation tasks which do not take temporal behaviors into account. Our technique extends this approach to the problem of longhorizon prediction by proposing a local model selection based on an iterated version of the PRESS leaveoneout statistic. In order to show the effectiveness of our method, we present the results obtained on two time series from the Santa Fe competition and on a time series proposed in a recent international contest. 1 Introduction The use of local memorybased approximators for time series analysis has been the focus of numerous studies in the literature [5, 14]. Memorybased approaches do not estimate a global model...