## Comparative study of stock trend prediction using time delay, recurrent and probabilistic neural networks (1998)

Venue: | IEEE TRANSACTIONS ON NEURAL NETWORKS |

Citations: | 37 - 0 self |

### BibTeX

@ARTICLE{Saad98comparativestudy,

author = {Emad W. Saad and Danil V. Prokhorov and Donald C. Wunsch and II},

title = {Comparative study of stock trend prediction using time delay, recurrent and probabilistic neural networks},

journal = {IEEE TRANSACTIONS ON NEURAL NETWORKS},

year = {1998},

volume = {9},

number = {6},

pages = {1456--1470}

}

### Years of Citing Articles

### OpenURL

### Abstract

Three networks are compared for low false alarm stock trend predictions. Short-term trends, particularly attractive for neural network analysis, can be used profitably in scenarios such as option trading, but only with significant risk. Therefore, we focus on limiting false alarms, which improves the risk/reward ratio by preventing losses. To predict stock trends, we exploit time delay, recurrent, and probabilistic neural networks (TDNN, RNN, and PNN, respectively), utilizing conjugate gradient and multistream extended Kalman filter training for TDNN and RNN. We also discuss different predictability analysis techniques and perform an analysis of predictability based on a history of daily closing price. Our results indicate that all the networks are feasible, the primary preference being one of convenience.

### Citations

3675 |
Neural Networks: Comprehensive Foundations
- Haykin
(Show Context)
Citation Context ...nts a threshold. The nonlinear activation function of the neuron then produces an output ; i.e., For all neurons, we used the hyperbolic tangent activation function which usually accelerates training =-=[5]-=-. (3) (4) (5) (6) A. Cost Function Conventional training of TDNN consists of minimizing a cost function, the mean square error of the network. Since we are willing to forego some profit opportunities ... |

2678 |
Introduction to statistical pattern recognition
- Fukunaga
- 1990
(Show Context)
Citation Context ...orecasting using the above techniques appear in Section VII. III. PROBABILISTIC NEURAL NETWORK The probabilistic neural network (PNN) [17] is an algorithm for approximating the Bayesian decision rule =-=[18]-=-. We use a PNN with four layers of dedicated nodes (Fig. 2). Twenty-nine input nodes are fully connected with the next layer of pattern nodes. Input nodes distribute components of the input . The PNN ... |

719 | Methods of conjugate gradients for solving linear systems - Hestenes, Stiefel - 1952 |

511 |
Detecting Strange Attractors in Turbulence
- Takens
- 1981
(Show Context)
Citation Context ...ble and its lagged versions is called embedding [32]. A system, with an dimensional attractor, can be embedded in an dimensional space provided that (25) This is according to Takens embedding theorem =-=[33]-=-. The attractor dimension can be estimated by different ways, including the method explained in the following section. While as a function of . The denominator is simply the variance and it serves for... |

326 |
Phoneme recognition using time-delay neural networks
- Waibel, Hanazawa, et al.
- 1989
(Show Context)
Citation Context ... study are feedforward multilayer perceptrons, where the internal weights are replaced by finite impulse response (FIR) filters (Fig. 1). This builds an internal memory for time series prediction [4]–=-=[8]-=-. Our goal is not price prediction but rather trend prediction, which can be formulated as a problem of pattern classification. An output of “1” corresponds to an upward trend of 2% or more, while an ... |

274 |
Backpropagation through time: what does it do and how to do it
- Werbos
- 1990
(Show Context)
Citation Context ... we compute derivatives of the RNN’s outputs, rather than output errors, with respect to the weights [24]. These derivatives are obtained through backpropagation through time or its truncated version =-=[25]-=-, [26]. We store them in a set of matrices , where each has dimension: size . This set is obtained by truncated backpropagation through time, with depth 20, meaning that we do not use more than 20 cop... |

255 |
Probabilistic neural networks
- Specht
- 1990
(Show Context)
Citation Context ...izing conjugate gradient training [5], [9]–[16]. Results of TDNN forecasting using the above techniques appear in Section VII. III. PROBABILISTIC NEURAL NETWORK The probabilistic neural network (PNN) =-=[17]-=- is an algorithm for approximating the Bayesian decision rule [18]. We use a PNN with four layers of dedicated nodes (Fig. 2). Twenty-nine input nodes are fully connected with the next layer of patter... |

226 |
Independent coordinates for strange attractors from mutual information
- Fraser, Swinney
(Show Context)
Citation Context ...ization. We then find the value of which gives the first zero correlation, as the lag for the phase space. An alternative method is to choose which gives the first minimum of the mutual information , =-=[35]-=-. This is supported by the argument that the different coordinates should be an uncorrelated delay (see also Section VI-D1). Unfortunately, according to (25), the method of plotting the phase diagram ... |

213 | Function minimization by conjugate gradients - Fletcher, Reeves - 1963 |

198 |
Procaccia I. Characterization of strange attractors
- Grassberger
- 1983
(Show Context)
Citation Context ...ce unusable for most economic systems due to their high dimensionality. B. Correlation Dimension This is one of the most popular measures of chaos. It has been introduced by Grassberger and Procaccia =-=[36]-=-, and it is a measure of the fractal dimension of a strange attractor. The name fractal comes from the fact that the dimension is not an integer [37]. For example, the attractor of a stable system is ... |

196 | A theory of networks for approximation and learning - Poggio, Girosi - 1989 |

189 |
Predicting the Future: A Connectionist Approach, submitted to the
- Weigend, Rumelhart
(Show Context)
Citation Context ...work, might be an improvement. If the conventional methods fail to calculate the system dimension, we can minimize output error of a neural network as a function of the number of hidden neurons [40], =-=[41]-=-. This number can estimate the system dimension. 2) Multistep Prediction: Chaotic systems are characterized by the exponential divergence between two closely starting paths. This makes the multistep p... |

173 |
J.H.: ARTMAP: Supervised Real-Time Learning and Classification of Nonstationary Data by a Self-Organizing Neural Network
- Carpenter, Grossberg, et al.
- 1991
(Show Context)
Citation Context ...ctions are in the forward direction. No derivatives are calculated. Therefore it is the fastest network in training. Other architectures exist which similarly have the advantage of fast training [42]–=-=[46]-=-. The weights between the input and pattern units are directly determined by training patterns. The weights of the output neuron are set according to (10). Training time is determined by duration of t... |

172 | A time-delay neural network architecture for isolated word recognition - Lang, Waibel, et al. - 1990 |

135 | Fast learning in networks of locally tuned processing units - Moody, Darken - 1989 |

118 | Gradient-based learning algorithms for recurrent networks and their computational complexity
- Williams, Zipser
- 1992
(Show Context)
Citation Context ...mpute derivatives of the RNN’s outputs, rather than output errors, with respect to the weights [24]. These derivatives are obtained through backpropagation through time or its truncated version [25], =-=[26]-=-. We store them in a set of matrices , where each has dimension: size . This set is obtained by truncated backpropagation through time, with depth 20, meaning that we do not use more than 20 copies of... |

100 |
et al., “Numerical Recipes in C: the art of scientific computing
- Press
- 1992
(Show Context)
Citation Context ...ndence of the future step costs on the current step state. Detailed description of the training method can be found in [5]. We further enhance our system by utilizing conjugate gradient training [5], =-=[9]-=-–[16]. Results of TDNN forecasting using the above techniques appear in Section VII. III. PROBABILISTIC NEURAL NETWORK The probabilistic neural network (PNN) [17] is an algorithm for approximating the... |

84 | Economic Prediction Using Neural Networks: The Case of IBM Daily Stock Returns - White - 1988 |

83 | Modolar construction of time-delay neural networks for speech recognition - Waibel - 1989 |

81 | Neural Network Time Series Forecasting of Financial Markets - Azoff - 1994 |

71 | Note sur la convergence de méthodes de directions conjugées - Polak, Ribière - 1969 |

69 |
Progress in supervised neural networks
- Hush, Horne
- 1993
(Show Context)
Citation Context ...g appear in Section VII. IV. RECURRENT NEURAL NETWORK AND ITS TRAINING The recurrent neural network (RNN) considered in this paper (Fig. 3) is a type of discrete-time recurrent multilayer perceptrons =-=[23]-=-. Temporal representation capabilities of this RNN can be better than those of purely feedforward networks, even with tapped-delay lines. Unlike other networks, RNN is capable of representing and enco... |

62 | Time series prediction by using a connectionist network with internal delays
- Wan
- 1994
(Show Context)
Citation Context ...this study are feedforward multilayer perceptrons, where the internal weights are replaced by finite impulse response (FIR) filters (Fig. 1). This builds an internal memory for time series prediction =-=[4]-=-–[8]. Our goal is not price prediction but rather trend prediction, which can be formulated as a problem of pattern classification. An output of “1” corresponds to an upward trend of 2% or more, while... |

57 |
Chaos and nonlinear dynamics: an introduction for scientists and engineers. oxford university press
- Hilborn
- 2000
(Show Context)
Citation Context ...ns, which is the main shortcoming of this technique. Each dimension is called the embedding dimension. The process of representing a system by one variable and its lagged versions is called embedding =-=[32]-=-. A system, with an dimensional attractor, can be embedded in an dimensional space provided that (25) This is according to Takens embedding theorem [33]. The attractor dimension can be estimated by di... |

49 | Nonlinear time sequence analysis - Grassberger, Schreiber, et al. - 1991 |

43 |
Conjugate gradient methods with inexact searches
- Shanno
- 1978
(Show Context)
Citation Context ...ce of the future step costs on the current step state. Detailed description of the training method can be found in [5]. We further enhance our system by utilizing conjugate gradient training [5], [9]–=-=[16]-=-. Results of TDNN forecasting using the above techniques appear in Section VII. III. PROBABILISTIC NEURAL NETWORK The probabilistic neural network (PNN) [17] is an algorithm for approximating the Baye... |

33 | Neural networks in financial engineering: A study in methodology - Refenes, Burgess, et al. - 1997 |

31 | Stock performance modeling using neural networks: a comparative study with regression models. Neural Networks 7:375–388 - Refenes, Zapranis, et al. - 1994 |

31 | rating: A nonconservative application of neural networks - Dutta, Shekhar, et al. - 1998 |

30 | Designing a neural network for forecasting financial and economic time series - Kaastra, Boyd - 1996 |

21 |
Deterministic chaos: the science and the fiction
- Ruelle
- 1990
(Show Context)
Citation Context ... , this indicates that we have a chaotic series. 1) Effect of Finite Number of Data Points: If we use a series of finite length , it establishes an upper limit on the calculated correlation dimension =-=[38]-=-. Consider the slope of over orders of magnitude, extending from to , which correspond to and , respectively. We then find (32) The lower limit of is , and the upper limit is one. For large , the calc... |

18 | An application of a multiple neural network learning system to emulation of mortgage underwriting judgments - Collins, Ghosh, et al. - 1998 |

18 | Neural networks for bond rating improved by multiple hidden layers - Surkan, Singleton - 1990 |

16 |
Dynamic neural network methods applied to on-vehicle idle speed control
- Puskorius, Feldkamp, et al.
- 1996
(Show Context)
Citation Context ... deeply hidden states, in which a network’s output depends on an arbitrary number of previous inputs. Among many methods proposed for training RNN’s, extended Kalman filter1 (EKF) training stands out =-=[24]-=-. EKF training is a parameter identification technique for a nonlinear dynamic system (RNN). This method adapts weights of the network pattern-by-pattern accumulating training information in approxima... |

15 |
The Mathematical Theory of Probabilities
- Fisher
- 1923
(Show Context)
Citation Context ...o our problem. This establishes the lower limit of performance when comparing results of various neural networks (see Section VII). The linear classifier we used is the Fisher linear classifier [18], =-=[30]-=- which has the form (18) where is the vector to be classified. It consists of a delay line of length 50: which carries the stock price on the day of purchase as well as on the previous 49 days. is a l... |

13 | Testability of the arbitrage pricing theory by neural networks - Ahmadi - 1990 |

11 |
CHAOS. A program Collection for the PC
- Korsch, Jodl
- 1994
(Show Context)
Citation Context ...t has been introduced by Grassberger and Procaccia [36], and it is a measure of the fractal dimension of a strange attractor. The name fractal comes from the fact that the dimension is not an integer =-=[37]-=-. For example, the attractor of a stable system is a point which has zero dimension. The attractor of an oscillating system is a circle which has two dimensions. These attractors have integer dimensio... |

10 |
The future of time series: Learning and understanding. In Time series prediction: Forecasting the future and understanding the past; Addison-Wesley
- Gershenfeld, Weigend
- 1993
(Show Context)
Citation Context ...ay network, might be an improvement. If the conventional methods fail to calculate the system dimension, we can minimize output error of a neural network as a function of the number of hidden neurons =-=[40]-=-, [41]. This number can estimate the system dimension. 2) Multistep Prediction: Chaotic systems are characterized by the exponential divergence between two closely starting paths. This makes the multi... |

10 |
Potential function algorithms for pattern recognition learning machines,” Automation and Remote Control
- Bashkirov, Braverman, et al.
- 1964
(Show Context)
Citation Context ...connections are in the forward direction. No derivatives are calculated. Therefore it is the fastest network in training. Other architectures exist which similarly have the advantage of fast training =-=[42]-=-–[46]. The weights between the input and pattern units are directly determined by training patterns. The weights of the output neuron are set according to (10). Training time is determined by duration... |

8 | A commodity trading model based on a neural network- expert system hybrid - Bergerson, Wunsch |

6 |
Probabilistic Neural Networks and General Regression Neural Networks, FuzzyLogic and Neural Network Handbook, Chap3. Mac Grow Hill inc
- Specht
- 1995
(Show Context)
Citation Context ...set. The th pattern node output function is where is the th training pattern, and is the smoothing parameter of the Gaussian kernel. Other alternatives to are available [17], including with adaptable =-=[19]-=-, and full covariance matrices [20]. (9) where is the ratio of losses associated with false alarms to those associated with missed profit opportunities. We have used , emphasizing the importance of av... |

6 | Predicting Japanese Corporate Bankruptcy in terms of financial data using neural networks - Tsukada, Baba - 1994 |

5 | Dasgupta "Classifying Trend Movements in the MSCI U.S.A Capital Market Index - A Comparison of Regression, ARIMA and Neural Network Methods - Wood, Bhaskar - 1996 |

5 | Price Prediction Using Neural Networks: An Empirical Test - Schoneburg, \Stock - 1991 |

4 |
Probabilistic and time-delay neural-network techniques for conservative short-term stock trend prediction
- Tan, Prokhorov, et al.
- 1995
(Show Context)
Citation Context ...niversity, Lubbock, TX 79409-3102 USA. D. V. Prokhorov is with the Scientific Research Laboratory, Ford Motor Co., Dearborn, MI 48121-2053 USA. Publisher Item Identifier S 1045-9227(98)07353-6. works =-=[1]-=-–[3]. This paper compares the three networks and evaluates them against a conventional method of prediction. Also, a predictability analysis of the stock data is presented and related to the neural-ne... |

4 |
Time-delay neural network for small time series data sets
- Kreesuradej, Wunsch, et al.
- 1994
(Show Context)
Citation Context ...ediction according to the relation (39) where and are the first and th step prediction errors, respectively. We have previously compared neural and conventional (ARMA) techniques for price prediction =-=[2]-=-, as opposed to trend classification considered in this paper. While performance of both TDNN and ARMA is similar for single step predictions, TDNN outperforms ARMA in accuracy of multistep prediction... |

4 |
Training controllers for robustness: multi-stream DEKF
- Feldkamp, Puskorius
- 1994
(Show Context)
Citation Context ...training with a batch-like update, without violating consistency between the weights and the approximate error covariance matrices , the multistream training approach was first proposed and tested in =-=[27]-=-. We assume data streams. It is then useful to consider copies of the same RNN (weights are identical for all the copies). Each copy is assigned a separate stream. We apply each copy of the RNN to a t... |

4 | Input Variable Selection for Neural Networks; Application to Predicting the - Moody, Rehfuss, et al. - 1995 |

4 | Risk assessment of mortgage applications with a neural network system: An update as the test portfolio ages - DL, Collins, et al. - 1991 |

3 |
Advanced neural-network training methods for low false alarm stock trend prediction
- Saad, Prokhorov, et al.
- 1996
(Show Context)
Citation Context ...rsity, Lubbock, TX 79409-3102 USA. D. V. Prokhorov is with the Scientific Research Laboratory, Ford Motor Co., Dearborn, MI 48121-2053 USA. Publisher Item Identifier S 1045-9227(98)07353-6. works [1]–=-=[3]-=-. This paper compares the three networks and evaluates them against a conventional method of prediction. Also, a predictability analysis of the stock data is presented and related to the neural-networ... |