• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Long Short Term Memory (1995)

Cached

  • Download as a PDF
  •  
  • Download as a PS

Download Links

  • [wwwbrauer.in.tum.de]
  • [wwwbrauer.in.tum.de]
  • [ftp.idsia.ch]
  • [wwwbrauer.informatik.tu-muenchen.de]
  • [ftp.idsia.ch]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Sepp Hochreiter , Jürgen Schmidhuber
Citations:178 - 51 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Hochreiter95longshort,
    author = {Sepp Hochreiter and Jürgen Schmidhuber},
    title = {Long Short Term Memory},
    year = {1995}
}

Years of Citing Articles

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

"Recurrent backprop" for learning to store information over extended time periods takes too long. The main reason is insufficient, decaying error back flow. We describe a novel, efficient "Long Short Term Memory" (LSTM) that overcomes this and related problems. Unlike previous approaches, LSTM can learn to bridge arbitrary time lags by enforcing constant error flow. Using gradient descent, LSTM explicitly learns when to store information and when to access it. In experimental comparisons with "Real-Time Recurrent Learning", "Recurrent Cascade-Correlation", "Elman nets", and "Neural Sequence Chunking", LSTM leads to many more successful runs, and learns much faster. Unlike its competitors, LSTM can solve tasks involving minimal time lags of more than 1000 time steps, even in noisy environments.

Citations

1313 Finding structure in time - Elman - 1990
214 Learning long-term dependencies with gradient descent is difficult - Bengio, Simard, et al. - 1994
151 A time-delay neural network architecture for isolated word recognition - Lang, Waibel, et al. - 1990
150 Learning state space trajectories in recurrent neural networks - Pearlmutter - 1989
119 Gradient calculations for dynamic recurrent neural networks: A survey - Pearlmutter - 1995
105 An efficient gradient-based algorithm for on-line training of recurrent network trajectories - Williams, Peng - 1990
82 Induction of finite-state languages using second-order recurrent networks - Watrous, Kuhn - 1992
76 The utility driven dynamic error propagation network - Robinson, Fallside - 1987
66 A focused back-propagation algorithm for temporal pattern recognition - Mozer - 1989
60 Neurocontrol of nonlinear dynamical systems with kalman filter trained recurrent networks - Puskorius, Feldkamp - 1994
54 Induction of multiscale temporal structure - Mozer - 1992
54 Learning complex, extended sequences using the principle of history compression - Schmidhuber - 1992
31 A fixed size storage O(n ) time complexity learning algorithm for fully recurrent continually running networks - Schmidhuber - 1992
30 Untersuchungen zu dynamischen neuronalen Netzen. Diploma thesis - Hochreiter - 1991
28 Experimental comparison of the effect of order in recurrent neural networks - Miller, Giles - 1993
27 The recurrent cascade-correlation learning algorithm - Fahlman - 1991
26 Learning sequential tasks by incrementally adding higher orders - Ring - 1993
25 Credit assignment through time: Alternatives to backpropagation - Bengio, Frasconi - 1994
25 Adaptive neural oscillator using continuous-time backpropagation learning - Doya, Shuji - 1989
22 Finite-state automata and simple recurrent networks - Cleeremans, Servan-Schreiber, et al. - 1989
21 Complexity of exact gradient computation algorithms for recurrent neural networks (Tech. Rep - Williams - 1989
17 Learning sequential structures with the real-time recurrent learning algorithm - Smith, Zipser - 1989
16 A theory for neural networks with time delays - Vries, Principe
15 Learning long-term dependencies is not as difficult with narx recurrent neural networks - Lin, Horne, et al. - 1996
14 The Neural Bucket Brigade: A local learning algorithm for dynamic feedforward and recurrent networks - Schmidhuber - 1989
14 Hochreiter S, Guessing can outperform many long time lag algorithms - Schmidhuber - 1996
10 Holographic recurrent networks - Plate - 1992
9 Time warping invariant neural networks - Sun, Chen, et al. - 1993
6 Netzwerkarchitekturen, Zielfunktionen und Kettenregel - Schmidhuber - 1993
3 Language induction by phase transition in dynamical recognizers - Pollack - 1991
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University