Learning long–term dependencies with gradient descent is difficult (1994)

by Y Bengio, P Frasconi, P Simard
Venue:IEEE Transactions on Neural Networks