Learning long-term dependencies with gradient descent is difficult (1994)

by Y Bengio, P Simard, P Frasconi
Venue:IEEE Transactions on Neural Networks