Learning Long-Term Dependencies with Gradient Descent is Difficult

by Yoshua Bengio , Patrice Simard , Paolo Frasconi
Venue: TO APPEAR IN THE SPECIAL ISSUE ON RECURRENT NETWORKS OF THE IEEE TRANSACTIONS ON NEURAL NETWORKS
Citations:256 - 24 self