## Diffusion of context and credit information in Markovian models (1995)

Venue: | Journal of Artificial Intelligence Research |

Citations: | 18 - 2 self |

@ARTICLE{Bengio95diffusionof,

author = {Yoshua Bengio and Paolo Frasconi},

title = {Diffusion of context and credit information in Markovian models},

journal = {Journal of Artificial Intelligence Research},

year = {1995},

volume = {3},

pages = {3--249}

}

### Abstract

This paper studies the problem of ergodicity of transition probabilitymatricesinMarkovian models, such as hidden Markov models (HMMs), and how itmakes very di cult the task of learning to represent long-term context for sequential data. This phenomenon hurts the forward propagation of long-term context information, as well as learning a hidden state representation to represent long-term context, which depends on propagating credit information backwards in time. Using results from Markov chain theory, weshow that this problem of di usion of context and credit is reduced when the transition probabilities approach 0 or 1, i.e., the transition probability matrices are sparse and the model essentially deterministic. The results found in this paper apply to learning approachesbasedon continuous optimization, such asgradient descent and the Baum-Welch algorithm. 1.

