Results 11  20
of
215
Formal Theory of Creativity, Fun, and Intrinsic Motivation (19902010)
"... The simple but general formal theory of fun & intrinsic motivation & creativity (1990) is based on the concept of maximizing intrinsic reward for the active creation or discovery of novel, surprising patterns allowing for improved prediction or data compression. It generalizes the traditional fiel ..."
Abstract

Cited by 34 (14 self)
 Add to MetaCart
The simple but general formal theory of fun & intrinsic motivation & creativity (1990) is based on the concept of maximizing intrinsic reward for the active creation or discovery of novel, surprising patterns allowing for improved prediction or data compression. It generalizes the traditional field of active learning, and is related to old but less formal ideas in aesthetics theory and developmental psychology. It has been argued that the theory explains many essential aspects of intelligence including autonomous development, science, art, music, humor. This overview first describes theoretically optimal (but not necessarily practical) ways of implementing the basic computational principles on exploratory, intrinsically motivated agents or robots, encouraging them to provoke event sequences exhibiting previously unknown but learnable algorithmic regularities. Emphasis is put on the importance of limited computational resources for online prediction and compression. Discrete and continuous time formulations are given. Previous practical but nonoptimal implementations (1991, 1995, 19972002) are reviewed, as well as several recent variants by others (2005). A simplified typology addresses current confusion concerning the precise nature of intrinsic motivation.
New Results on Recurrent Network Training: Unifying the Algorithms and Accelerating Convergence
 IEEE TRANS. NEURAL NETWORKS
, 2000
"... How to efficiently train recurrent networks remains a challenging and active research topic. Most of the proposed training approaches are based on computational ways to efficiently obtain the gradient of the error function, and can be generally grouped into five major groups. In this study we presen ..."
Abstract

Cited by 33 (1 self)
 Add to MetaCart
How to efficiently train recurrent networks remains a challenging and active research topic. Most of the proposed training approaches are based on computational ways to efficiently obtain the gradient of the error function, and can be generally grouped into five major groups. In this study we present a derivation that unifies these approaches. We demonstrate that the approaches are only five different ways of solving a particular matrix equation. The second goal of this paper is develop a new algorithm based on the insights gained from the novel formulation. The new algorithm, which is based on approximating the error gradient, has lower computational complexity in computing the weight update than the competing techniques for most typical problems. In addition, it reaches the error minimum in a much smaller number of iterations. A desirable characteristic of recurrent network training algorithms is to be able to update the weights in an online fashion. We have also developed an online version of the proposed algorithm, that is based on updating the error gradient approximation in a recursive manner.
Reinforcement Learning with Long ShortTerm Memory
 In NIPS
, 2002
"... This paper presents reinforcement learning with a Long ShortTerm Memory recurrent neural network: RLLSTM. Modelfree RLLSTM using Advantage### learning and directed exploration can solve nonMarkovian tasks with longterm dependencies between relevantevents. This is demonstrated in a Tmaze ta ..."
Abstract

Cited by 32 (4 self)
 Add to MetaCart
This paper presents reinforcement learning with a Long ShortTerm Memory recurrent neural network: RLLSTM. Modelfree RLLSTM using Advantage### learning and directed exploration can solve nonMarkovian tasks with longterm dependencies between relevantevents. This is demonstrated in a Tmaze task, as well as in a di#cult variation of the pole balancing task. 1
BackpropagationDecorrelation: online recurrent learning with O(N) complexity
"... We introduce a new learning rule for fully recurrent neural networks which we call BackpropagationDecorrelation rule (BPDC). It combines important principles: onestep backpropagation of errors and the usage of temporal memory in the network dynamics by means of decorrelation of activations. The B ..."
Abstract

Cited by 31 (3 self)
 Add to MetaCart
We introduce a new learning rule for fully recurrent neural networks which we call BackpropagationDecorrelation rule (BPDC). It combines important principles: onestep backpropagation of errors and the usage of temporal memory in the network dynamics by means of decorrelation of activations. The BPDC rule is derived and theoretically justified from regarding learning as a constraint optimization problem and applies uniformly in discrete and continuous time. It is very easy to implement, and has a minimal complexity of 2N multiplications per timestep in the single output case. Nevertheless we obtain fast tracking and excellent performance in some benchmark problems including the MackeyGlass timeseries.
Architectural Bias in Recurrent Neural Networks  Fractal Analysis
 IEEE Transactions on Neural Networks
, 1931
"... We have recently shown that when initialized with "small" weights, recurrent neural networks (RNNs) with standard sigmoidtype activation functions are inherently biased towards Markov models, i.e. even prior to any training, RNN dynamics can be readily used to extract finite memory machines (Hammer ..."
Abstract

Cited by 28 (7 self)
 Add to MetaCart
We have recently shown that when initialized with "small" weights, recurrent neural networks (RNNs) with standard sigmoidtype activation functions are inherently biased towards Markov models, i.e. even prior to any training, RNN dynamics can be readily used to extract finite memory machines (Hammer & Tino, 2002; Tino, Cernansky & Benuskova, 2002; Tino, Cernansky & Benuskova, 2002a). Following Christiansen and Chater (1999), we refer to this phenomenon as the architectural bias of RNNs. In this paper we further extend our work on the architectural bias in RNNs by performing a rigorous fractal analysis of recurrent activation patterns. We assume the network is driven by sequences obtained by traversing an underlying finitestate transition diagram  a scenario that has been frequently considered in the past e.g. when studying RNNbased learning and implementation of regular grammars and finitestate transducers. We obtain lower and upper bounds on various types of fractal dimensions, such as boxcounting and Hausdor# dimensions. It turns out that not only can the recurrent activations inside RNNs with small initial weights be explored to build Markovian predictive models, but also the activations form fractal clusters the dimension of which can be bounded by the scaled entropy of the underlying driving source. The scaling factors are fixed and are given by the RNN parameters.
A general framework for unsupervised processing of structured data
 NEUROCOMPUTING
, 2004
"... ..."
Shortterm memory for serial order: A recurrent neural network model
 Psychological Review
, 2006
"... Despite a century of research, the mechanisms underlying shortterm or working memory for serial order remain uncertain. Recent theoretical models have converged on a particular account, based on transient associations between independent item and context representations. In the present article, the ..."
Abstract

Cited by 26 (3 self)
 Add to MetaCart
Despite a century of research, the mechanisms underlying shortterm or working memory for serial order remain uncertain. Recent theoretical models have converged on a particular account, based on transient associations between independent item and context representations. In the present article, the authors present an alternative model, according to which sequence information is encoded through sustained patterns of activation within a recurrent neural network architecture. As demonstrated through a series of computer simulations, the model provides a parsimonious account for numerous benchmark characteristics of immediate serial recall, including data that have been considered to preclude the application of recurrent neural networks in this domain. Unlike most competing accounts, the model deals naturally with findings concerning the role of background knowledge in serial recall and makes contact with relevant neuroscientific data. Furthermore, the model gives rise to numerous testable predictions that differentiate it from competing theories. Taken together, the results presented indicate that recurrent neural networks may offer a useful framework for understanding shortterm memory for serial order.
Accelerated Neural Evolution through Cooperatively Coevolved Synapses
"... Many complex control problems require sophisticated solutions that are not amenable to traditional controller design. Not only is it difficult to model real world systems, but often it is unclear what kind of behavior is required to solve the task. Reinforcement learning (RL) approaches have made pr ..."
Abstract

Cited by 26 (8 self)
 Add to MetaCart
Many complex control problems require sophisticated solutions that are not amenable to traditional controller design. Not only is it difficult to model real world systems, but often it is unclear what kind of behavior is required to solve the task. Reinforcement learning (RL) approaches have made progress by using direct interaction with the task environment, but have so far not scaled well to large state spaces and environments that are not fully observable. In recent years, neuroevolution, the artificial evolution of neural networks, has had remarkable success in tasks that exhibit these two properties. In this paper, we compare a neuroevolution method called Cooperative Synapse Neuroevolution (CoSyNE), that uses cooperative coevolution at the level of individual synaptic weights, to a broad range of reinforcement learning algorithms on very difficult versions of the pole balancing problem that involve large (continuous) state spaces and hidden state. CoSyNE is shown to be significantly more efficient and powerful than the other methods on these tasks.
Abandoning emotion classes  towards continuous emotion recognition with modelling of longrange dependencies
 in Proceedings Interspeech
, 2008
"... Class based emotion recognition from speech, as performed in most works up to now, entails many restrictions for practical applications. Human emotion is a continuum and an automatic emotion recognition system must be able to recognise it as such. We present a novel approach for continuous emotion r ..."
Abstract

Cited by 26 (16 self)
 Add to MetaCart
Class based emotion recognition from speech, as performed in most works up to now, entails many restrictions for practical applications. Human emotion is a continuum and an automatic emotion recognition system must be able to recognise it as such. We present a novel approach for continuous emotion recognition based on Long ShortTerm Memory Recurrent Neural Networks which include modelling of longrange dependencies between observations and thus outperform techniques like SupportVector Regression. Transferring the innovative concept of additionally modelling emotional history to the classification of discrete levels for the emotional dimensions “valence ” and “activation ” we also apply Conditional Random Fields which prevail over the commonly used SupportVector Machines. Experiments conducted on data that was recorded while humans interacted with a Sensitive Artificial Listener prove that for activation the derived classifiers perform as well as human annotators.
Incremental Syntactic Parsing of Natural Language Corpora with Simple Synchrony Networks
 IEEE Transactions on Knowledge and Data Engineering
, 2001
"... This article explores the use of Simple Synchrony Networks (SSNs) for learning to parse English sentences drawn from a corpus of naturally occurring text. Parsing natural language sentences requires taking a sequence of words and outputting a hierarchical structure representing how those words fi ..."
Abstract

Cited by 25 (4 self)
 Add to MetaCart
This article explores the use of Simple Synchrony Networks (SSNs) for learning to parse English sentences drawn from a corpus of naturally occurring text. Parsing natural language sentences requires taking a sequence of words and outputting a hierarchical structure representing how those words fit together to form constituents. Feedforward and Simple Recurrent Networks have had great difficulty with this task, in part because the number of relationships required to specify a structure is too large for the number of unit outputs they have available. SSNs have the representational power to output the necessary O(n 2 ) possible structural relationships, because SSNs extend the O(n) incremental outputs of Simple Recurrent Networks with the O(n) entity outputs provided by Temporal Synchrony Variable Binding. This article presents an incremental representation of constituent structures which allows SSNs to make effective use of both these dimensions. Experiments on learning to ...