Results 1  10
of
77
A General Framework for Adaptive Processing of Data Structures
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 1998
"... A structured organization of information is typically required by symbolic processing. On the other hand, most connectionist models assume that data are organized according to relatively poor structures, like arrays or sequences. The framework described in this paper is an attempt to unify adaptive ..."
Abstract

Cited by 150 (61 self)
 Add to MetaCart
A structured organization of information is typically required by symbolic processing. On the other hand, most connectionist models assume that data are organized according to relatively poor structures, like arrays or sequences. The framework described in this paper is an attempt to unify adaptive models like artificial neural nets and belief nets for the problem of processing structured information. In particular, relations between data variables are expressed by directed acyclic graphs, where both numerical and categorical values coexist. The general framework proposed in this paper can be regarded as an extension of both recurrent neural networks and hidden Markov models to the case of acyclic graphs. In particular we study the supervised learning problem as the problem of learning transductions from an input structured space to an output structured space, where transductions are assumed to admit a recursive hidden statespace representation. We introduce a graphical formalism for r...
An experimental unification of reservoir computing methods
, 2007
"... Three different uses of a recurrent neural network (RNN) as a reservoir that is not trained but instead read out by a simple external classification layer have been described in the literature: Liquid State Machines (LSMs), Echo State Networks (ESNs) and the Backpropagation Decorrelation (BPDC) lea ..."
Abstract

Cited by 70 (10 self)
 Add to MetaCart
Three different uses of a recurrent neural network (RNN) as a reservoir that is not trained but instead read out by a simple external classification layer have been described in the literature: Liquid State Machines (LSMs), Echo State Networks (ESNs) and the Backpropagation Decorrelation (BPDC) learning rule. Individual descriptions of these techniques exist, but a overview is still lacking. Here, we present a series of experimental results that compares all three implementations, and draw conclusions about the relation between a broad range of reservoir parameters and network dynamics, memory, node complexity and performance on a variety of benchmark tests with different characteristics. Next, we introduce a new measure for the reservoir dynamics based on Lyapunov exponents. Unlike previous measures in the literature, this measure is dependent on the dynamics of the reservoir in response to the inputs, and in the cases we tried, it indicates an optimal value for the global scaling of the weight matrix, irrespective of the standard measures. We also describe the Reservoir Computing Toolbox that was used for these experiments, which implements all the types of Reservoir Computing and allows the easy simulation of a wide range of reservoir topologies for a number of benchmarks.
Approximating the Semantics of Logic Programs by Recurrent Neural Networks
"... In [18] we have shown how to construct a 3layered recurrent neural network that computes the fixed point of the meaning function TP of a given propositional logic program P, which corresponds to the computation of the semantics of P. In this article we consider the first order case. We define a no ..."
Abstract

Cited by 61 (10 self)
 Add to MetaCart
In [18] we have shown how to construct a 3layered recurrent neural network that computes the fixed point of the meaning function TP of a given propositional logic program P, which corresponds to the computation of the semantics of P. In this article we consider the first order case. We define a notion of approximation for interpretations and prove that there exists a 3layered feed forward neural network that approximates the calculation of TP for a given first order acyclic logic program P with an injective level mapping arbitrarily well. Extending the feed forward network by recurrent connections we obtain a recurrent neural network whose iteration approximates the fixed point of TP. This result is proven by taking advantage of the fact that for acyclic logic programs the function TP is a contraction mapping on a complete metric space defined by the interpretations of the program. Mapping this space to the metric space IR with Euclidean distance, a real valued function fP can be defined which corresponds to TP and is continuous as well as a contraction. Consequently it can be approximated by an appropriately chosen class of feed forward neural networks.
Natural language grammatical inference with recurrent neural networks
 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
, 1998
"... This paper examines the inductive inference of a complex grammar with neural networks  specifically, the task considered is that of training a network to classify natural language sentences as grammatical or ungrammatical, thereby exhibiting the same kind of discriminatory power provided by the P ..."
Abstract

Cited by 58 (1 self)
 Add to MetaCart
This paper examines the inductive inference of a complex grammar with neural networks  specifically, the task considered is that of training a network to classify natural language sentences as grammatical or ungrammatical, thereby exhibiting the same kind of discriminatory power provided by the Principles and Parameters linguistic framework, or GovernmentandBinding theory. Neural networks are trained, without the division into learned vs. innate components assumed by Chomsky, in an attempt to produce the same judgments as native speakers on sharply grammatical/ungrammatical data. How a recurrent neural network could possess linguistic capability and the properties of various common recurrent neural network architectures are discussed. The problem exhibits training behavior which is often not present with smaller grammars and training was initially difficult. However, after implementing several techniques aimed at improving the convergence of the gradient descent backpropagationthroughtime training algorithm, significant learning was possible. It was found that certain architectures are better able to learn an appropriate grammar. The operation of the networks and their training is analyzed. Finally, the extraction of rules in the form of deterministic finite state automata is investigated.
On the Effect of Analog Noise in DiscreteTime Analog Computations
 Neural Computation
, 1997
"... We introduce a model for noiserobust analog computations with discrete time that is flexible enough to cover the most important concrete cases, such as computations in noisy analog neural nets and networks of noisy spiking neurons. We show that the presence of arbitrarily small amounts of analog no ..."
Abstract

Cited by 56 (14 self)
 Add to MetaCart
We introduce a model for noiserobust analog computations with discrete time that is flexible enough to cover the most important concrete cases, such as computations in noisy analog neural nets and networks of noisy spiking neurons. We show that the presence of arbitrarily small amounts of analog noise reduces the power of analog computational models to that of finite automata, and we also prove a new type of upper bound for the VCdimension of computational models with analog noise. 1 Introduction Analog noise is a serious issue in practical analog computation. However there exists no formal model for reliable computations by noisy analog systems which allows us to address this issue in an adequate manner. The investigation of noisetolerant digital computations in the presence of stochastic failures of gates or wires had been initiated by [von Neumann, 1956]. We refer to [Cowan, 1966] and [Pippenger, 1989] for a small sample of the numerous results that have been achieved in this d...
Computational Capabilities of Recurrent NARX Neural Networks
 IEEE Trans. on Systems, Man and Cybernetics
, 1997
"... Abstract—Recently, fully connected recurrent neural networks have been proven to be computationally rich—at least as powerful as Turing machines. This work focuses on another network which is popular in control applications and has been found to be very effective at learning a variety of problems. T ..."
Abstract

Cited by 54 (9 self)
 Add to MetaCart
(Show Context)
Abstract—Recently, fully connected recurrent neural networks have been proven to be computationally rich—at least as powerful as Turing machines. This work focuses on another network which is popular in control applications and has been found to be very effective at learning a variety of problems. These networks are based upon Nonlinear AutoRegressive models with eXogenous Inputs (NARX models), and are therefore called NARX networks. As opposed to other recurrent networks, NARX networks have a limited feedback which comes only from the output neuron rather than from hidden states. They are formalized by y(t) =9(u(t0nu);111;u(t01); u(t);y(t0ny);111;y(t01)) where u(t) and y(t) represent input and output of the network at time t, nu and ny are the input and output order, and the function 9 is the mapping performed by a Multilayer Perceptron. We constructively prove that the NARX networks with a finite number of parameters are computationally as strong as fully connected recurrent networks and thus Turing machines. We conclude that in theory one can use the NARX models, rather than conventional recurrent networks without any computational loss even though their feedback is limited. Furthermore, these results raise the issue of what amount of feedback or recurrence is necessary for any network to be Turing equivalent and what restrictions on feedback limit computational power. I.
Analog Neural Nets with Gaussian or other Common Noise Distributions cannot Recognize Arbitrary Regular Languages
 NEURAL COMPUTATION
, 1998
"... We consider recurrent analog neural nets where the output of each gate is subject to Gaussian noise, or any other common noise distribution that is nonzero on a large set. We show that many regular languages cannot be recognized by networks of this type, and we give a precise characterization of ..."
Abstract

Cited by 52 (5 self)
 Add to MetaCart
We consider recurrent analog neural nets where the output of each gate is subject to Gaussian noise, or any other common noise distribution that is nonzero on a large set. We show that many regular languages cannot be recognized by networks of this type, and we give a precise characterization of those languages which can be recognized. This result implies severe constraints on possibilities for constructing recurrent analog neural nets that are robust against realistic types of analog noise. On the other hand we present a method for constructing feedforward analog neural nets that are robust with regard to analog noise of this type.
Representation of Finite State Automata in Recurrent Radial Basis Function Networks
, 1996
"... to :hs paper we propose some techniques ft>r injccling linite Stale automata rate l.ec:rr,zn Radial Basis Functlt>n networks (R2BF). When providing proper hints and constraining the v,oght space prlpe'ly. we show that thc,e nelworks behave as automata. A teebraque is snggcsted /"t eb ..."
Abstract

Cited by 42 (6 self)
 Add to MetaCart
to :hs paper we propose some techniques ft>r injccling linite Stale automata rate l.ec:rr,zn Radial Basis Functlt>n networks (R2BF). When providing proper hints and constraining the v,oght space prlpe'ly. we show that thc,e nelworks behave as automata. A teebraque is snggcsted /"t ebrorag the lemmng process re develop aulomata representationq that is based on adding a pro)per penalty tunelton to the mdinary cost. Successful experinental results are shown for tuducttvc mcrenc.' 1 regular gramrnar Keywords: Attemala, backpropagation t[rough trine, high(rder neural networks, induclix. c reference. learning item hints. radial basis ftlnctions, rectarent radial basra tnnclmns. recurrent netw(>rks 1. introduction The ability (>f learning fiom examples is certainly lhe most appealing l'eature c)f neu ral networks. In the last lw years, several researchers have used conncctontst models for solving different kinds ol probfoms ranging from robot control to pattern recogmtioa Coping wilh optimization of [unctions with several thousands of x, ariablcs s quite common Surprisingly, in many practical cases, global or near global r)ptimization is attained also wth non sophistteated numertcal methods. For example, successlul applications of neural nets fi)r recognition of handwritten characters (le Cun, 189) md for phoncmc discrimination (Waibcl c al., 1989) ave bccn proposed which d() n<,t report serious convergence problems Some attempts to understand the theoretical reasons )r lhc successes and atlures of supervised }earrang schemes have been carried oat which explain when such schemes are likely to succeed in discovering oplmal solutions (Bmnchini cl al.. 1994; Gori & Tesi, 1992; Yu, 192), and to gencrali7c to new examples (Baum & Haussler. 1989L These results give st>me ...
On the Complexity of Computing and Learning with Multiplicative Neural Networks
 NEURAL COMPUTATION
"... In a great variety of neuron models neural inputs are combined using the summing operation. We introduce the concept of multiplicative neural networks that contain units which multiply their inputs instead of summing them and, thus, allow inputs to interact nonlinearly. The class of multiplicative n ..."
Abstract

Cited by 38 (3 self)
 Add to MetaCart
In a great variety of neuron models neural inputs are combined using the summing operation. We introduce the concept of multiplicative neural networks that contain units which multiply their inputs instead of summing them and, thus, allow inputs to interact nonlinearly. The class of multiplicative neural networks comprises such widely known and well studied network types as higherorder networks and product unit networks. We investigate the complexity of computing and learning for multiplicative neural networks. In particular, we derive upper and lower bounds on the VapnikChervonenkis (VC) dimension and the pseudo dimension for various types of networks with multiplicative units. As the most general case, we consider feedforward networks consisting of product and sigmoidal units, showing that their pseudo dimension is bounded from above by a polynomial with the same order of magnitude as the currently best known bound for purely sigmoidal networks. Moreover, we show that this bound holds even in the case when the unit type, product or sigmoidal, may be learned. Crucial for these results are calculations of solution set components bounds for new network classes. As to lower bounds we construct product unit networks of fixed depth with superlinear VC dimension. For sigmoidal networks of higher order we establish polynomial bounds that, in contrast to previous results, do not involve any restriction of the network order. We further consider various classes of higherorder units, also known as sigmapi units, that are characterized by connectivity constraints. In terms of these we derive some asymptotically tight bounds.
Rule Extraction from Recurrent Neural Networks: a Taxonomy and Review
 Neural Computation
, 2005
"... this paper, the progress of this development is reviewed and analysed in detail. In order to structure the survey and to evaluate the techniques, a taxonomy, specifically designed for this purpose, has been developed. Moreover, important open research issues are identified, that, if addressed pr ..."
Abstract

Cited by 37 (5 self)
 Add to MetaCart
this paper, the progress of this development is reviewed and analysed in detail. In order to structure the survey and to evaluate the techniques, a taxonomy, specifically designed for this purpose, has been developed. Moreover, important open research issues are identified, that, if addressed properly, possibly can give the field a significant push forward