Results 1  10
of
178
On The Problem Of Local Minima In Backpropagation
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1992
"... Supervised Learning in MultiLayered Neural Networks (MLNs) has been recently proposed through the wellknown Backpropagation algorithm. This is a gradient method which can get stuck in local minima, as simple examples can show. In this paper, some conditions on the network architecture and the lear ..."
Abstract

Cited by 95 (18 self)
 Add to MetaCart
(Show Context)
Supervised Learning in MultiLayered Neural Networks (MLNs) has been recently proposed through the wellknown Backpropagation algorithm. This is a gradient method which can get stuck in local minima, as simple examples can show. In this paper, some conditions on the network architecture and the learning environment are proposed which ensure the convergence of the Backpropagation algorithm. It is proven in particular that the convergence holds if the classes are linearlyseparable. In this case, the experience gained in several experiments shows that MLNs exceed perceptrons in generalization to new examples. Index Terms MultiLayered Networks, learning environment, Backpropagation, pattern recognition, linearlyseparable classes. I. Introduction Supervised learning in MultiLayered Networks can be accomplished thanks to Backpropagation (BP ) ([19, 25, 31]). Its application to several different subjects [25], and, particularly, to pattern recognition ([3, 6, 8, 20, 27, 29]), has bee...
Designing a Neural Network for Forecasting Financial and Economic Time Series
, 1996
"... Artificial neural networks are universal and highly flexible function xpproximators first used in the fields of cognitive science and engineering. In recent years, neural network applications in finance for such tasks as pattern recognition, classification, and time series forecasting have dramati ..."
Abstract

Cited by 76 (0 self)
 Add to MetaCart
Artificial neural networks are universal and highly flexible function xpproximators first used in the fields of cognitive science and engineering. In recent years, neural network applications in finance for such tasks as pattern recognition, classification, and time series forecasting have dramatically increased. However, the large number of parameters that must be selected to develop a neural network forecasting model have meant that the design process still involves much trial and error. The objective of this paper is to provide a practical introductory guide in the design of a neural network for forecasting economic time series data. An eightstep procedure to design a neural network forecasting model is explained including a discussion of tradeoffs in parameter selection, some common pitfalls, and points of disagreement among practitioners.
Sastry,”Varieties of Learning Automata: An Overview
 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS
, 2002
"... Abstract—Automata models of learning systems introduced in the 1960s were popularized as learning automata (LA) in a survey paper in 1974 [1]. Since then, there have been many fundamental advances in the theory as well as applications of these learning models. In the past few years, the structure of ..."
Abstract

Cited by 75 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Automata models of learning systems introduced in the 1960s were popularized as learning automata (LA) in a survey paper in 1974 [1]. Since then, there have been many fundamental advances in the theory as well as applications of these learning models. In the past few years, the structure of LA has been modified in several directions to suit different applications. Concepts such as parameterized learning automata (PLA), generalized learning automata (GLA), and continuous actionset learning automata (CALA) have been proposed, analyzed, and applied to solve many significant learning problems. Furthermore, groups of LA forming teams and feedforward networks have been shown to converge to desired solutions under appropriate learning algorithms. Modules of LA have been used for parallel operation with consequent increase in speed of convergence. All of these concepts and results are relatively new and are scattered in technical literature. An attempt has been made in this paper to bring together the main ideas involved in a unified framework and provide pointers to relevant references. Index Terms—Continuous actionset learning automata (CALA), generalized learning automata (GLA), modules of learning automata, parameterized learning automata (PLA), teams and networks of learning automata. I.
Approximation theory of the MLP model in neural networks
 ACTA NUMERICA
, 1999
"... In this survey we discuss various approximationtheoretic problems that arise in the multilayer feedforward perceptron (MLP) model in neural networks. Mathematically it is one of the simpler models. Nonetheless the mathematics of this model is not well understood, and many of these problems are appr ..."
Abstract

Cited by 59 (3 self)
 Add to MetaCart
In this survey we discuss various approximationtheoretic problems that arise in the multilayer feedforward perceptron (MLP) model in neural networks. Mathematically it is one of the simpler models. Nonetheless the mathematics of this model is not well understood, and many of these problems are approximationtheoretic in character. Most of the research we will discuss is of very recent vintage. We will report on what has been done and on various unanswered questions. We will not be presenting practical (algorithmic) methods. We will, however, be exploring the capabilities and limitations of this model. In the first
Continuous Speech Recognition Using Hidden Markov Models
 IEEE ASSP MAGAZINE
, 1990
"... Stochastic signal processing techniques have profoundly changed our perspective on speech processing. We have witnessed a progression from heuristic algorithms to detailed statistical approaches based on iterat ive analysis techniques. Markov modeling provides a mathematically rigorous approach t ..."
Abstract

Cited by 54 (9 self)
 Add to MetaCart
Stochastic signal processing techniques have profoundly changed our perspective on speech processing. We have witnessed a progression from heuristic algorithms to detailed statistical approaches based on iterat ive analysis techniques. Markov modeling provides a mathematically rigorous approach to developing robust s tat is t ica l signal models. Since t h e i n t roduc t i on of Markov models t o speech processing in t h e middle 1970s. continuous speech recognition technology has come of age. Dramatic advances have been made in characterizing the temporal and spectral evolution of the speech signal. A t the same time, our appreciation o f t he need to explain complex acoustic manifestations b y integration of application constraints in to low level signal processing has grown. In th is paper, w e review the use of Markov models in continuous speech recognition. Markov models are presented as a generalization of i t s predecessor technology, Dynamic Programming. A unified view is offered in which bo th linguistic decoding and acoustic matching are integrated in to a single optimal network search framework.
Are artificial neural networks black boxes
 IEEE Trans. Neural Networks
, 1997
"... Abstract — Artificial neural networks are efficient computing models which have shown their strengths in solving hard problems in artificial intelligence. They have also been shown to be universal approximators. Notwithstanding, one of the major criticisms is their being black boxes, since no satisf ..."
Abstract

Cited by 50 (5 self)
 Add to MetaCart
(Show Context)
Abstract — Artificial neural networks are efficient computing models which have shown their strengths in solving hard problems in artificial intelligence. They have also been shown to be universal approximators. Notwithstanding, one of the major criticisms is their being black boxes, since no satisfactory explanation of their behavior has been offered. In this paper, we provide such an interpretation of neural networks so that they will no longer be seen as black boxes. This is stated after establishing the equality between a certain class of neural nets and fuzzy rulebased systems. This interpretation is built with fuzzy rules using a new fuzzy logic operator which is defined after introducing the concept of fduality. In addition, this interpretation offers an automated knowledge acquisition procedure. Index Terms — Equality between neural nets and fuzzy rulebased systems, fduality, fuzzy additive systems, interpretation of neural nets, ior operator. I.
GAL: Networks that grow when they learn and shrink when they forget
 INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE
, 1991
"... Learning when limited to modification of some parameters has a limited scope; the capability to modify the system structure is also needed to get a wider range of the learnable. In the case of artificial neural networks, learning by iterative adjustment of synaptic weights can only succeed if t ..."
Abstract

Cited by 29 (5 self)
 Add to MetaCart
Learning when limited to modification of some parameters has a limited scope; the capability to modify the system structure is also needed to get a wider range of the learnable. In the case of artificial neural networks, learning by iterative adjustment of synaptic weights can only succeed if the network designer predefines an appropriate network structure, i.e., number of hidden layers, units, and the size and shape of their receptive and projective fields. This paper advocates the view that the network structure should not, as usually done, be determined by trialanderror but should be computed by the learning algorithm. Incremental learning algorithms can modify the network structure by addition and/or removal of units and/or links. A survey of current connectionist literature is given on this line of thought. "Grow and Learn" (GAL) is a new algorithm that learns an association at oneshot due to being incremental and using a local representation. During the socalled...
Degraded Text Recognition Using Visual And Linguistic Context
, 1995
"... Recognition of degraded text is a challenging problem. To improve the performance of an OCR system on degraded images of text, postprocessing techniques are critical. The objective of postprocessing is to correct errors or to resolve ambiguities in OCR results by using contextual information. Depend ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
Recognition of degraded text is a challenging problem. To improve the performance of an OCR system on degraded images of text, postprocessing techniques are critical. The objective of postprocessing is to correct errors or to resolve ambiguities in OCR results by using contextual information. Depending on the extent of context used, there are different levels of postprocessing. In current commercial OCR systems, wordlevel postprocessing methods, such as dictionarylookup, have been applied successfully. However, many OCR errors cannot be corrected by wordlevel postprocessing. To overcome this limitation, passagelevel postprocessing, in which global contextual information is utilized, is necessary. In most current studies on passagelevel postprocessing, linguistic context is the major resource to be exploited. This thesis addresses problems in degraded text recognition and discusses potential solutions through passagelevel postprocessing. The objective is to develop a postprocessin...
Seismic discrimination with artificial neural networks: Preliminary results with regional spectral data
 Bulletin of the Seismological Society of America
, 1990
"... An application of artificial neural networks (ANN) for discrimination between natural earthquakes and underground nuclear explosions has been studied using distance corrected spectral data of regional seismic phases. Pn, Pg, and Lg spectra have been analyzed from 83 western U.S. earthquakes and 87 N ..."
Abstract

Cited by 24 (1 self)
 Add to MetaCart
(Show Context)
An application of artificial neural networks (ANN) for discrimination between natural earthquakes and underground nuclear explosions has been studied using distance corrected spectral data of regional seismic phases. Pn, Pg, and Lg spectra have been analyzed from 83 western U.S. earthquakes and 87 Nevada Test Site explosions recorded at the four broadband seismic stations operated by Lawrence Livermore National Laboratory. Distance corrections are applied to the raw spectra using existing frequencydependent Q models for the Basin and Range. The spectra are sampled logarithmically at 41 points between 0,1 and 10 Hz for each phase and checked for adequate signaltonoise ratios (S/N> 2). The ANN was implemented on a SUN 4/110 workstation using a backpropagationfeedforward architecture. We find that, using even simple ANN architectures (82 input units, 1 hidden unit, and 2 output units), powerful discrimination systems can be designed. In order to regionalize the data characteristics, a separate neural network was assigned to each station. For this data set, the rate of correct recognition for untrained data is over 93 per cent for both earthquakes and explosions at any single station. Using a majority voting scheme with a network of four stations, the rate of correct recognition is over 97 per cent. Although the performance of the ANN is similar to that of the Fisher linear discriminant, the ANN exhibits a number of computational advantages over the conventional method. Finally, examination of the network weights suggests that, in addition to spectral shape, a criterion that the ANN utilized to discriminate between the two populations was the Lg/Pg spectral amplitude ratios.
Feedforward Neural Networks for Nonparametric Regression
, 1998
"... Feed forward neural networks (FFNN) with an unconstrained random number of hidden neurons define flexible nonparametric regression models. In Müller and Rios Insua (1998) we have argued that variable architecture models with random size hidden layer significantly reduce posterior multimodality typi ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
Feed forward neural networks (FFNN) with an unconstrained random number of hidden neurons define flexible nonparametric regression models. In Müller and Rios Insua (1998) we have argued that variable architecture models with random size hidden layer significantly reduce posterior multimodality typical for posterior distributions in neural network models. In this chapter we review the model proposed in Müller and Rios Insua (1998) and extend it to a nonparametric model by allowing unconstrained size of the hidden layer. This is made possible by introducing a Markov chain Monte Carlo posterior simulation scheme using reversible jump (Green 1995) steps to move between different size architectures.