#### DMCA

## Evolving Optimal Neural Networks U sing Genetic Algorithms with Occam's Razor Byoung-Tak Zhang'

### Citations

10045 |
Genetic Algorithms
- Goldberg
- 1989
(Show Context)
Citation Context ...on problems, such as finding a network architecture appropriate for the application at hand, and finding an optimal set of weight values for the network to solve the problem. Genetic algorithms (GAs) =-=[8, 5, 20]-=- have been used to solve each of these optimization problems [36] . In weight optimization, the set of weights is represented as a chromosome, and a genetic search is applied to .Email: zhang(ggmd.de ... |

3669 |
Genetic programming: on the programming of computers by means of natural selection
- Koza
- 1992
(Show Context)
Citation Context ...ctly specified by a graph-generation grammar that is evolved by GAs. All of these meth ods use the backpropagation algorithm [29], a gradient-descent method, to train the weights of the network. Koza =-=[12]-=- provides an alternative approach to the representation of neural networks, under the framework of genetic programming (GP) , which enables modificat ion of not only the weights of a neural network, b... |

3623 |
Learning internal representations by error propagation
- Rumelhart, Hinton, et al.
- 1986
(Show Context)
Citation Context ... have suggested encoding schemes in which a network configuration is indirectly specified by a graph-generation grammar that is evolved by GAs. All of these meth ods use the backpropagation algorithm =-=[29]-=-, a gradient-descent method, to train the weights of the network. Koza [12] provides an alternative approach to the representation of neural networks, under the framework of genetic programming (GP) ,... |

1128 |
A logical calculus of the ideas immanent in nervous activity
- McCulloch, Pitts
- 1943
(Show Context)
Citation Context ...ts. Figure 1 compares a typical multilayer perceptron to a more general architecture as adopted in thi s work. There are many types of neur al units; we confine ourselves to McCullo ch-Pi tts neurons =-=[14]-=-, although the met hod we describ e can be extended easily to other types of neurons. The McCulloch-Pitts neuron is a binary device. Each neuron has a threshold. The neuron can receive inputs from exc... |

814 |
Networks for approximation and learning
- Poggio, Girosi
- 1990
(Show Context)
Citation Context ...etermining the speed and accuracy of learning. In addition, large weights generally should be penalized, in the hope of achieving a smoother or simpler mapping-this technique is called regularization =-=[26, 13]-=-. We define th e complexity C of a network as K C(W I A) = :Lw~ k = l where K is the numb er of free parameters. Note th at K can be arbitrarily large, because we fit the architectures as well. In the... |

810 |
Evolutionsstrategie – Optimierung technischer Systeme nach Prinzipien der biologischen Information. Fromman
- Rechenberg
- 1973
(Show Context)
Citation Context ...l e genet ic algorithms typically mod el a natural evolution, t he BGA models a rational selection performed by hum an breeders. The BGA can be considered as a recombin ation of evolution st rategies =-=[27, 30]-=- and GAs [8, 5]. The BGA uses truncation selection as performed by breeders. This select ion scheme is similar to the (J..L, A)-strategy in [30]. The search process of t he BGA is mainly driven by rec... |

669 |
Perceptrons: An Introduction to Computational Geometry
- Minsky, Papert
- 1969
(Show Context)
Citation Context ...followed by an analysis of fitness landscap es in sect ion 6, and discussion in section 7. 2. Representing neural networks as trees Multilayer feedforward neur al networks (or multilayer perceptrons) =-=[28, 16, 29]-=- are networks of simpl e processing elements, called neurons or units, organized in layers. The external inputs are presented to the input layer and are fed forward via one or more layers of hidden un... |

609 |
Numerical Optimization for Computer Models
- Schwefel
- 1981
(Show Context)
Citation Context ...l e genet ic algorithms typically mod el a natural evolution, t he BGA models a rational selection performed by hum an breeders. The BGA can be considered as a recombin ation of evolution st rategies =-=[27, 30]-=- and GAs [8, 5]. The BGA uses truncation selection as performed by breeders. This select ion scheme is similar to the (J..L, A)-strategy in [30]. The search process of t he BGA is mainly driven by rec... |

408 |
An overview of evolutionary algorithms for parameter optimization
- Bäck, Schwefel
- 1993
(Show Context)
Citation Context ...terms of both architecture and weight values). Most existing search methods, including iterated hillclimbing meth ods [4, 18, 31], simulat ed ann ealing [10], backpropagation [29], and even other GAs =-=[2]-=-, work on a search space of fixed size, while our search space is of variable size. This difference of ability, combined with the difference in parameters used in each algorithm, makes the comparison ... |

353 |
Designing neural networks using genetic algorithms with graph generation system
- Kitano
- 1990
(Show Context)
Citation Context ...schemes in which the anatomical prop erties of the network structure are encoded as bit-strings. A similar representation has been used by Whitley et al. [36] to prune unnecessary connections. Kitano =-=[11]-=- and Gruau [6] have suggested encoding schemes in which a network configuration is indirectly specified by a graph-generation grammar that is evolved by GAs. All of these meth ods use the backpropagat... |

274 |
Designing neural networks using genetic algorithms
- Miiler, Todd, et al.
- 1989
(Show Context)
Citation Context ...y Mi.ihlenbein and Kindermann [24] . Recent works, however, have used GAs separately in each optimization problem, primarily focusing on optimizing network topology. Harp et al. [7] and Miller et al. =-=[15]-=- have described representation schemes in which the anatomical prop erties of the network structure are encoded as bit-strings. A similar representation has been used by Whitley et al. [36] to prune u... |

271 |
Towards a general theory of adaptive walks on rugged landscapes
- Kaufmann, Levin
- 1987
(Show Context)
Citation Context ... to speed up the genetic search, an investigation of its fitness landscapes is necessary. 6. Analysis of fitness landscapes Fitness landscapes have been analyzed for Boolean N- K networks by Kauffman =-=[3]-=-, for random traveling salesman problems (TSPs) by Kirkpatrick et al. [10], and for Euclidean TSPs by Miihlenbein [21]. The general characterization of a fitn ess landscape is very difficult . The num... |

246 |
Training feedforward neural networks using genetic algorithms
- Montana, Davis
- 1989
(Show Context)
Citation Context ...Miihlenbein the encoded representation to find a set of weights that best fits the training data. Some encouraging results have been reported that are comparable with conventional learning algorithms =-=[17]-=- . In architect ure opt imizat ion, the topo logy of the network is encoded as a chromosome, and genetic operators are applied to find an architecture that best fits the specified tas k (according to ... |

209 |
The breakout method for escaping from local minima
- Morris
- 1993
(Show Context)
Citation Context ...00) = 0.0024. No general learning method is yet known to find such a solution (in terms of both architecture and weight values). Most existing search methods, including iterated hillclimbing meth ods =-=[4, 18, 31]-=-, simulat ed ann ealing [10], backpropagation [29], and even other GAs [2], work on a search space of fixed size, while our search space is of variable size. This difference of ability, combined with ... |

175 | Bayesian Methods for Adaptive Models
- MacKay
- 1992
(Show Context)
Citation Context ...d Cj is completely replaced by another syntactically correct subtree. 4. Fitness function with Occam's razor Occam's razor states that simpler models should be preferred to unnecessarily complex ones =-=[13, 33]-=-. This section complies with Occam's razor by giving a quantitative method for using GAs to construct neural networks of minimal complexity. In defining criteria for minimality, it is important that t... |

148 |
Genetic algorithms and neural networks: optimizing connections and connectivity
- Whitley, Starkweather, et al.
- 1990
(Show Context)
Citation Context ... application at hand, and finding an optimal set of weight values for the network to solve the problem. Genetic algorithms (GAs) [8, 5, 20] have been used to solve each of these optimization problems =-=[36]-=- . In weight optimization, the set of weights is represented as a chromosome, and a genetic search is applied to .Email: zhang(ggmd.de tEmail : muehlen(ggmd.de200 B. T. Zhang and H. Miihlenbein the e... |

135 | Towards an understanding of hill-climbing procedures for SAT
- Gent, Walsh
(Show Context)
Citation Context ...00) = 0.0024. No general learning method is yet known to find such a solution (in terms of both architecture and weight values). Most existing search methods, including iterated hillclimbing meth ods =-=[4, 18, 31]-=-, simulat ed ann ealing [10], backpropagation [29], and even other GAs [2], work on a search space of fixed size, while our search space is of variable size. This difference of ability, combined with ... |

90 |
Polynomial theory of complex systems
- Ivakhnenko
- 1971
(Show Context)
Citation Context ...of the group method of data handling (GMDH) in which additional terms are incrementally added to the existing polynomial approximator to achieve a minimal description length model of a complex system =-=[9, 34]-=-. The performance of the BGP method on noisy data was tested using the maj ority problem with nine inputs. In each run, we used a training set of 256 examples.with 5% noise (which means that, on avera... |

60 |
Genetic Synthesis of Boolean Neural Networks with a Cell Rewiting Development
- GRUAU
- 1992
(Show Context)
Citation Context ...h the anatomical prop erties of the network structure are encoded as bit-strings. A similar representation has been used by Whitley et al. [36] to prune unnecessary connections. Kitano [11] and Gruau =-=[6]-=- have suggested encoding schemes in which a network configuration is indirectly specified by a graph-generation grammar that is evolved by GAs. All of these meth ods use the backpropagation algorithm ... |

54 |
The Vapnik-Chervonenkis Dimension: Information Versus Complexity
- Abu-Mostafa
- 1989
(Show Context)
Citation Context ...usually leads to overfitt ing of the training data. On the other hand , a small network will achieve a good generalizat ion if it converges, but it needs, in general, a large amount of training tim e =-=[1, 32]-=-. Therefore, the size of a network should be as small as possible, yet sufficiently large to ensure an accurate fitting of the training set . A general method for evolving genetic neural networks was ... |

44 |
Consistent inference of probabilities in layered networks: prediction and generalization
- Tishby, Levin, et al.
- 1989
(Show Context)
Citation Context ...y x , , Z({3) , (10) where (3 is a positive constant which determines the sensitivity of the probability to the error value, and Z(f3) = Jexp( - {3E(y I x ,W,A))dy (11) is a normalizing constant (see =-=[35]-=-) . Under the assumption of the Gaussian error model (i.e., if the true output is expected to include additive Gaussian noise with standard deviation a), we have P( I WA) = _1_ (_ E(y I x , W,A)) y z ... |

32 | New Solutions to the Mapping Problem of Parallel Systems: The Evolution Approach, Parallel Computing 4 - Muhlenbein, Gorges-Schleuter - 1987 |

29 | G~~etic programming of minimal neural nets using Occam's razor - Zhang, Miihlenbein - 1993 |

26 |
Self-organizing network for optimum supervi.s-ed learning
- Tenorio, Lee
- 1990
(Show Context)
Citation Context ...of the group method of data handling (GMDH) in which additional terms are incrementally added to the existing polynomial approximator to achieve a minimal description length model of a complex system =-=[9, 34]-=-. The performance of the BGP method on noisy data was tested using the maj ority problem with nine inputs. In each run, we used a training set of 256 examples.with 5% noise (which means that, on avera... |

24 | Neural Networks that Teach Themselves through Genetic Discovery of Novel Examples - Zhang, Veenker - 1991 |

13 | Focused incremental learning for improved generalization with reduced training sets - Zhang, Veenker - 1991 |

8 | A Quantitative Occam's Razor
- Sorkin
- 1983
(Show Context)
Citation Context ...d Cj is completely replaced by another syntactically correct subtree. 4. Fitness function with Occam's razor Occam's razor states that simpler models should be preferred to unnecessarily complex ones =-=[13, 33]-=-. This section complies with Occam's razor by giving a quantitative method for using GAs to construct neural networks of minimal complexity. In defining criteria for minimality, it is important that t... |

4 |
Optimiza- tion by Simulated Annealing
- Kirkpatrick, Gellat, et al.
- 1983
(Show Context)
Citation Context ...ethod is yet known to find such a solution (in terms of both architecture and weight values). Most existing search methods, including iterated hillclimbing meth ods [4, 18, 31], simulat ed ann ealing =-=[10]-=-, backpropagation [29], and even other GAs [2], work on a search space of fixed size, while our search space is of variable size. This difference of ability, combined with the difference in parameters... |

3 |
The dynamics of evolution and learning-Towards genetic neural networks
- Miihlenbein, Kindermann
- 1989
(Show Context)
Citation Context ... be as small as possible, yet sufficiently large to ensure an accurate fitting of the training set . A general method for evolving genetic neural networks was suggested by Mi.ihlenbein and Kindermann =-=[24]-=- . Recent works, however, have used GAs separately in each optimization problem, primarily focusing on optimizing network topology. Harp et al. [7] and Miller et al. [15] have described representation... |

2 |
Evolution in time and spacethe parallel genetic algorithm
- Miihlenbein
- 1991
(Show Context)
Citation Context ...on problems, such as finding a network architecture appropriate for the application at hand, and finding an optimal set of weight values for the network to solve the problem. Genetic algorithms (GAs) =-=[8, 5, 20]-=- have been used to solve each of these optimization problems [36] . In weight optimization, the set of weights is represented as a chromosome, and a genetic search is applied to .Email: zhang(ggmd.de ... |

2 |
Parallel Genetic Algorithms in Combinatorial Optimization
- Miihlenbein
(Show Context)
Citation Context ...scapes Fitness landscapes have been analyzed for Boolean N- K networks by Kauffman [3], for random traveling salesman problems (TSPs) by Kirkpatrick et al. [10], and for Euclidean TSPs by Miihlenbein =-=[21]-=-. The general characterization of a fitn ess landscape is very difficult . The number of local optima, th eir distribution, and the basins of attraction are some of the important parameters which desc... |

2 | Accelerated learning by active example selection," forthcoming - Zhang - 1994 |

1 |
Guh a, "Towards th e Genet ic Synthesis of Neural Networks
- Harp, T
- 1989
(Show Context)
Citation Context ...tworks was suggested by Mi.ihlenbein and Kindermann [24] . Recent works, however, have used GAs separately in each optimization problem, primarily focusing on optimizing network topology. Harp et al. =-=[7]-=- and Miller et al. [15] have described representation schemes in which the anatomical prop erties of the network structure are encoded as bit-strings. A similar representation has been used by Whitley... |

1 | Darwin's Continent al Cycle and Its Simulation by the Prisoner's Dilemma - Miihlenbein - 1991 |

1 |
Evolutionary Algorithms: Theory and Applications," in Local Search in Combinatorial Optimization, edited by
- Miihlenbein
- 1993
(Show Context)
Citation Context ...nary functions of size n, th ere exists no simple optimization method that performs better than any other. To be effective, every sophisticated opti mization met hod must be tuned to the applicat ion =-=[22]-=-. In order to assess the complexity of an opt imization problem, and to speed up the genetic search, an investigation of its fitness landscapes is necessary. 6. Analysis of fitness landscapes Fitness ... |

1 |
Predict ive Models for the Breeder Genetic Algorithm I: Continuous Parameter Optimization
- Miihlenbein, Schlierkamp-Voosen
- 1993
(Show Context)
Citation Context ...reeding of neural networks 3.1 Breeder genetic programming (BGP) For the evolut ion of optimal neur al networks, we use the concepts based on the breeder genetic algorithm (BGA) of Miihlenbein et al. =-=[25]-=-. Whil e genet ic algorithms typically mod el a natural evolution, t he BGA models a rational selection performed by hum an breeders. The BGA can be considered as a recombin ation of evolution st rate... |

1 |
Rosenb latt
- unknown authors
- 1962
(Show Context)
Citation Context ...followed by an analysis of fitness landscap es in sect ion 6, and discussion in section 7. 2. Representing neural networks as trees Multilayer feedforward neur al networks (or multilayer perceptrons) =-=[28, 16, 29]-=- are networks of simpl e processing elements, called neurons or units, organized in layers. The external inputs are presented to the input layer and are fed forward via one or more layers of hidden un... |

1 |
An Emp irical Study of Greedy Local Search for Satisfiability Test ing
- Selman, Kautz
- 1993
(Show Context)
Citation Context ...00) = 0.0024. No general learning method is yet known to find such a solution (in terms of both architecture and weight values). Most existing search methods, including iterated hillclimbing meth ods =-=[4, 18, 31]-=-, simulat ed ann ealing [10], backpropagation [29], and even other GAs [2], work on a search space of fixed size, while our search space is of variable size. This difference of ability, combined with ... |

1 |
Neural Network Construct ive Algorithms: Trading Generalizat ion for Learn ing Efficiency
- unknown authors
- 1993
(Show Context)
Citation Context ...usually leads to overfitt ing of the training data. On the other hand , a small network will achieve a good generalizat ion if it converges, but it needs, in general, a large amount of training tim e =-=[1, 32]-=-. Therefore, the size of a network should be as small as possible, yet sufficiently large to ensure an accurate fitting of the training set . A general method for evolving genetic neural networks was ... |

1 | Learning by Genetic Neural Evolution, (in German), ISBN 3 929037-16-5, (Sankt Augustin - Zhang - 1992 |