## Evolutionary multi-objective optimization of neural networks for face detection (2004)

### Cached

### Download Links

Venue: | INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS |

Citations: | 9 - 4 self |

### BibTeX

@ARTICLE{Wiegand04evolutionarymulti-objective,

author = {Stefan Wiegand and Christian Igel and Uwe Handmann},

title = {Evolutionary multi-objective optimization of neural networks for face detection},

journal = {INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS},

year = {2004},

volume = {4},

number = {3},

pages = {2004}

}

### Years of Citing Articles

### OpenURL

### Abstract

For face recognition from video streams speed and accuracy are vital aspects. The first decision whether a preprocessed image region represents a human face or not is often made by a feed-forward neural network (NN), e.g., in the Viisage-FaceFINDER video surveillance system. We describe the optimization of such a NN by a hybrid algorithm combining evolutionary multi-objective optimization (EMO) and gradient-based learning. The evolved solutions perform considerably faster than an expert-designed architecture without loss of accuracy. We compare an EMO and a single objective approach, both with online search strategy adaptation. It turns out that EMO is preferable to the single objective approach in several respects.

### Citations

1227 |
Multi-objective optimization using evolutionary algorithms
- Deb
- 2001
(Show Context)
Citation Context ...t,τ) o /q (t,τ) all + (1 − ζ) · ˜p(t) o if q (t,τ) all > 0 ζ/|Ω| + (1 − ζ) · ˜p (t) (8) o otherwise and p (t+1) o := pmin + (1 − |Ω| · pmin)˜p (t+1) � � o o ′ ∈Ω := � o ′ ∈Ω q(t,τ) o ′ ˜p (t+1) o ′ . =-=(9)-=- The factor q (t,τ) all is used for normalization and ˜p(t+1) o stores the weighted average of the quality of the operator o, where the influence of previous adaptation cycles decreases exponentially.... |

1017 | T.: Neural Network-Based Face Detection - Rowley, Baluja, et al. - 1998 |

963 | Face recognition: A literature survey
- Zhao, Chellappa, et al.
- 2002
(Show Context)
Citation Context ... Then all individuals a ∈ P (t) ∪ O (t) are sorted in ascending order according to the partial order ≥n defined by � ai ≥n aj ⇔ R (t) (ai) < R (t) � � (aj) ∨ R (t) (ai) = R (t) � (aj) ∧ C(ai) ≥ C(aj) =-=(4)-=- and the first |P| individuals form the new parent population P (t+1) . We refer to the described selection method as NSGA-II selection throughout this article. 2.2.5. Search strategy adaptation: Adju... |

933 | T.: A fast and elitist multiobjective genetic algorithm: NSGA–II
- Deb, Pratap, et al.
(Show Context)
Citation Context ...ferenceindicator DA,B := HA+B − HB. 29 It reflects the size of the objective space that is weakly dominated by the set A but not by B, see Fig. 3 (right). It holds (DA,B > 0 and DB,A = 0) ⇔ (A ⊲ B) . =-=(12)-=- The coverage difference DA,B also allows to draw conclusions of the form (DA,B = 0) ∧ (DB,A = 0) ⇔ (A = B), and (DA,B > 0) ∧ (DB,A > 0) ⇔ (A||B), where A||B denotes that A and B are incomparable. Fol... |

684 | Evolutionary Computation: Towards a New Philosophy of Machine Intelligence - Fogel - 1995 |

664 | Detecting faces in images: A survey
- Yang, Kriegman, et al.
- 2002
(Show Context)
Citation Context ...absolute deviation of the quantities ∆Ai,Bj := DAi,Bj − DBj,Ai = −∆Bj,Ai and DAi,Bj . Furthermore we calculate for 1 ≤ i, j ≤ T PAi⊲B := | � (Ai, Bj) : DAi,Bj > 0 ∧ DBj,Ai = 0, 1 ≤ j ≤ T � | · 1 /T , =-=(14)-=- P Ai||B := | � (Ai, Bj) : DAi,Bj > 0 ∧ DBj,Ai > 0, 1 ≤ j ≤ T � | · 1 /T , (15) PBj⊲A := | � (Ai, Bj) : DAi,Bj = 0 ∧ DBj,Ai > 0, 1 ≤ i ≤ T � | · 1 /T , (16) P Bj||A := | � (Ai, Bj) : DAi,Bj > 0 ∧ DBj,... |

573 |
Evolutionary Algorithms for Solving Multi-Objective Problems
- Coello, Veldhuizen
- 2002
(Show Context)
Citation Context ...The operator probabilities p (t+1) o are adjusted every τ generations according to equations ˜p (t+1) � ζ · q o := (t,τ) o /q (t,τ) all + (1 − ζ) · ˜p(t) o if q (t,τ) all > 0 ζ/|Ω| + (1 − ζ) · ˜p (t) =-=(8)-=- o otherwise and p (t+1) o := pmin + (1 − |Ω| · pmin)˜p (t+1) � � o o ′ ∈Ω := � o ′ ∈Ω q(t,τ) o ′ ˜p (t+1) o ′ . (9) The factor q (t,τ) all is used for normalization and ˜p(t+1) o stores the weighted ... |

534 | Distortion invariant object recognition in the dynamic link architecture
- Lades, Vorbruggen, et al.
- 1993
(Show Context)
Citation Context ...zation. The elements of the objective space are partially ordered by the dominance relation � (z dominates z ′ ) that is defined by z � z ′ ∈ R n ⇔ ∀ 1 ≤ i ≤ n : zi ≤ z ′ i ∧ ∃ 1 ≤ j ≤ n : zj < z ′ j =-=(3)-=- stating that vector z performs better than z ′ iff z is as least as good as z ′ in all objectives and better with respect to at least one objective. Considering a set M (2)sPSfrag replacements 8 Stef... |

454 | Evolving artificial neural networks - Yao - 1999 |

355 | The GENITOR algorithm and selection pressure – why rank-based allocation of reproduction trials is best - Whitley - 1989 |

334 | An Evolutionary Algorithm for Multiobjective Optimisation: the Strength Pareto Approach. TIK-Report No. 43. Institute für Technische Informatik and Kommunikationsnetze - Zitzler, Thiele - 1998 |

256 | Parameter control in evolutionary algorithms - Eiben, Hinterding, et al. - 1999 |

188 | Adapting operator probabilities in genetic algorithms - Davis - 1989 |

181 | The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network
- Bartlett
- 1998
(Show Context)
Citation Context ...Bj > 0 ∧ DBj,Ai = 0, 1 ≤ j ≤ T � | · 1 /T , (14) P Ai||B := | � (Ai, Bj) : DAi,Bj > 0 ∧ DBj,Ai > 0, 1 ≤ j ≤ T � | · 1 /T , (15) PBj⊲A := | � (Ai, Bj) : DAi,Bj = 0 ∧ DBj,Ai > 0, 1 ≤ i ≤ T � | · 1 /T , =-=(16)-=- P Bj||A := | � (Ai, Bj) : DAi,Bj > 0 ∧ DBj,Ai > 0, 1 ≤ i ≤ T � | · 1 /T , (17) that is, in (14) the average number of trials from algorithm (B) that perform worse than trial Ai, in (15) the average n... |

82 | Supervised Learning in Multilayer Perceptrons - from Backpropagation to Adaptive Learning Techniques - Riedmiller - 1994 |

64 | da Fonseca, “Performance Assessment of Multiobjective Optimizers: An Analysis - Zitzler, Thiele, et al. |

61 | Empirical evaluation of the improved Rprop learning algorithms - Igel, Husken - 2003 |

58 | Neural networks for classification: a survey
- Zhang
- 2000
(Show Context)
Citation Context ...min 1 , zmin 2 empirically by looking at all Pareto fronts obtained. In each trial of our optimization algorithm, we maintain an additional archive A (t+1) := P S t+1 t ′ =0 P(t′ ) = P A (t) ∪P (t+1) =-=(13)-=- starting from A (0) = P P (0). In our optimization scenario |A (t) | is always small and we do not need to discard any non-dominated solutions. We regard the final archive as the outcome of an optimi... |

48 | Operator and parameter adaptation in genetic algorithms - Smith, Fogarty - 1997 |

43 |
Face detection: a survey", Computer Vision and Image Understanding 83
- Hjelmas, Low
- 2001
(Show Context)
Citation Context ...i,Bj . Furthermore we calculate for 1 ≤ i, j ≤ T PAi⊲B := | � (Ai, Bj) : DAi,Bj > 0 ∧ DBj,Ai = 0, 1 ≤ j ≤ T � | · 1 /T , (14) P Ai||B := | � (Ai, Bj) : DAi,Bj > 0 ∧ DBj,Ai > 0, 1 ≤ j ≤ T � | · 1 /T , =-=(15)-=- PBj⊲A := | � (Ai, Bj) : DAi,Bj = 0 ∧ DBj,Ai > 0, 1 ≤ i ≤ T � | · 1 /T , (16) P Bj||A := | � (Ai, Bj) : DAi,Bj > 0 ∧ DBj,Ai > 0, 1 ≤ i ≤ T � | · 1 /T , (17) that is, in (14) the average number of tria... |

34 | Locating and tracking of human faces with neural networks
- Hunke
- 1994
(Show Context)
Citation Context ... � , (5) where B (t) (a) represents a quality measure proportional to some kind of fitness improvement. This is for the scalar value based selection scheme, case (A), B (t) (a) := Φ(a) − Φ(parent(a)) =-=(6)-=- and for the vector-valued selection scheme, case (B), B (t) (a) := R (t) (parent(a)) − R (t) (a) (7) respectively, where parent(a) denotes the parent of an offspring a. The operator probabilities p (... |

30 | A critical survey of performance indices for multi-objective optimisation - Okabe, Jin, et al. - 2003 |

25 | A image processing system for driver assistance
- Handmann, Kalinke, et al.
- 2000
(Show Context)
Citation Context ... an adaptation cycle. The average performance achieved by the operator o over an adaptation cycle is measured by q (t,τ) o := τ−1 � i=0 � a∈O (t−i) o max (0, B (t) (a)) � τ−1 � � �O i=0 (t−i) o � � , =-=(5)-=- where B (t) (a) represents a quality measure proportional to some kind of fitness improvement. This is for the scalar value based selection scheme, case (A), B (t) (a) := Φ(a) − Φ(parent(a)) (6) and ... |

16 | Neural network regularization and ensembling using multi-objective evolutionary algorithms
- Jin, Okabe, et al.
- 2004
(Show Context)
Citation Context ... unary quantitative indicators Ii : P(sn ) →s, 1 ≤ i ≤ m, such that in general I1(A) > I1(B) ∧ . . . ∧ Im(A) > Im(B) ⇔ A ⊲ B. However, we can achieve (A ⊲ B ⇒ I(A) > I(B)) and (I(A) > I(B) ⇒ B ⋫ A) , =-=(11)-=-sEvolutionary Multi-objective Optimization of Neural Networks for Face Detection 11 for example when using the hypervolume-indicator 29 HP explained in Fig.3 (right). It measures the portion of object... |

15 |
Overfitting in neural networks: backpropagation, conjugate gradient, and early stopping
- Caruana, Lawrence, et al.
(Show Context)
Citation Context ...Bj > 0 ∧ DBj,Ai > 0, 1 ≤ j ≤ T � | · 1 /T , (15) PBj⊲A := | � (Ai, Bj) : DAi,Bj = 0 ∧ DBj,Ai > 0, 1 ≤ i ≤ T � | · 1 /T , (16) P Bj||A := | � (Ai, Bj) : DAi,Bj > 0 ∧ DBj,Ai > 0, 1 ≤ i ≤ T � | · 1 /T , =-=(17)-=- that is, in (14) the average number of trials from algorithm (B) that perform worse than trial Ai, in (15) the average number of trials from algorithm (B) that are incomparable to trial Ai, in (16) t... |

14 | Operator adaptation in evolutionary computation and its application to structure optimization of neural networks - Igel, Kreutz |

13 | Fast network pruning and feature extraction using the unit-OBS algorithm - Stahlberger, Riedmiller - 1997 |

9 | Neuronale Netze. Optimierung durch Lernen und Evolution - Braun - 1997 |

8 | Evolutionary optimization of neural networks for face detection
- Wiegand, Igel, et al.
- 2004
(Show Context)
Citation Context ...t. This is for the scalar value based selection scheme, case (A), B (t) (a) := Φ(a) − Φ(parent(a)) (6) and for the vector-valued selection scheme, case (B), B (t) (a) := R (t) (parent(a)) − R (t) (a) =-=(7)-=- respectively, where parent(a) denotes the parent of an offspring a. The operator probabilities p (t+1) o are adjusted every τ generations according to equations ˜p (t+1) � ζ · q o := (t,τ) o /q (t,τ)... |

7 | Early Stopping { But When ? In: Neural Networks: Tricks of the Trade - Prechelt - 1998 |

5 |
Abbass, "Speeding up backpropagation using multiobjective evolutionary algorithms
- A
- 2003
(Show Context)
Citation Context ... n . A set A ∈ P(R n ) can be defined to be better than a front B ∈ P(R n ) by the relation A ⊲ B (A weakly dominates B) given by A ⊲ B iff A �= B and ∀ b ∈ B : ∃ a ∈ A : b is weakly dominated by a . =-=(10)-=- Weak dominance of z compared to z ′ means that objective vector z is not worse than z ′ in all objectives. We would also like to make some quantitative statements about how much Pareto fronts outperf... |

3 |
der Malsburg. Evaluation of implicit 3D modeling for pose invariant face recognition
- Hüsken, Brauckmann, et al.
- 2004
(Show Context)
Citation Context ...∧ ∃ 1 ≤ j ≤ n : zj < z ′ j (3) stating that vector z performs better than z ′ iff z is as least as good as z ′ in all objectives and better with respect to at least one objective. Considering a set M =-=(2)-=-sPSfrag replacements 8 Stefan Wiegand, Christian Igel, and Uwe Handmann z2 aj−1 cuboid PSfrag replacements aj aj+1 z1 z2 (z min 1 , z min 2 ) V α γ β (z max 1 , zmax 2 ) Fig. 3. The figure on the left... |

1 | Evolution and learning in neural networks, in The Handbook of Brain Theory and Neural - Nolfi - 2002 |