## Abstraction-Based Genetic Programming (2009)

### BibTeX

@MISC{Binard09abstraction-basedgenetic,

author = {Franck J. L. Binard},

title = {Abstraction-Based Genetic Programming },

year = {2009}

}

### OpenURL

### Abstract

This thesis describes a novel method for representing and automatically generating computer programs in an evolutionary computation context. Abstraction-Based Genetic Programming (ABGP) is a typed Genetic Programming representation system that uses System F, an expressive λ-calculus, to represent the computational components from which the evolved programs are assembled. ABGP is based on the manipulation of closed, independent modules expressing computations with effects that have the ability to affect the whole genotype. These modules are plugged into other modules according to precisely defined rules to form complete computer programs. The use of System F allows the straightforward representation and use of many typical computational structures and behaviors (such as iteration, recursion, lists and trees) in modular form. This is done without introducing additional external symbols in the set of predefined functions and terminals of the system. In fact, programming structures typically included in GP terminal sets, such as if then else, may be removed and represented as abstractions in ABGP for the same problems. ABGP also provides a search space partitioning system based on the structure of the genotypes, similar to the species partitioning system of living organisms and derived from the Curry-Howard isomorphism. This thesis also presents the results obtained by applying this method to a set of problems.

### Citations

3037 |
H.: Adaption in Natural and Artificial Systems
- Holland
- 1975
(Show Context)
Citation Context ...ts. In any event, sensicality is one of the most salient issues of the classical representation scheme. 2.2.2.3 Search Space Partitioning Since John Holland’s work in the 1970s and his schema theorem =-=[37]-=-, schemata are often used to explain why genetic algorithms (GA) work. Schemata are “similarity templates” and the schema theorem describes how they are expected to propagate generation after generati... |

2061 |
Genetic algorithms in search, optimization, and machine learning
- Goldberg
- 1989
(Show Context)
Citation Context ...heir fitness and exhibit less and less phenotypic variation. An alternative approach to understanding how GP searches is based on the idea of dividing the search space into subspaces (called schemata =-=[38, 30]-=-). The original idea for schemata originates from Genetic Algorithms (GA), a branch of evolutionary computation that concerns itself with the evolution of solutions to problems (in contrast with GP th... |

1626 | The Definition of Standard ML
- Milner, Tofte, et al.
- 1990
(Show Context)
Citation Context ...g it to the compiler to check whether a type can be assigned to the program. This will be the case if the program is correct. A well known example of a programming language that uses this style is ML =-=[56]-=-. 2. In the Church style, typing is explicit and the terms are annotated versions of the type-free terms. Each term has a type that is usually unique up to α-equivalence and that type is derivable fro... |

1314 | Handbook of Genetic Algorithms
- DAVIS
- 1991
(Show Context)
Citation Context ... primarily because it is historically accepted as a versatile technique that has been widely implemented. In addition, this selection method has also proven successful in a variety of problem domains =-=[43, 20]-=- and the selection algorithm itself is relatively straightforward. Fitnessproportionate selection has the short-coming of over-select: individuals with super-fitness tend to get selected too often, le... |

1305 |
On computable numbers with an application to the Entscheidungsproblem
- Turing
- 1936
(Show Context)
Citation Context ...vably total functions. However, System F’s strong normalization property implies that its programs will always eventually terminate. This in turns implies (by the unsolvability of the halting problem =-=[75]-=-) that there are computable functions that cannot be represented in System F. This is not so bad as it sounds because as [3] puts it, in order to find computable functions that cannot be represented i... |

1184 |
The Lambda Calculus: Its Syntax and Semantics
- Barendregt
- 1984
(Show Context)
Citation Context ...tem dealing with functions became a successful model for the computable functions. Representing computable functions as expressions in a λ-calculus gives rise to what is called functional programming =-=[4]-=-. α-Conversion and α-Equivalence α-conversion is a bound variable renaming operation. It expresses the notion that the names of the bound variables are unimportant; Two expressions are α-equivalent wh... |

879 |
A formulation of the simple theory of types
- Church
- 1940
(Show Context)
Citation Context ...tion. In fact, all computable recursive functions can be represented in the type-free λ-calculus (see [35] and [4]). 5.1.2 Typed λ-calculi Typed versions of the λ-calculus were introduced in [17] and =-=[12]-=-. The two original papers of Curry and Church introducing typed versions of the λ-calculus give rise to two different families of systems. 1. In the Curry style, typing is implicit and terms are those... |

798 | On understanding types, data abstraction, and polymorphism
- Cardelli, Wegner
- 1985
(Show Context)
Citation Context ...on of a universal quantification operation on types. 1.2.1 Types Types arise naturally, even starting from untyped universes, in any domain to categorize objects according to their usage and behavior =-=[10]-=-. A type is a collection of values that share some properties [57]. A type system has as its major purpose to avoid embarrassing questions about representations, and to forbid situations where these q... |

760 |
Types and Programming Languages
- Pierce
- 2002
(Show Context)
Citation Context ... the function has been written, it can be instantiated as needed, by providing values for the parameters in each case. Functional abstraction is a key feature of essentially all programming languages =-=[63]-=-. System F uses the symbol λ to denote anonymous function abstraction. For example, given a function f of type [A → A] (a function that takes an object of type [A] as argument and outputs another obje... |

663 | Some studies in machine learning using the game of checkers
- Samuel
- 1959
(Show Context)
Citation Context ...l Concepts Computer scientists have wanted to give computers the ability to learn since the 1950s. The term Machine Learning (ML) was coined in 1959 by Samuel to mean computers programming themselves =-=[68]-=-. A good contemporary definition of ML is due to Mitchell: “Machine learning is the study of computer algorithms that improve automatically through experience” [58]. Genetic Programming (GP), a subset... |

552 | Lambda calculi with types
- Barendregt
- 1992
(Show Context)
Citation Context ...minate. This in turns implies (by the unsolvability of the halting problem [75]) that there are computable functions that cannot be represented in System F. This is not so bad as it sounds because as =-=[3]-=- puts it, in order to find computable functions that cannot be represented in F, “one has to stand on one’s head”. In theory, we could do all the programming we would ever need without going outside t... |

515 |
Interactive Theorem Proving and Program Development. Coq’Art: The Calculus of Inductive Constructions. Texts in Theoretical Computer Science. An EATCS series
- Bertot, Castéran
- 2004
(Show Context)
Citation Context ...evelopment of the calculus of constructions [15] and the logical framework LF [62]. A number of popular computer-based proof systems are based on type theory, for example NuPRL [1], LEGO [64] and Coq =-=[5]-=-. The work presented in this thesis extends the applicability of type theory to GP. 1.2.4 Abstract Data Types (ADT) An Abstract Data Type is a description of a common representation of data. It is a b... |

469 |
The formulae-as-types notion of construction
- Howard
- 1980
(Show Context)
Citation Context ... to form genotypes from typed terminals. This method is applicable to all versions of typed GP. It is a contribution of the work presented in this thesis. It relies on the Curry-Howard correspondence =-=[39]-=-, the generalization of which is the following claim: a proof is a program, the formula it proves is a type for the program. This can be seen as an analogy which states that the return type of a funct... |

378 |
Towards a theory of type structure
- Reynolds
- 1974
(Show Context)
Citation Context ...egy related to GP in which the genotypes are compositions of computational blocks, assembled from a pattern derived from a second-order logic proof. The blocks (which are called alleles) are System F =-=[28, 66]-=- terms. System F is an extension of the simply typed λ-calculus. 1.1 MOTIVATIONS AND CONTRIBUTIONS The scientific contributions of this work are: 1. The formulation of ABGP, a GP variant that addresse... |

360 | The GENITOR algorithm and selection pressure: Why rank-based allocation of reprodutive trials is best
- Whitley
- 1989
(Show Context)
Citation Context ...hold do not produce offspring. Unlike fitness proportionate selection there are no chances for weaker solutions to survive the selection process. 2.5.3 Ranking-based Selection Ranking-based selection =-=[79, 32]-=- sorts the genotypes of a generation according to the raw fitness score of their associated genotypes. The probability of a genotype being selected for reproduction depends only on its position in ter... |

354 |
Intuitionistic Type Theory
- Martin-Löf
- 1984
(Show Context)
Citation Context ...nter-related meanings of types. In this work, we use all of them at different times and in different contexts, so we will clarify immediately the nuances. 1.2.3 Type Theory Intuitionistic type theory =-=[55]-=- is a logical system and a set theory based on the principles of mathematical constructivism. Introduced by Per Martin-Löf in 1972, intuitionistic type theory is based on the analogy between propositi... |

342 |
Foundations for programming languages
- Mitchell
- 1996
(Show Context)
Citation Context ...ypes arise naturally, even starting from untyped universes, in any domain to categorize objects according to their usage and behavior [10]. A type is a collection of values that share some properties =-=[57]-=-. A type system has as its major purpose to avoid embarrassing questions about representations, and to forbid situations where these questions might come up. In mathematics as in programming, types im... |

320 | Lambda-calculus notation with nameless dummies: a tool for automatic formula manipulation with application to the Church-Rosser theorem
- Bruijn
- 1972
(Show Context)
Citation Context ...am’ as Λ 2. ‘lam’ as λ 3. ‘TT’ as Π 4. ‘->’ as → 6.2 PRIMITIVES To represent System F terms and types, we use the implementation of the de Bruijn canonical representation of variables and expressions =-=[8]-=- described in [63]. It has the advantage of making α-equivalence the same as syntactic equality. In this representation scheme, named variables are replaced by natural numbers, each index is a number ... |

247 | Strongly typed genetic programming
- Montana
- 1995
(Show Context)
Citation Context ...t associative, so [A → (B → C)] is equivalent to [A → B → C]. 1.2.2 GP and Types A problem with using GP to solve large and complex problems is the considerable size of the search space [33]. Montana =-=[59]-=- illustrated how the size of the search space of possible parse trees might be in the order of 1027 parse trees even for small problems. A type in a metaphysical sense is a category of being. For exam... |

237 | Foundations of Genetic Programming
- Langdon, Poli
- 2002
(Show Context)
Citation Context ... individuals as they are modified by the genetic operators and are evaluated by the fitness function [80]. 2. It does not provide any obvious, mathematically natural way to partition the search space =-=[53]-=-. The scheme constructs tree-like genotypes such as the one of figure 2.1 by assembling “functions” (inner nodes) and “terminals” (leaves): 9Figure 2.1: Genotype formed using the classical GP represe... |

232 |
Une extension de l'interprétation de Gödel a l'analyse, et son application a l'élimination des coupures dans l'analyse et la théorie des types
- Girard
- 1971
(Show Context)
Citation Context ...egy related to GP in which the genotypes are compositions of computational blocks, assembled from a pattern derived from a second-order logic proof. The blocks (which are called alleles) are System F =-=[28, 66]-=- terms. System F is an extension of the simply typed λ-calculus. 1.1 MOTIVATIONS AND CONTRIBUTIONS The scientific contributions of this work are: 1. The formulation of ABGP, a GP variant that addresse... |

222 |
Combinatory Logic I
- Curry, Feys
- 1958
(Show Context)
Citation Context ...xponentiation. In fact, all computable recursive functions can be represented in the type-free λ-calculus (see [35] and [4]). 5.1.2 Typed λ-calculi Typed versions of the λ-calculus were introduced in =-=[17]-=- and [12]. The two original papers of Curry and Church introducing typed versions of the λ-calculus give rise to two different families of systems. 1. In the Curry style, typing is implicit and terms ... |

196 | The evolution of evolvability in genetic programming
- Altenberg
- 1994
(Show Context)
Citation Context ...fitness landscape would be a three-dimensional map with the fitness of the genotypes as the height. Evolution causes populations to move along a fitness landscape in particular ways. The evolvability =-=[2]-=- of an evolutionary system can be seen as movement toward local peaks in the fitness landscape. A local peak is not necessarily the highest point in the fitness landscape, but any small movement away ... |

168 | Genetic Programming: A Paradigm for Genetically Breeding Populations of Computer Programs to Solve Problems
- Koza
- 1990
(Show Context)
Citation Context ... . . . . . . . . . . . . . . . . . . . . . . 25 3.2 A set ℑty for a sample side-effect GP system capable of expressing the genotype of figure 3.1 26 XIChapter 1 Introduction Genetic Programming (GP) =-=[45, 46, 48]-=- is an Evolutionary Computation (EC) search strategy in which solutions are represented as executable parse trees. GP systems evolve populations of parse trees using a selection process linked to the ... |

160 |
Fundamental concepts in programming languages
- Strachey
- 1967
(Show Context)
Citation Context ...nd “generic programming” in the imperative programming world (where confusingly polymorphism means something else). There is a particularly strong kind of polymorphism, called parametric polymorphism =-=[73]-=-. Parametric polymorphism allows the definition of functions which have uniform behavior for all types. For example, a function f, defined in English as: “a function that takes two arguments of the sa... |

159 |
Mathematical games: The fantastic combinations of John Conway’s new solitaire game “life
- Gardner
- 1970
(Show Context)
Citation Context ...ch pathways is that very simple systems with very simple rules can produce very complicated behavior. Of course, most simple systems produce simple behavior, but every once in a while, a game of life =-=[27]-=- or a recursive function such as: Q(1) = Q(2) = 1 , Q(n) = Q(n − Q(n − 1)) + Q(n − Q(n − 2)) taken from page 137 of [36] proves itself capable of producing very complicated behavior. Often, the people... |

159 |
A Learning System Based On Genetic Adaptive Algorithms
- Smith
- 1980
(Show Context)
Citation Context ...omputer programs that improve automatically as they experience the data on which they are trained. The first results obtained using GP methodology were reported in Stephen F. Smith’s PhD dissertation =-=[71]-=- in 1980. In 1981, Richard Forsyth [26] described the evolution of small programs with applications to forensic science. The first modern statement of GP was given by Nichael L. Cramer [16] in 1985. I... |

135 |
Correspondence between algol 60 and church’s lambda-notation: part i
- Landin
- 1965
(Show Context)
Citation Context ...stigate function definition and application as well as recursion. It has since emerged as a valuable tool in computability or recursion theory. Most programming languages are rooted in the λ-calculus =-=[52]-=-, which provides the basic mechanisms for procedural abstraction and procedure (subprogram) application. The calculus is an idealized, minimalist programming language capable of expressing any algorit... |

126 |
Hierarchical genetic algorithms operating on populations of computer programs
- Koza
- 1989
(Show Context)
Citation Context ... . . . . . . . . . . . . . . . . . . . . . . 25 3.2 A set ℑty for a sample side-effect GP system capable of expressing the genotype of figure 3.1 26 XIChapter 1 Introduction Genetic Programming (GP) =-=[45, 46, 48]-=- is an Evolutionary Computation (EC) search strategy in which solutions are represented as executable parse trees. GP systems evolve populations of parse trees using a selection process linked to the ... |

117 |
How genetic algorithms work: A critical look at implicit parallelism
- Grefenstette, Baker
- 1989
(Show Context)
Citation Context ...hold do not produce offspring. Unlike fitness proportionate selection there are no chances for weaker solutions to survive the selection process. 2.5.3 Ranking-based Selection Ranking-based selection =-=[79, 32]-=- sorts the genotypes of a generation according to the raw fitness score of their associated genotypes. The probability of a genotype being selected for reproduction depends only on its position in ter... |

113 |
A set of postulates for the foundation of logic
- Church
- 1933
(Show Context)
Citation Context ... for the related concepts of α-conversion and α-equivalence). 5.1 λ-CALCULI A λ-calculus is a formalism that uses the symbol λ to denote anonymous function abstraction. Originally conceived by Church =-=[11]-=-, the first λ-calculus turned out to be inconsistent [44], but the subsystem dealing with functions became a successful model for the computable functions. Representing computable functions as express... |

111 | Genetic programming for feature discovery and image discrimination
- Tackett
- 1993
(Show Context)
Citation Context ...hin that set. It was also observed that most runs achieved a point where the size and complexity of trees eventually began to grow increasingly 8larger, while performance tapered off to lower values =-=[65, 86, 74]-=-. The fitness value of a genotype in a current generation is a synthesis of the information needed by the evolution function to decide how much of the genotype’s genetic material should be present in ... |

110 |
Evolutionary Computation --- The Fossil Record
- Fogel
- 1998
(Show Context)
Citation Context ...ve the more optimum computation. This leads to stagnation. This can often be detected early by a quick and chronic lack of diversity in the population. Evolvability is related to population diversity =-=[25]-=-. One of the major obstacles to achieving sustainable evolution is lack of diversity: some essential building blocks are missing in the population. Diversity is necessary for the existence of sufficie... |

106 |
The mathematical language Automath, its usage and some of its extensions
- Bruijn
- 1970
(Show Context)
Citation Context ...ch is exactly equivalent to a second-order Π-elim rule. The idea that a type [A] can also be viewed as a proposition and a term of [A] as a proof of this proposition is independently due to de Bruijn =-=[22]-=- and Howard [40]. Both papers were conceived in 1968, but hints to this propositions-as-types interpretation were given as early as 1958 in [18] and in [54]. Several systems of proof checking are now ... |

88 | Balancing accuracy and parsimony in genetic programming
- Zhang, Mühlenbein
- 1995
(Show Context)
Citation Context ...hin that set. It was also observed that most runs achieved a point where the size and complexity of trees eventually began to grow increasingly 8larger, while performance tapered off to lower values =-=[65, 86, 74]-=-. The fitness value of a genotype in a current generation is a synthesis of the information needed by the evolution function to decide how much of the genotype’s genetic material should be present in ... |

85 | The troubling aspects of a building block hypothesis for genetic programming
- O’Reilly, Oppacher
- 1994
(Show Context)
Citation Context ...notype in which it is embedded. This is necessary so as to allow the representation of certain types of computations (such as recursive applications) that require such a mechanism. 2. Schema capture: =-=[60]-=- found that in the common GP representation, the probability of disruption of a schema changes so drastically from generation to generation that it can best be represented as a random variable. A modu... |

77 | Strongly typed genetic programming in evolving cooperation strategies
- Haynes, Wainwright, et al.
- 1995
(Show Context)
Citation Context ... arrow is right associative, so [A → (B → C)] is equivalent to [A → B → C]. 1.2.2 GP and Types A problem with using GP to solve large and complex problems is the considerable size of the search space =-=[33]-=-. Montana [59] illustrated how the size of the search space of possible parse trees might be in the order of 1027 parse trees even for small problems. A type in a metaphysical sense is a category of b... |

74 |
Constructions: A higher order proof system for mechanizing mathematics
- Coquand, Huet
(Show Context)
Citation Context ...widely in use in theories of semantics of natural language. Intuitionistic type theory developed the notion of dependent types and directly influenced the development of the calculus of constructions =-=[15]-=- and the logical framework LF [62]. A number of popular computer-based proof systems are based on type theory, for example NuPRL [1], LEGO [64] and Coq [5]. The work presented in this thesis extends t... |

69 | The Theory of LEGO: A Proof Checker for the Extended Calculus of Constructions
- Pollack
- 1994
(Show Context)
Citation Context ...luenced the development of the calculus of constructions [15] and the logical framework LF [62]. A number of popular computer-based proof systems are based on type theory, for example NuPRL [1], LEGO =-=[64]-=- and Coq [5]. The work presented in this thesis extends the applicability of type theory to GP. 1.2.4 Abstract Data Types (ADT) An Abstract Data Type is a description of a common representation of dat... |

67 | Analysis of Complexity Drift in Genetic Programming
- Rosca
- 1997
(Show Context)
Citation Context ...hin genotypes. This implies that schemata do not partition the search space, as a genotype may belong to several schemata. Some research has tried to correct this, such as Rosca’s Rooted Tree Theorem =-=[67]-=- in which the definition of a schema is a rooted tree fragment. For example, the schema (plus # time) includes all genotypes whose root node is a plus and whose second argument is time. With this sche... |

65 |
An Analysis of Genetic Programming
- O’Reilly
- 1995
(Show Context)
Citation Context ...hema varies from generation to generation during a run. 3.1.1 Program Component Schema Theories Component analysis concentrates on the propagation of sub-components of genotypes. In some definitions, =-=[47, 60, 77, 78, 61]-=- components of a schema are non-rooted in the sense that the schema can potentially match components anywhere in the tree. The first definition of a schema in the context of GP appears in [47] and it ... |

62 | Grammatically-based genetic programming
- Whigham
- 1995
(Show Context)
Citation Context ...are allowed to be the child nodes of functions in GP program trees. 3.2.2 Context-free Grammar Approach Also noting that the requirement of closure makes many program structures difficult to express, =-=[76]-=- proposed the use of context free grammars (CFGs) to specify the structure of the system’s programs. A context free grammar describes the admissible constructs of a language. Note in the following def... |

52 | Simultaneous evolution of programs and their control structures
- Spector
- 1996
(Show Context)
Citation Context ...re mechanism (section 3.2.1). It was found that genetic programs with ADFs have an advantage on some problems, particularly when the problem has a high level of regularity in its solution. 3.3.2 ADMs =-=[72]-=- has proposed the use of Automatically Defined Macros (ADMs). An ADM is evaluated in the main program global environment, unlike an ADF where the evaluation is performed in its local environment. The ... |

50 | Type inheritance in strongly typed genetic programming
- Haynes, Schoenefeld, et al.
- 1996
(Show Context)
Citation Context ... that the reduced search space is the cause of the performance improvements. They also showed that the programs generated by STGP tend to be easier to understand. Following these encouraging results, =-=[34]-=- proposes the extension of STGP with a type hierarchy mechanism and describes an application to the problem of finding all the cliques in an indirected or directed graph. 3.2.3.2 Polymorphic STGP The ... |

44 | The Nuprl open logical environment
- Allen, Constable, et al.
(Show Context)
Citation Context ...rectly influenced the development of the calculus of constructions [15] and the logical framework LF [62]. A number of popular computer-based proof systems are based on type theory, for example NuPRL =-=[1]-=-, LEGO [64] and Coq [5]. The work presented in this thesis extends the applicability of type theory to GP. 1.2.4 Abstract Data Types (ADT) An Abstract Data Type is a description of a common representa... |

44 |
Dependent types in logic programming
- Pfenning
- 1992
(Show Context)
Citation Context ...tics of natural language. Intuitionistic type theory developed the notion of dependent types and directly influenced the development of the calculus of constructions [15] and the logical framework LF =-=[62]-=-. A number of popular computer-based proof systems are based on type theory, for example NuPRL [1], LEGO [64] and Coq [5]. The work presented in this thesis extends the applicability of type theory to... |

41 |
The selfish gene. New edition
- Dawkins
- 1989
(Show Context)
Citation Context ...s and another at the level of genes. Future Work: In biology, the smallest entity within the hierarchy of biological organization that is subject to natural selection is called the unit of selection. =-=[21]-=- proposed the gene as the biological unit of selection based on the observation that genes that improve the survival or reproductive chances of the organisms that carry them also improve their own cha... |

39 | A schema theorem for context-free grammars
- Whigham
- 1995
(Show Context)
Citation Context ...hema varies from generation to generation during a run. 3.1.1 Program Component Schema Theories Component analysis concentrates on the propagation of sub-components of genotypes. In some definitions, =-=[47, 60, 77, 78, 61]-=- components of a schema are non-rooted in the sense that the schema can potentially match components anywhere in the tree. The first definition of a schema in the context of GP appears in [47] and it ... |

36 |
Grammatical Bias for Evolutionary Learning
- Whigham
- 1996
(Show Context)
Citation Context ...hema varies from generation to generation during a run. 3.1.1 Program Component Schema Theories Component analysis concentrates on the propagation of sub-components of genotypes. In some definitions, =-=[47, 60, 77, 78, 61]-=- components of a schema are non-rooted in the sense that the schema can potentially match components anywhere in the tree. The first definition of a schema in the context of GP appears in [47] and it ... |

32 |
Evolutionary principles in self-referential learning, or on learning how to learn: the meta-meta-… hook, Institut f�r Informatik, Technische Universit�t M�nchen, [Online], http://www.idsia.ch/˜juergen/ diploma.html
- Schmidhuber
- 1987
(Show Context)
Citation Context ...85. Independently, Jürgen Schmidhuber, an undergraduate student evolved computer programs through genetic algorithms. The method was published in 1987 as one of the first papers in the emerging field =-=[69]-=-. John Koza [45, 46, 48] later explored program creation by means of evolution and established the field of GP. Since then, there has been extensive demonstration of GP as a domain-independent method ... |