Competent Program Evolution
- An excellent start, excellent good description + meta-description / review of existing literature.
- He thinks about things in a slightly different way - separates what I call solutions and objective functions "post- and pre-representational levels" (respectively).
- The thesis focuses on post-representational search/optimization, not pre-representational (though, I believe that both should meet in the middle - eg. pre-representational levels/ objective functions tuned iteratively during post-representational solution creation. This is what a human would do!)
- The primary difficulty in competent program evolution is the intense non-decomposability of programs: every variable, constant, branch effects the execution of every other little bit.
- Competent program creation is possible - humans create programs significantly shorter than lookup tables - hence it should be possible to make a program to do the same job.
- One solution to the problem is representation - formulate the program creation as a set of 'knobs' that can be twiddled (here he means both gradient-descent partial-derivative optimization and simplex or heuristic one-dimensional probabilistic search, of which there are many good algorithms.)
- pp 27: outline of his MOSES program. Read it for yourself, but looks like:
- The representation step above "explicitly addresses the underlying (semantic) structure of program space independently of the search for any kind of modularity or problem decomposition."
- In MOSES, optimization does not operate directly on program space, but rather on subspaces defined by the representation-building process. These subspaces may be considered as being defined by templates assigning values to some of the underlying dimensions (e.g., they restrict the size and shape of any resulting trees).
- In chapter 3 he examines the properties of the boolean programming space, which is claimed to be a good model of larger/more complicated programming spaces in that:
- Simpler functions are much more heavily sampled - e.g. he generated 1e6 samples of 100-term boolean functions, then reduced them to minimal form using standard operators. The vast majority of the resultant minimum length (compressed) functions were simple - tautologies or of a few terms.
- A corollary is that simply increasing syntactic sample length is insufficient for increasing program behavioral complexity / variety.
- Actually, as random program length increases, the percentage with interesting behaviors decreases due to the structure of the minimum length function distribution.
- Also tests random perturbations to large boolean formulae (variable replacement/removal, operator swapping) - ~90% of these do nothing.
- These randomly perturbed programs show a similar structure to above: most of them have very similar behavior to their neighbors; only a few have unique behaviors. makes sense.
- Run the other way: "syntactic space of large programs is nearly uniform with respect to semantic distance." Semantically similar (boolean) programs are not grouped together.
- Results somehow seem a let-down: the program does not scale to even moderately large problem spaces. No loops, only functions with conditional evalutation - Jacques Pitrat's results are far more impressive. {815}
- Seems that, still, there were a lot of meta-knobs to tweak in each implementation. Perhaps this is always the case?
- My thought: perhaps you can run the optimization not on program representations, but rather program codepaths. He claims that one problem is that behavior is loosely or at worst chaotically related to program structure - which is true - hence optimization on the program itself is very difficult. This is why Moshe runs optimization on the 'knobs' of a representational structure.
|