{1529} revision 0 modified: 02-01-2021 18:39 gmt

DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning

  • Kevin Ellis, Catherine Wong, Maxwell Nye, Mathias Sable-Meyer, Luc Cary, Lucas Morales, Luke Hewitt, Armando Solar-Lezama, Joshua B. Tenenbaum

This paper describes a system for adaptively finding programs which succinctly and accurately produce desired output.  These desired outputs are provided by the user / test system, and come from a number of domains:

  • list (as in lisp) processing,
  • text editing,
  • regular expressions,
  • line graphics,
  • 2d lego block stacking,
  • symbolic regression (ish),
  • functional programming,
  • and physcial laws.  
Some of these domains are naturally toy-like, eg. the text processing, but others are deeply impressive: the system was able to "re-derive" basic physical laws of vector calculus in the process of looking for S-expression forms of cheat-sheet physics equations.  These advancements result from a long lineage of work, perhaps starting from the Helmholtz machine PMID-7584891 introduced by Peter Dayan, Geoff Hinton and others, where onemodel is trained to generate patterns given context (e.g.) while a second recognition module is trained to invert this model: derive context from the patterns.  The two work simultaneously to allow model-exploration in high dimensions.  

Also in the lineage is the EC2 algorithm, which most of the same authors above published in 2018.  EC2 centers around the idea of "explore - compress" : explore solutions to your program induction problem during the 'wake' phase, then compress the observed programs into a library by extracting/factoring out commonalities during the 'sleep' phase.  This of course is one of the core algorithms of human learning: explore options, keep track of both what worked and what didn't, search for commonalities among the options & their effects, and use these inferred laws or heuristics to further guide search and goal-setting, thereby building a buffer attack the curse of dimensionality.  Making the inferred laws themselves functions in a programming library allows hierarchically factoring the search task, making exploration of unbounded spaces possible.  This advantage is unique to the program synthesis approach. 

This much is said in the introduction, though perhaps with more clarity.  DreamCoder is an improved, more-accessible version of EC2, though the underlying ideas are the same.   It differs in that the method for constructing libraries has improved through the addition of a powerful version space for enumerating and evaluating refactors of the solutions generated during the wake phase.  (I will admit that I don't much understand the version space system.)  This version space allows DreamCoder to collapse the search space for re-factorings by many orders of magnitude, and seems to be a clear advancement.  Furthermore, DreamCoder incorporates a second phase of sleep: "dreaming", hence the moniker.  During dreaming the library is used to create 'dreams' consisting of combinations of the library primitives, which are then executed with training data as input.  These dreams are then used to train up a neural network to predict which library and atomic objects to use in given contexts.  Context in this case is where in the parse tree a given object has been inserted (it's parent and which argument number it sits in); how the data-context is incorporated to make this decision is not clear to me (???). 

This neural dream and replay-trained neural network is either a GRU recurrent net with 64 hidden states, or a convolutional network feeding into a RNN.  The final stage is a linear ReLu (???) which again is not clear how it feeds into the prediction of "which unit to use when".  The authors clearly demonstrate that the network, or the probabalistic context-free grammar that it controls (?) is capable of straightforward optimizations, like breaking symmetries due to commutativity, avoiding adding zero, avoiding multiplying by one, etc.  Beyond this, they do demonstrate via an ablation study that the presence of the neural network affords significant algorithmic leverage in all of the problem domains tested.  The network also seems to learn a reasonable representation of the sub-type of task encountered -- but a thorough investigation of how it works, or how it might be made to work better, remains desired. 

I've spent a little time looking around the code, which is a mix of python high-level experimental control code, and lower-level OCaml code responsible for running (emulating) the lisp-like DSL, inferring type in it's polymorphic system / reconciling types in evaluated program instances, maintaining the library, and recompressing it using aforementioned version spaces.  The code, like many things experimental, is clearly a work-in progress, with some old or unused code scattered about, glue to run the many experiments & record / analyze the data, and personal notes from the first author for making his job talks (! :).  The description in the supplemental materials, which is satisfyingly thorough (if again impenetrable wrt version spaces), is readily understandable, suggesting that one (presumably the first) author has a clear understanding of the system.  It doesn't appear that much is being hidden or glossed over, which is not the case for all scientific papers. 


With the caveat that I don't claim to understand the system to completion, there are some clear areas where the existing system could be augmented further.  The 'recognition' or perceptual module, which guides actual synthesis of candidate programs, realistically can use as much information as is available in DreamCoder as is available: full lexical and semantic scope, full input-output specifications, type information, possibly runtime binding of variables when filling holes.  This is motivated by the way that humans solve problems, at least as observed by introspection:
  • Examine problem, specification; extract patterns (via perceptual modules)
  • Compare patterns with existing library (memory) of compositionally-factored 'useful solutions' (this is identical to the library in DreamCoder)* Do something like beam-search or quasi stochastic search on selected useful solutions.  This is the same as DreamCoder, however human engineers make decisions progressively, at runtime so-to-speak: you fill not one hole per cycle, but many holes.  The addition of recursion to DreamCoder, provided a wider breadth of input information, could support this functionality. 
  • Run the program to observe input-output .. but also observe the inner workings of the program, eg. dataflow patterns.  These dataflow patterns are useful to human engineers when both debugging and when learning-by-inspection what library elements do.   DreamCoder does not really have this facility. 
  • Compare the current program results to the desired program output.  Make a stochastic decision whether to try to fix it, or to try another beam in the search.  Since this would be on a computer, this could be in parallel (as DreamCoder is); the ability to 'fix' or change a DUT is directly absent dreamcoder.   As an 'deeply philosophical' aside, this loop itself might be the effect of running a language-of-thought program, as was suggested by pioneers in AI (ref).  The loop itself is subject to modification and replacement based on goal-seeking success in the domain of interest, in a deeply-satisfying and deeply recursive manner ...
At each stage in the pipeline, the perceptual modules would have access to relevant variables in the current problem-solving context.  This is modeled on Jacques Pitrat's work.  Humans of course are even more flexible than that -- context includes roughly the whole brain, and if anything we're mushy on which level of the hierarchy we are working. 

Critical to making this work is to have, as I've written in my notes many years ago, a 'self compressing and factorizing memory'.  The version space magic + library could be considered a working example of this.  In the realm of ANNs, per recent OpenAI results with CLIP and Dall-E, really big transformers also seem to have strong compositional abilities, with the caveat that they need to be trained on segments of the whole web.  (This wouldn't be an issue here, as Dreamcoder generates a lot of its own training data via dreams).  Despite the data-inefficiency of DNN / transformers, they should be sufficient for making something in the spirit of above work, with a lot of compute, at least until more efficient models are available (which they should be shortly; see AlphaZero vs MuZero).