You are not authenticated, login.
text: sort by
tags: modified
type: chronology
{889} is owned by tlh24.{868} is owned by tlh24.{797} is owned by tlh24.{851} is owned by tlh24.{845} is owned by tlh24.{426} is owned by tlh24.
hide / / print
ref: -0 tags: ocaml application functional programming date: 10-11-2022 21:36 gmt revision:2 [1] [0] [head]


From this I learned that in ocaml you can return not just functions (e.g. currying) but appliations of yet-to-be named functions.

let sum f = f 0 ;;
let arg a b c = c ( b + a ) ;;
let z a = a ;;


sum (arg 1) ;; 

is well-typed as (int -> `a) -> `a = <fun> e.g. an application of a function that converts int to `a. Think of it as the application of Xa to argument ( 0 + 1 ), where Xa is the argument (per type signature). Zero is supplied by the definition of 'sum'.

 sum (arg 1) (arg 2);; 

can be parsed as

(sum (arg 1)) (arg 2) ;; 

'(arg 2)' outputs an application of an int & a yet-to be determined function to 'a,

E.g. it's typed as int -> (int -> `a) -> `a = <fun>. So, you can call it Xa passed to above.

Or, Xa = Xb( ( 0 + 1 ) + 2)

where, again, Xb is a yet-to-be defined function that is supplied as an argument.

Therefore, you can collapse the whole chain with the identity function z. But, of course, it could be anything else -- square root perhaps for MSE?

All very clever.

hide / / print
ref: work-0 tags: distilling free-form natural laws from experimental data Schmidt Cornell automatic programming genetic algorithms date: 12-30-2021 05:11 gmt revision:7 [6] [5] [4] [3] [2] [1] [head]

Distilling free-form natural laws from experimental data

  • The critical step was to use the full set of all pairs of partial derivatives ( δx/δy\delta x / \delta y ) to evaluate the search for invariants.
  • The selection of which partial derivatives are held to be independent / which variables are dependent is a bit of a trick too -- see the supplemental information.
    • Even yet, with a 4D data set the search for natural laws took ~ 30 hours.
  • This was via a genetic algorithm, distributed among 'islands' on different CPUs, with mutation and single-point crossover.
  • Not sure what the IL is, but it appears to be floating-point assembly.
  • Timeseries data is smoothed with Loess smoothing, which fits a polynomial to the data, and hence allows for smoother / more analytic derivative calculation.
    • Then again, how long did it take humans to figure out these invariants? (Went about it in a decidedly different way..)
    • Further, how long did it take for biology to discover similar 'design equations'?
      • The same algorithm has been applied to biological data - a metabolic pathway - with some success pub 2011.
      • Of course evolution had to explore a much larger space - proteins and regulatory pathways, not simpler mathematical expressions / linkages.

Since his Phd, Michael Schmidt has gone on to found Nutonian, which produced Eurequa software, apparently without dramatic new features other than being able to use the cloud for equation search. (Probably he improved many other detailed facets of the software..). Nutonian received $4M in seed funding, according to Crunchbase.

In 2017, Nutonian was acquired by Data Robot (for an undisclosed amount), where Michael has worked since, rising to the title of CTO.

Always interesting to follow up on the authors of these classic papers!

hide / / print
ref: -0 tags: inductive logic programming deepmind formal propositions prolog date: 11-21-2020 04:07 gmt revision:0 [head]

Learning Explanatory Rules from Noisy Data

  • From a dense background of inductive logic programming (ILP): given a set of statements, and rules for transformation and substitution, generate clauses that satisfy a set of 'background knowledge'.
  • Programs like Metagol can do this using search and simplify logic built into Prolog.
    • Actually kinda surprising how very dense this program is -- only 330 lines!
  • This task can be transformed into a SAT problem via rules of logic, for which there are many fast solvers.
  • The trick here (instead) is that a neural network is used to turn 'on' or 'off' clauses that fit the background knowledge
    • BK is typically very small, a few examples, consistent with the small size of the learned networks.
  • These weight matrices are represented as the outer product of composed or combined clauses, which makes the weight matrix very large!
  • They then do gradient descent, while passing the cross-entropy errors through nonlinearities (including clauses themselves? I think this is how recursion is handled.) to update the weights.
    • Hence, SGD is used as a means of heuristic search.
  • Compare this to Metagol, which is brittle to any noise in the input; unsurprisingly, due to SGD, this is much more robust.
  • Way too many words and symbols in this paper for what it seems to be doing. Just seems to be obfuscating the work (which is perfectly good). Again: Metagol is only 330 lines!

hide / / print
ref: -0 tags: automatic programming inductive functional igor date: 07-29-2014 02:07 gmt revision:0 [head]

Inductive Rule Learning on the Knowledge Level.

  • 2011.
  • v2 of their IGOR inductive-synthesis program.
  • Quote: The general idea of learning domain specific problem solving strategies is that first some small sample problems are solved by means of some planning or problem solving algorithm and that then a set of generalized rules are learned from this sample experience. This set of rules represents the competence to solve arbitrary problems in this domain.
  • My take is that, rather than using heuristic search to discover programs by testing specifications, they use memories of the output to select programs directly (?)
    • This is allegedly a compromise between the generate-and-test and analytic strategies.
  • Description is couched in CS-lingo which I am inexperienced in, and is perhaps too high-level, a sin I too am at times guilty of.
  • It seems like a good idea, though the examples are rather unimpressive as compared to MagicHaskeller.

hide / / print
ref: -0 tags: automatic programming date: 05-18-2014 07:02 gmt revision:1 [0] [head]


  • http://www.lighttable.com/2014/05/16/pain-we-forgot/ -- approaching the problem by making better tools / interfaces.
  • http://pchiusano.blogspot.com/2013/05/the-future-of-software-end-of-apps-and.html -- approaching the problem by making the web one giant functional & typed language, distributed over all clients.
    • Strongly advocates the utility and need for typed languages to provide contexts for actions.
    • Interesting, but altogether dreamy and unlikely.
  • Bloom language
    • "the standard data structures in Bloom are disorderly collections, rather than scalar variables and structures. These data structures reflect the realities of non-deterministic ordering inherent in distributed systems. Bloom provides simple, familiar syntax for manipulating these structures. In the Bud prototype, much of this syntax comes straight from Ruby, with a taste of MapReduce and SQL." perfect.
    • From Berkeley.
    • Based on Daedalus data language, which specifies temporal ordering?
      • "The basic idea is that Time (meaning both the sequentiality of program steps in a single “thread”, and coordination of steps across threads/machines) is needed for only one purpose: to prevent multiple possible states from co-occurring. I.e. the purpose of time is to seal fate at each instantaneous moment." src

hide / / print
ref: -0 tags: Moshe looks automatic programming google tech talk links date: 11-07-2012 07:38 gmt revision:3 [2] [1] [0] [head]

List of links from Moshe Looks google tech talk:

hide / / print
ref: -0 tags: matlab STL boost programming C++ date: 11-06-2012 21:58 gmt revision:2 [1] [0] [head]

Recently decided to move myopen's sorting program from sqlite-based persistent state to matlab persistent state, for better interoperability with lab rigs & easier introspection. For this I wrote a class for serializing STL / boost based one, two, and three dimensional resizeable containers to and from matlab files.

The code has been tested (albeit not extensively), and therefore may be of use to someone else, even if as an example. See: http://code.google.com/p/myopen/source/browse/trunk/common_host/matStor.cpp

As you can see from the header (below), the interface is nice and concise!

#ifndef __MATSTOR_H__
#define __MATSTOR_H__

class MatStor{
        typedef boost::multi_array<float, 3> array3; 
        typedef boost::multi_array<float, 2> array2; 
        std::string m_name; //name of the file.
        std::map<std::string, std::vector<float> > m_dat1; //one dimensional
        std::map<std::string, array2> m_dat2; //two dimensions
        std::map<std::string, array3> m_dat3; //three!
        MatStor(const char* fname); 
        void save(); 
        void setValue(int ch, const char* name, float val);
        void setValue2(int ch, int un, const char* name, float val);
        void setValue3(int ch, int un, const char* name, float* val, int siz);
        float getValue(int ch, const char* name, float def);
        float getValue2(int ch, int un, const char* name, float def);
        void getValue3(int ch, int un, const char* name, float* val, int siz);

hide / / print
ref: -0 tags: loops feedback arcs video game programming date: 04-30-2012 15:12 gmt revision:0 [head]

I highly agree with this philosophy / this deconstruction of the flow of information in human structures: http://www.lostgarden.com/2012/04/loops-and-arcs.html

On criticism as a meta-arc game:

"In the past I've discussed criticism as a game that attempts to revisit an arc repeatedly and embellish it with additional meaning. The game is to generate essays superficially based on some piece of existing art. In turn, other players generate additional essays based off the first essays. This acts as both a referee mechanism and judge. Score is accumulated via reference counts and by rising through an organization hierarchy. It is a deliciously political game of wit that is both impenetrable to outsiders and nearly independent of the actual source arcs. Here creating an arc becomes a move in the larger game. "

hide / / print
ref: -0 tags: automatic programming synthesis loops examples date: 10-03-2011 22:28 gmt revision:1 [0] [head]

Open letter proposing some ideas on how to automate programming: simulate a human! Rather from a neuro background, and rather sketchy (as in vague, not as in the present slang usage).


hide / / print
ref: -0 tags: automatic programming Greg date: 01-11-2011 07:00 gmt revision:0 [head]

The goal is to make a system that is capable of automatic debugging / design. It seems to me that a good bit of the work of designing and debugging such a program - which I imagine to be basically a set of heuristics triggered by certain situations - is mechanical, and hence could be automated.

First, debugging: Say your program doesn't compile. You look at the error message + location, look at the code, and change variables either there or in causally-connected code to fix the error. The knowledge of what to change basically involves a forward model of how the computer runs the program and experiential knowledge of what worked in the past. The former need not be encoded - the computer can simply run slightly different versions of the program and see what happens itself (aka 'taking the partial derivatives of a program'). The latter can be stored in a well-designed database.

Then your program crashes. Usually, you look at the code and run it on a model of the computer in your mind, checking at every point if the given data-path and code-path to see if it diverges from acceptable behavior. Maybe you instrument the code with printfs. Then, you look at the causally-connected code to see what can be changed to alleviated the error condition, or to make the code/datapath align with well-known programming patterns. Again, the computer does not need a model of itself - the computer can simply running the code as an interpreted language; everything is then instrumented. The well-known patterns can be entered by a human, or learned via experimentation (analogous to 'school' and 'hacking' to a person).

Now your program runs and compiles. But, it doesn't do what you want it to do - you see in one particular case there is an error. So, you look at the program, run it in your mind, and look at all the variables and codepaths that are causally related to producing that case, imagining what changes to each will do to the error. Possibly this involves changing the AST, adding an if-statement, etc. You poke around; eventually something works, if partially. The computer can do the same thing, via the well-instrumented interpreted language.

Eventually this produces spaghetti code (unless you're really smart, and predicted all this case-programming), and small tweaks break more things than they fix. Time to refactor - apply well-known invariant code transformations, or more generally look for simpler codepaths with the same effective datapaths. Perhaps you avoided it by using good initial programming patterns, found mostly through analogy with the real world (object-orientation, MVC etc). Again this is where a good memory of programming-patterns will serve well.

That's the gist. There is much more - namely, the importance of finding transforms that render problems and sub-problems linear, and the practical consideration that the end goal in much computer programming is itself found through iteration - but this email is long enough. What I've outlined probably doesn't look close to GA / EA, but in practice will most likely involve intense and random experimentation within each of the steps above, something that GA does a lot of. What should make it faster is that individual random experimentation is more constrained - smaller space to explore - and when possible memory can replace expending time and power on experimentation. To look at it another way, the project should move a feedback loop (write compile observe) from the human to the computer; this is OK as all the needed info is within the computer already; no outside interaction is needed (unlike, say physical evolution).

hide / / print
ref: ai-0 tags: automatic programming journal notes date: 12-31-2010 05:24 gmt revision:4 [3] [2] [1] [0] [head]

This evening, on the drive back from wacky (and difficult) Russian-style yoga, I got a chance to explain to my brother what I really want to be working on, the thing that really tickles my fancy. My brother and I, so much as genetic commonality and common upbringing seem to effect, have very similar styles of thinking, which made explaining things a bit easier. For you, dear readier, I'll expand a bit.

I'd like to write a program that writes other programs, iteratively, given some objective function / problem statement / environment in which to interact. The present concrete goal is to have a said program make a program that is able to lay out PCBs with quality similar to that of humans. The overarching framework that I'm planning on using is genetic/evolutionary algorithms (the latter does not have crossover, fyi), but no one has applied GA to the problem in this way: most people use GA to solve a particular instance of a problem. Rubbish, i say, this is energy wasteful!

Rubbish, you may return: the stated problem requires a degree of generalization and disconnect from the 'real world' (the PCB) that makes GAs extremely unlikely to come up with any solutions. Expressed another way: the space to be explored is too large (program begets program begets solution). This is a very sensible critique; there is no way in hell a GA can solve this problem. They are notably pathetic at exploring space in a energy-efficient way (to conclude a paragraph again with energy... ).

There are known solutions for this: memory -- cache the results, in terms of algorithm & behavior, of all 'hypotheses' or individuals tried out by a GA. This is what humans do -- they remember the results of their experiment, and substitute the result rather than running a test again. But humans do something far more sophisticated and interesting than just memory - they engineer systems; engineering is an iterative process that often goes down wrong design paths, yet it nonetheless delivers awesome things like Saabs and such.

As I described to K--, engineering is not magic and can be (has been?) described mechanistically. First of all, most engineering artifacts start off from established, well-characterized components, aggregated through the panoply of history. Some of these components describe how other components are put together, things that are either learned in school or by taking things apart. Every engineer, ala Newton, stands on the vast shoulders of the designers before; hence any program must also have these shoulders available. The components are assembled into a system in a seemingly ad-hoc and iterative procedure: sometimes you don't know what you want, so you play with the parts sorta randomly, and see what interesting stuff comes out. Other times you know damn well what you / your boss / the evil warlord who holds you captive wants. Both modes are interesting (and the dichotomy is artificial), but the latter is more computer-like, hence to be modeled.

Often the full details of the objective function or desired goal is very unclear in the hands of the boss / evil warlord (1), despite how reluctant they may be to admit this. Such an effect is well documented in Fred Brooks' book, __The Design of Design__. Likewise, how to get to a solution is unclear in the mind of an engineer, so he/she shuffles things around in the mind (2),

  1. looking for components that deliver particular desired features (e.g. in an electronic system, gain makes me think of an op-amp)
  2. looking for components that remove undesirable features (e.g. a recent noise problem on my wireless headstage made me think of a adaptive decorrelating filter I made once)
  3. looking for transforms that make the problem solvable in a linear space, something that Moshe Looks calls knob-twiddling.
    1. this is from both sides -- transforms that convert the problem or the nascent solution.
    2. An example would be the FFT. This makes it easy to see spectral features.
    3. Another example, used even more recently, is coordinate transforms - it makes thinks like line-line intersection much easier.
    4. When this doesn't work, you can do far more powerful automatic coordinate transform - math, calculus. This is ultimately what I needed when figuring out the shortest line segment between a line segment and a ellipse. Don't ask.

This search is applied iteratively, apparently a good bit of the time subconsciously. A component exists in our mind as a predictive model of how the thing behaves, so we simulate it on input, observe output, and check to see if anything there is correlated / decorrelated with target features. (One would imagine that our general purpose modeling ability grew from needing to model and predict the world and all the yummy food/dangerous animals/warlords in it). The bigger the number of internal models in the engineers mind, the bigger the engineers passion for the project, the more components can be simulated and selected for. Eventually progress is made, and a new subproblem is attacked in the same way, with a shorter path and different input/output to model/regress against.

This is very non-magical, which may appall the more intuitive designers among us. It is also a real issue, because it doesn't (or poorly) explains really interesting engineering: e.g. the creation of the Fourier transform, the creation of the expectation-maximization algorithm, all the statistical and mathematical hardware that lends beauty and power to our design lives. When humans create these things, they are at the height of their creative ability, and thus it's probably a bit ridiculous to propose having a computer program do the same. That does not prevent me from poking at the mystery here, though: perhaps it is something akin to random component assembly (and these must be well known components (highly accurate, fast internal models); most all innovations were done by people exceptionally familiar with their territory), with verification against similarly intimately known data (hence, all things in memory - fast 'iteration cycles'). This is not dissimilar to evolutionary approaches to deriving laws. A Cornell physicist / computer scientist was able to generate natural laws via a calculus-infused GA {842}, and other programs were able to derive Copernicus' laws from planetary data. Most interesting scientific formulae are short, which makes them accessible to GAs (and also aesthetically pleasurable, and/or memelike, but hey!). In contrast engineering has many important design patterns that are borrowed by analogy from real-world phenomena, such as the watermark algorithm, sorting, simulated annealing, the MVC framework, object-oriented programming, WIMP interface, verb/noun interface, programming language, even GAs themselves! Douglas Hofstadter has much more to say about analogies, so I defer to him here.

Irregardless, as K-- pointed out, without some model for creativity (even one as soulless as the one above), any proposed program-creating program will never come up with anything really new. To use a real-world analogy, at his work the boss is extremely crazy - namely, he mistook a circuit breaker for an elevator (in a one-story factory!). But, this boss also comes up with interminable and enthusiastic ideas, which he throws against the wall of his underlings a few dozen times a day. Usually these ideas are crap, but sometimes they are really good, and they stick. According to K--, the way his mind works is basically opaque and illogical (I've met a few of these myself), yet he performs an essential job in the company - he spontaneously creates new ideas. Without such a boss, he claimed, the creations of a program-creating-program will impoverished.

And perhaps hence this should be the first step. Tonight I also learned that at the company (a large medical devices firm) they try to start projects at the most difficult step. That way, projects that are unlikely to succeed are killed as soon as possible. The alternate strategy, which I have previously followed, is to start with the easiest things first, so you get some motivation to continue. Hmm...

The quandary to shuffle your internal models over tonight then, dear readers, is this: is creativity actually (or accurately modeled by) random component-combination creation (boss), followed by a selection/rejection (internal auditing, or colleague auditing)? (3)

  • (1) Are there any beneficent warlords?
  • (2) Yet: as I was educated in a good postmodernist tradition, this set of steps ('cultural software') is not the only way to design. I'm just using it since, well, best to start with something that already works.
  • (3) If anyone reads this and wants to comment, just edit this. Perhaps you want to draw a horizontal line and write comments below it? Anyway, writing is active thinking, so thanks for helping me think.

hide / / print
ref: -0 tags: automatic programming Herbert Simon date: 12-29-2010 04:43 gmt revision:1 [0] [head]

Whether Software Engineering Needs to Be Artificially Intelligent By Herbert Simon, author of __The Sciences of the Artificial__

  • He talks about many tasks & goals that, at least in 2010, (25 years later) seem relatively obvious:
    • There is aboundary between man and machine, and the automatic programming can make steps by moving the boundary toward peeople: e.g. incorporating shortcuts / macros whatever for the menial trivial tasks of programming.
      • I would imagine that garbage collection, automatic type inference to be two of these menial tasks. Alternately, you can look at the automatic GUI generators like Glade and MFC. But why progress so slow on this part? We haven't automated that much..
    • Must have a better language for formalizing desired program state, desired results. I agree! Programming languages still conflate the tasks of describing the data structures with the task of describing the desired end state..
      • He talks about writing desires, objectives in natural language. I don't find that all to likely..
    • Need a database of data representations from which to draw, code for the automatic programmer to copy into new programs.
      • Have thought about this, and storing what program snippets do seems very ill defined! We can remember in our vague memory, but perhaps that's because we choose relatively few building blocks for constructing larger structures (and these structures, in turn, are relatively few...)
  • A good example of moving the boundary between man and machine is the telephone exchange system. Initially, it was only operated by humans; later, it's mostly machines - it's become hard to find a human operator on the other end. The steps to this have been gradual, possibly due to the required effort, possibly due to institutional intertia.
    • In design "the informality gradually gets squeezed out".
  • The author tells that medical diagnosis, e.g. at the hands of caduceus, is well functional; looking back & at the present absence of expert systems in medicine, this statement seems overly optimistic.
    • That said, i do agree heartily with that a useful expert system "must be designed in an interactive mode for progressive modification of the sort that I have been describing"
  • Random: While moves in chess can be searched by traversing a tree, backgammon is not susceptible to the same approach. Hans Berliner, apparently in the 1980's devised a production program that was able to beat the world champion backgammon player. This was at least 10 years before Tesauro!

hide / / print
ref: -0 tags: automatic programming myths prospects date: 12-29-2010 04:42 gmt revision:3 [2] [1] [0] [head]

Automatic Programming: Myths and Prospects by Charles Rich and Richard C Waters, 1988

  • "End user oriented automatic programming systems must be domain experts. In particular, it is not possible to describe something briefly unless the hearer knows as much about the domain as the speaker does and can therefore understand the words and fill in what was left out.
  • Authors enumerate three general classes of automatic programming:
    • Bottom up. Eg. writing in very high-level programming languages; as in {853}, tries to push the boundary of human-machine toward the human.
    • Narrow domain. Eg. expert systems in a limited domain.
    • Assistant. Eg. IDEs, documentation: possibly the most popular approach.
  • Back in the day (1958) automatic programming consisted of what we now call assemblers and compilers. hah!
  • "The problem with reqirements (and contracts) is that they cannot be complete." (There is implicit, unstated information, as well as straight unspecified information).
    • Applied to practical things which must interact with people - they give the example of an ATM - i think so much is in the head of the human that automatic programming / using some potentially stochastic means of generating code makes no sense. Better to use templates and libraries! I'm more interested in the automatic creation of heuristic systems.
  • Program design is not serial: it involves continuous interaction between the programmer and client, which serves to communicate the client's desire and DSK to the programmer.
    • An automatic progarmming system must pay close attention to the medium that supports the dialog between human and program. The meduim must be, effectively, a powerful tool. (think of the palm..)
  • "Most of the productivity improvements introduced by automatic programming will almost certainly be used to atack enormous programs, rather than huge programs". So, we'll forever be in need of more powerful programming tools!
  • At some point, a high-level programming language description of a problem & it's algorithm of solution is as short or shorter than the English prose description of what the programm is/was to do :: what other degrees of compression can be extracted?
    • That said, "An interesting research direction is the design of artificial languages that intentionally allow informality" (ala English).
    • These will most likely not be natural languages, though: look at schematics, mathematical formula. These are really DSLs...
  • Regarding example-based automatic programming (despite the fact that humans communicate through examples and analogies extensively, just look at this blog (for example!)): "Unfortunately, exept for toy problems, no one has been able to make an example based automatic programming system work, and there is reason to believe this failure is fundamental."
    • Why? No matter how many examples are provided, there is no way to gurantee generalization. Oh, but I should contest this!
  • "The problem of sythesizing a program satisfying a given specification is formally equivalent to finding a constructive proof of the specifications satisfiability. This is the idea behind deductive logic based automatic programmers"
  • "Transformational methods in some form are certain to be part of all future automatic programming systems."
  • Programers think in cliches - and in fact, this is a useful technique for solving many problems by inspection. Should be possible to embed cliches in an expert-auto-programming system.

hide / / print
ref: -0 tags: meta learning Artificial intelligence competent evolutionary programming Moshe Looks MOSES date: 08-07-2010 16:30 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

Competent Program Evolution

  • An excellent start, excellent good description + meta-description / review of existing literature.
  • He thinks about things in a slightly different way - separates what I call solutions and objective functions "post- and pre-representational levels" (respectively).
  • The thesis focuses on post-representational search/optimization, not pre-representational (though, I believe that both should meet in the middle - eg. pre-representational levels/ objective functions tuned iteratively during post-representational solution creation. This is what a human would do!)
  • The primary difficulty in competent program evolution is the intense non-decomposability of programs: every variable, constant, branch effects the execution of every other little bit.
  • Competent program creation is possible - humans create programs significantly shorter than lookup tables - hence it should be possible to make a program to do the same job.
  • One solution to the problem is representation - formulate the program creation as a set of 'knobs' that can be twiddled (here he means both gradient-descent partial-derivative optimization and simplex or heuristic one-dimensional probabilistic search, of which there are many good algorithms.)
  • pp 27: outline of his MOSES program. Read it for yourself, but looks like:
  • The representation step above "explicitly addresses the underlying (semantic) structure of program space independently of the search for any kind of modularity or problem decomposition."
    • In MOSES, optimization does not operate directly on program space, but rather on subspaces defined by the representation-building process. These subspaces may be considered as being defined by templates assigning values to some of the underlying dimensions (e.g., they restrict the size and shape of any resulting trees).
  • In chapter 3 he examines the properties of the boolean programming space, which is claimed to be a good model of larger/more complicated programming spaces in that:
    • Simpler functions are much more heavily sampled - e.g. he generated 1e6 samples of 100-term boolean functions, then reduced them to minimal form using standard operators. The vast majority of the resultant minimum length (compressed) functions were simple - tautologies or of a few terms.
    • A corollary is that simply increasing syntactic sample length is insufficient for increasing program behavioral complexity / variety.
      • Actually, as random program length increases, the percentage with interesting behaviors decreases due to the structure of the minimum length function distribution.
  • Also tests random perturbations to large boolean formulae (variable replacement/removal, operator swapping) - ~90% of these do nothing.
    • These randomly perturbed programs show a similar structure to above: most of them have very similar behavior to their neighbors; only a few have unique behaviors. makes sense.
    • Run the other way: "syntactic space of large programs is nearly uniform with respect to semantic distance." Semantically similar (boolean) programs are not grouped together.
  • Results somehow seem a let-down: the program does not scale to even moderately large problem spaces. No loops, only functions with conditional evalutation - Jacques Pitrat's results are far more impressive. {815}
    • Seems that, still, there were a lot of meta-knobs to tweak in each implementation. Perhaps this is always the case?
  • My thought: perhaps you can run the optimization not on program representations, but rather program codepaths. He claims that one problem is that behavior is loosely or at worst chaotically related to program structure - which is true - hence optimization on the program itself is very difficult. This is why Moshe runs optimization on the 'knobs' of a representational structure.

hide / / print
ref: work-0 tags: functional programming compilation ocaml date: 08-24-2009 14:33 gmt revision:0 [head]

The implementation of functional programming languages - book!

hide / / print
ref: work-0 tags: ocaml mysql programming functional date: 07-03-2009 19:16 gmt revision:2 [1] [0] [head]

Foe my work I store a lot of analyzed data in SQL databases. In one of these, I have stored the anatomical target that the data was recorded from - namely, STN or VIM thalamus. After updating the analysis programs, I needed to copy the anatomical target data over to the new SQL tables. Where perl may have been my previous go-to language for this task, I've had enuogh of its strange quiks, hence decided to try it in Ruby (worked, but was not so elegant, as I don't actually know Ruby!) and then Ocaml.

#use "topfind"
#require "mysql"

(* this function takes a query and a function that converts entries 
in a row to Ocaml tuples *)
let read_table db query rowfunc =
	let r = Mysql.exec db query in
	let col = Mysql.column r in
	let rec loop = function
		| None      -> []
		| Some x    -> rowfunc col x :: loop (Mysql.fetch r)
	loop (Mysql.fetch r)

let _ = 
	let db = Mysql.quick_connect ~host:"crispy" ~database:"turner" ~password:"" ~user:"" () in
	let nn = Mysql.not_null in
	(* this function builds a table of files (recording sessions) from a given target, then 
	uses the mysql UPDATE command to propagate to the new SQL database. *)
	let propagate targ = 
		let t = read_table db 
			("SELECT file, COUNT(file) FROM `xcor2` WHERE target='"^targ^"' GROUP BY file")
			(fun col row -> (
				nn Mysql.str2ml (col ~key:"file" ~row), 
				nn Mysql.int2ml (col ~key:"COUNT(file)" ~row) )
		List.iter (fun (fname,_) -> 
			let query = "UPDATE `xcor3` SET `target`='"^targ^
				"' WHERE STRCMP(`file`,'"^fname^"')=0" in
			print_endline query ;
			ignore( Mysql.exec db query )
		) t ;
	propagate "STN" ; 
	propagate "VIM" ; 
	propagate "CTX" ; 
	Mysql.disconnect db ;;

Interacting with MySQL is quite easy with Ocaml - though the type system adds a certain overhead, it's not too bad.

hide / / print
ref: notes-0 tags: programming excellence norvig 10 years date: 04-07-2009 20:26 gmt revision:0 [head]

Teach yourself programming in 10 years

  • points out that, in order to be excellent at any difficult skill/art, you must practice 10 years or 10,000 hours, and this practice must be focused and deliberate.
    • quote: "have shown it takes about ten years to develop expertise in any of a wide variety of areas, including chess playing, music composition, telegraph operation, painting, piano playing, swimming, tennis, and research in neuropsychology and topology"
    • possibly this is partially due to competition - most other people drop out after 10 years!
    • Or this is due to the fact that, for general purpose behaviors, we are really no better than the present gradient descent & reinforcement learning algorithms which require repeated presentation of patterns and behaviors. Where humans achieve sub-gradient/RL performance is where evolution has supplied us with hardware or 'prior assumptions' to bias for a correct solution / correct solution space. These prior assumptions are (part of) that which the make study brain interesting!
  • "Life is short, [the] craft long, opportunity fleeting, experiment treacherous, judgment difficult." -- Hippocrates.

hide / / print
ref: bookmark-0 tags: code laws lawyers programming date: 11-28-2008 04:54 gmt revision:0 [head]

http://www.linux-mag.com/id/7187 -- has a very interesting and very well applied analogy between programs and laws. I am inclined to believe that they really are not all that different; legalese is structured and convoluted the way it is because it is, in effect, a programming language for laws, hence must be precise and unambiguous. Furthermore, the article is well written and evidences structured and balanced thought (via appropriate references to the real world). And he uses Debian ;-)

hide / / print
ref: -0 tags: differential dynamic programming machine learning date: 09-24-2008 23:39 gmt revision:2 [1] [0] [head]

excellent bibliography.

  • Jacobson, D. and Mayne, D., Differential Dynamic Programming, Elsevier, New York, 1970. in Perkins library.
  • Bertsekas, Dimitri P. Dynamic programming and optimal control Ford Library.
  • Receding horizon differential dynamic programming
    • good for high-dimensional problems. for this paper, they demonstrate control of a swimming robot.
    • webpage, including a animated gif of the swimmer
    • above is a quote from the conclusions -- very interesting!

hide / / print
ref: bookmark-0 tags: C lisp programming unix worse better XML JSON date: 04-29-2008 18:21 gmt revision:1 [0] [head]

hide / / print
ref: bookmark-0 tags: Ocaml python paradox programming finance date: 03-10-2008 21:29 gmt revision:2 [1] [0] [head]

  • this trading firm used OCaml, apparently to exclusion: http://www.janestcapital.com/yaron_minsky-cufp_2006.pdf
  • they also reference the python paradox. interesting, I'll have to check into Ocaml.
  • or, rather, Lisp. this article is quite convincing!
    • quote: If you're trying to solve a hard problem with a language that's too low-level, you reach a point where there is just too much to keep in your head at once.
    • quote: Any sufficiently complicated C or Fortran program contains an ad hoc informally-specified bug-ridden slow implementation of half of Common Lisp.
      • well, yes, this happened a bit in BMI with variables that were indexed by name :(
  • also see this excellent, extensive article.

hide / / print
ref: bookmark-0 tags: optimization function search matlab linear nonlinear programming date: 08-09-2007 02:21 gmt revision:0 [head]


very nice collection of links!!

hide / / print
ref: math notes-0 tags: linear_algebra BLAS FFT library programming C++ matrix date: 02-21-2007 15:48 gmt revision:1 [0] [head]

Newmat11 -- nice, elegant BLAS / FFT and matrix library, with plenty of syntactic sugar.

hide / / print
ref: bookmark-0 tags: Bayes Baysian_networks probability probabalistic_networks Kalman ICA PCA HMM Dynamic_programming inference learning date: 0-0-2006 0:0 revision:0 [head]

http://www.cs.ubc.ca/~murphyk/Bayes/bnintro.html very, very good! many references, well explained too.

hide / / print
ref: bookmark-0 tags: OpenGL shaders vertex pixel fragment GLSL Linux programming C++ date: 0-0-2006 0:0 revision:0 [head]


easy to compile on my debian system - all the development libraries had debian packages!

also of interest :