you are not logged in, login. new entry
text: sort by
tags: modified
type: chronology
hide / edit[1] / print
ref: -0 tags: computational biology evolution metabolic networks andreas wagner genotype phenotype network date: 06-12-2017 19:35 gmt revision:1 [0] [head]

Evolutionary Plasticity and Innovations in Complex Metabolic Reaction Networks

  • ‘’João F. Matias Rodrigues, Andreas Wagner ‘’
  • Our observations suggest that the robustness of the Escherichia coli metabolic network to mutations is typical of networks with the same phenotype.
  • We demonstrate that networks with the same phenotype form large sets that can be traversed through single mutations, and that single mutations of different genotypes with the same phenotype can yield very different novel phenotypes
  • Entirely computational study.
    • Examines what is possible given known metabolic building-blocks.
  • Methodology: collated a list of all metabolic reactions in E. Coli (726 reactions, excluding 205 transport reactions) out of 5870 possible reactions.
    • Then ran random-walk mutation experiments to see where the genotype + phenotype could move. Each point in the genotype had to be viable on either a rich (many carbon source) or minimal (glucose) growth medium.
    • Viability was determined by Flux-balance analysis (FBA).
      • In our work we use a set of biochemical precursors from E. coli 47-49 as the set of required compounds a network needs to synthesize, ‘’’by using linear programming to optimize the flux through a specific objective function’’’, in this case the reaction representing the production of biomass precursors we are able to know if a specific metabolic network is able to synthesize the precursors or not.
      • Used Coin-OR and Ilog to optimize the metabolic concentrations (I think?) per given network.
    • This included the ability to synthesize all required precursor biomolecules; see supplementary information.
    • ‘’’“Viable” is highly permissive -- non-zero biomolecule concentration using FBA and linear programming. ‘’’
    • Genomic distances = hamming distance between binary vectors, where 1 = enzyme / reaction possible; 0 = mutated off; 0 = identical genotype, 1 = completely different genotype.
  • Between pairs of viable genetic-metabolic networks, only a minority (30 - 40%) of reactions are essential,
    • Which naturally increases with increasing carbon source diversity:
    • When they go back an examine networks that can sustain life on any of (up to) 60 carbon sources, and again measure the distance from the original E. Coli genome, they find this added robustness does not significantly constrain network architecture.

Summary thoughts: This is a highly interesting study, insofar that the authors show substantial support for their hypotheses that phenotypes can be explored through random-walk non-lethal mutations of the genotype, and this is somewhat invariant to the source of carbon for known biochemical reactions. What gives me pause is the use of linear programming / optimization when setting the relative concentrations of biomolecules, and the permissive criteria for accepting these networks; real life (I would imagine) is far more constrained. Relative and absolute concentrations matter.

Still, the study does reflect some robustness. I suggest that a good control would be to ‘fuzz’ the list of available reactions based on statistical criteria, and see if the results still hold. Then, go back and make the reactions un-biological or less networked, and see if this destroys the measured degrees of robustness.

hide / edit[1] / print
ref: bookmark-0 tags: intrinsic evolution FPGA GPU optimization algorithm genetic date: 01-27-2013 22:27 gmt revision:1 [0] [head]


  • http://evolutioninmaterio.com/ - using FPGAs in intrinsic evolution, e.g. the device is actually programmed and tested.
  • - Adrian Thompson's homepage. There are many PDFs of his work on his homepage.
  • Parallel genetic algorithms on programmable graphics hardware
    • basically deals with optimizing mutation and fitness evaluation using the parallel arcitecture of a GPU: larger populations can be evaluated at one time.
    • does not concern the intrinsic evolution of algorithms to the GPU, as in the Adrian's work.
    • uses a linear conguent generator to produce random numbers.
    • used a really simple problem: Colville minimization problem which need only search through a four-dimensional space.
  • Cellular genetic algoritms and local search for 3-SAT problem on Graphic Hardware
    • concerning SAT: satisfiabillity technique: " many practical problems, such as graph coloring, job-shop scheduling, and real-world scheduling can be represented as a SAT problem.
    • SAT-3 refers to the length of the search clause. length 3 is apparently very hard..
    • they use a combination of greedy search (flip the bit that increases the fitness the largest ammount) and random-walk via point mutations to keep the algorithm away from local minima.
    • also use cellular genetic algorithm which works better on a GPU): select the optimal neignbor, not global, individual.
    • only used a GeForce 6200 gpu, but it was still 5x faster than a p4 2.4ghz.
  • Evolution of a robot controller using cartesian genetic programming
    • cartesian programming has many advantages over traditional tree based methods - e.g. blot-free evolution & faster evolution through neutral search.
    • cartesian programming is characterized by its encoding of a graph as a string of integers that represent the functions and connections between graph nodes, and program inputs and outputs.
      • this encoding was developed in the course of evolving electronic circuits, e.g. above ?
      • can encode a non-connected graph. the genetic material that is not utilized is analogous to biological junk DNA.
    • even in converged populations, small mutations can produce large changes in phenotypic behavior.
    • in this work he only uses directed graphs - there are no cycles & an organized flow of information.
    • mentions automatically defined functions - what is this??
    • used diffusion to define the fitness values of particular locations in the map. the fewer particles there eventually were in a grid location, the higher the fitness value of the robot that managed to get there.
  • Hardware evolution: on the nature of artifically evolved circuits - doctoral dissertation.
    • because evolved circuits utilize the parasitic properties of devices, they have little tolerance of the value of components. Reverse engineering of the circuits evolved to improve tolerance is not easy.

hide / edit[1] / print
ref: -0 tags: evolutionary psychology human mating sexuality discrimination wedlock date: 01-09-2011 18:22 gmt revision:1 [0] [head]

From Why Beautiful people have more daughters:

"Abuse, degradation, and intimidation are all part of men's unfortunate repertoire of tactics employed in competitive situations. In other words, men are not harassing women because they are treating them differently than men (which is the definition of discrimination under which harassment legally falls), but the exact opposite: men harass women because they are not discriminating between men and women."

Interesting argument. But in sexual discrimination cases, the women are not being treated the way they want to be treated - this is more a problem than the inequality.

The author then goes on to pose that current sexual discrimination law and policy in US corporations actually inhibits welcome sexual/romantic interest/advances. Many people do find partners at work. Again, I beg to differ: if there is passion between people, things will fall as they should; if policy and culture serves to make this more civilized (provided it's not completely inhibited, as the author suggests), then all the better.

In related news: An Analysis of Out-Of-Wedlock Births in the United States

Central hypothesis: Contraceptive technology shifted the balance of power between the sexes: prior the pill, women could force the men into promising to marry; in the case of preganancy, cultural standards forced marriage - shotgun marriage. Men accepted these terms because they were uniform across all women - sex implies pregnancy implies child rearing. When contraception became available, this was decoupled, as sex did not beget pregnancy; those women who negotiated on the old terms were likely to lose their mate, hence shotgun marriages (the result of such negotiations) gradually disappeared from culture.

The author generally approves of the idea of shotgun marriage, and suggests that a governmental body should enforce a form of it through child support payments. Presently about 40% of children in the US are born out of wedlock.

Finally, Serial monogamy increases reproductive success in men but not in women. It rests upon data, only recently gathered, that supports that having multiple partners increases reproductive success more strongly in male than in female humans. This implies that the variance of the fertility of men should be higher than that of women - again, which is borne out in the data, but only weakly: men have 10% higher variance in # of offspring than women. This effect is correlated to serial monogamy - "Compared with men with 1 spouse, men with 3 or more spouses had 19% more children in the total sample". This did not hold with women, nor did varying spouse number in men change the survival rate of their offspring.

Irregardless, this reading was spurred by someone mentioning that a genetic analysis of human populations reveals that while 80% of women reached reproductive success, only 40% of men did - implying that historically a few more successful men fathered a large fraction of children. I was unable to find evidence to support this on the internet (and indeed the Behavioral Ecology article gives much less dramatic figures), but it makes intuitive sense, especially in light of some patterns of male behavior.

hide / edit[6] / print
ref: -0 tags: meta learning Artificial intelligence competent evolutionary programming Moshe Looks MOSES date: 08-07-2010 16:30 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

Competent Program Evolution

  • An excellent start, excellent good description + meta-description / review of existing literature.
  • He thinks about things in a slightly different way - separates what I call solutions and objective functions "post- and pre-representational levels" (respectively).
  • The thesis focuses on post-representational search/optimization, not pre-representational (though, I believe that both should meet in the middle - eg. pre-representational levels/ objective functions tuned iteratively during post-representational solution creation. This is what a human would do!)
  • The primary difficulty in competent program evolution is the intense non-decomposability of programs: every variable, constant, branch effects the execution of every other little bit.
  • Competent program creation is possible - humans create programs significantly shorter than lookup tables - hence it should be possible to make a program to do the same job.
  • One solution to the problem is representation - formulate the program creation as a set of 'knobs' that can be twiddled (here he means both gradient-descent partial-derivative optimization and simplex or heuristic one-dimensional probabilistic search, of which there are many good algorithms.)
  • pp 27: outline of his MOSES program. Read it for yourself, but looks like:
  • The representation step above "explicitly addresses the underlying (semantic) structure of program space independently of the search for any kind of modularity or problem decomposition."
    • In MOSES, optimization does not operate directly on program space, but rather on subspaces defined by the representation-building process. These subspaces may be considered as being defined by templates assigning values to some of the underlying dimensions (e.g., they restrict the size and shape of any resulting trees).
  • In chapter 3 he examines the properties of the boolean programming space, which is claimed to be a good model of larger/more complicated programming spaces in that:
    • Simpler functions are much more heavily sampled - e.g. he generated 1e6 samples of 100-term boolean functions, then reduced them to minimal form using standard operators. The vast majority of the resultant minimum length (compressed) functions were simple - tautologies or of a few terms.
    • A corollary is that simply increasing syntactic sample length is insufficient for increasing program behavioral complexity / variety.
      • Actually, as random program length increases, the percentage with interesting behaviors decreases due to the structure of the minimum length function distribution.
  • Also tests random perturbations to large boolean formulae (variable replacement/removal, operator swapping) - ~90% of these do nothing.
    • These randomly perturbed programs show a similar structure to above: most of them have very similar behavior to their neighbors; only a few have unique behaviors. makes sense.
    • Run the other way: "syntactic space of large programs is nearly uniform with respect to semantic distance." Semantically similar (boolean) programs are not grouped together.
  • Results somehow seem a let-down: the program does not scale to even moderately large problem spaces. No loops, only functions with conditional evalutation - Jacques Pitrat's results are far more impressive. {815}
    • Seems that, still, there were a lot of meta-knobs to tweak in each implementation. Perhaps this is always the case?
  • My thought: perhaps you can run the optimization not on program representations, but rather program codepaths. He claims that one problem is that behavior is loosely or at worst chaotically related to program structure - which is true - hence optimization on the program itself is very difficult. This is why Moshe runs optimization on the 'knobs' of a representational structure.

hide / edit[2] / print
ref: work-0 tags: no free lunch wolpert coevolution date: 07-19-2010 12:54 gmt revision:2 [1] [0] [head]


  • Just discovered this. It makes perfect sense - bias free learning is 'futile'. Learning need be characterized by its biases, which enable faster or better results in particular problem domains.
  • Equivalently: any two algorithms are equivalent when their performance is averaged across all possible problems. (This is not as strong as it sounds, as most problems will never be encountered).
  • Wolper 1996 provides an excellent geometric interpretation of this: the quality of the search/optimization algorithm within a particular domain iis proporational to the inner product of its expected search stream with the actual (expected?) probability distribution of the data.
  • However! with coevolutionary algorithms, there can be a free lunch - "in coevolution some algorithms have better performance than other algorithms, averaged across all possible problems." Wolpert 2005
    • claims that this does not (??) hold in biological evolution, where there is no champion. Yet biology seems all about co-evolution.
    • coevolution of a backgammon player details how it may be coevolution + the structure of the backgammon game, not reinforcement learning, which led Tesauro to his championship-level player. Specifically, coevolutionary algorithms tend to get stuck in local minima - where both contestants play mediocre and draw - but this is not possible in backgammon; there is only one winner, and the games must terminate eventually.
      • These authors introduce a very interesting twist to improve coevolutionary bootstrapping: Firstly, the games are played in pairs, with the order of play reversed and the same random seed used to generate the dice rolls for both games. This washes out some of the unfairness due to the dice rolls when the two networks are very close - in particular, if they were identical, the result would always be one win each.

hide / edit[3] / print
ref: work-0 tags: differential evolution function optimization date: 07-09-2010 14:46 gmt revision:3 [2] [1] [0] [head]

Differential evolution (DE) is an optimization method, somewhat like Neidler-Mead or simulated annealing (SA). Much like genetic algorithms, it utilizes a population of solutions and selection to explore and optimize the objective function. However, it instead of perturbing vectors randomly or greedily descending the objective function gradient, it uses the difference between individual population vectors to update hypothetical solutions. See below for an illustration.

At my rather cursory reading, this serves to adapt the distribution of hypothetical solutions (or population of solutions, to use the evolutionary term) to the structure of the underlying function to be optimized. Judging from images/821_1.pdf Price and Storn (the inventors), DE works in situations where simulated annealing (which I am using presently, in the robot vision system) fails, and is applicable to higher-dimensional problems than simplex methods or SA. The paper tests DE on 100 dimensional problems, and it is able to solve these with on the order of 50k function evaluations. Furthermore, they show that it finds function extrema quicker than stochastic differential equations (SDE, alas from 85) which uses the gradient of the function to be optimized.

I'm surprised that this method slipped under my radar for so long - why hasn't anyone mentioned this? Is it because it has no proofs of convergence? has it more recently been superseded? (the paper is from 1997). Yet, I'm pleased because it means that there are also many other algorithms equally clever and novel (and simple?), out their in the literature or waiting to be discovered.

hide / edit[2] / print
ref: -0 tags: chess evolution machine learning 2004 partial derivative date: 10-26-2009 04:07 gmt revision:2 [1] [0] [head]

A Self-learning Evolutionary Chess Program

  • The evolved program is able to perform at near master level!
  • Used object networks (neural networks that can be moved about according to the symmetries of the problem space). Paul Werbos apparently invented these, too.
  • Approached the problem by assigning values to having pieces at particular places on the board (PVT, positional value tables). The value of a move was the value of the resulting global valuation (sum of value of pieces - value of opponents pieces) + PVT. They used these valuations to look a set number of moves in the future, using an alpha-beta search.
    • Used 4-plys (search depth) while in normal genetic evolution; 6 when pawns would be upgraded.
  • The neural networks looked at the first 2 rows, the last two rows, and a 4x4 square in the middle of the board - areas known to matter in real games. (The main author is a master-level chess player and chess teacher).
  • The outputs of the three neural networks were added to the material and PVT values to assess a hypothetical board position.
  • Genetic selection operated on the PVT values, neural network weights, piece valuation, and biases of the neural networks. These were initialized semi-randomly; PVT values were initialized based on open-source programs.
  • Performed 50 generations of 20 players each. The top 10 players from each generation survived.
  • Gary Kasparov was consulted in this research. Cool!
  • I wonder what would happen if you allowed the program to propose (genetically or otherwise) alternate algorithmic structures. What they describe is purely a search through weight space - what about a genetic search through algorithmic structure space? Too difficult of a search?
  • I mean, that's what humans (the authors) do while they were designing this program/algorithm. The lead author, as mentioned, is already a very good chess player, and hence he could imbue the initial program with a lot of good 'filters' 'kernels' or 'glasses' for looking at the chess board. And how did he arrive at these ideas? Practice (raw data) and communication (other peoples kernels extracted from more raw data, and validated). And how does he play? By using his experience and knowledge to predict probable moves into the future, evaluating their value, and selecting the best. And how does he evaluate his algorithmic? The same way! By using his knowledge of both chess and computer science to simulate hypothetical designs in his head, seeing how he thinks they will perform, and selecting the best one.
  • The problem with present algorithms is that they have no sense of artistic beauty - no love of symmetry, whether it be simple geometric symmetry (beautiful people have symmetric faces) or more fractal (fractional-dimensioned) symmetry, e.g. music, fractals (duh), human art. I think symmetry can enormously cut down the dimension of the search space in learning, hence is frequently worthy of its own search.
    • Algorithms do presently have a good sense of parsimony, at least, through the AIC / regularization / SVD / bayes net's priors / etc. Parsimony can be beauty, too.
  • Another notable discrepancy is that humans can reason in a concrete way - they actively search for the thing that is causing the problem, the thing that is contributing greatly to either good or bad results. They do this by the scientific method, sorta - hold all other things constant, perturb some section of the system, measure the output. This is the same as taking a partial derivative. Such derivative are used heavily/exclusively in training neural networks - weights are changed based on the partial derivative of that weight wrt the output-referenced error. So reasoning is similar to non-parallel backprop? Or a really slow way of taking partial derivatives? Maybe. The goal of both is to assign valuation/causation to a given weight/subsystem.
  • Human reasoning involves dual valuation pathways - internal, based on a model of the world, and external, which of course involves experimentation and memory (and perhaps scholarly journal papers etc). The mammalian cortex-basal ganglia-thalamus loop seems designed for running these sorts of simulations because it is the dual of the problem of selecting appropriate behaviors. (there! I said it!) In internal simulation, you take world state, apply forward transform with perturbation, then evaluate the result - see if your perturbation (partial derivative) yields information. In motor behavior, you take the body state, apply forward transformation with perturbation (muscle contraction), and evaluate the result. Same thing. Of course you don't have to do this too much, as the cortex will remember the input-perturbation-result.
  • Understanding seems to be related to this input-transform-evaluate cycle, too, except here what is changing is the forward transform, and the output is compared to known output - does a given kernel (concept) predict the output/observed data?
  • Now what would happen if you applied this input-transform-evaluate to itself, e.g. you allowed the system to evaluate itself. Nothing? Recursion? (recursion is a very beautiful concept.) Some degree of awareness?
  • Surely someone has thought of this before, and tried to simulate it on a computer. Wasn't AI research all about this in the 70's-80's? People have said that their big problem was that AI was then entirely/mostly symbolic and insufficiently probabilistic or data-intensive; the 90's-21st century seems to have solved that. This field is unfamiliar to me, it'll take some sussing about before I can grok the academic landscape.
    • Even more surely, someone is doing it right now! This is the way the world advances. Same thing happened to me with GPGPU stuff, which I was doing in 2003. Now everyone is up to that shiznit.
  • It seems that machine-learning is transitioning from informing my personal philosophy, to becoming my philosophy. Good/bad? Feel free to edit this entry!
  • It's getting late and I'm tried -> rant ends.

hide / edit[0] / print
ref: work-0 tags: covariance matrix adaptation learning evolution continuous function normal gaussian statistics date: 06-30-2009 15:07 gmt revision:0 [head]


  • Details a method of sampling + covariance matrix approximation to find the extrema of a continuous (but intractable) fitness function
  • HAs flavors of RLS / Kalman filtering. Indeed, i think that kalman filtering may be a more principled method for optimization?
  • Can be used in high-dimensional optimization problems like finding optimal weights for a neural network.
  • Optimum-seeking is provided by weighting the stochastic samples (generated ala a particle filter or unscented kalman filter) by their fitness.
  • Introductory material is quite good, actually...