use https for features.
text: sort by
tags: modified
type: chronology
hide / / print
ref: -2016 tags: spiking neural network self supervised learning date: 12-10-2019 03:41 gmt revision:2 [1] [0] [head]

PMID: Spiking neurons can discover predictive features by aggregate-label learning

  • This is a meandering, somewhat long-winded, and complicated paper, even for the journal Science. It's not been cited a great many times, but none-the-less is of interest.
  • The goal of the derived network is to detect fixed-pattern presynaptic sequences, and fire a prespecified number of spikes to each occurrence.
  • One key innovation is the use of a spike-threshold-surface for a 'tempotron' [12], the derivative of which is used to update the weights of synapses after trials. As the author says, spikes are hard to differentiate; the STS makes this more possible. This is hence standard gradient descent: if the neuron missed a spike then the weight is increased based on aggregate STS (for the whole trial -- hence the neuron / SGD has to perform temporal and spatial credit assignment).
    • As common, the SGD is appended with a momentum term.
  • Since STS differentiation is biologically implausible -- where would the memory lie? -- he also implements a correlational synaptic eligibility trace. The correlation is between the postsynaptic voltage and the EPSC, which seems kinda circular.
    • Unsurprisingly, it does not work as well as the SGD approximation. But does work...
  • Second innovation is the incorporation of self-supervised learning: a 'supervisory' neuron integrates the activity of a number (50) of feature detector neurons, and reinforces them to basically all fire at the same event, WTA style. This effects a unsupervised feature detection.
  • This system can be used with sort-of lateral inhibition to reinforce multiple features. Not so dramatic -- continuous feature maps.

Editorializing a bit: I said this was interesting, but why? The first part of the paper is another form of SGD, albeit in a spiking neural network, where the gradient is harder compute hence is done numerically.

It's the aggregate part that is new -- pulling in repeated patterns through synaptic learning rules. Of course, to do this, the full trace of pre and post synaptic activity must be recorded (??) for estimating the STS (i think). An eligibility trace moves in the right direction as a biologically plausible approximation, but as always nothing matches the precision of SGD. Can the eligibility trace be amended with e.g. neuromodulators to push the performance near that of SGD?

The next step of adding self supervised singular and multiple features is perhaps toward the way the brain organizes itself -- small local feedback loops. These features annotate repeated occurrences of stimuli, or tile a continuous feature space.

Still, the fact that I haven't seen any follow-up work is suggestive...

Editorializing further, there is a limited quantity of work that a single human can do. In this paper, it's a great deal of work, no doubt, and the author offers some good intuitions for the design decisions. Yet still, the total complexity that even a very determined individual can amass is limited, and likely far below the structural complexity of a mammalian brain.

This implies that inference either must be distributed and compositional (the normal path of science), or the process of evaluating & constraining models must be significantly accelerated. This later option is appealing, as current progress in neuroscience seems highly technology limited -- old results become less meaningful when the next wave of measurement tools comes around, irrespective of how much work went into it. (Though: the impedtus for measuring a particular thing in biology is only discovered through these 'less meaningful' studies...).

A third option, perhaps one which many theoretical neuroscientists believe in, is that there are some broader, physics-level organizing principles to the brain. Karl Friston's free energy principle is a good example of this. Perhaps at a meta level some organizing theory can be found, or likely a set of theories; but IMHO, you'll need at least one theory per brain area, at least, just the same as each area is morphologically, cytoarchitecturaly, and topologically distinct. (There may be only a few theories of the cortex, despite all the areas, which is why so many are eager to investigate it!)

So what constitutes a theory? Well, you have to meaningfully describe what a brain region does. (Why is almost as important; how more important to the path there.) From a sensory standpoint: what information is stored? What processing gain is enacted? How does the stored information impress itself on behavior? From a motor standpoint: how are goals selected? How are the behavioral segments to attain them sequenced? Is the goal / behavior even a reasonable way of factoring the problem?

Our dual problem, building the bridge from the other direction, is perhaps easier. Or it could be a lot more money has gone into it. Either way, much progress has been made in AI. One arm is deep function approximation / database compression for fast and organized indexing, aka deep learning. Many people are thinking about that; no need to add to the pile; anyway, as OpenAI has proven, the common solution to many problems is to simply throw more compute at it. A second is deep reinforcement learning, which is hideously sample and path inefficient, hence ripe for improvement. One side is motor: rather than indexing raw motor variables (LRUD in a video game, or joint torques with a robot..) you can index motor primitives, perhaps hierarchically built; likewise, for the sensory input, the model needs to infer structure about the world. This inference should decompose overwhelming sensory experience into navigable causes ...

But how can we do this decomposition? The cortex is more than adept at it, but now we're at the original problem, one that the paper above purports to make a stab at.

hide / / print
ref: -0 tags: meta compilation self-hostying ACM date: 12-30-2015 07:52 gmt revision:2 [1] [0] [head]

META II: Digital Vellum in the Digital Scriptorium: Revisiting Schorre's 1962 compiler-compiler

  • Provides high-level commentary about re-implementing the META-II self-reproducing compiler, using Python as a backend, and mountain climbing as an analogy. Good read.
  • Original paper
  • What it means to be self-reproducing: The original compiler was written in assembly (in this case, a bytecode assembly). When this compiler is run and fed the language description (figure 5 in the paper), it outputs bytecode which is identical (or almost nearly so) to the hand-coded compiler. When this automatically-generated compiler is run and fed the language description (again!) it reproduces itself (same bytecode) perfectly.
    • See section "How the Meta II compiler was written"

hide / / print
ref: Cho-2007.03 tags: SOM self organizing maps Prinicpe neural signal reconstruction recording compression date: 01-03-2012 00:59 gmt revision:2 [1] [0] [head]

PMID-17234384[0] Self-organizing maps with dynamic learning for signal reconstruction.

  • They use a dynamically-learning self-organizing map to compress (encode) continuous neural signals so they can be sent over a wireless link. In this way, you do not have to sort and bin on the device (but this is relatively easy; it seems that their SOM is more computationally expensive than simple thresholding.) Nonetheless, it is an interesting approach.


[0] Cho J, Paiva AR, Kim SP, Sanchez JC, Príncipe JC, Self-organizing maps with dynamic learning for signal reconstruction.Neural Netw 20:2, 274-84 (2007 Mar)

hide / / print
ref: work-0 tags: metacognition AI bootstrap machine learning Pitrat self-debugging date: 08-07-2010 04:36 gmt revision:7 [6] [5] [4] [3] [2] [1] [head]

Jacques Pitrat seems to have many of the same ideas that I've had (only better, and he's implemented them!)--

A Step toward and Artificial Scientist

  • The overall structure seems good - difficult problems are attacked by 4 different levels. First level tries to solve the problem semi-directly, by writing a program to solve combinatorial problems (all problems here are constraint based; constraints are used to pare the tree of possible solutions; these trees are tested combinatorially); second level monitors lower level performance and decides which hypotheses to test (which branch to pursue on the tree) and/or which rules to apply to the tree; third level directs the second level and restarts the whole process if a snag or inconsistency is found, forth level gauges the interest of a given problem and looks for new problems to solve within a family so as to improve the skill of the 3 lower levels.
    • This makes sense, but why 4? Seems like in humans we only need 2 - the actor and the critic, bootstrapping forever.
    • Also includes a "Zeus" module that periodically checks for infinite loops of the other programs, and recompiles with trace instructions if an infinite loop is found within a subroutine.
  • Author claims that the system is highly efficient - it codes constraints and expert knowledge using a higher level language/syntax that is then converted to hundreds of thousands of lines of C code. The active search program runs runtime-generated C programs to evaluate and find solutions, wow!
  • This must have taken a decade or more to create! Very impressive. (seems it took 2 decades, at least according to http://tunes.org/wiki/jacques_20pitrat.html)
    • Despite all this work, he is not nearly done - it has not "learning" module.
    • Quote: In this paper, I do not describe some parts of the system which still need to be developed. For instance, the system performs experiments, analyzes them and finds surprising results; from these results, it is possible to learn some improvements, but the learning module, which would be able to find them, is not yet written. In that case, only a part of the system has been implemented: on how to find interesting data, but still not on how to use them.
  • Only seems to deal with symbolic problems - e.g. magic squares, magic cubes, self-referential integer series. Alas, no statistical problems.
  • The whole CAIA system can effectively be used as a tool for finding problems of arbitrary difficulty with arbitrary number of solutions from a set of problem families or meta-families.
  • Has hypothesis based testing and backtracking; does not have problem reformulation or re-projection.
  • There is mention of ALICE, but not the chatbot A.L.I.C.E - some constraint-satisfaction AI program from the 70's.
  • Has a C source version of MALICE (his version of ALICE) available on the website. Amazingly, there is no Makefile - just gcc *.c -rdynamic -ldl -o malice.
  • See also his 1995 Paper: AI Systems Are Dumb Because AI Researchers Are Too Clever images/815_1.pdf

Artificial beings - his book.