m8ta
You are not authenticated, login. |
|
{1571} | |||||
One model for the learning of language
A more interesting result is Deep symbolic regression for recurrent sequences, where the authors (facebook/meta) use a Transformer -- in this case, directly taken from Vaswini 2017 (8-head, 8-layer QKV w/ a latent dimension of 512) to do both symbolic (estimate the algebraic recurrence relation) and numeric (estimate the rest of the sequence) training / evaluation. Symbolic regression generalizes better, unsurprisingly. But both can be made to work even in the presence of (log-scaled) noise! While the language learning paper shows that small generative programs can be inferred from a few samples, the Meta symbolic regression shows that Transformers can evince either amortized memory (less likely) or algorithms for perception -- both new and interesting. It suggests that 'even' abstract symbolic learning tasks are sufficiently decomposable that the sorts of algorithms available to an 8-layer transformer can give a useful search heuristic. (N.B. That the transformer doesn't spit out perfect symbolic or numerical results directly -- it also needs post-processing search. Also, the transformer algorithm has search (in the form of softmax) baked in to it's architecture.) This is not a light architecture: they trained the transformer for 250 epochs, where each epoch was 5M equations in batches of 512. Each epoch took 1 hour on 16 Volta GPUs w 32GB of memory. So, 4k GPU-hours x ~10 TFlops = 1.4e20 Flops. Compare this with grammar learning above; 7 days on 32 cores operating at ~ 3Gops/sec is 1.8e15 ops. Much, much smaller compute. All of this is to suggest a central theme of computer science: a continuum between search and memorization.
Most interesting for a visual neuroscientist (not that I'm one per se, but bear with me) is where on these axes (search, heuristic, memory) visual perception is. Clearly there is a high degree of recurrence, and a high degree of plasticity / learning. But is there search or local optimization? Is this coupled to the recurrence via some form of energy-minimizing system? Is recurrence approximating E-M? | |||||
{1568} | |||||
Burst-dependent synaptic plasticity can coordinate learning in hierarchical circuits
| |||||
{1544} | |||||
The HSIC Bottleneck: Deep learning without Back-propagation In this work, the authors use a kernelized estimate of statistical independence as part of a 'information bottleneck' to set per-layer objective functions for learning useful features in a deep network. They use the HSIC, or Hilbert-schmidt independence criterion, as the independence measure. The information bottleneck was proposed by Bailek (spikes..) et al in 1999, and aims to increase the mutual information between the layer representation and the labels while minimizing the mutual information between the representation and the input:
Where is the hidden representation at layer i (later output), is the layer input, and are the labels. By replacing with the HSIC, and some derivation (?), they show that
Where are samples and labels, and -- that is, it's the kernel function applied to all pairs of (vectoral) input variables. H is the centering matrix. The kernel is simply a Gaussian kernel, . So, if all the x and y are on average independent, then the inner-product will be mean zero, the kernel will be mean one, and after centering will lead to zero trace. If the inner product is large within the realm of the derivative of the kernel, then the HSIC will be large (and negative, i think). In practice they use three different widths for their kernel, and they also center the kernel matrices. But still, the feedback is an aggregate measure (the trace) of the product of two kernelized (a nonlinearity) outer-product spaces of similarities between inputs. it's not unimaginable that feedback networks could be doing something like this... For example, a neural network could calculate & communicate aspects of joint statistics to reward / penalize weights within a layer of a network, and this is parallelizable / per layer / adaptable to an unsupervised learning regime. Indeed, that was done almost exactly by this paper: Kernelized information bottleneck leads to biologically plausible 3-factor Hebbian learning in deep networks albeit in a much less intelligible way. Robust Learning with the Hilbert-Schmidt Independence Criterion Is another, later, paper using the HSIC. Their interpretation: "This loss-function encourages learning models where the distribution of the residuals between the label and the model prediction is statistically independent of the distribution of the instances themselves." Hence, given above nomenclature, (I'm not totally sure about the weighting, but might be required given the definition of the HSIC.) As I understand it, the HSIC loss is a kernellized loss between the input, output, and labels that encourages a degree of invariance to input ('covariate shift'). This is useful, but I'm unconvinced that making the layer output independent of the input is absolutely essential (??) | |||||
{1552} | |||||
Modularizing Deep Learning via Pairwise Learning With Kernels
I think in general this is an important result, even if its not wholly unique / somewhat anticipated (it's a year old at the time of writing). Modular training of neural networks is great for efficiency, parallelization, and biological implementations! Transport of weights between layers is hence non-essential. Classes still are, but I wonder if temporal continuity can solve some of these problems? (There is plenty of other effort in this area -- see also {1544}) | |||||
{1547} | |||||
Meta-Learning Update Rules for Unsupervised Representation Learning
This is a clearly-written, easy to understand paper. The results are not highly compelling, but as a first set of experiments, it's successful enough. I wonder what more constraints (fewer parameters, per the genome), more options for architecture modifications (e.g. different feedback schemes, per neurobiology), and a black-box optimization algorithm (evolution) would do? | |||||
{1546} | |||||
Local synaptic learning rules suffice to maximize mutual information in a linear network
x = randn(1000, 10); Q = x' * x; a = 0.001; Y = randn(10, 1); y = zeros(10, 1); for i = 1:1000 y = Y + (eye(10) - a*Q)*y; end y - pinv(Q)*Y / a % should be zero.
To this is added a 'sensing' learning and 'noise' unlearning phase -- one optimizes , the other minimizes . Everything is then applied, similar to before, to a gaussian-filtered one-dimensional white-noise stimuli. He shows this results in bandpass filter behavior -- quite weak sauce in an era where ML papers are expected to test on five or so datasets. Even if this was 1992 (nearly forty years ago!), it would have been nice to see this applied to a more realistic dataset; perhaps some of the following papers? Olshausen & Field came out in 1996 -- but they applied their algorithm to real images. In both Olshausen & this work, no affordances are made for multiple layers. There have to be solutions out there... | |||||
{1545} | |||||
Self-organizaton in a perceptual network
One may critically challenge the infomax idea: we very much need to (and do) throw away spurious or irrelevant information in our sensory streams; what upper layers 'care about' when making decisions is certainly relevant to the lower layers. This credit-assignment is neatly solved by backprop, and there are a number 'biologically plausible' means of performing it, but both this and infomax are maybe avoiding the problem. What might the upper layers really care about? Likely 'care about' is an emergent property of the interacting local learning rules and network structure. Can you search directly in these domains, within biological limits, and motivated by statistical reality, to find unsupervised-learning networks? You'll still need a way to rank the networks, hence an objective 'care about' function. Sigh. Either way, I don't per se put a lot of weight in the infomax principle. It could be useful, but is only part of the story. Otherwise Linsker's discussion is accessible, lucid, and prescient. Lol. | |||||
{1543} |
ref: -2019
tags: backprop neural networks deep learning coordinate descent alternating minimization
date: 07-21-2021 03:07 gmt
revision:1
[0] [head]
|
||||
Beyond Backprop: Online Alternating Minimization with Auxiliary Variables
This is interesting in that the weight updates can be cone in parallel - perhaps more efficient - but you are still propagating errors backward, albeit via optimizing 'codes'. Given the vast infractructure devoted to auto-diff + backprop, I can't see this being adopted broadly. That said, the idea of alternating minimization (which is used eg for EM clustering) is powerful, and this paper does describe (though I didn't read it) how there are guarantees on the convexity of the alternating minimization. Likewise, the authors show how to improve the performance of the online / minibatch algorithm by keeping around memory variables, in the form of covariance matrices. | |||||
{1541} | |||||
Like this blog but 100% better! | |||||
{1540} | |||||
Two Routes to Scalable Credit Assignment without Weight Symmetry This paper looks at five different learning rules, three purely local, and two non-local, to see if they can work as well as backprop in training a deep convolutional net on ImageNet. The local learning networks all feature forward weights W and backward weights B; the forward weights (+ nonlinearities) pass the information to lead to a classification; the backward weights pass the error, which is used to locally adjust the forward weights. Hence, each fake neuron has locally the forward activation, the backward error (or loss gradient), the forward weight, backward weight, and Hebbian terms thereof (e.g the outer product of the in-out vectors for both forward and backward passes). From these available variables, they construct the local learning rules:
Each of these serves as a "regularizer term" on the feedback weights, which governs their learning dynamics. In the case of backprop, the backward weights B are just the instantaneous transpose of the forward weights W. A good local learning rule approximates this transpose progressively. They show that, with proper hyperparameter setting, this does indeed work nearly as well as backprop when training a ResNet-18 network. But, hyperparameter settings don't translate to other network topologies. To allow this, they add in non-local learning rules:
In "Symmetric Alignment", the Self and Decay rules are employed. This is similar to backprop (the backward weights will track the forward ones) with L2 regularization, which is not new. It performs very similarly to backprop. In "Activation Alignment", Amp and Sparse rules are employed. I assume this is supposed to be more biologically plausible -- the Hebbian term can track the forward weights, while the Sparse rule regularizes and stabilizes the learning, such that overall dynamics allow the gradient to flow even if W and B aren't transposes of each other. Surprisingly, they find that Symmetric Alignment to be more robust to the injection of Gaussian noise during training than backprop. Both SA and AA achieve similar accuracies on the ResNet benchmark. The authors then go on to explain the plausibility of non-local but approximate learning rules with Regression discontinuity design ala Spiking allows neurons to estimate their causal effect. This is a decent paper,reasonably well written. They thought trough what variables are available to affect learning, and parameterized five combinations that work. Could they have done the full matrix of combinations, optimizing just they same as the metaparameters? Perhaps, but that would be even more work ... Regarding the desire to reconcile backprop and biology, this paper does not bring us much (if at all) closer. Biological neural networks have specific and local uses for error; even invoking 'error' has limited explanatory power on activity. Learning and firing dynamics, of course of course. Is the brain then just an overbearing mess of details and overlapping rules? Yes probably but that doesn't mean that we human's can't find something simpler that works. The algorithms in this paper, for example, are well described by a bit of linear algebra, and yet they are performant. | |||||
{1537} |
ref: -0
tags: cortical computation learning predictive coding reviews
date: 02-23-2021 20:15 gmt
revision:2
[1] [0] [head]
|
||||
PMID-30359606 Predictive Processing: A Canonical Cortical Computation
PMID-23177956 Canonical microcircuits for predictive coding
Control of synaptic plasticity in deep cortical networks
| |||||
{1523} |
ref: -0
tags: tennenbaum compositional learning character recognition one-shot learning
date: 02-23-2021 18:56 gmt
revision:2
[1] [0] [head]
|
||||
One-shot learning by inverting a compositional causal process
| |||||
{1534} | |||||
Going in circles is the way forward: the role of recurrence in visual inference I think the best part of this article are the references -- a nicely complete listing of, well, the current opinion in Neurobiology! (Note that this issue is edited by our own Karel Svoboda, hence there are a good number of Janelians in the author list..) The gestalt of the review is that deep neural networks need to be recurrent, not purely feed-forward. This results in savings in overall network size, and increase in the achievable computational complexity, perhaps via the incorporation of priors and temporal-spatial information. All this again makes perfect sense and matches my sense of prevailing opinion. Of course, we are left wanting more: all this recurrence ought to be structured in some way. To me, a rather naive way of thinking about it is that feed-forward layers cause weak activations, which are 'amplified' or 'selected for' in downstream neurons. These neurons proximally code for 'causes' or local reasons, based on the supported hypothesis that the brain has a good temporal-spatial model of the visuo-motor world. The causes then can either explain away the visual input, leading to balanced E-I, or fail to explain it, in which the excess activity is either rectified by engaging more circuits or engaging synaptic plasticity. A critical part of this hypothesis is some degree of binding / disentanglement / spatio-temporal re-assignment. While not all models of computation require registers / variables -- RNNs are Turning-complete, e.g., I remain stuck on the idea that, to explain phenomenological experience and practical cognition, the brain much have some means of 'binding'. A reasonable place to look is the apical tuft dendrites, which are capable of storing temporary state (calcium spikes, NMDA spikes), undergo rapid synaptic plasticity, and are so dense that they can reasonably store the outer-product space of binding. There is mounting evidence for apical tufts working independently / in parallel is investigations of high-gamma in ECoG: PMID-32851172 Dissociation of broadband high-frequency activity and neuronal firing in the neocortex. "High gamma" shows little correlation with MUA when you differentiate early-deep and late-superficial responses, "consistent with the view it reflects dendritic processing separable from local neuronal firing" | |||||
{1531} | |||||
PMID-24204224 The Convallis rule for unsupervised learning in cortical networks 2013 - Pierre Yger 1 , Kenneth D Harris This paper aims to unify and reconcile experimental evidence of in-vivo learning rules with established STDP rules. In particular, the STDP rule fails to accurately predict change in strength in response to spike triplets, e.g. pre-post-pre or post-pre-post. Their model instead involves the competition between two time-constant threshold circuits / coincidence detectors, one which controls LTD and another LTP, and is such an extension of the classical BCM rule. (BCM: inputs below a threshold will weaken a synapse; those above it will strengthen. ) They derive the model from optimization criteria that neurons should try to optimize the skewedness of the distribution of their membrane potential: much time spent either firing spikes or strongly inhibited. This maps to a objective function F that looks like a valley - hence the 'convallis' in the name (latin for valley); the objective is differentiated to yield a weighting function for weight changes; they also add a shrinkage function (line + heaviside function) to gate weight changes 'off' at resting membrane potential. A network of firing neurons successfully groups correlated rate-encoded inputs, better than the STDP rule. it can also cluster auditory inputs of spoken digits converted into cochleogram. But this all seems relatively toy-like: of course algorithms can associate inputs that co-occur. The same result was found for a recurrent balanced E-I network with the same cochleogram, and convalis performed better than STDP.  Meh. Perhaps the biggest thing I got from the paper was how poorly STDP fares with spike triplets: Pre following post does not 'necessarily' cause LTD; it's more complicated than that, and more consistent with the two different-timeconstant coincidence detectors. This is satisfying as it allows for apical dendritic depolarization to serve as a contextual binding signal - without negatively impacting the associated synaptic weights. | |||||
{1522} | |||||
Schema networks: zero-shot transfer with a generative causal model of intuitive physics
| |||||
{1500} | |||||
PMID-31942076 A distributional code for value in dopamine based reinforcement learning
| |||||
{1505} | |||||
Scalable and sustainable deep learning via randomized hashing
| |||||
{1495} | |||||
Why multifactor?
| |||||
{1497} | |||||
PMID-26659050 Human level concept learning through probabalistic program induction
| |||||
{1493} | |||||
PMID-27690349 Nonlinear Hebbian Learning as a Unifying Principle in Receptive Field Formation
| |||||
{1492} | |||||
PMID: Spiking neurons can discover predictive features by aggregate-label learning
Editorializing a bit: I said this was interesting, but why? The first part of the paper is another form of SGD, albeit in a spiking neural network, where the gradient is harder compute hence is done numerically. It's the aggregate part that is new -- pulling in repeated patterns through synaptic learning rules. Of course, to do this, the full trace of pre and post synaptic activity must be recorded (??) for estimating the STS (i think). An eligibility trace moves in the right direction as a biologically plausible approximation, but as always nothing matches the precision of SGD. Can the eligibility trace be amended with e.g. neuromodulators to push the performance near that of SGD? The next step of adding self supervised singular and multiple features is perhaps toward the way the brain organizes itself -- small local feedback loops. These features annotate repeated occurrences of stimuli, or tile a continuous feature space. Still, the fact that I haven't seen any follow-up work is suggestive... Editorializing further, there is a limited quantity of work that a single human can do. In this paper, it's a great deal of work, no doubt, and the author offers some good intuitions for the design decisions. Yet still, the total complexity that even a very determined individual can amass is limited, and likely far below the structural complexity of a mammalian brain. This implies that inference either must be distributed and compositional (the normal path of science), or the process of evaluating & constraining models must be significantly accelerated. This later option is appealing, as current progress in neuroscience seems highly technology limited -- old results become less meaningful when the next wave of measurement tools comes around, irrespective of how much work went into it. (Though: the impedtus for measuring a particular thing in biology is only discovered through these 'less meaningful' studies...). A third option, perhaps one which many theoretical neuroscientists believe in, is that there are some broader, physics-level organizing principles to the brain. Karl Friston's free energy principle is a good example of this. Perhaps at a meta level some organizing theory can be found, or likely a set of theories; but IMHO, you'll need at least one theory per brain area, at least, just the same as each area is morphologically, cytoarchitecturaly, and topologically distinct. (There may be only a few theories of the cortex, despite all the areas, which is why so many are eager to investigate it!) So what constitutes a theory? Well, you have to meaningfully describe what a brain region does. (Why is almost as important; how more important to the path there.) From a sensory standpoint: what information is stored? What processing gain is enacted? How does the stored information impress itself on behavior? From a motor standpoint: how are goals selected? How are the behavioral segments to attain them sequenced? Is the goal / behavior even a reasonable way of factoring the problem? Our dual problem, building the bridge from the other direction, is perhaps easier. Or it could be a lot more money has gone into it. Either way, much progress has been made in AI. One arm is deep function approximation / database compression for fast and organized indexing, aka deep learning. Many people are thinking about that; no need to add to the pile; anyway, as OpenAI has proven, the common solution to many problems is to simply throw more compute at it. A second is deep reinforcement learning, which is hideously sample and path inefficient, hence ripe for improvement. One side is motor: rather than indexing raw motor variables (LRUD in a video game, or joint torques with a robot..) you can index motor primitives, perhaps hierarchically built; likewise, for the sensory input, the model needs to infer structure about the world. This inference should decompose overwhelming sensory experience into navigable causes ... But how can we do this decomposition? The cortex is more than adept at it, but now we're at the original problem, one that the paper above purports to make a stab at. | |||||
{1485} | |||||
PMID-26352471 Labelling and optical erasure of synaptic memory traces in the motor cortex
| |||||
{1482} | |||||
Rapid learning or feature reuse? Towards understanding the effectiveness of MAML
| |||||
{208} | |||||
PMID-22388818 Corticostriatal plasticity is necessary for learning intentional neuroprosthetic skills.
| |||||
{1463} | |||||
All-optical spiking neurosynaptic networks with self-learning capabilities
| |||||
{1441} | |||||
Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures
| |||||
{1415} | |||||
PMID-28777724 Active inference, curiosity and insight. Karl J. Friston, Marco Lin, Christopher D. Frith, Giovanni Pezzulo,
| |||||
{1423} | |||||
PMID-27824044 Random synaptic feedback weights support error backpropagation for deep learning.
Our proof says that weights W0 and W evolve to equilibrium manifolds, but simulations (Fig. 4) and analytic results (Supple- mentary Proof 2) hint at something more specific: that when the weights begin near 0, feedback alignment encourages W to act like a local pseudoinverse of B around the error manifold. This fact is important because if B were exactly W + (the Moore- Penrose pseudoinverse of W ), then the network would be performing Gauss-Newton optimization (Supplementary Proof 3). We call this update rule for the hidden units pseudobackprop and denote it by ∆hPBP = W + e. Experiments with the linear net- work show that the angle, ∆hFA ]∆hPBP quickly becomes smaller than ∆hFA ]∆hBP (Fig. 4b, c; see Methods). In other words feedback alignment, despite its simplicity, displays elements of second-order learning. | |||||
{1419} | |||||
All-optical machine learning using diffractive deep neural networks
| |||||
{1422} | |||||
PMID-29205151 Towards deep learning with segregated dendrites https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5716677/
| |||||
{1416} | |||||
Learning data manifolds with a Cutting Plane method
| |||||
{1413} | |||||
PMID-24711417 Evidence for a causal inverse model in an avian cortico-basal ganglia circuit
| |||||
{1412} |
ref: -0
tags: deeplabcut markerless tracking DCN transfer learning
date: 10-03-2018 23:56 gmt
revision:0
[head]
|
||||
Markerless tracking of user-defined features with deep learning
| |||||
{1411} | |||||
PMID-20544831 The decade of the dendritic NMDA spike.
| |||||
{1408} | |||||
LDMNet: Low dimensional manifold regularized neural nets.
| |||||
{1345} |
ref: -0
tags: nucleus accumbens caudate stimulation learning enhancement MIT
date: 09-20-2016 23:51 gmt
revision:1
[0] [head]
|
||||
| |||||
{1333} | |||||
| |||||
{913} | |||||
PMID-21499255[0] Reversible large-scale modification of cortical networks during neuroprosthetic control.
Other notes:
____References____
| |||||
{1169} |
ref: -0
tags: artificial intelligence projection episodic memory reinforcement learning
date: 08-15-2012 19:16 gmt
revision:0
[head]
|
||||
Projective simulation for artificial intelligence
| |||||
{696} |
ref: Jarosiewicz-2008.12
tags: Schwartz BMI learning perturbation
date: 03-07-2012 17:11 gmt
revision:2
[1] [0] [head]
|
||||
PMID-19047633[0] Functional network reorganization during learning in a brain-computer interface paradigm.
____References____
| |||||
{166} | |||||
PMID-19744484 What can man do without basal ganglia motor output? The effect of combined unilateral subthalamotomy and pallidotomy in a patient with Parkinson's disease.
| |||||
{58} | |||||
PMID-16271465 The basal ganglia: learning new tricks and loving it
| |||||
{1144} | |||||
PMID-15242667 Anatomical funneling, sparse connectivity and redundancy reduction in the neural networks of the basal ganglia
PMID-15233923 Coincident but distinct messages of midbrain dopamine and striatal tonically active neurons.
| |||||
{1076} | |||||
PMID-17017503[0] Synchronizing activity of basal ganglia and pathophysiology of Parkinson's disease.
____References____
| |||||
{1124} | |||||
PMID-20850966[0] Basal ganglia contributions to motor control: a vigorous tutor.
____References____
| |||||
{843} | |||||
PMID-19286561[0] Human Substantia Nigra Neurons Encode Unexpected Financial Rewards
____References____
| |||||
{1077} | |||||
PMID-17962524[0] Hold your horses: impulsivity, deep brain stimulation, and medication in parkinsonism.
____References____
| |||||
{165} |
ref: Lehericy-2005.08
tags: fMRI motor_learning basal_ganglia STN subthalamic
date: 01-25-2012 00:20 gmt
revision:2
[1] [0] [head]
|
||||
PMID-16107540[0] Distinct basal ganglia territories are engaged in early and advanced motor sequence learning
____References____
| |||||
{1084} | |||||
PMID-19416950[0] Reward-learning and the novelty-seeking personality: a between- and within-subjects study of the effects of dopamine agonists on young Parkinson's patients
____References____
| |||||
{1085} | |||||
PMID-21603228[0] Dopaminergic Balance between Reward Maximization and Policy Complexity.
____References____
| |||||
{255} |
ref: BarGad-2003.12
tags: information dimensionality reduction reinforcement learning basal_ganglia RDDR SNR globus pallidus
date: 01-16-2012 19:18 gmt
revision:3
[2] [1] [0] [head]
|
||||
PMID-15013228[] Information processing, dimensionality reduction, and reinforcement learning in the basal ganglia (2003)
____References____ | |||||
{911} | |||||
PMID-19621062 Emergence of a stable cortical map for neuroprosthetic control.
| |||||
{318} | |||||
PMID-14624244[0] Learning to control a brain-machine interface for reaching and grasping by primates.
____References____ | |||||
{904} | |||||
PMID-6769536[0] Operant control of precentral neurons: Control of modal interspike intervals
____References____
| |||||
{341} | |||||
PMID-4196269[0] Operantly conditioned patterns on precentral unit activity and correlated responses in adjacent cells and contralateral muscles
____References____ | |||||
{888} | |||||
Experiment: you have a key. You want that key to learn to control a BMI, but you do not want the BMI to learn how the key does things, as
Given this, I propose a very simple groupweight: one axis is controlled by the summed action of a certain population of neurons, the other by a second, disjoint, population; a third population serves as control. The task of the key is to figure out what does what: how does the firing of a given unit translate to movement (forward model). Then the task during actual behavior is to invert this: given movement end, what sequence of firings should be generated? I assume, for now, that the brain has inbuilt mechanisms for inverting models (not that it isn't incredibly interesting -- and I'll venture a guess that it's related to replay, perhaps backwards replay of events). This leaves us with the task of inferring the tool-model from behavior, a task that can be done now with our modern (though here-mentioned quite simple) machine learning algorithms. Specifically, it can be done through supervised learning: we know the input (neural firing rates) and the output (cursor motion), and need to learn the transform between them. I can think of many ways of doing this on a computer:
{i need to think more about model-building, model inversion, and songbird learning?} | |||||
{902} | |||||
bibtex:Olson-2005 Evidence of a mechanism of neural adaptation in the closed loop control of directions
| |||||
{788} |
ref: -0
tags: reinforcement learning basis function policy specialization
date: 01-03-2012 02:37 gmt
revision:1
[0] [head]
|
||||
To read: | |||||
{623} | |||||
Reinforcement learning in the cortex (a web scour/crawl):
| |||||
{5} |
ref: bookmark-0
tags: machine_learning research_blog parallel_computing bayes active_learning information_theory reinforcement_learning
date: 12-31-2011 19:30 gmt
revision:3
[2] [1] [0] [head]
|
||||
hunch.net interesting posts:
| |||||
{612} | |||||
PMID-17187065[0] Separate neural substrates for skill learning and performance in the ventral and dorsal striatum.
____References____ | |||||
{69} | |||||
PMID-17057705 Long-term motor cortex plasticity induced by an electronic neural implant.
____References____ | |||||
{194} | |||||
PMID-9658025[0] Predictive reward signal of dopamine neurons.
____References____
| |||||
{135} | |||||
PMID-16212764[0] Incremental online learning in high dimensions ideas:
____References____
| |||||
{24} | |||||
PMID-15537672[0] On the Benefits of not Trying: Brain Activity and Connectivity Reflecting the Interactions of Explicit and Implicit Sequence Learning quote: ünder certain curcumstances, automatic learning may be attenuated by explicit memory processes" : expicit attemps to learn a difficult sequence (compared to a control) produces a failure in implicit learning, and this failure is caused by the supression of learning rather than the expression. There is a deleterious effect of explicit search on implicit learning.
____References____
| |||||
{300} | |||||
Motor learning by field approximation.
____References____ | |||||
{323} |
ref: Loewenstein-2006.1
tags: reinforcement learning operant conditioning neural networks theory
date: 12-07-2011 03:36 gmt
revision:4
[3] [2] [1] [0] [head]
|
||||
PMID-17008410[0] Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity
____References____ | |||||
{699} |
ref: Harris-2008.03
tags: retroaxonal retrosynaptic Harris learning cortex backprop
date: 12-07-2011 02:34 gmt
revision:2
[1] [0] [head]
|
||||
PMID-18255165[0] Stability of the fittest: organizing learning through retroaxonal signals
____References____
| |||||
{702} | |||||
PMID-8670641[0] The hippocampo-neocortical dialogue.
| |||||
{723} |
ref: notes-0
tags: data effectiveness Norvig google statistics machine learning
date: 12-06-2011 07:15 gmt
revision:1
[0] [head]
|
||||
The unreasonable effectiveness of data.
| |||||
{906} | |||||
PMID-6772272 Operant control of precentral neurons: bilateral single unit conditioning.
| |||||
{914} | |||||
PMID-10681435 Cortical correlates of learning in monkey adapting to a new dynamical environment. | |||||
{903} | |||||
PMID-7140894 Short-term changes in cell activity of areas 4 and 5 during operant conditioning.
| |||||
{871} | |||||
http://www.autonlab.org/tutorials/ -- excellent http://energyfirefox.blogspot.com/2010/12/data-mining-with-ubuntu.html -- apt-get! | |||||
{859} | |||||
Learning by Playing: Video Games in the Classroom
| |||||
{858} | |||||
Notes & responses to evolutionary psychologists John Toobey and Leda Cosmides' - authors of The Adapted Mind - essay in This Will change Everything
| |||||
{838} |
ref: -0
tags: meta learning Artificial intelligence competent evolutionary programming Moshe Looks MOSES
date: 08-07-2010 16:30 gmt
revision:6
[5] [4] [3] [2] [1] [0] [head]
|
||||
| |||||
{815} | |||||
Jacques Pitrat seems to have many of the same ideas that I've had (only better, and he's implemented them!)-- A Step toward and Artificial Scientist
Artificial beings - his book. | |||||
{817} | |||||
My letter to a friend regarding images/817_1.pdf The free-energy principle: a unified brain theory? PMID-20068583 -- like all critics, i feel the world will benefit from my criticism ;-) Hey , I did read that paper on the plane, and wrote down some comments, but haven't had a chance to actually send them until now. err..anyway.. might as well send them since I did bother writing stuff down: I thought the paper was interesting, but rather specious, especially the way the author makes 'surprise' something to be minimized. This is blatantly false! Humans and other mammals (at least) like being surprised (in the normal meaning of the word). He says things like: "This is where free energy comes in: free energy is an upper bound on surprise, which means that if agents minimize free energy, they implicity minimize surprise -- a huge logical jump, and not one that I'm willing to accept. I feel like this author is trying to capitalize on some recent developments, like variational bayes and ensemble learning, without fully understanding them or having the mathematical chops (like Hayen) to flesh it out. So far as I understand, large theories (as this proposes to be) are useful in that they permit derivation of particular update equations; Variational Bayes for example takes the Kullbeck-Leibler divergence & a factorization of the posterior to create EM update equations. So, even if the free energy idea is valid, the author uses it at such a level to make no useful, mathy predictions. One area where I agree with him is that the nervous system create a model of the internal world, for the purpose of prediction. Yes, maybe this allows 'surprise' to be minimized. But animals minimize surprise not because of free energy, but rather for the much more quotidian reason that surprise can be dangerous. Finally, i wholly reject the idea that value and surprise can be equated or even similar. They seem orthogonal to me! Value is assigned to things that help an animal survive and multiply, surprise is things it's nervous system does not expect. All these things make sense when cast against the theories of evolurion and selection. Perhaps, perhaps selection is a consequence of decreasing free energy - this intuitively and somewhat amorphously/mystically makes sense (the aggregate consequence of life on earth is somehow order, harmony and other 'goodstuff' (but this is an anthropocentric view)) - but if so the author should be able to make more coherent / mathematical prediction of observed phenomena. Eg. why animals locally violate the second law of thermodynamics. Despite my critique, thanks for sending the article, made me think. Maybe you don't want to read it now and I saved you some time ;-) | |||||
{796} | |||||
An interesting field in ML is nonlinear dimensionality reduction - data may appear to be in a high-dimensional space, but mostly lies along a nonlinear lower-dimensional subspace or manifold. (Linear subspaces are easily discovered with PCA or SVD(*)). Dimensionality reduction projects high-dimensional data into a low-dimensional space with minimum information loss -> maximal reconstruction accuracy; nonlinear dim reduction does this (surprise!) using nonlinear mappings. These techniques set out to find the manifold(s):
(*) SVD maps into 'concept space', an interesting interpretation as per Leskovec's lecture presentation. | |||||
{795} |
ref: work-0
tags: machine learning reinforcement genetic algorithms
date: 10-26-2009 04:49 gmt
revision:1
[0] [head]
|
||||
I just had dinner with Jesse, and the we had a good/productive discussion/brainstorm about algorithms, learning, and neurobio. Two things worth repeating, one simpler than the other: 1. Gradient descent / Newton-Rhapson like techniques should be tried with genetic algorithms. As of my current understanding, genetic algorithms perform an semi-directed search, randomly exploring the space of solutions with natural selection exerting a pressure to improve. What if you took the partial derivative of each of the organism's genes, and used that to direct mutation, rather than random selection of the mutated element? What if you looked before mating and crossover? Seems like this would speed up the algorithm greatly (though it might get it stuck in local minima, too). Not sure if this has been done before - if it has, edit this to indicate where! 2. Most supervised machine learning algorithms seem to rely on one single, externally applied objective function which they then attempt to optimize. (Rather this is what convex programming is. Unsupervised learning of course exists, like PCA, ICA, and other means of learning correlative structure) There are a great many ways to do optimization, but all are exactly that - optimization, search through a space for some set of weights / set of rules / decision tree that maximizes or minimizes an objective function. What Jesse and I have arrived at is that there is no real utility function in the world, (Corollary #1: life is not an optimization problem (**)) -- we generate these utility functions, just as we generate our own behavior. What would happen if an algorithm iteratively estimated, checked, cross-validated its utility function based on the small rewards actually found in the world / its synthetic environment? Would we get generative behavior greater than the complexity of the inputs? (Jesse and I also had an in-depth talk about information generation / destruction in non-linear systems.) Put another way, perhaps part of learning is to structure internal valuation / utility functions to set up reinforcement learning problems where the reinforcement signal comes according to satisfaction of sub-goals (= local utility functions). Or, the gradient signal comes by evaluating partial derivatives of actions wrt Creating these goals is natural but not always easy, which is why one reason (of very many!) sports are so great - the utility function is clean, external, and immutable. The recursive, introspective creation of valuation / utility functions is what drives a lot of my internal monologues, mixed with a hefty dose of taking partial derivatives (see {780}) based on models of the world. (Stated this way, they seem so similar that perhaps they are the same thing?) To my limited knowledge, there has been some work as of recent in the creation of sub-goals in reinforcement learning. One paper I read used a system to look for states that had a high ratio of ultimately rewarded paths to unrewarded paths, and selected these as subgoals (e.g. rewarded the agent when this state was reached.) I'm not talking about these sorts of sub-goals. In these systems, there is an ultimate goal that the researcher wants the agent to achieve, and it is the algorithm's (or s') task to make a policy for generating/selecting behavior. Rather, I'm interested in even more unstructured tasks - make a utility function, and a behavioral policy, based on small continuous (possibly irrelevant?) rewards in the environment. Why would I want to do this? The pet project I have in mind is a 'cognitive' PCB part placement / layout / routing algorithm to add to my pet project, kicadocaml, to finally get some people to use it (the attention economy :-) In the course of thinking about how to do this, I've realized that a substantial problem is simply determining what board layouts are good, and what are not. I have a rough aesthetic idea + some heuristics that I learned from my dad + some heuristics I've learned through practice of what is good layout and what is not - but, how to code these up? And what if these aren't the best rules, anyway? If i just code up the rules I've internalized as utility functions, then the board layout will be pretty much as I do it - boring! Well, I've stated my sub-goal in the form of a problem statement and some criteria to meet. Now, to go and search for a decent solution to it. (Have to keep this blog m8ta!) (Or, realistically, to go back and see if the problem statement is sensible). (**) Corollary #2 - There is no god. nod, Dawkins. | |||||
{780} | |||||
A Self-learning Evolutionary Chess Program
| |||||
{792} | |||||
http://www.cs.cmu.edu/~wcohen/slipper/
| |||||
{787} | |||||
My theory on the Flynn effect - human intelligence IS increasing, and this is NOT stopping. Look at it from a ML perspective: there is more free time to get data, the data (and world) has almost unlimited complexity, the data is much higher quality and much easier to get (the vast internet & world!(travel)), there is (hopefully) more fuel to process that data (food!). Therefore, we are getting more complex, sophisticated, and intelligent. Also, the idea that less-intelligent people having more kids will somehow 'dilute' our genetic IQ is bullshit - intelligence is mostly a product of environment and education, and is tailored to the tasks we need to do; it is not (or only very weakly, except at the extremes) tied to the wetware. Besides, things are changing far too fast for genetics to follow. Regarding this social media, like facebook and others, you could posit that social intelligence is increasing, along similar arguments to above: social data is seemingly more prevalent, more available, and people spend more time examining it. Yet this feels to be a weaker argument, as people have always been socializing, talking, etc., and I'm not sure if any of these social media have really increased it. Irregardless, people enjoy it - that's the important part. My utopia for today :-) | |||||
{762} |
ref: work-0
tags: covariance matrix adaptation learning evolution continuous function normal gaussian statistics
date: 06-30-2009 15:07 gmt
revision:0
[head]
|
||||
http://www.lri.fr/~hansen/cmatutorial.pdf
| |||||
{761} | |||||
http://www.nytimes.com/2009/05/01/opinion/01brooks.html?_r=1 -- the 'modern view' of genius. Makes sense to me.
| |||||
{666} | |||||
PMID-15286181[0] Providing explicit information disrupts implicit motor learning after basal ganglia stroke.
____References____
| |||||
{715} |
ref: Legenstein-2008.1
tags: Maass STDP reinforcement learning biofeedback Fetz synapse
date: 04-09-2009 17:13 gmt
revision:5
[4] [3] [2] [1] [0] [head]
|
||||
PMID-18846203[0] A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback
____References____
| |||||
{706} | |||||
PMID-8987766[0] Functional Stages in the Formation of Human Long-Term Motor Memory
____References____
| |||||
{712} | |||||
PMID-19245368[0] The influence of learning on sleep slow oscillations and associated spindles and ripples in humans and rats
____References____
| |||||
{685} |
ref: BrashersKrug-1996.07
tags: motor learning sleep offline consolidation Bizzi Shadmehr
date: 03-24-2009 15:39 gmt
revision:1
[0] [head]
|
||||
PMID-8717039[0] Consolidation in human motor memory.
____References____
| |||||
{701} | |||||
PMID-18836440[0] Pharmacological REM sleep suppression paradoxically improves rather than impairs skill memory
____References____
| |||||
{707} | |||||
PMID-11691982[0] The Role of Sleep in Learning and Memory
____References____
| |||||
{297} | |||||
PMID-17182912[0] Skill Representation in the Primary Motor Cortex After Long-Term Practice
____References____ | |||||
{703} | |||||
PMID-17167082[0] Elevated sleep spindle density after learning or after retrieval in rats.
____References____
| |||||
{700} | |||||
PMID-11691983[0] Sleep, Learning, and Dreams: Off-line Memory Reprocessing
____References____
| |||||
{693} | |||||
PMID-16794848[9] Bilateral basal ganglia activation associated with sensorimotor adaptation.
| |||||
{695} | |||||
Alopex: A Correlation-Based Learning Algorithm for Feed-Forward and Recurrent Neural Networks (1994)
| |||||
{694} |
ref: Diedrichsen-2005.1
tags: Shadmehr error learning basal ganglia cerebellum motor cortex
date: 03-09-2009 19:26 gmt
revision:0
[head]
|
||||
PMID-16251440[0] Neural correlates of reach errors.
____References____ | |||||
{692} | |||||
PMID-17189946[0] Cortico-hippocampal interaction during up-down states and memory consolidation.
____References____ | |||||
{680} | |||||
PMID-17406665[0] Daytime naps, motor memory consolidation and regionally specific sleep spindles.
____References____ | |||||
{683} | |||||
PMID-14983183[0] Off-line replay maintains declarative memories in a model of hippocampal-neocortical interactions
____References____ | |||||
{686} | |||||
PMID-17855611 Motor Force Field Learning Influences Visual Processing of Target Motion
| |||||
{679} | |||||
PMID-18274267[0] Fast sleep spindle (13-15 hz) activity correlates with sleep-dependent improvement in visuomotor performance.
____References____ | |||||
{672} | |||||
PMID-18714787[0] Motor sequence learning increases sleep spindles and fast frequencies in post-training sleep.
____References____ | |||||
{671} | |||||
PMID-18951924[0] Consciousness and the consolidation of motor learning
____References____ | |||||
{651} | |||||
PMID-18482830[0] Reinforcement learning of motor skills with policy gradients
____References____ | |||||
{676} | |||||
PMID-18578851 Overconfidence in an objective anticipatory motor task.
| |||||
{674} |
ref: notes-0
tags: Barto Hierarchal Reinforcement Learning
date: 02-17-2009 05:38 gmt
revision:1
[0] [head]
|
||||
Recent Advancements in Hierarchal Reinforcement Learning
| |||||
{673} |
ref: Vasilaki-2009.02
tags: associative learning prefrontal cortex model hebbian
date: 02-17-2009 03:37 gmt
revision:2
[1] [0] [head]
|
||||
PMID-19153762 Learning flexible sensori-motor mappings in a complex network.
| |||||
{669} | |||||
PMID-19191602 A New Hypothesis for Sleep: Tuning for Criticality.
| |||||
{653} | |||||
PMID-12371511[0] Dopamine: generalization and bonuses
____References____ | |||||
{652} |
ref: notes-0
tags: policy gradient reinforcement learning aibo walk optimization
date: 12-09-2008 17:46 gmt
revision:0
[head]
|
||||
Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion
| |||||
{636} | |||||
PMID-9448252[0] The acquisition of skilled motor performance: Fast and slow experience-driven changes in primary motor cortex
____References____ | |||||
{635} | |||||
“Seeing†through the tongue: cross-modal plasticity in the congenitally blind
| |||||
{631} | |||||
PMID-16563737[0] The computational neurobiology of learning and reward
____References____ | |||||
{629} | |||||
PMID-11257908[0] Multiple Reward Signals in the Brain
____References____ | |||||
{628} | |||||
PMID-10731222[0] Reward processing in primate orbitofrontal cortex and basal ganglia
____References____ | |||||
{627} | |||||
PMID-9530495[0] Cortical plasticity: from synapses to maps
____References____ | |||||
{625} | |||||
PMID-8423485[0] Plasticity in the frequency representation of primary auditory cortex following discrimination training in adult owl monkeys
____References____ | |||||
{618} | |||||
PMID-11506661[0] Parallel cortico-basal ganglia mechanisms for acquisition and execution of visuomotor sequences - a computational approach.
____References____ | |||||
{617} | |||||
PMID-12015240[0] Central mechanisms of motor skill learning
____References____ | |||||
{126} | |||||
PMID-8091209[0] The basal ganglia and adaptive motor control (I couldn't find the pdf for this)
____References____ | |||||
{613} | |||||
PMID-12383782[0] Reward, motivation, and reinforcement learning.
____References____ | |||||
{236} | |||||
PMID-8985875 Neural information transferred from the putamen to the globus pallidus during learned movement in the monkey.
| |||||
{67} | |||||
PMID-16271465[] The basal ganglia: Learning new tricks and loving it
____References____ | |||||
{611} | |||||
PMID-18667540[0] Learning a novel myoelectric-controlled interface task.
____References____ | |||||
{609} |
ref: -0
tags: differential dynamic programming machine learning
date: 09-24-2008 23:39 gmt
revision:2
[1] [0] [head]
|
||||
| |||||
{289} | |||||
PMID-11395017[0] Neuronal correlates of motor performance and motor learning in the primary motor cortex of monkeys adapting to an external force field
____References____ | |||||
{608} | |||||
PMID-14511525 Probing changes in neural interaction during adaptation.
| |||||
{264} | |||||
PMID-15588812[0] Tools for the body schema See also PMID-8951846[1] Coding of modified body schema during tool use by macaque postcentral neurones. ____References____ | |||||
{329} | |||||
PMID-17234689[0] Volitional control of neural activity: implications for brain-computer interfaces (part of a symposium)
humm.. this paper came out a month ago, and despite the fact that he is much older and more experienced than i, we have arrived at the same conclusions by looking at the same set of data/papers. so: that's good, i guess. ____References____ | |||||
{442} | |||||
http://mirror.mricon.com/french/french.html -- "how i learned french in a year"
| |||||
{71} |
ref: Francis-2005.11
tags: Joe_Francis motor_learning reaching humans delay intertrial interval
date: 04-09-2007 22:48 gmt
revision:1
[0] [head]
|
||||
PMID-16132970[0] The Influence of the Inter-Reach-Interval on Motor Learning. Previous studies have demonstrated changes in motor memories with the passage of time on the order of hours. We sought to further this work by determining the influence that time on the order of seconds has on motor learning by changing the duration between successive reaches (inter-reach-interval IRI). Human subjects made reaching movements to visual targets while holding onto a robotic manipulandum that presented a viscous curl field. We tested four experimental groups that differed with respect to the IRI (0.5, 5, 10 or 20 sec). The 0.5 sec IRI group performed significantly worse with respect to a learning index than the other groups over the first set of 192 reaches. Each group demonstrated significant learning during the first set. There was no significant difference with respect to the learning index between the 5, 10 or 20 sec IRI groups. During the second and third set of 192 reaches the 0.5 sec IRI group's performance became indistinguishable from the other groups indicating that fatigue did not cause the initial poor performance and that with continued training the initial deficit in performance could be overcome. ____References____ | |||||
{129} | |||||
PMID-10607637[0] Internal models for motor control and trajectory planning
____References____ | |||||
{333} |
ref: BrashersKrug-1996.07
tags: consolidation motor learning Shadmher Bizzi
date: 04-09-2007 14:35 gmt
revision:2
[1] [0] [head]
|
||||
PMID-8717039[0] Consolidation in human motor memory
____References____ | |||||
{245} |
ref: AnguianoRodrAguez-2007.02
tags: serotonin learning dopamine
date: 03-12-2007 02:30 gmt
revision:0
[head]
|
||||
PMID-17126827 Striatal serotonin depletion facilitates rat egocentric learning via dopamine modulation. facilitates - they get better! (more awake than controls? inability to forget?) | |||||
{244} | |||||
PMID-17216714 Motor and cognitive functions of the neostriatum during bilateral blocking of its dopamine receptors
| |||||
{197} | |||||
PMID-15151178[0] Sequential Rearrangements of the Ensemble Activity of Putamen Neurons in the Monkey Brain as a Correlate of Continuous Behavior
____References____ | |||||
{7} |
ref: bookmark-0
tags: book information_theory machine_learning bayes probability neural_networks mackay
date: 0-0-2007 0:0
revision:0
[head]
|
||||
http://www.inference.phy.cam.ac.uk/mackay/itila/book.html -- free! (but i liked the book, so I bought it :) | |||||
{22} |
ref: Brown-2001.11
tags: Huntingtons motor_learning intentional implicit cognitive deficits
date: 0-0-2007 0:0
revision:0
[head]
|
||||
PMID-11673321 http://brain.oxfordjournals.org/cgi/content/full/124/11/2188 :
| |||||
{29} | |||||
Iterative Linear Quadratic regulator design for nonlinear biological movement systems
| |||||
{37} |
ref: bookmark-0
tags: Unscented sigma_pint kalman filter speech processing machine_learning SDRE control UKF
date: 0-0-2007 0:0
revision:0
[head]
|
||||
| |||||
{108} | |||||
http://www.berndporr.me.uk/iso3_sab/
| |||||
{109} | |||||
http://www.bcs.rochester.edu/people/alex/bcs547/readings/WolpertGhahr00.pdf
| |||||
{110} | |||||
Iso learning approximates a solution to the inverse controller problem in an usupervised behavioral paradigm http://hardm.ath.cx/pdf/isolearning2002.pdf
| |||||
{140} | |||||
PMID-15649663 Composite adaptive control with locally weighted statistical learning.
| |||||
{151} | |||||
PMID-11741014 Computational approaches to motor control. Tamar Flash and Terry Sejnowski.
| |||||
{153} |
ref: Stefani-1995.09
tags: electrophysiology dopamine basal_ganglia motor learning
date: 0-0-2007 0:0
revision:0
[head]
|
||||
PMID-8539419 Electrophysiology of dopamine D-1 receptors in the basal ganglia: old facts and new perspectives.
| |||||
{8} |
ref: bookmark-0
tags: machine_learning algorithm meta_algorithm
date: 0-0-2006 0:0
revision:0
[head]
|
||||
Boost learning or AdaBoost - the idea is to update the discrete distribution used in training any algorithm to emphasize those points that are misclassified in the previous fit of a classifier. sensitive to outliers, but not overfitting. | |||||
{20} |
ref: bookmark-0
tags: neural_networks machine_learning matlab toolbox supervised_learning PCA perceptron SOM EM
date: 0-0-2006 0:0
revision:0
[head]
|
||||
http://www.ncrg.aston.ac.uk/netlab/index.php n.b. kinda old. (or does that just mean well established?) | |||||
{36} | |||||
{40} |
ref: bookmark-0
tags: Bayes Baysian_networks probability probabalistic_networks Kalman ICA PCA HMM Dynamic_programming inference learning
date: 0-0-2006 0:0
revision:0
[head]
|
||||
http://www.cs.ubc.ca/~murphyk/Bayes/bnintro.html very, very good! many references, well explained too. | |||||
http://www.iovs.org/cgi/reprint/46/4/1322.pdf A related machine learning classifier, the relevance vector machine (RVM), has recently been introduced, which, unlike SVM, incorporates probabalistic output (probability of membership) through Bayesian inference. Its decision function depends on fewer input variables that SVM, possibly allowing better classification for small data sets with high dimensionality.
| |||||
{61} |
ref: bookmark-0
tags: smith predictor motor control wolpert cerebellum machine_learning prediction
date: 0-0-2006 0:0
revision:0
[head]
|
||||
http://prism.bham.ac.uk/pdf_files/SmithPred_93.PDF
| |||||
{66} |
ref: bookmark-0
tags: machine_learning classification entropy information
date: 0-0-2006 0:0
revision:0
[head]
|
||||
http://iridia.ulb.ac.be/~lazy/ -- Lazy Learning. | |||||
{72} |
ref: abstract-0
tags: tlh24 error signals in the cortex and basal ganglia reinforcement_learning gradient_descent motor_learning
date: 0-0-2006 0:0
revision:0
[head]
|
||||
Title: Error signals in the cortex and basal ganglia. Abstract: Numerous studies have found correlations between measures of neural activity, from single unit recordings to aggregate measures such as EEG, to motor behavior. Two general themes have emerged from this research: neurons are generally broadly tuned and are often arrayed in spatial maps. It is hypothesized that these are two features of a larger hierarchal structure of spatial and temporal transforms that allow mappings to procure complex behaviors from abstract goals, or similarly, complex sensory information to produce simple percepts. Much theoretical work has proved the suitability of this organization to both generate behavior and extract relevant information from the world. It is generally agreed that most transforms enacted by the cortex and basal ganglia are learned rather than genetically encoded. Therefore, it is the characterization of the learning process that describes the computational nature of the brain; the descriptions of the basis functions themselves are more descriptive of the brain’s environment. Here we hypothesize that learning in the mammalian brain is a stochastic maximization of reward and transform predictability, and a minimization of transform complexity and latency. It is probable that the optimizations employed in learning include both components of gradient descent and competitive elimination, which are two large classes of algorithms explored extensively in the field of machine learning. The former method requires the existence of a vectoral error signal, while the latter is less restrictive, and requires at least a scalar evaluator. We will look for the existence of candidate error or evaluator signals in the cortex and basal ganglia during force-field learning where the motor error is task-relevant and explicitly provided to the subject. By simultaneously recording large populations of neurons from multiple brain areas we can probe the existence of error or evaluator signals by measuring the stochastic relationship and predictive ability of neural activity to the provided error signal. From this data we will also be able to track dependence of neural tuning trajectory on trial-by-trial success; if the cortex operates under minimization principles, then tuning change will have a temporal relationship to reward. The overarching goal of this research is to look for one aspect of motor learning – the error signal – with the hope of using this data to better understand the normal function of the cortex and basal ganglia, and how this normal function is related to the symptoms caused by disease and lesions of the brain. |