m8ta
You are not authenticated, login.
text: sort by
tags: modified
type: chronology
{888}
hide / / print
ref: tlh24-2011 tags: motor learning models BMI date: 01-06-2012 00:19 gmt revision:1 [0] [head]

Experiment: you have a key. You want that key to learn to control a BMI, but you do not want the BMI to learn how the key does things, as

  1. That is not applicable for when you don't have training data - amputees, parapalegics.
  2. That does not tell much about motor learning, which is what we are interested in.

Given this, I propose a very simple groupweight: one axis is controlled by the summed action of a certain population of neurons, the other by a second, disjoint, population; a third population serves as control. The task of the key is to figure out what does what: how does the firing of a given unit translate to movement (forward model). Then the task during actual behavior is to invert this: given movement end, what sequence of firings should be generated? I assume, for now, that the brain has inbuilt mechanisms for inverting models (not that it isn't incredibly interesting -- and I'll venture a guess that it's related to replay, perhaps backwards replay of events). This leaves us with the task of inferring the tool-model from behavior, a task that can be done now with our modern (though here-mentioned quite simple) machine learning algorithms. Specifically, it can be done through supervised learning: we know the input (neural firing rates) and the output (cursor motion), and need to learn the transform between them. I can think of many ways of doing this on a computer:

  1. Linear regression -- This is obvious given the problem statement and knowledge that the model is inherently linear and separable (no multiplication factors between the input vectors). n matlab, you'd just do mldivide (backslash opeartor) -- but but! this requires storing all behavior to date. Does the brain do this? I doubt it, but this model, for a linear BMI, is optimal. (You could extend it to be Bayesian if you want confidence intervals -- but this won't make it faster).
  2. Gradient descent -- During online performance, you (or the brain) adjusts the estimates of the weights per neuron to minimize error between observed behavior and estimated behavior (the estimated behavior would constitute a forward model..) This is just LMS; it works, but has a exponential convergence and may get stuck in local minima. This model will make predictions on which neurons change relevance in the behavior (more needed for acquiring reward) based on continuous-time updates.
  3. Batched Gradient descent -- Hypothetically, one could bolster the learning rate by running batches of data multiple times through a gradient descent algorithm. The brain very well could offline (sleep), and we can observe this. Such a mechanism would improve performance after sleep, which has been observed behaviorally in people (and primates?).
  4. Gated Gradient Descent -- This is halfway between reinforcement learning and gradient descent. Basically, the brain only updates weights when something of motivational / sensory salience occurs, e.g. juice reward. It differs from raw reinforcement learning in that there is still multiplication between sensory and motor data + subsequent derivative.
  5. Reinforcement learning -- Neurons are 'rewarded' at the instant juice is delivered; they adjust their behavior based on behavioral context (a target), which presumably (given how long we train our keys), is present in the brain at the same time the cursor enters the target. Sensory data and model-building are largely absent.

{i need to think more about model-building, model inversion, and songbird learning?}