m8ta
You are not authenticated, login.
text: sort by
tags: modified
type: chronology
[0] Graybiel AM, The basal ganglia: learning new tricks and loving it.Curr Opin Neurobiol 15:6, 638-44 (2005 Dec)

{5}
hide / / print
ref: bookmark-0 tags: machine_learning research_blog parallel_computing bayes active_learning information_theory reinforcement_learning date: 12-31-2011 19:30 gmt revision:3 [2] [1] [0] [head]

hunch.net interesting posts:

  • debugging your brain - how to discover what you don't understand. a very intelligent viewpoint, worth rereading + the comments. look at the data, stupid
    • quote: how to represent the problem is perhaps even more important in research since human brains are not as adept as computers at shifting and using representations. Significant initial thought on how to represent a research problem is helpful. And when it’s not going well, changing representations can make a problem radically simpler.
  • automated labeling - great way to use a human 'oracle' to bootstrap us into good performance, esp. if the predictor can output a certainty value and hence ask the oracle all the 'tricky questions'.
  • The design of an optimal research environment
    • Quote: Machine learning is a victim of it’s common success. It’s hard to develop a learning algorithm which is substantially better than others. This means that anyone wanting to implement spam filtering can do so. Patents are useless here—you can’t patent an entire field (and even if you could it wouldn’t work).
  • More recently: http://hunch.net/?p=2016
    • Problem is that online course only imperfectly emulate the social environment of a college, which IMHO are useflu for cultivating diligence.
  • The unrealized potential of the research lab Quote: Muthu Muthukrishnan says “it’s the incentives”. In particular, people who invent something within a research lab have little personal incentive in seeing it’s potential realized so they fail to pursue it as vigorously as they might in a startup setting.
    • The motivation (money!) is just not there.

{194}
hide / / print
ref: Schultz-1998.07 tags: dopamine reward reinforcement_learning review date: 12-07-2011 04:16 gmt revision:1 [0] [head]

PMID-9658025[0] Predictive reward signal of dopamine neurons.

  • hot article.
  • reasons why midbrain Da is involved in reward: lesions, receptor blocking, electrical self-stimulation, and drugs of abuse.
  • DA neurons show phasic response to both primary reward and reward-predicting stimul.
  • 'All responses to rewards and reward-predicting stimuli depend on event predictability.
  • Just think of the MFB work with the rats... and how powerful it is.
  • most deficits following dopamine-depleting lesions are not easily explained by a defective reward signal (e.g. parkinsons, huntingtons) -> implying that DA has two uses: the labeling of reward, that the tonic enabling of postsynaptic neurons.
    • I just anticipated this, which is good :)
    • It is still a mystery how the neurons in the midbrain determine to fire - the pathways between reward and behavior must be very carefully segregated, otherwise we would be able to self-simulate
      • the pure expectation part of it is bound play a part in this - if we know that a certain event will be rewarding, then the expectation will diminish DA release.
  • predictive eye movements amerliorate behavioral perfromance through advance focusing. (interesting)
  • predictions are used in industry:
    • Internal Model Control is used in industry to predict future system states before they actually occur. for example, the fly-by-wire technique in aviation makes decisions to do particular manuvers based on predictable forthcoming states of the plane. (Like a human)
  • if you learn a reaction/reflex based on a conditioned stimulus, the presentation of that stimulus sets the internal state to that motivated to achieve the primary reward. there is a transfer back in time, which, generally, is what neural systems are for.
  • animals avoid foods that fail to influence important plasma/brain parameters, for example foods lacking essential amino acids like histidine, threonine, or methionine. In the case of food, the appearance/structure would be used to predict the slower plasma effects, and hence influence motivation to eat it. (of course!)
  • midbrain groups:
    • A8 = dorsal to lateral substantia nigra
    • A9 = pars compacta of substantia nigra, SNc
    • A10 = VTA, media to substantia nigra.
  • The characteristic polyphasic, relatively long impulses discharged at low frequencies make dpamine neurons easily distinguishable from other midbrain neurons.

____References____

[0] Schultz W, Predictive reward signal of dopamine neurons.J Neurophysiol 80:1, 1-27 (1998 Jul)

{67}
hide / / print
ref: Graybiel-2005.12 tags: graybiel motor_learning reinforcement_learning basal ganglia striatum thalamus cortex date: 10-03-2008 17:04 gmt revision:3 [2] [1] [0] [head]

PMID-16271465[] The basal ganglia: Learning new tricks and loving it

  • learning-related changes occur significantly earlier in the striatum than the cortex in a cue-reversal task. she says that this is because the basal ganglia instruct the cortex. I rather think that they select output dimensions from that variance-generator, the cortex.
  • dopamine agonist treatment improves learning with positive reinforcers but not learning with negative reinforcers.
  • there is a strong hyperkinetic pathway that projects directly to the subthalamic nucleus from the motor cortex. this controls output of the inhibitor pathway (GPi)
  • GABA input from the GPi to the thalamus can induce rebound spikes with precise timing. (the outputs are therefore not only inhibitory).
  • striatal neurons have up and down states. recommended action: simultaneous on-line recording of dopamine release and spike activity.
  • interesting generalization: cerebellum = supervised learning, striatum = reinforcement learning. yet yet! the cerebellum has a strong disynaptic projection to the putamen. of course, there is a continuous gradient between fully-supervised and fully-reinforcement models. the question is how to formulate both in a stable loop.
  • striosomal = striatum to the SNc
  • http://en.wikipedia.org/wiki/Substantia_nigra SNc is not an disorganized mass: the dopamergic neurons from the pars compacta project to the cortex in a topological map, dopaminergic neurons of the fringes (the lowest) go to the sensorimotor striatum and the highest to the associative striatum

____References____

{72}
hide / / print
ref: abstract-0 tags: tlh24 error signals in the cortex and basal ganglia reinforcement_learning gradient_descent motor_learning date: 0-0-2006 0:0 revision:0 [head]

Title: Error signals in the cortex and basal ganglia.

Abstract: Numerous studies have found correlations between measures of neural activity, from single unit recordings to aggregate measures such as EEG, to motor behavior. Two general themes have emerged from this research: neurons are generally broadly tuned and are often arrayed in spatial maps. It is hypothesized that these are two features of a larger hierarchal structure of spatial and temporal transforms that allow mappings to procure complex behaviors from abstract goals, or similarly, complex sensory information to produce simple percepts. Much theoretical work has proved the suitability of this organization to both generate behavior and extract relevant information from the world. It is generally agreed that most transforms enacted by the cortex and basal ganglia are learned rather than genetically encoded. Therefore, it is the characterization of the learning process that describes the computational nature of the brain; the descriptions of the basis functions themselves are more descriptive of the brain’s environment. Here we hypothesize that learning in the mammalian brain is a stochastic maximization of reward and transform predictability, and a minimization of transform complexity and latency. It is probable that the optimizations employed in learning include both components of gradient descent and competitive elimination, which are two large classes of algorithms explored extensively in the field of machine learning. The former method requires the existence of a vectoral error signal, while the latter is less restrictive, and requires at least a scalar evaluator. We will look for the existence of candidate error or evaluator signals in the cortex and basal ganglia during force-field learning where the motor error is task-relevant and explicitly provided to the subject. By simultaneously recording large populations of neurons from multiple brain areas we can probe the existence of error or evaluator signals by measuring the stochastic relationship and predictive ability of neural activity to the provided error signal. From this data we will also be able to track dependence of neural tuning trajectory on trial-by-trial success; if the cortex operates under minimization principles, then tuning change will have a temporal relationship to reward. The overarching goal of this research is to look for one aspect of motor learning – the error signal – with the hope of using this data to better understand the normal function of the cortex and basal ganglia, and how this normal function is related to the symptoms caused by disease and lesions of the brain.