ISO learning reflex inverse controller Porr 2007

Iso learning approximates a solution to the inverse controller problem in an usupervised behavioral paradigm http://hardm.ath.cx/pdf/isolearning2002.pdf

  1. robot/actor whatever has a reflex after the presentation of a reward.
  2. the ISO learning mechanism learns to expect its own reflex -> anticipate actions, react at an appropriate time.
    1. a fixed reflex loop prevents arbitraryness by defining initial behavioral goal.
  3. iso means isotropic: all inputs are the same, and all can be used for learning.
  4. learning is proportional to the derivative of the output.
  • the central advantage of an (ideal) feed-forward controller is that it acts without the feedback-induced delay. The fatally damaging sluggishness of feedback systems makes this a highly desirable feature.
  • see figure 4 in the local paper. this basically looks like the cerebellum.. sorta. the controller takes predictive signal, and with this prior information, is able to learn the correct response to the disturbance.
  • they also include an interesting comparison to Sutton & Barto's reinforcement learning:
    • in ISO learning, the weights stabilize if a particular input condition is achieved;
    • in reinforcement learning, the weights are stabilized when a certain output condition is reached.