m8ta
You are not authenticated, login.
text: sort by
tags: modified
type: chronology
{674}
hide / / print
ref: notes-0 tags: Barto Hierarchal Reinforcement Learning date: 02-17-2009 05:38 gmt revision:1 [0] [head]

Recent Advancements in Hierarchal Reinforcement Learning

  • RL with good function-approximation methods for evaluating the value function or policy function solve many problems yet...
  • RL is bedeviled by the curse of dimensionality: the number of parameters grows exponentially with the size of a compact encoding of state.
  • Recent research has tackled the problem by exploiting temporal abstraction - decisions are not required at each step, but rather invoke the activity of temporally extended sub-policies. This is somewhat similar to a macro or subroutine in programming.
  • This is fundamentally similar to adding detailed domain-specific knowledge to the controller / policy.
  • Ron Parr seems to have made significant advances in this field with 'hierarchies of abstract machines'.
    • I'm still looking for a cognitive (predictive) extension to these RL methods ... these all are about extension through programmer knowledge.
  • They also talk about concurrent RL, where agents can pursue multiple actions (or options) at the same time, and assess value of each upon completion.
  • Next are partially observable markov decision processes, where you have to estimate the present state (belief state), as well as a policy. It is known that and optimal solution to this task is intractable. They propose using Hierarchal suffix memory as a solution ; I can't really see what these are about.
    • It is also possible to attack the problem using hierarchal POMDPs, which break the task into higher and lower level 'tasks'. Little mention is given to the even harder problem of breaking sequences up into tasks.
  • Good review altogether, reasonable balance between depth and length.