m8ta
you are not logged in, login. new entry
text: sort by
tags: modified
type: chronology
{1408}
hide / edit[0] / print
ref: -2018 tags: machine learning manifold deep neural net geometry regularization date: 08-29-2018 14:30 gmt revision:0 [head]

LDMNet: Low dimensional manifold regularized neural nets.

  • Synopsis of the math:
    • Fit a manifold formed from the concatenated input ‘’and’’ output variables, and use this set the loss of (hence, train) a deep convolutional neural network.
      • Manifold is fit via point integral method.
      • This requires both SGD and variational steps -- alternate between fitting the parameters, and fitting the manifold.
      • Uses a standard deep neural network.
    • Measure the dimensionality of this manifold to regularize the network. Using a 'elegant trick', whatever that means.
  • Still yet he results, in terms of error, seem not very significantly better than previous work (compared to weight decay, which is weak sauce, and dropout)
    • That said, the results in terms of feature projection, figures 1 and 2, ‘’do’’ look clearly better.
    • Of course, they apply the regularizer to same image recognition / classification problems (MNIST), and this might well be better adapted to something else.
  • Not completely thorough analysis, perhaps due to space and deadlines.

{1333}
hide / edit[6] / print
ref: -0 tags: deep reinforcement learning date: 04-12-2016 17:19 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

Prioritized experience replay

  • In general, experience replay can reduce the amount of experience required to learn, and replace it with more computation and more memory – which are often cheaper resources than the RL agent’s interactions with its environment.
  • Transitions (between states) may be more or less
    • surprising (does the system in question have a model of the environment? It does have a model of the state & action expected reward, as it's Q-learning.
    • redundant, or
    • task-relevant
  • Some sundry neuroscience links:
    • Sequences associated with rewards appear to be replayed more frequently (Atherton et al., 2015; Ólafsdóttir et al., 2015; Foster & Wilson, 2006). Experiences with high magnitude TD error also appear to be replayed more often (Singer & Frank, 2009 PMID-20064396 ; McNamara et al., 2014).
  • Pose a useful example where the task is to learn (effectively) a random series of bits -- 'Blind Cliffwalk'. By choosing the replayed experiences properly (via an oracle), you can get an exponential speedup in learning.
  • Prioritized replay introduces bias because it changes [the sampled state-action] distribution in an uncontrolled fashion, and therefore changes the solution that the estimates will converge to (even if the policy and state distribution are fixed). We can correct this bias by using importance-sampling (IS) weights.
    • These weights are the inverse of the priority weights, but don't matter so much at the beginning, when things are more stochastic; they anneal the controlling exponent.
  • There are two ways of selecting (weighting) the priority weights:
    • Direct, proportional to the TD-error encountered when visiting a sequence.
    • Ranked, where errors and sequences are stored in a data structure ordered based on error and sampled 1/rank .
  • Somewhat illuminating is how the deep TD or Q learning is unable to even scratch the surface of Tetris or Montezuma's Revenge.

{1269}
hide / edit[0] / print
ref: -0 tags: hinton convolutional deep networks image recognition 2012 date: 01-11-2014 20:14 gmt revision:0 [head]

ImageNet Classification with Deep Convolutional Networks

{1174}
hide / edit[0] / print
ref: -0 tags: Hinton google tech talk dropout deep neural networks Boltzmann date: 11-09-2012 18:01 gmt revision:0 [head]

http://www.youtube.com/watch?v=DleXA5ADG78

  • Hinton believes in the the power of crowds -- he thinks that the brain fits many, many different models to the data, then selects afterward.
    • Random forests, as used in predator, is an example of this: they average many simple to fit and simple to run decision trees. (is apparently what Kinect does)
  • Talk focuses on dropout, a clever new form of model averaging where only half of the units in the hidden layers are trained for a given example.
    • He is inspired by biological evolution, where sexual reproduction often spontaneously adds or removes genes, hence individual genes or small linked genes must be self-sufficient. This equates to a 'rugged individualism' of units.
    • Likewise, dropout forces neurons to be robust to the loss of co-workers.
    • This is also great for parallelization: each unit or sub-network can be trained independently, on it's own core, with little need for communication! Later, the units can be combined via genetic algorithms then re-trained.
  • Hinton then observes that sending a real value p (output of logistic function) with probability 0.5 is the same as sending 0.5 with probability p. Hence, it makes sense to try pure binary neurons, like biological neurons in the brain.
    • Indeed, if you replace the backpropagation with single bit propagation, the resulting neural network is trained more slowly and needs to be bigger, but it generalizes better.
    • Neurons (allegedly) do something very similar to this by poisson spiking. Hinton claims this is the right thing to do (rather than sending real numbers via precise spike timing) if you want to robustly fit models to data.
      • Sending stochastic spikes is a very good way to average over the large number of models fit to incoming data.
      • Yes but this really explains little in neuroscience...
  • Paper referred to in intro: Livnat, Papadimitriou and Feldman, PMID-19073912 and later by the same authors PMID-20080594