LDMNet: Low dimensional manifold regularized neural nets.
 Synopsis of the math:
 Fit a manifold formed from the concatenated input ‘’and’’ output variables, and use this set the loss of (hence, train) a deep convolutional neural network.
 Manifold is fit via point integral method.
 This requires both SGD and variational steps  alternate between fitting the parameters, and fitting the manifold.
 Uses a standard deep neural network.
 Measure the dimensionality of this manifold to regularize the network. Using a 'elegant trick', whatever that means.
 Still yet he results, in terms of error, seem not very significantly better than previous work (compared to weight decay, which is weak sauce, and dropout)
 That said, the results in terms of feature projection, figures 1 and 2, ‘’do’’ look clearly better.
 Of course, they apply the regularizer to same image recognition / classification problems (MNIST), and this might well be better adapted to something else.
 Not completely thorough analysis, perhaps due to space and deadlines.

An interesting field in ML is nonlinear dimensionality reduction  data may appear to be in a highdimensional space, but mostly lies along a nonlinear lowerdimensional subspace or manifold. (Linear subspaces are easily discovered with PCA or SVD(*)). Dimensionality reduction projects highdimensional data into a lowdimensional space with minimum information loss > maximal reconstruction accuracy; nonlinear dim reduction does this (surprise!) using nonlinear mappings. These techniques set out to find the manifold(s):
 Spectral Clustering
 Locally Linear Embedding
 related: The manifold ways of perception
 Would be interesting to run nonlinear dimensionality reduction algorithms on our data! What sort of space does the motor system inhabit? Would it help with prediction? Am quite sure people have looked at Kohonen maps for this purpose.
 Random irrelevant thought: I haven't been watching TV lately, but when I do, I find it difficult to recognize otherwise recognizable actors. In real life, I find no difficulty recognizing people, even some whom I don't know personally  is this a data thing (little training data), or mapping thing (not enough time training my TVnoteyes facial recognition).
 A Global Geometric Framework for Nonlinear Dimensionality Reduction method:
 map the points into a graph by connecting each point with a certain number of its neighbors or all neighbors within a certain radius.
 estimate geodesic distances between all points in the graph by finding the shortest graph connection distance
 use MDS (multidimensional scaling) to embed the original data into a smallerdimensional euclidean space while preserving as much of the original geometry.
 Doesn't look like a terribly fast algorithm!
(*) SVD maps into 'concept space', an interesting interpretation as per Leskovec's lecture presentation. 