m8ta
You are not authenticated, login.
text: sort by
tags: modified
type: chronology
[0] Loewenstein Y, Seung HS, Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity.Proc Natl Acad Sci U S A 103:41, 15224-9 (2006 Oct 10)

[0] Sergio LE, Hamel-Paquet C, Kalaska JF, Motor cortex neural correlates of output kinematics and kinetics during isometric-force and arm-reaching tasks.J Neurophysiol 94:4, 2353-78 (2005 Oct)[1] Hatsopoulos NG, Encoding in the motor cortex: was evarts right after all? Focus on "motor cortex neural correlates of output kinematics and kinetics during isometric-force and arm-reaching tasks".J Neurophysiol 94:4, 2261-2 (2005 Oct)[2] Cooke JD, Brown SH, Movement-related phasic muscle activation. II. Generation and functional role of the triphasic pattern.J Neurophysiol 63:3, 465-72 (1990 Mar)[3] Almeida GL, Hong DA, Corcos D, Gottlieb GL, Organizing principles for voluntary movement: extending single-joint rules.J Neurophysiol 74:4, 1374-81 (1995 Oct)[4] Gottlieb GL, Latash ML, Corcos DM, Liubinskas TJ, Agarwal GC, Organizing principles for single joint movements: V. Agonist-antagonist interactions.J Neurophysiol 67:6, 1417-27 (1992 Jun)[5] Corcos DM, Agarwal GC, Flaherty BP, Gottlieb GL, Organizing principles for single-joint movements. IV. Implications for isometric contractions.J Neurophysiol 64:3, 1033-42 (1990 Sep)[6] Gottlieb GL, Corcos DM, Agarwal GC, Latash ML, Organizing principles for single joint movements. III. Speed-insensitive strategy as a default.J Neurophysiol 63:3, 625-36 (1990 Mar)[7] Corcos DM, Gottlieb GL, Agarwal GC, Organizing principles for single-joint movements. II. A speed-sensitive strategy.J Neurophysiol 62:2, 358-68 (1989 Aug)[8] Gottlieb GL, Corcos DM, Agarwal GC, Organizing principles for single-joint movements. I. A speed-insensitive strategy.J Neurophysiol 62:2, 342-57 (1989 Aug)[9] Ghez C, Gordon J, Trajectory control in targeted force impulses. I. Role of opposing muscles.Exp Brain Res 67:2, 225-40 (1987)[10] Sainburg RL, Ghez C, Kalakanis D, Intersegmental dynamics are controlled by sequential anticipatory, error correction, and postural mechanisms.J Neurophysiol 81:3, 1045-56 (1999 Mar)

{1545}
hide / / print
ref: -1988 tags: Linsker infomax linear neural network hebbian learning unsupervised date: 08-03-2021 06:12 gmt revision:2 [1] [0] [head]

Self-organizaton in a perceptual network

  • Ralph Linsker, 1988.
  • One of the first (verbose, slightly diffuse) investigations of the properties of linear projection neurons (e.g. dot-product; no non-linearity) to express useful tuning functions.
  • ''Useful' is here information-preserving, in the face of noise or dimensional bottlenecks (like PCA).
  • Starts with Hebbian learning functions, and shows that this + white-noise sensory input + some local topology, you can get simple and complex visual cell responses.
    • Ralph notes that neurons in primate visual cortex are tuned in utero -- prior real-world visual experience! Wow. (Who did these studies?)
    • This is a very minimalistic starting point; there isn't even structured stimuli (!)
    • Single neuron (and later, multiple neurons) are purely feed-forward; author cautions that a lack of feedback is not biologically realistic.
      • Also note that this was back in the Motorola 680x0 days ... computers were not that powerful (but certainly could handle more than 1-2 neurons!)
  • Linear algebra shows that Hebbian synapses cause a linear layer to learn the covariance function of their inputs, QQ , with no dependence on the actual layer activity.
  • When looked at in terms of an energy function, this is equivalent to gradient descent to maximize the layer-output variance.
  • He also hits on:
    • Hopfield networks,
    • PCA,
    • Oja's constrained Hebbian rule δw i<L 2(L 1L 2w i)> \delta w_i \propto &lt; L_2(L_1 - L_2 w_i) &gt; (that is, a quadratic constraint on the weight to make Σw 21\Sigma w^2 \sim 1 )
    • Optimal linear reconstruction in the presence of noise
    • Mutual information between layer input and output (I found this to be a bit hand-wavey)
      • Yet he notes critically: "but it is not true that maximum information rate and maximum activity variance coincide when the probability distribution of signals is arbitrary".
        • Indeed. The world is characterized by very non-Gaussian structured sensory stimuli.
    • Redundancy and diversity in 2-neuron coding model.
    • Role of infomax in maximizing the determinant of the weight matrix, sorta.

One may critically challenge the infomax idea: we very much need to (and do) throw away spurious or irrelevant information in our sensory streams; what upper layers 'care about' when making decisions is certainly relevant to the lower layers. This credit-assignment is neatly solved by backprop, and there are a number 'biologically plausible' means of performing it, but both this and infomax are maybe avoiding the problem. What might the upper layers really care about? Likely 'care about' is an emergent property of the interacting local learning rules and network structure. Can you search directly in these domains, within biological limits, and motivated by statistical reality, to find unsupervised-learning networks?

You'll still need a way to rank the networks, hence an objective 'care about' function. Sigh. Either way, I don't per se put a lot of weight in the infomax principle. It could be useful, but is only part of the story. Otherwise Linsker's discussion is accessible, lucid, and prescient.

Lol.

{1543}
hide / / print
ref: -2019 tags: backprop neural networks deep learning coordinate descent alternating minimization date: 07-21-2021 03:07 gmt revision:1 [0] [head]

Beyond Backprop: Online Alternating Minimization with Auxiliary Variables

  • This paper is sort-of interesting: rather than back-propagating the errors, you optimize auxiliary variables, pre-nonlinearity 'codes' in a last-to-first layer order. The optimization is done to minimize a multimodal logistic loss function; math is not done to minimize other loss functions, but presumably this is not a limit. The loss function also includes a quadratic term on the weights.
  • After the 'codes' are set, optimization can proceed in parallel on the weights. This is done with either straight SGD or adaptive ADAM.
  • Weight L2 penalty is scheduled over time.

This is interesting in that the weight updates can be cone in parallel - perhaps more efficient - but you are still propagating errors backward, albeit via optimizing 'codes'. Given the vast infractructure devoted to auto-diff + backprop, I can't see this being adopted broadly.

That said, the idea of alternating minimization (which is used eg for EM clustering) is powerful, and this paper does describe (though I didn't read it) how there are guarantees on the convexity of the alternating minimization. Likewise, the authors show how to improve the performance of the online / minibatch algorithm by keeping around memory variables, in the form of covariance matrices.

{1534}
hide / / print
ref: -2020 tags: current opinion in neurobiology Kriegeskorte review article deep learning neural nets circles date: 02-23-2021 17:40 gmt revision:2 [1] [0] [head]

Going in circles is the way forward: the role of recurrence in visual inference

I think the best part of this article are the references -- a nicely complete listing of, well, the current opinion in Neurobiology! (Note that this issue is edited by our own Karel Svoboda, hence there are a good number of Janelians in the author list..)

The gestalt of the review is that deep neural networks need to be recurrent, not purely feed-forward. This results in savings in overall network size, and increase in the achievable computational complexity, perhaps via the incorporation of priors and temporal-spatial information. All this again makes perfect sense and matches my sense of prevailing opinion. Of course, we are left wanting more: all this recurrence ought to be structured in some way.

To me, a rather naive way of thinking about it is that feed-forward layers cause weak activations, which are 'amplified' or 'selected for' in downstream neurons. These neurons proximally code for 'causes' or local reasons, based on the supported hypothesis that the brain has a good temporal-spatial model of the visuo-motor world. The causes then can either explain away the visual input, leading to balanced E-I, or fail to explain it, in which the excess activity is either rectified by engaging more circuits or engaging synaptic plasticity.

A critical part of this hypothesis is some degree of binding / disentanglement / spatio-temporal re-assignment. While not all models of computation require registers / variables -- RNNs are Turning-complete, e.g., I remain stuck on the idea that, to explain phenomenological experience and practical cognition, the brain much have some means of 'binding'. A reasonable place to look is the apical tuft dendrites, which are capable of storing temporary state (calcium spikes, NMDA spikes), undergo rapid synaptic plasticity, and are so dense that they can reasonably store the outer-product space of binding.

There is mounting evidence for apical tufts working independently / in parallel is investigations of high-gamma in ECoG: PMID-32851172 Dissociation of broadband high-frequency activity and neuronal firing in the neocortex. "High gamma" shows little correlation with MUA when you differentiate early-deep and late-superficial responses, "consistent with the view it reflects dendritic processing separable from local neuronal firing"

{1530}
hide / / print
ref: -2017 tags: deep neuroevolution jeff clune Uber genetic algorithms date: 02-18-2021 18:27 gmt revision:1 [0] [head]

Deep Neuroevolution: genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning* Uber AI labs; Jeff Clune.

  • In this paper, they used a (fairly generic) genetic algorithm to tune the weights of a relatively large (4M parameters) convolutional neural net to play 13 atari games. 
  • The GA used truncation selection, population of ~ 1k individuals, no crossover, and gaussian mutation.
  • To speed up and streamline this algo, they encoded the weights not directly but as an initialization seed to the RNG (log2 of the number of parameters, approximately), plus seeds to generate the per-generation mutation (~ 28 bits).  This substantially decreased the required storage space and communication costs when running the GA in parallel on their cluster; they only had to transmit the rng seed sequence. 
  • Quite surprisingly, the GA was good at typically 'hard' games like frostbite and skiing, whereas it fared poorly on games like atlantis (which is a fixed-gun shooter game) and assault
  • Performance was compared to Deep-Q-networks (DQN), Evolutionary search (which used stochastic gradient approximates), Asynchronous Advantage Actor-critic (A3C), and random search (RS)
  • They surmise that some games were thought to be hard, but are actually fairly easy, albeit with many local minima. This is why search around the origin (near the initialization of the networks, which was via the Xavier method) is sufficient to solve the tasks.
  • Also noted that frequently the GA would find individuals with good performance in ~10 generations, further supporting the point above. 
  • The GA provide very consistent performance across the entirety of a trial, which, they suggest, may offer a cleaner signal to selection as to the quality of each of the individuals (debatable!).
  • Of course, for some tasks, the GA fails woefully; it was not able to quickly learn to control a humanoid robot, which involves mapping a ~370-dimensional vector into ~17 joint torques.  Evolutionary search was able to perform this task, which is not surprising as the gradient here should be smooth.

The result is indeed surprising, but it also feels lazy -- the total effort or information that they put into writing the actual algorithm is small; as mentioned in the introduction, this is a case of old algorithms with modern levels of compute.  Analogously, compare Go-Explore, also by Uber AI labs, vs Agent57 by DeepMind; the Agent57 paper blithely dismisses the otherwise breathless Go-Explore result as feature engineering and unrealistic free backtracking / game-resetting (which is true..) It's strange that they did not incorporate crossover aka recombination, as David MacKay clearly shows that recombination allows for much higher mutation rates and much better transmission of information through a population.  (Chapter 'Why have sex').  They also perhaps more reasonably omit developmental encoding, where network weights are tied or controlled through development, again in an analogy to biology. 

A better solution, as they point out, would be some sort of hybrid GA / ES / A3C system which used both gradient-based tuning, random stochastic gradient-based exploration, and straight genetic optimization, possibly all in parallel, with global selection as the umbrella.  They mention this, but to my current knowledge this has not been done. 

{1522}
hide / / print
ref: -2017 tags: schema networks reinforcement learning atari breakout vicarious date: 09-29-2020 02:32 gmt revision:2 [1] [0] [head]

Schema networks: zero-shot transfer with a generative causal model of intuitive physics

  • Like a lot of papers, the title has more flash than the actual results.
  • Results which would be state of the art (as of 2017) in playing Atari breakout, then transferring performance to modifications of the game (paddle moved up a bit, wall added in the middle of the bricks, brick respawning, juggling).
  • Schema network is based on 'entities' (objects) which have binary 'attributes'. These attributes can include continuous-valued signals, in which case each binary variable is like a place fields (i think).
    • This is clever an interesting -- rather than just low-level features pointing to high-level features, this means that high-level entities can have records of low-level features -- an arrow pointing in the opposite direction, one which can (also) be learned.
    • The same idea is present in other Vicarious work, including the CAPTCHA paper and more-recent (and less good) Bio-RNN paper.
  • Entities and attributes are propagated forward in time based on 'ungrounded schemas' -- basically free-floating transition matrices. The grounded schemas are entities and action groups that have evidence in observation.
    • There doesn't seem to be much math describing exactly how this works; only exposition. Or maybe it's all hand-waving over the actual, much simpler math.
      • Get the impression that the authors are reaching to a level of formalism when in fact they just made something that works for the breakout task... I infer Dileep prefers the empirical for the formal, so this is likely primarily the first author.
  • There are no perceptual modules here -- game state is fed to the network directly as entities and attributes (and, to be fair, to the A3C model).
  • Entity-attributes vectors are concatenated into a column vector length NTNT , where NN are the number of entities, and TT are time slices.
    • For each entity of N over time T, a row-vector is made of length MRMR , where MM are the number of attributes (fixed per task) and R1R-1 are the number of neighbors in a fixed radius. That is, each entity is related to its neighbors attributes over time.
    • This is a (large, sparse) binary matrix, XX .
  • yy is the vector of actions; task is to predict actions from XX .
    • How is X learned?? Very unclear in the paper vs. figure 2.
  • The solution is approximated as y=XW1¯y = X W \bar{1 } where WW is a binary weight matrix.
    • Minimize the solution based on an objective function on the error and the complexity of ww .
    • This is found via linear programming relaxation. "This procedure monotonically decreases the prediction error of the overall schema network, while increasing its complexity".
      • As it's a issue of binary conjunctions, this seems like a SAT problem!
    • Note that it's not probabilistic: "For this algorithm to work, no contradictions can exist in the input data" -- they instead remove them!
  • Actual behavior includes maximum-product belief propagation, to look for series of transitions that set the reward variable without setting the fail variable.
    • Because the network is loopy, this has to occur several times to set entity variables eg & includes backtracking.

  • Have there been any further papers exploring schema networks? What happened to this?
  • The later paper from Vicarious on zero-shot task transfer are rather less interesting (to me) than this.

{1517}
hide / / print
ref: -2015 tags: spiking neural networks causality inference demixing date: 07-22-2020 18:13 gmt revision:1 [0] [head]

PMID-26621426 Causal Inference and Explaining Away in a Spiking Network

  • Rubén Moreno-Bote & Jan Drugowitsch
  • Use linear non-negative mixing plus nose to generate a series of sensory stimuli.
  • Pass these through a one-layer spiking or non-spiking neural network with adaptive global inhibition and adaptive reset voltage to solve this quadratic programming problem with non-negative constraints.
  • N causes, one observation: μ=Σ i=1 Nu ir i+ε \mu = \Sigma_{i=1}^{N} u_i r_i + \epsilon ,
    • r i0r_i \geq 0 -- causes can be present or not present, but not negative.
    • cause coefficients drawn from a truncated (positive only) Gaussian.
  • linear spiking network with symmetric weight matrix J=U TUβI J = -U^TU - \beta I (see figure above)
    • That is ... J looks like a correlation matrix!
    • UU is M x N; columns are the mixing vectors.
    • U is known beforehand and not learned
      • That said, as a quasi-correlation matrix, it might not be so hard to learn. See ref [44].
  • Can solve this problem by minimizing the negative log-posterior function: $$ L(\mu, r) = \frac{1}{2}(\mu - Ur)^T(\mu - Ur) + \alpha1^Tr + \frac{\beta}{2}r^Tr $$
    • That is, want to maximize the joint probability of the data and observations given the probabilistic model p(μ,r)exp(L(μ,r))Π i=1 NH(r i) p(\mu, r) \propto exp(-L(\mu, r)) \Pi_{i=1}^{N} H(r_i)
    • First term quadratically penalizes difference between prediction and measurement.
    • second term, alpha is a L1 regularization term, and third term w beta is a L2 regularization.
  • The negative log-likelihood is then converted to an energy function (linear algebra): W=U TUW = -U^T U , h=U Tμ h = U^T \mu then E(r)=0.5r TWrr Th+α1 Tr+0.5βr TrE(r) = 0.5 r^T W r - r^T h + \alpha 1^T r + 0.5 \beta r^T r
    • This is where they get the weight matrix J or W. If the vectors U are linearly independent, then it is negative semidefinite.
  • The dynamics of individual neurons w/ global inhibition and variable reset voltage serves to minimize this energy -- hence, solve the problem. (They gloss over this derivation in the main text).
  • Next, show that a spike-based network can similarly 'relax' or descent the objective gradient to arrive at the quadratic programming solution.
    • Network is N leaky integrate and fire neurons, with variable synaptic integration kernels.
    • α\alpha translates then to global inhibition, and β\beta to lowered reset voltage.
  • Yes, it can solve the problem .. and do so in the presence of firing noise in a finite period of time .. but a little bit meh, because the problem is not that hard, and there is no learning in the network.

{1516}
hide / / print
ref: -2017 tags: GraphSAGE graph neural network GNN date: 07-16-2020 15:49 gmt revision:2 [1] [0] [head]

Inductive representation learning on large graphs

  • William L. Hamilton, Rex Ying, Jure Leskovec
  • Problem: given a graph where each node has a set of (possibly varied) attributes, create a 'embedding' vector at each node that describes both the node and the network that surrounds it.
  • To this point (2017) there were two ways of doing this -- through matrix factorization methods, and through graph convolutional networks.
    • The matrix factorization methods or spectral methods (similar to multi-dimensional scaling, where points are projected onto a plane to preserve a distance metric) are transductive : they work entirely within-data, and don't directly generalize to new data.
      • This is parsimonious in some sense, but doesn't work well in the real world, where datasets are constantly changing and frequently growing.
  • Their approach is similar to graph convolutional networks, where (I think) the convolution is indexed by node distances.
  • General idea: each node starts out with an embedding vector = its attribute or feature vector.
  • Then, all neighboring nodes are aggregated by sampling a fixed number of the nearest neighbors (fixed for computational reasons).
    • Aggregation can be mean aggregation, LSTM aggregation (on random permuations of the neighbor nodes), or MLP -> nonlinearity -> max-pooling. Pooling has the most wins, though all seem to work...
  • The aggregated vector is concatenated with the current node feature vector, and this is fed through a learned weighting matrix and nonlinearity to output the feature vector for the current pass.
  • Passes proceed from out-in... i think.
  • Algorithm is inspired by the Weisfeiler-Lehman Isomorphism Test, which updates neighbor counts per node to estimate if graphs are isomorphic. They do a similar thing here, only with vectors not scalars, and similarly take into account the local graph structure.
    • All the aggregator functions, and for course the nonlinearities and weighting matricies, are differentiable -- so the structure is trained in a supervised way with SGD.

This is a well-put together paper, with some proofs of convergence etc -- but it still feels only lightly tested. As with many of these papers, could benefit from a positive control, where the generating function is known & you can see how well the algorithm discovers it.

Otherwise, the structure / algorithm feels rather intuitive; surprising to me that it was not developed before the matrix factorization methods.

Worth comparing this to word2vec embeddings, where local words are used to predict the current word & the resulting vector in the neck-down of the NN is the representation.

{1511}
hide / / print
ref: -2020 tags: evolution neutral drift networks random walk entropy population date: 04-08-2020 00:48 gmt revision:0 [head]

Localization of neutral evolution: selection for mutational robustness and the maximal entropy random walk

  • The take-away of the paper is that, with larger populations, random mutation and recombination make areas of the graph that take several steps to get to (in the figure, this is Maynard Smith's four-letter mutation word game) are less likely to be visited with a larger population.
  • This is because the recombination serves to make the population adhere more closely to the 'giant' mode. In Maynard's game, this is 2268 words of 2405 meaningful words that can be reached by successive letter changes.
  • The author extends it to van Nimwegen's 1999 paper / RNA genotype-secondary structure. It's not as bad as Maynard's game, but still has much lower graph-theoretic entropy than the actual population.
    • He suggests if the entropic size of the giant component is much smaller than it's dictionary size, then populations are likely to be trapped there.

  • Interesting, but I'd prefer to have an expert peer-review it first :)

{1507}
hide / / print
ref: -2015 tags: winner take all sparsity artificial neural networks date: 03-28-2020 01:15 gmt revision:0 [head]

Winner-take-all Autoencoders

  • During training of fully connected layers, they enforce a winner-take all lifetime sparsity constraint.
    • That is: when training using mini-batches, they keep the k percent largest activation of a given hidden unit across all samples presented in the mini-batch. The remainder of the activations are set to zero. The units are not competing with each other; they are competing with themselves.
    • The rest of the network is a stack of ReLU layers (upon which the sparsity constraint is applied) followed by a linear decoding layer (which makes interpretation simple).
    • They stack them via sequential training: train one layer from the output of another & not backprop the errors.
  • Works, with lower sparsity targets, also for RBMs.
  • Extended the result to WTA covnets -- here enforce both spatial and temporal (mini-batch) sparsity.
    • Spatial sparsity involves selecting the single largest hidden unit activity within each feature map. The other activities and derivatives are set to zero.
    • At test time, this sparsity constraint is released, and instead they use a 4 x 4 max-pooling layer & use that for classification or deconvolution.
  • To apply both spatial and temporal sparsity, select the highest spatial response (e.g. one unit in a 2d plane of convolutions; all have the same weights) for each feature map. Do this for every image in a mini-batch, and then apply the temporal sparsity: each feature map gets to be active exactly once, and in that time only one hidden unit (or really, one location of the input and common weights (depending on stride)) undergoes SGD.
    • Seems like it might train very slowly. Authors didn't note how many epochs were required.
  • This, too can be stacked.
  • To train on larger image sets, they first extract 48 x 48 patches & again stack...
  • Test on MNIST, SVHN, CIFAR-10 -- works ok, and well even with few labeled examples (which is consistent with their goals)

{1428}
hide / / print
ref: -0 tags: VARNUM GEVI genetically encoded voltage indicators FRET Ace date: 03-18-2020 17:12 gmt revision:5 [4] [3] [2] [1] [0] [head]

PMID-30420685 Fast in-vivo voltage imaging using a red fluorescent indicator

  • Kannan M, Vasan G, Huang C, Haziza S, Li JZ, Inan H, Schnitzer MJ, Pieribone VA.
  • Other genetically encoded voltage indicators (GEVI):
    • PMID-22958819 ArcLight (Peribone also last author) ; sign of ΔF/F\Delta F / F negative, but large, 35%! Slow tho? improvement in speed
    • ASAP3 ΔF/F\Delta F / F large, τ=3ms.\tau = 3 ms.
    • PMID-26586188 Ace-mNeon FRET based, Acetabularia opsin, fast kinetics + brightness of mNeonGreen.
    • Archon1 -- fast and sensitive, found (like VARNUM) using a robotic directed evolution or direct search strategy.
  • VARNAM is based on Acetabularia (Ace) + mRuby3, also FRET based, found via high-throughput voltage screen.
  • Archaerhodopsin require 1-12 W/mm^2 of illumination, vs. 50 mw/mm^2 for GFP based probes. Lots of light!
  • Systematic optimization of voltage sensor function: both the linker region (288 mutants), which affects FRET efficiency, as well as the opsin fluorophore region (768 mutants), which affects the wavelength of absorption / emission.
  • Some intracellular clumping (which will negatively affect sensitivity), but mostly localized to the membrane.
  • Sensitivity is still imperfect -- 4% in-vivo cortical neurons, though it’s fast enough to resolve 100 Hz spiking.
  • Can resolve post-synaptic EPSCs, but < 1 % ΔF/F\Delta F/F .
  • Tested all-optical ephys using VARNAM + blueshifted channelrhodopsin, CheRiff, both sparsely, and in PV targeted transgenetic model. Both work, but this is a technique paper; no real results.
  • Tested TEMPO fiber-optic recording in freely behaving mice (ish) -- induced ketamine waves, 0.5-4Hz.
  • And odor-induced activity in flies, using split-Gal4 expression tools. So many experiments.

{1492}
hide / / print
ref: -2016 tags: spiking neural network self supervised learning date: 12-10-2019 03:41 gmt revision:2 [1] [0] [head]

PMID: Spiking neurons can discover predictive features by aggregate-label learning

  • This is a meandering, somewhat long-winded, and complicated paper, even for the journal Science. It's not been cited a great many times, but none-the-less is of interest.
  • The goal of the derived network is to detect fixed-pattern presynaptic sequences, and fire a prespecified number of spikes to each occurrence.
  • One key innovation is the use of a spike-threshold-surface for a 'tempotron' [12], the derivative of which is used to update the weights of synapses after trials. As the author says, spikes are hard to differentiate; the STS makes this more possible. This is hence standard gradient descent: if the neuron missed a spike then the weight is increased based on aggregate STS (for the whole trial -- hence the neuron / SGD has to perform temporal and spatial credit assignment).
    • As common, the SGD is appended with a momentum term.
  • Since STS differentiation is biologically implausible -- where would the memory lie? -- he also implements a correlational synaptic eligibility trace. The correlation is between the postsynaptic voltage and the EPSC, which seems kinda circular.
    • Unsurprisingly, it does not work as well as the SGD approximation. But does work...
  • Second innovation is the incorporation of self-supervised learning: a 'supervisory' neuron integrates the activity of a number (50) of feature detector neurons, and reinforces them to basically all fire at the same event, WTA style. This effects a unsupervised feature detection.
  • This system can be used with sort-of lateral inhibition to reinforce multiple features. Not so dramatic -- continuous feature maps.

Editorializing a bit: I said this was interesting, but why? The first part of the paper is another form of SGD, albeit in a spiking neural network, where the gradient is harder compute hence is done numerically.

It's the aggregate part that is new -- pulling in repeated patterns through synaptic learning rules. Of course, to do this, the full trace of pre and post synaptic activity must be recorded (??) for estimating the STS (i think). An eligibility trace moves in the right direction as a biologically plausible approximation, but as always nothing matches the precision of SGD. Can the eligibility trace be amended with e.g. neuromodulators to push the performance near that of SGD?

The next step of adding self supervised singular and multiple features is perhaps toward the way the brain organizes itself -- small local feedback loops. These features annotate repeated occurrences of stimuli, or tile a continuous feature space.

Still, the fact that I haven't seen any follow-up work is suggestive...


Editorializing further, there is a limited quantity of work that a single human can do. In this paper, it's a great deal of work, no doubt, and the author offers some good intuitions for the design decisions. Yet still, the total complexity that even a very determined individual can amass is limited, and likely far below the structural complexity of a mammalian brain.

This implies that inference either must be distributed and compositional (the normal path of science), or the process of evaluating & constraining models must be significantly accelerated. This later option is appealing, as current progress in neuroscience seems highly technology limited -- old results become less meaningful when the next wave of measurement tools comes around, irrespective of how much work went into it. (Though: the impedtus for measuring a particular thing in biology is only discovered through these 'less meaningful' studies...).

A third option, perhaps one which many theoretical neuroscientists believe in, is that there are some broader, physics-level organizing principles to the brain. Karl Friston's free energy principle is a good example of this. Perhaps at a meta level some organizing theory can be found, or likely a set of theories; but IMHO, you'll need at least one theory per brain area, at least, just the same as each area is morphologically, cytoarchitecturaly, and topologically distinct. (There may be only a few theories of the cortex, despite all the areas, which is why so many are eager to investigate it!)

So what constitutes a theory? Well, you have to meaningfully describe what a brain region does. (Why is almost as important; how more important to the path there.) From a sensory standpoint: what information is stored? What processing gain is enacted? How does the stored information impress itself on behavior? From a motor standpoint: how are goals selected? How are the behavioral segments to attain them sequenced? Is the goal / behavior even a reasonable way of factoring the problem?

Our dual problem, building the bridge from the other direction, is perhaps easier. Or it could be a lot more money has gone into it. Either way, much progress has been made in AI. One arm is deep function approximation / database compression for fast and organized indexing, aka deep learning. Many people are thinking about that; no need to add to the pile; anyway, as OpenAI has proven, the common solution to many problems is to simply throw more compute at it. A second is deep reinforcement learning, which is hideously sample and path inefficient, hence ripe for improvement. One side is motor: rather than indexing raw motor variables (LRUD in a video game, or joint torques with a robot..) you can index motor primitives, perhaps hierarchically built; likewise, for the sensory input, the model needs to infer structure about the world. This inference should decompose overwhelming sensory experience into navigable causes ...

But how can we do this decomposition? The cortex is more than adept at it, but now we're at the original problem, one that the paper above purports to make a stab at.

{1485}
hide / / print
ref: -2015 tags: PaRAC1 photoactivatable Rac1 synapse memory optogenetics 2p imaging mouse motor skill learning date: 10-30-2019 20:35 gmt revision:1 [0] [head]

PMID-26352471 Labelling and optical erasure of synaptic memory traces in the motor cortex

  • Idea: use Rac1, which has been shown to induce spine shrinkage, coupled to a light-activated domain to allow for optogenetic manipulation of active synapses.
  • PaRac1 was coupled to a deletion mutant of PSD95, PSD delta 1.2, which concentrates at the postsynaptic site, but cannot bind to postsynaptic proteins, thus minimizing the undesirable effects of PSD-95 overexpression.
    • PSD-95 is rapidly degraded by proteosomes
    • This gives spatial selectivity.
  • They then exploited the dendritic targeting element (DTE) of Arc mRNA which is selectively targeted and translated in activiated dendritic segments in response to synaptic activation in an an NMDA receptor dependent manner.
    • Thereby giving temporal selectivity.
  • Construct is then PSD-PaRac1-DTE; this was tested on hippocampal slice cultures.
  • Improved sparsity and labelling further by driving it with the Arc promoter.
  • Motor learning is impaired in Arc KO mice; hence inferred that the induction of AS-PaRac1 by the Arc promoter would enhance labeling during learning-induced potentiation.
  • Delivered construct via in-utero electroporation.
  • Observed rotarod-induced learning; the PaRac signal decayed after two days, but the spine volume persisted in spines that showed Arc / DTE hence PA labeled activity.
  • Now, since they had a good label, performed rotarod training followed by (at variable delay) light pulses to activate Rac, thereby suppressing recently-active synapses.
    • Observed both a depression of behavioral performance.
    • Controlled with a second task; could selectively impair performance on one of the tasks based on ordering/timing of light activation.
  • The localized probe also allowed them to image the synapse populations active for each task, which were largely non-overlapping.

{1475}
hide / / print
ref: -2017 tags: two photon holographic imaging Arch optogenetics GCaMP6 date: 09-12-2019 19:24 gmt revision:1 [0] [head]

PMID-28053310 Simultaneous high-speed imaging and optogenetic inhibition in the intact mouse brain.

  • Bovetti S1, Moretti C1, Zucca S1, Dal Maschio M1, Bonifazi P2,3, Fellin T1.
  • Image GCamp6 in either scanned mode (high resolution, slow) or holographically (SLM, redshirt 80x80 NeuroCCD, activate opsin Arch, simultaneously record juxtasomal action potentials.

{1418}
hide / / print
ref: -0 tags: nanophotonics interferometry neural network mach zehnder interferometer optics date: 06-13-2019 21:55 gmt revision:3 [2] [1] [0] [head]

Deep Learning with Coherent Nanophotonic Circuits

  • Used a series of Mach-Zehnder interferometers with thermoelectric phase-shift elements to realize the unitary component of individual layer weight-matrix computation.
    • Weight matrix was decomposed via SVD into UV*, which formed the unitary matrix (4x4, Special unitary 4 group, SU(4)), as well as Σ\Sigma diagonal matrix via amplitude modulators. See figure above / original paper.
    • Note that interfereometric matrix multiplication can (theoretically) be zero energy with an optical system (modulo loss).
      • In practice, you need to run the phase-moduator heaters.
  • Nonlinearity was implemented electronically after the photodetector (e.g. they had only one photonic circuit; to get multiple layers, fed activations repeatedly through it. This was a demonstration!)
  • Fed network FFT'd / banded recordings of consonants through the network to get near-simulated vowel recognition.
    • Claim that noise was from imperfect phase setting in the MZI + lower resolution photodiode read-out.
  • They note that the network can more easily (??) be trained via the finite difference algorithm (e.g. test out an incremental change per weight / parameter) since running the network forward is so (relatively) low-energy and fast.
    • Well, that's not totally true -- you need to update multiple weights at once in a large / deep network to descend any high-dimensional valleys.

{1463}
hide / / print
ref: -2019 tags: optical neural networks spiking phase change material learning date: 06-01-2019 19:00 gmt revision:4 [3] [2] [1] [0] [head]

All-optical spiking neurosynaptic networks with self-learning capabilities

  • J. Feldmann, N. Youngblood, C. D. Wright, H. Bhaskaran & W. H. P. Pernice
  • Idea: use phase-change material to either block or pass the light in waveguides.
    • In this case, they used GST -- germanium-antimony-tellurium. This material is less reflective in the amorphous phase, which can be reached by heating to ~150C and rapidly quenching. It is more reflective in the crystalline phase, which occurs on annealing.
  • This is used for both plastic synapses (phase change driven by the intensity of the light) and the nonlinear output of optical neurons (via a ring resonator).
  • Uses optical resonators with very high Q factors to couple different wavelengths of light into the 'dendrite'.
  • Ring resonator on the output: to match the polarity of the phase-change material. Is this for reset? Storing light until trigger?
  • Were able to get correlative-like or hebbian learning (which I suppose is not dissimilar from really slow photographic film, just re-branded, and most importantly with nonlinear feedback.)
  • Issue: every weight needs a different source wavelength! Hence they have not demonstrated a multi-layer network.
  • Previous paper: All-optical nonlinear activation function for photonic neural networks
    • Only 3db and 7db extinction ratios for induced transparency and inverse saturation

{1460}
hide / / print
ref: -0 tags: phosphorescence fluorescence magnetic imaging slicing adam cohen date: 05-29-2019 19:41 gmt revision:8 [7] [6] [5] [4] [3] [2] [head]

A friend postulated using the triplet state phosphorescence as a magnetically-modulatable dye. E.g. magnetically slice a scattering biological sample, rather than slicing optically (light sheet, 2p) or mechanically. After a little digging:

I'd imagine that it should be possible to design a molecule -- a protein cage, perhaps a (fully unsaturated) terpine -- which isolates the excited state from oxygen quenching.

Adam Cohen at Harvard has been working a bit on this very idea, albeit with fluorescence not phosphorescence --

  • Optical imaging through scattering media via magnetically modulated fluorescence (2010)
    • The two species, pyrene and dimethylaniline are in solution.
    • Dimethylaniline absorbs photons and transfers an electron to pyrene to produce a singlet radical pair.
    • The magnetic field represses conversion of this singlet into a triplet; when two singlet electrons combine, they produce exciplex fluorescence.
  • Addition of an aliphatic-ether 12-O-2 linker improves things significantly --
  • Mapping Nanomagnetic Fields Using a Radical Pair Reaction (2011)
  • Which can be used with a 2p microscope:
  • Two-photon imaging of a magneto-fluorescent indicator for 3D optical magnetometry (2015)
    • Notably, use decay kinetics of the excited state to yield measurements that are insensitive to photobleaching, indicator concentration, or local variations in optical excitation or collection efficiency. (As opposed to ΔF/F\Delta F / F )
    • Used phenanthrene (3 aromatic rings, not 4 in pyrene) as the excited electron acceptor, dimethylaniline again as the photo-electron generator.
    • Clear description:
      • A molecule with a singlet ground state absorbs a photon.
      • The photon drives electron transfer from a donor moiety to an acceptor moiety (either inter or intra molecular).
      • The electrons [ground state and excited state, donor] become sufficiently separated so that their spins do not interact, yet initially they preserve the spin coherence arising from their starting singlet state.
      • Each electron experiences a distinct set of hyperfine couplings to it's surrounding protons (?) leading to a gradual loss of coherence and intersystem crossing (ISC) into a triplet state.
      • An external magnetic field can lock the precession of both electrons to the field axis, partially preserving coherence and supressing ISC.
      • In some chemical systems, the triplet state is non-fluorescence, whereas the singlet pair can recombine and emit a photon.
      • Magnetochemical effects are remarkable because they arise at a magnetic field strengths comparable to hyperfine energy (typically 1-10mT).
        • Compare this to the Zeeman effect, where overt splitting is at 0.1T.
    • phenylanthrene-dimethylaniline was dissolved in dimethylformamide (DMF). The solution was carefully degassed in nitrogen to prevent molecular oxygen quenching.

Yet! Magnetic field effects do exist in solution:

{1456}
hide / / print
ref: -2011 tags: ttianium micromachining chlorine argon plasma etch oxide nitride penetrating probes Kevin Otto date: 03-18-2019 22:57 gmt revision:1 [0] [head]

PMID-21360044 Robust penetrating microelectrodes for neural interfaces realized by titanium micromachining

  • Patrick T. McCarthyKevin J. OttoMasaru P. Rao
  • Used Cl / Ar plasma to deep etch titanium film, 0.001 / 25um thick. Fine Metals Corp Ashland VA.
  • Discuss various insulation (oxide /nitride) failure modes, lithography issues.

{1446}
hide / / print
ref: -2017 tags: vicarious dileep george captcha message passing inference heuristic network date: 03-06-2019 04:31 gmt revision:2 [1] [0] [head]

PMID-29074582 A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs

  • Vicarious supplementary materials on their RCN (recursive cortical network).
  • Factor scene into shape and appearance, which CNN or DCNN do not do -- they conflate (ish? what about the style networks?)
    • They call this the coloring book approach -- extract shape then attach appearance.
  • Hierarchy of feature layers F frcF_{f r c} (binary) and pooling layer H frcH_{f r c} (multinomial), where f is feature, r is row, c is column (e.g. over image space).
  • Each layer is exclusively conditional on the layer above it, and all features in a layer are conditionally independent given the layer above.
  • Pool variables H frcH_{f r c} is multinomial, and each value associated with a feature, plus one off feature.
    • These features form a ‘pool’, which can/does have translation invariance.
  • If any of the pool variables are set to enable FF , then that feature is set (or-operation). Many pools can contain a given feature.
  • One can think of members of a pool as different alternatives of similar features.
  • Pools can be connected laterally, so each is dependent on the activity of its neighbors. This can be used to enforce edge continuity.
  • Each bottom-level feature corresponds to an edge, which defines ‘in’ and ‘out’ to define shape, YY .
  • These variables YY are also interconnected, and form a conditional random field, a ‘Potts model’. YY is generated by gibbs sampling given the F-H hierarchy above it.
  • Below Y, the per-pixel model X specifies texture with some conditional radial dependence.
  • The model amounts to a probabalistic model for which exact inference is impossible -- hence you must do approximate, where a bottom up pass estimates the category (with lateral connections turned off), and a top down estimates the object mask. Multiple passes can be done for multiple objects.
  • Model has a hard time moving from rgb pixels to edge ‘in’ and ‘out’; they use edge detection pre-processing stage, e.g. Gabor filter.
  • Training follows a very intuitive, hierarchical feature building heuristic, where if some object or collection of lower level features is not present, it’s added to the feature-pool tree.
    • This includes some winner-take-all heuristic for sparsification.
    • Also greedily learn some sort of feature ‘’dictionary’’ from individual unlabeled images.
  • Lateral connections are learned similarly, with a quasi-hebbian heuristic.
  • Neuroscience inspiration: see refs 9, 98 for message-passing based Bayesian inference.

  • Overall, a very heuristic, detail-centric, iteratively generated model and set of algorithms. You get the sense that this was really the work of Dileep George or only a few people; that it was generated by successively patching and improving the model/algo to make up for observed failures and problems.
    • As such, it offers little long-term vision for what is possible, or how perception and cognition occurs.
    • Instead, proof is shown that, well, engineering works, and the space of possible solutions -- including relatively simple elements like dictionaries and WTA -- is large and fecund.
      • Unclear how this will scale to even more complex real-world problems, where one would desire a solution that does not have to have each level carefully engineered.
      • Modern DCNN, at least, do not seem to have this property -- the structure is learned from the (alas, labeled) data.
  • This extends to the fact that yes, their purpose-built system achieves state of the art performance on the designated CAPATCHA tasks.
  • Check: B. M. Lake, R. Salakhutdinov, J. B. Tenenbaum, Human-level concept learning through probabilistic program induction. Science 350, 1332–1338 (2015). doi:10.1126/science.aab3050 Medline

{1439}
hide / / print
ref: -2006 tags: hinton contrastive divergence deep belief nets date: 02-20-2019 02:38 gmt revision:0 [head]

PMID-16764513 A fast learning algorithm for deep belief nets.

  • Hinton GE1, Osindero S, Teh YW.
  • Very highly cited contrastive divergence paper.
  • Back in 2006 yielded state of the art MNIST performance.
  • And, being CD, can be used in an unsupervised mode.

{1434}
hide / / print
ref: -0 tags: convolutional neural networks audio feature extraction vocals keras tensor flow fourier date: 02-18-2019 21:40 gmt revision:3 [2] [1] [0] [head]

Audio AI: isolating vocals from stereo music using Convolutional Neural Networks

  • Ale Koretzky
  • Fairly standard CNN, but use a binary STFT mask to isolate vocals from instruments.
    • Get Fourier-type time-domain artifacts as a results; but it sounds reasonable.
    • Didn't realize it until this paper / blog post: stacked conv layers combine channels.
    • E.g. Input size 513*25*16 513 * 25 * 16 (512 freq channels + DC, 25 time slices, 16 filter channels) into a 3x3 Conv2D -> 3*3*16+16=1603 * 3 * 16 + 16 = 160 total parameters (filter weights and bias).
    • If this is followed by a second Conv2D layer of the same parameters, the layer acts as a 'normal' fully connected network in the channel dimension.
    • This means there are (3*3*16)*16+16=2320(3 * 3 * 16) * 16 + 16 = 2320 parameters.
      • Each input channel from the previous conv layer has independent weights -- they are not shared -- whereas the spatial weights are shared.
      • Hence, same number of input channels and output channels (in this case; doesn't have to be).
      • This, naturally, falls out of spatial weight sharing, which might be obvious in retrospect; of course it doesn't make sense to share non-spatial weights.
      • See also: https://datascience.stackexchange.com/questions/17064/number-of-parameters-for-convolution-layers
  • Synthesized a large training set via acapella youtube videos plus instrument tabs .. that looked like a lot of work!
    • Need a karaoke database here.
  • Authors wrapped this into a realtime extraction toolkit.

{1426}
hide / / print
ref: -2019 tags: Arild Nokland local error signals backprop neural networks mnist cifar VGG date: 02-15-2019 03:15 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

Training neural networks with local error signals

  • Arild Nokland and Lars H Eidnes
  • Idea is to use one+ supplementary neural networks to measure within-batch matching loss between transformed hidden-layer output and one-hot label data to produce layer-local learning signals (gradients) for improving local representation.
  • Hence, no backprop. Error signals are all local, and inter-layer dependencies are not explicitly accounted for (! I think).
  • L simL_{sim} : given a mini-batch of hidden layer activations H=(h 1,...,h n)H = (h_1, ..., h_n) and a one-hot encoded label matrix Y=(y 1,...,y nY = (y_1, ..., y_n ,
    • L sim=||S(NeuralNet(H))S(Y)|| F 2 L_{sim} = || S(NeuralNet(H)) - S(Y)||^2_F (don't know what F is..)
    • NeuralNet()NeuralNet() is a convolutional neural net (trained how?) 3*3, stride 1, reduces output to 2.
    • S()S() is the cosine similarity matrix, or correlation matrix, of a mini-batch.
  • L pred=CrossEntropy(Y,W TH)L_{pred} = CrossEntropy(Y, W^T H) where W is a weight matrix, dim hidden_size * n_classes.
    • Cross-entropy is H(Y,W TH)=Σ i,jY i,jlog((W TH) i,j)+(1Y i,j)log(1(W TH) i,j) H(Y, W^T H) = \Sigma_{i,j} Y_{i,j} log((W^T H)_{i,j}) + (1-Y_{i,j}) log(1-(W^T H)_{i,j})
  • Sim-bio loss: replace NeuralNet()NeuralNet() with average-pooling and standard-deviation op. Plus one-hot target is replaced with a random transformation of the same target vector.
  • Overall loss 99% L simL_sim , 1% L predL_pred
    • Despite the unequal weighting, both seem to improve test prediction on all examples.
  • VGG like network, with dropout and cutout (blacking out square regions of input space), batch size 128.
  • Tested on all the relevant datasets: MNIST, Fashion-MNIST, Kuzushiji-MNIST, CIFAR-10, CIFAR-100, STL-10, SVHN.
  • Pretty decent review of similarity matching measures at the beginning of the paper; not extensive but puts everything in context.
    • See for example non-negative matrix factorization using Hebbian and anti-Hebbian learning in and Chklovskii 2014.
  • Emphasis put on biologically realistic learning, including the use of feedback alignment {1423}
    • Yet: this was entirely supervised learning, as the labels were propagated back to each layer.
    • More likely that biology is setup to maximize available labels (not a new concept).

{1419}
hide / / print
ref: -0 tags: diffraction terahertz 3d print ucla deep learning optical neural networks date: 02-13-2019 23:16 gmt revision:1 [0] [head]

All-optical machine learning using diffractive deep neural networks

  • Pretty clever: use 3D printed plastic as diffractive media in a 0.4 THz all-optical all-interference (some attenuation) linear convolutional multi-layer 'neural network'.
  • In the arxive publication there are few details on how they calculated or optimized given diffractive layers.
  • Absence of nonlinearity will limit things greatly.
  • Actual observed performance (where thy had to print out the handwritten digits) rather poor, ~ 60%.

{1174}
hide / / print
ref: -0 tags: Hinton google tech talk dropout deep neural networks Boltzmann date: 02-12-2019 08:03 gmt revision:2 [1] [0] [head]

Brains, sex, and machine learning -- Hinton google tech talk.

  • Hinton believes in the the power of crowds -- he thinks that the brain fits many, many different models to the data, then selects afterward.
    • Random forests, as used in predator, is an example of this: they average many simple to fit and simple to run decision trees. (is apparently what Kinect does)
  • Talk focuses on dropout, a clever new form of model averaging where only half of the units in the hidden layers are trained for a given example.
    • He is inspired by biological evolution, where sexual reproduction often spontaneously adds or removes genes, hence individual genes or small linked genes must be self-sufficient. This equates to a 'rugged individualism' of units.
    • Likewise, dropout forces neurons to be robust to the loss of co-workers.
    • This is also great for parallelization: each unit or sub-network can be trained independently, on it's own core, with little need for communication! Later, the units can be combined via genetic algorithms then re-trained.
  • Hinton then observes that sending a real value p (output of logistic function) with probability 0.5 is the same as sending 0.5 with probability p. Hence, it makes sense to try pure binary neurons, like biological neurons in the brain.
    • Indeed, if you replace the backpropagation with single bit propagation, the resulting neural network is trained more slowly and needs to be bigger, but it generalizes better.
    • Neurons (allegedly) do something very similar to this by poisson spiking. Hinton claims this is the right thing to do (rather than sending real numbers via precise spike timing) if you want to robustly fit models to data.
      • Sending stochastic spikes is a very good way to average over the large number of models fit to incoming data.
      • Yes but this really explains little in neuroscience...
  • Paper referred to in intro: Livnat, Papadimitriou and Feldman, PMID-19073912 and later by the same authors PMID-20080594
    • A mixability theory for the role of sex in evolution. -- "We define a measure that represents the ability of alleles to perform well across different combinations and, using numerical iterations within a classical population-genetic framework, show that selection in the presence of sex favors this ability in a highly robust manner"
    • Plus David MacKay's concise illustration of why you need sex, pg 269, __Information theory, inference, and learning algorithms__
      • With rather simple assumptions, asexual reproduction yields 1 bit per generation,
      • Whereas sexual reproduction yields G\sqrt G , where G is the genome size.

{1425}
hide / / print
ref: -0 tags: Kato fear conditioning GABA auditory cortex mice optogenetics SOM PV date: 02-04-2019 19:09 gmt revision:0 [head]

PMID-29375323 Fear learning regulates cortical sensory representation by suppressing habituation

  • Trained mice on CS+ and CS --> lick task.
    • CS+ = auditory tone followed by tailshock
    • CS- = auditory tone (both FM modulated, separated by 0.5 - 1.0 octave).
    • US = licking.
  • VGAT2-ChR2 or PV-ChR2
  • GABA-ergic silencing of auditory cortex through blue light illumination abolished behavior difference following CS+ and CS-.
  • Used intrinsic imaging to locate A1 cortex, then AAV - GCaMP6 imaging to lcoated pyramidal cells.
  • In contrast to reports of enhanced tone responses following simple fear conditioning (Quirk et al., 1997; Weinberger, 2004, 2015), discriminative learning under our conditions caused no change in the average fraction of pyramidal cells responsive to the CS+ tone.
    • Seemed to be an increase in suppression, and reduced cortical responses, which is consistent with habituation.
  • Whereas -- and this is by no means surprising -- cortical responses to CS+ were sustained at end of tone following fear conditioning.
  • ----
  • Then examined this effect relative to the two populations of interneurons, using PV-cre and SOM-cre mice.
    • In PV cells, fear conditioning resulted in a decreased fraction of cells responsive, and a decreased magnitude of responses.
    • In SOM cells, CS- responses were enhanced, while CS+ were less enhanced (the main text seems like an exaggeration c.f. figure 6E)
  • This is possibly the more interesting result of the paper, but even then the result is not super strong.

{842}
hide / / print
ref: work-0 tags: distilling free-form natural laws from experimental data Schmidt Cornell automatic programming genetic algorithms date: 09-14-2018 01:34 gmt revision:5 [4] [3] [2] [1] [0] [head]

Distilling free-form natural laws from experimental data

  • There critical step was to use partial derivatives to evaluate the search for invariants. Even yet, with a 4D data set the search for natural laws took ~ 30 hours.
    • Then again, how long did it take humans to figure out these invariants? (Went about it in a decidedly different way..)
    • Further, how long did it take for biology to discover similar invariants?
      • They claim elsewhere that the same algorithm has been applied to biological data - a metabolic pathway - with some success.
      • Of course evolution had to explore a much larger space - proteins and reculatory pathways, not simpler mathematical expressions / linkages.

{1409}
hide / / print
ref: -0 tags: coevolution fitness prediction schmidt genetic algorithm date: 09-14-2018 01:34 gmt revision:8 [7] [6] [5] [4] [3] [2] [head]

Coevolution of Fitness Predictors

  • Michael D. Schmidt and Hod Lipson, Member, IEEE
  • Fitness prediction is a technique to replace fitness evaluation in evolutionary algorithms with a light-weight approximation that adapts with the solution population.
    • Cannot approximate the full landscape, but shift focus during evolution.
    • Aka local caching.
    • Or adversarial techniques.
  • Instead use coevolution, with three populations:
    • 1) solutions to the original problem, evaluated using only fitness predictors;
    • 2) fitness predictors of the problem; and
    • 3) fitness trainers, whose exact fitness is used to train predictors.
      • Trainers are selected high variance solutions across the predictors, and predictors are trained on this subset.
  • Lightweight fitness predictors evolve faster than the solution population, so they cap the computational effort on that at 5% overall effort.
    • These fitness predictors are basically an array of integers which index the full training set -- very simple and linear. Maybe boring, but the simplest solution that works ...
    • They only sample 8 training examples for even complex 30-node solution functions (!!).
    • I guess, because the information introduced into the solution set is relatively small per generation, it makes little sense to over-sample or over-specify this; all that matters is that, on average, it's directionally correct and unbiased.
  • Used deterministic crowding selection as the evolutionary algorithm.
    • Similar individuals have to compete in tournaments for space.
  • Showed that the coevolution algorithm is capable of inferring even highly complex many-term functions
    • And, it uses function evaluations more efficiently than the 'exact' (each solution evaluated exactly) algorithm.
  • Coevolution algorithm seems to induce less 'bloat' in the complexity of the solutions.
  • See also {842}

{1408}
hide / / print
ref: -2018 tags: machine learning manifold deep neural net geometry regularization date: 08-29-2018 14:30 gmt revision:0 [head]

LDMNet: Low dimensional manifold regularized neural nets.

  • Synopsis of the math:
    • Fit a manifold formed from the concatenated input ‘’and’’ output variables, and use this set the loss of (hence, train) a deep convolutional neural network.
      • Manifold is fit via point integral method.
      • This requires both SGD and variational steps -- alternate between fitting the parameters, and fitting the manifold.
      • Uses a standard deep neural network.
    • Measure the dimensionality of this manifold to regularize the network. Using a 'elegant trick', whatever that means.
  • Still yet he results, in terms of error, seem not very significantly better than previous work (compared to weight decay, which is weak sauce, and dropout)
    • That said, the results in terms of feature projection, figures 1 and 2, ‘’do’’ look clearly better.
    • Of course, they apply the regularizer to same image recognition / classification problems (MNIST), and this might well be better adapted to something else.
  • Not completely thorough analysis, perhaps due to space and deadlines.

{1384}
hide / / print
ref: -0 tags: NET probes SU-8 microfabrication sewing machine carbon fiber electrode insertion mice histology 2p date: 12-29-2017 04:38 gmt revision:1 [0] [head]

PMID-28246640 Ultraflexible nanoelectronic probes form reliable, glial scar–free neural integration

  • SU-8 asymptotic H2O absorption is 3.3% in PBS -- quite a bit higher than I expected, and higher than PI.
  • Faced yield problems with contact litho at 2-3um trace/space.
  • Good recordings out to 4 months!
  • 3 minutes / probe insertion.
  • Fab:
    • Ni release layer, Su-8 2000.5. "excellent tensile strength" --
      • Tensile strength 60 MPa
      • Youngs modulus 2.0 GPa
      • Elongation at break 6.5%
      • Water absorption, per spec sheet, 0.65% (but not PBS)
    • 500nm dielectric; < 1% crosstalk; see figure S12.
    • Pt or Au rec sites, 10um x 20um or 30 x 30um.
    • FFC connector, with Si substrate remaining.
  • Used transgenic mice, YFP expressed in neurons.
  • CA glue used before metabond, followed by Kwik-sil silicone.
  • Neuron yield not so great -- they need to plate the electrodes down to acceptable impedance. (figure S5)
    • Measured impedance ~ 1M at 1khz.
  • Unclear if 50um x 1um is really that much worse than 10um x 1.5um.
  • Histology looks realyl great, (figure S10).
  • Manuscript did not mention (though the did at the poster) problems with electrode pull-out; they deal with it in the same way, application of ACSF.

{1236}
hide / / print
ref: -0 tags: optogenetics micro LED flexible electrodes PET rogers date: 12-28-2017 03:24 gmt revision:9 [8] [7] [6] [5] [4] [3] [head]

PMID-23580530 Injectable, cellular-scale optoelectronics with applications for wireless optogenetics.

  • Supplementary materials
  • 21 authors, University Illinois at Urbana-Champaign, Tufts, China, Northwestern, Miami ..
  • GaN blue and green LEDs fabricated on a flexible substrate with stiff inserter.
    • Inserter is released in 15 min with a dissolving silk fibrin.
    • made of 250um thick SU-8 epoxy, reverse photocured on a glass slide.
  • GaN LEDS fabricated on a sapphire substrate & transfer printed via modified Karl-Suss mask aligner.
    • See supplemental materials for the intricate steps.
    • LEDs are 50um x 50um x 6.75um
  • Have integrated:
    • Temperature sensor (Pt serpentine resistor) / heater.
    • inorganic photodetector (IPD)
      • ultrathin silicon photodiode 1.25um thick, 200 x 200um^2, made on a SOI wafer
    • Pt extracellular recording electrode.
        • This insulated via 2um thick more SU-8.
  • Layers are precisely aligned and assembled via 500nm layer of epoxy.
    • Layers made of 6um or 2.5um thick mylar (polyethylene terephthalate (PET))
    • Layers joined with SU-8.
    • Wiring patterned via lift-off.
  • Powered via RF scavenging at 910 Mhz.
    • appeared to be simple, power in = light out; no data connection.
  • Tested vs control and fiber optic stimulation, staining for:
    • Tyrosine hydroxylase (makes l-DOPA)
    • c-fos, a neural activity marker
    • u-LEDs show significant activation.
  • Also tested for GFAP (astrocytes) and Iba1 (activated microglia); flexible & smaller devices had lower gliosis.
  • Next tested for behavior using a self-stimulation protocol; mice learned to self-stimulate to release DA.
  • Devices are somewhat reliable to 250 days!

{1391}
hide / / print
ref: -0 tags: computational biology evolution metabolic networks andreas wagner genotype phenotype network date: 06-12-2017 19:35 gmt revision:1 [0] [head]

Evolutionary Plasticity and Innovations in Complex Metabolic Reaction Networks

  • ‘’João F. Matias Rodrigues, Andreas Wagner ‘’
  • Our observations suggest that the robustness of the Escherichia coli metabolic network to mutations is typical of networks with the same phenotype.
  • We demonstrate that networks with the same phenotype form large sets that can be traversed through single mutations, and that single mutations of different genotypes with the same phenotype can yield very different novel phenotypes
  • Entirely computational study.
    • Examines what is possible given known metabolic building-blocks.
  • Methodology: collated a list of all metabolic reactions in E. Coli (726 reactions, excluding 205 transport reactions) out of 5870 possible reactions.
    • Then ran random-walk mutation experiments to see where the genotype + phenotype could move. Each point in the genotype had to be viable on either a rich (many carbon source) or minimal (glucose) growth medium.
    • Viability was determined by Flux-balance analysis (FBA).
      • In our work we use a set of biochemical precursors from E. coli 47-49 as the set of required compounds a network needs to synthesize, ‘’’by using linear programming to optimize the flux through a specific objective function’’’, in this case the reaction representing the production of biomass precursors we are able to know if a specific metabolic network is able to synthesize the precursors or not.
      • Used Coin-OR and Ilog to optimize the metabolic concentrations (I think?) per given network.
    • This included the ability to synthesize all required precursor biomolecules; see supplementary information.
    • ‘’’“Viable” is highly permissive -- non-zero biomolecule concentration using FBA and linear programming. ‘’’
    • Genomic distances = hamming distance between binary vectors, where 1 = enzyme / reaction possible; 0 = mutated off; 0 = identical genotype, 1 = completely different genotype.
  • Between pairs of viable genetic-metabolic networks, only a minority (30 - 40%) of reactions are essential,
    • Which naturally increases with increasing carbon source diversity:
    • When they go back an examine networks that can sustain life on any of (up to) 60 carbon sources, and again measure the distance from the original E. Coli genome, they find this added robustness does not significantly constrain network architecture.

Summary thoughts: This is a highly interesting study, insofar that the authors show substantial support for their hypotheses that phenotypes can be explored through random-walk non-lethal mutations of the genotype, and this is somewhat invariant to the source of carbon for known biochemical reactions. What gives me pause is the use of linear programming / optimization when setting the relative concentrations of biomolecules, and the permissive criteria for accepting these networks; real life (I would imagine) is far more constrained. Relative and absolute concentrations matter.

Still, the study does reflect some robustness. I suggest that a good control would be to ‘fuzz’ the list of available reactions based on statistical criteria, and see if the results still hold. Then, go back and make the reactions un-biological or less networked, and see if this destroys the measured degrees of robustness.

{1354}
hide / / print
ref: -0 tags: David Kleinfeld penetrating arterioles perfusion cortex vasculature date: 10-17-2016 23:24 gmt revision:1 [0] [head]

PMID-17190804 Penetrating arterioles are a bottleneck in the perfusion of neocortex.

  • Focal photothrombosis was used to occlude single penetrating arterioles in rat parietal cortex, and the resultant changes in flow of red blood cells were measured with two-photon laser-scanning microscopy in individual subsurface microvessels that surround the occlusion.
  • We observed that the average flow of red blood cells nearly stalls adjacent to the occlusion and remains within 30% of its baseline value in vessels as far as 10 branch points downstream from the occlusion.
  • Preservation of average flow emerges 350 mum away; this length scale is consistent with the spatial distribution of penetrating arterioles
  • Rose bengal photosensitizer.
  • 2p laser scanning microscopy.
  • Downstream and connected arterioles show a dramatic reduction in blood flow, even 1-4 branches in; there is little reduncancy (figure 2)
  • Measured a good number of vessels (and look at their density!); results are satisfactorily quantitative.
  • Vessel leakiness extends up to 1.1mm away (!) (figure 5).

{1348}
hide / / print
ref: -0 tags: David Kleinfeld cortical vasculature laser surgery network occlusion flow date: 09-23-2016 06:35 gmt revision:1 [0] [head]

Heller Lecture - Prof. David Kleinfeld

  • Also mentions the use of LIBS + q-switched laser for precisely drilling holes in the scull. Seems to work!
    • Use 20ns delay .. seems like there is still spectral broadening.
    • "Turn neuroscience into an industrial process, not an art form" After doing many surgeries, agreed!
  • Vasodiliation & vasoconstriction is very highly regulated; there is not enough blood to go around.
    • Vessels distant from a energetic / stimulated site will (net) constrict.
  • Vascular network is most entirely closed-loop, and not tree-like at all -- you can occlude one artery, or one capillary, and the network will route around the occlusion.
    • The density of the angio-architecture in the brain is unique in this.
  • Tested micro-occlusions by injecting rose bengal, which releases free radicals on light exposure (532nm, 0.5mw), causing coagulation.
  • "Blood flow on the surface arteriole network is insensitive to single occlusions"
  • Penetrating arterioles and venules are largely stubs -- single unbranching vessels, which again renders some immunity to blockage.
  • However! Occlusion of a penetrating arteriole retards flow within a 400 - 600um cylinder (larger than a cortical column!)
  • Occulsion of many penetrating vessels, unsurprisingly, leads to large swaths of dead cortex, "UBOS" in MRI parlance (unidentified bright objects).
  • Death and depolarizing depression can be effectively prevented by excitotoxicity inhibitors -- MK801 in the slides (NMDA blocker, systemically)

{711}
hide / / print
ref: Gradinaru-2009.04 tags: Deisseroth DBS STN optical stimulation 6-OHDA optogenetics date: 05-10-2016 23:48 gmt revision:8 [7] [6] [5] [4] [3] [2] [head]

PMID-19299587[0] Optical Deconstruction of Parkinsonian Neural Circuitry.

  • Viviana Gradinaru, Murtaza Mogri, Kimberly R. Thompson, Jaimie M. Henderson, Karl Deisseroth
  • DA depletion of the SN leads to abnormal activity in the BG ; HFS (>90Hz) of the STN has been found to be therapeutic, but the mechanism is imperfectly understood.
    • lesions of the BG can also be therapeutic.
  • Used chanelrhodopsin (light activated cation channel (+)) which are expressed by cell type specific promoters. (transgenic animals). Also used halorhodopsins, which are light activated chloride pumps (inhibition).
    • optogenetics allows simultaneous optical stimulation and electrical recording without artifact.
  • Made PD rats by 6-hydroxydopamine unilaterally into the medial forebrain bundle of rats.
  • Then they injected eNpHr (inhibitory) opsin vector targeting excitatory neurons (under control of the CaMKIIa receptor) to the STN as identified stereotaxically & by firing pattern.
    • Electrical stimulation of this area alleviated rotational behavior (they were hemiparkinsonian rats), but not optical inhibition of STN.
  • Alternately, the glia in STN may be secreting molecules that modulate local circuit activity; it has been shown that glial-derived factor adenosine accumulates during DBS & seems to help with attenuation of tremor.
    • Tested this by activating glia with ChR2, which can pass small Ca+2 currents.
    • This worked: blue light halted firing in the STN; but, again, no behavioral trace of the silencing was found.
  • PD is characterized by pathological levels of beta oscillations in the BG, and synchronizing STN with the BG at gamma frequencies may ameliorate PD symptoms; while sync. at beta will worsen -- see [1][2]
  • Therefore, they tried excitatory optical stimulation of excitatory STN neurons at the high frequencies used in DBS (90-130Hz).
    • HFS to STN failed, again, to produce any therapeutic effect!
  • Next expressed channel rhodopsin in only projection neurons Thy1::ChR2 (not excitatory cells in STN), again did optotrode (optical stim, eletrical record) recordings.
    • HFS of afferent fibers to STN shut down most of the local circuitry there, with some residual low-amplitude high frequency burstiness.
    • Observed marked effects with this treatment! Afferent HFS alleviated Parkinsonian symptoms, profoundly, with immediate reversal once the laser was turned off.
    • LFS worsened PD symptoms, in accord with electrical stimulation.
    • The Thy1::ChR2 only affected excitatory projections; GABAergic projections from GPe were absent. Dopamine projections from SNr were not affected by the virus either. However, M1 layer V projection neurons were strongly labeled by the retrovirus.
      • M1 layer V neurons could be antidromically recruited by optical stimulation in the STN.
  • Selective M1 layer V HFS also alleviated PD symptoms ; LFS had no effect; M2 (Pmd/Pmv?) LFS causes motor behavior.
  • Remind us that DBS can treat tremor, rigidity, and bradykinesia, but is ineffective at treating speech impairment, depression, and dementia.
  • Suggest that axon tract modulation could be a common theme in DBS (all the different types..), as activity in white matter represents the activity of larger regions compactly.
  • The result that the excitatory fibers of projections, mainly from the motor cortex, matter most in producing therapeutic effects of DBS is counterintuitive but important.
    • What do these neurons do normally, anyway? give a 'copy' of an action plan to the STN? What is their role in M1 / the BG? They should test with normal mice.

____References____

[0] Gradinaru V, Mogri M, Thompson KR, Henderson JM, Deisseroth K, Optical Deconstruction of Parkinsonian Neural Circuitry.Science no Volume no Issue no Pages (2009 Mar 19)
[1] Eusebio A, Brown P, Synchronisation in the beta frequency-band - The bad boy of parkinsonism or an innocent bystander?Exp Neurol no Volume no Issue no Pages (2009 Feb 20)
[2] Wingeier B, Tcheng T, Koop MM, Hill BC, Heit G, Bronte-Stewart HM, Intra-operative STN DBS attenuates the prominent beta rhythm in the STN in Parkinson's disease.Exp Neurol 197:1, 244-51 (2006 Jan)

{1334}
hide / / print
ref: -0 tags: micro LEDS Buzaki silicon neural probes optogenetics date: 04-18-2016 18:00 gmt revision:0 [head]

PMID-26627311 Monolithically Integrated μLEDs on Silicon Neural Probes for High-Resolution Optogenetic Studies in Behaving Animals.

  • 12 uLEDs and 32 rec sites integrated into one probe.
  • InGaN monolithically integrated LEDs.
    • Si has ~ 5x higher thermal conductivity than sapphire, allowing better heat dissipation.
    • Use quantum-well epitaxial layers, 460nm emission, 5nm Ni / 5nm Au current injection w/ 75% transmittance @ design wavelength.
      • Think the n/p GaN epitaxy is done by an outside company, NOVAGAN.
    • Efficiency near 80% -- small LEDs have fewer defects!
    • SiO2 + ALD Al2O3 passivation.
    • 70um wide, 30um thick shanks.

{1287}
hide / / print
ref: -0 tags: maleimide azobenzine glutamate photoswitch optogenetics date: 06-16-2014 21:19 gmt revision:0 [head]

PMID-16408092 Allosteric control of an ionotropic glutamate receptor with an optical switch

  • 2006
  • Use an azobenzene (benzine linked by two double-bonded nitrogens) as a photo-switchable allosteric activator that reversibly presents glutamate to an ion channel.
  • PIMD:17521567 Remote control of neuronal activity with a light-gated glutamate receptor.
    • The molecule, in use.
  • Likely the molecule is harder to produce than channelrhodopsin or halorhodopsin, hence less used. Still, it's quite a technology.

{1283}
hide / / print
ref: -0 tags: optogenetics glutamate azobenzine date: 05-07-2014 19:51 gmt revision:0 [head]

PMID-17521567 Remote control of neuronal activity with a light-gated glutamate receptor.

  • Neuron 2007.
  • azobenzines undergo a cis to trans confirmational change via illumination with one wavelength, and trans to cis via another. (neat!!)
  • This was used to create photo-controlled (on and off) glutamate channels.

{1269}
hide / / print
ref: -0 tags: hinton convolutional deep networks image recognition 2012 date: 01-11-2014 20:14 gmt revision:0 [head]

ImageNet Classification with Deep Convolutional Networks

{1257}
hide / / print
ref: -0 tags: Anna Roe optogenetics artificial dura monkeys intrinisic imaging date: 09-30-2013 19:08 gmt revision:3 [2] [1] [0] [head]

PMID-23761700 Optogenetics through windows on the brain in nonhuman primates

  • technique paper.
  • placed over the visual cortex.
  • Injected virus through the artificial dura -- micropipette, not CVD.
  • Strong expression:
  • See also: PMID-19409264 (Boyden, 2009)

{1255}
hide / / print
ref: -0 tags: Disseroth Kreitzer parkinsons optogenetics D1 D2 6OHDA date: 09-30-2013 18:15 gmt revision:0 [head]

PMID-20613723 Regulation of parkinsonian motor behaviors by optogenetic control of basal ganglia circuitry

  • Kravitz AV, Freeze BS, Parker PR, Kay K, Thwin MT, Deisseroth K, Kreitzer AC.
  • Generated mouse lines with channelrhodopsin2, with Cre recombinase under control of regulatory elements for the dopamine D1 (direct) or D2 (indirect) receptor.
  • optogenetic exitation of the indirect pathway elicited a parkinsonian state: increased freezing, bradykinesia and decreased locomotor initiations;
  • Activation of the direct pathway decreased freezing and increased locomotion.
  • Then: 6OHDA depletion of striatal dopamine neurons.
  • Optogenetic activation of direct pathway (D1 Cre/loxp) neurons restored behavior to pre-lesion levels.
    • Hence, this seems like a good target for therapy.

{1177}
hide / / print
ref: -0 tags: magnetic flexible insertion japan neural recording electrodes date: 01-28-2013 03:54 gmt revision:2 [1] [0] [head]

IEEE-1196780 (pdf) 3D flexible multichannel neural probe array

  • Shoji Takeuchi1, Takafumi Suzuki2, Kunihiko Mabuchi2 and Hiroyuki Fujita
  • wild -- they use a magnetic field to make the electrodes stand up!
  • Electrodes released with DRIE, as with Michigan probes.
  • As with many other electrodes, pretty high electrical impedance - 1.5M @ 1kHz.
    • 20x20um recording sites on 10um parylene.
  • Could push these into a rat and record extracellular APs, but nothing quantitative, no histology either.
  • Used a PEG coating to make them stiff enough to insert into the ctx (phantom in IEEE conference proceedings.)

{1214}
hide / / print
ref: -0 tags: brain micromotion magnetic resonance imaging date: 01-28-2013 01:38 gmt revision:0 [head]

PMID-7972766 Brain and cerebrospinal fluid motion: real-time quantification with M-mode MR imaging.

  • Measured brain motion via a clever MR protocol. (beyond my present understanding...)
  • ventricles move at up to 1mm/sec
  • In the Valsava maneuver the brainstem can move 2-3mm.
  • Coughing causes upswing of the CSF.

{54}
hide / / print
ref: bookmark-0 tags: intrinsic evolution FPGA GPU optimization algorithm genetic date: 01-27-2013 22:27 gmt revision:1 [0] [head]

!:

  • http://evolutioninmaterio.com/ - using FPGAs in intrinsic evolution, e.g. the device is actually programmed and tested.
  • - Adrian Thompson's homepage. There are many PDFs of his work on his homepage.
  • Parallel genetic algorithms on programmable graphics hardware
    • basically deals with optimizing mutation and fitness evaluation using the parallel arcitecture of a GPU: larger populations can be evaluated at one time.
    • does not concern the intrinsic evolution of algorithms to the GPU, as in the Adrian's work.
    • uses a linear conguent generator to produce random numbers.
    • used a really simple problem: Colville minimization problem which need only search through a four-dimensional space.
  • Cellular genetic algoritms and local search for 3-SAT problem on Graphic Hardware
    • concerning SAT: satisfiabillity technique: " many practical problems, such as graph coloring, job-shop scheduling, and real-world scheduling can be represented as a SAT problem.
    • SAT-3 refers to the length of the search clause. length 3 is apparently very hard..
    • they use a combination of greedy search (flip the bit that increases the fitness the largest ammount) and random-walk via point mutations to keep the algorithm away from local minima.
    • also use cellular genetic algorithm which works better on a GPU): select the optimal neignbor, not global, individual.
    • only used a GeForce 6200 gpu, but it was still 5x faster than a p4 2.4ghz.
  • Evolution of a robot controller using cartesian genetic programming
    • cartesian programming has many advantages over traditional tree based methods - e.g. blot-free evolution & faster evolution through neutral search.
    • cartesian programming is characterized by its encoding of a graph as a string of integers that represent the functions and connections between graph nodes, and program inputs and outputs.
      • this encoding was developed in the course of evolving electronic circuits, e.g. above ?
      • can encode a non-connected graph. the genetic material that is not utilized is analogous to biological junk DNA.
    • even in converged populations, small mutations can produce large changes in phenotypic behavior.
    • in this work he only uses directed graphs - there are no cycles & an organized flow of information.
    • mentions automatically defined functions - what is this??
    • used diffusion to define the fitness values of particular locations in the map. the fewer particles there eventually were in a grid location, the higher the fitness value of the robot that managed to get there.
  • Hardware evolution: on the nature of artifically evolved circuits - doctoral dissertation.
    • because evolved circuits utilize the parasitic properties of devices, they have little tolerance of the value of components. Reverse engineering of the circuits evolved to improve tolerance is not easy.

{913}
hide / / print
ref: Ganguly-2011.05 tags: Carmena 2011 reversible cortical networks learning indirect BMI date: 01-23-2013 18:54 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

PMID-21499255[0] Reversible large-scale modification of cortical networks during neuroprosthetic control.

  • Split the group of recorded motor neurons into direct (decoded and controls the BMI) and indirect (passive) neurons.
  • Both groups showed changes in neuronal tuning / PD.
    • More PD. Is there no better metric?
  • Monkeys performed manual control before (MC1) and after (MC2) BMI training.
    • The majority of neurons reverted back to original tuning after BC; c.f. [1]
  • Monkeys were trained to rapidly switch between manual and brain control; still showed substantial changes in PD.
  • 'Near' (on same electrode as direct neurons) and 'far' neurons (different electrode) showed similar changes in PD.
    • Modulation Depth in indirect neurons was less in BC than manual control.
  • Prove (pretty well) that motor cortex neuronal spiking can be dissociated from movement.
  • Indirect neurons showed decreased modulation depth (MD) -> perhaps this is to decrease interference with direct neurons.
  • Quote "Studies of operant conditioning of single neurons found that conconditioned adjacent neurons were largely correlated with the conditioned neurons".
    • Well, also: Fetz and Baker showed that you can condition neurons recorded on the same electrode to covary or inversely vary.
  • Contrast with studies of motor learning in different force fields, where there is a dramatic memory trace.
    • Possibly this is from proprioception activating the cerebellum?

Other notes:

  • Scale bars on the waveforms are incorrect for figure 1.
  • Same monkeys as [2]

____References____

[0] Ganguly K, Dimitrov DF, Wallis JD, Carmena JM, Reversible large-scale modification of cortical networks during neuroprosthetic control.Nat Neurosci 14:5, 662-7 (2011 May)
[1] Gandolfo F, Li C, Benda BJ, Schioppa CP, Bizzi E, Cortical correlates of learning in monkeys adapting to a new dynamical environment.Proc Natl Acad Sci U S A 97:5, 2259-63 (2000 Feb 29)
[2] Ganguly K, Carmena JM, Emergence of a stable cortical map for neuroprosthetic control.PLoS Biol 7:7, e1000153 (2009 Jul)

{1058}
hide / / print
ref: -0 tags: Purdue magnetic bullet electrode implantation date: 01-04-2013 00:51 gmt revision:3 [2] [1] [0] [head]

PMID-19596378 Magnetic insertion system for flexible electrode implantation.

  • Probes constructed from a sharp magnetic tip attached to a flexible tether.
  • Cite Polikov et al 2005. {781}.
  • Re micromotion: (Gilletti and Muthuswamy, 2006 {1102}; Lee et al., 2004; Subbaroyan et al., 2005 {1103}).
  • 0.6 mm (600 um!) diameter steel bullet, 4mm long, on the end of 38 gauge magnet wire. Mass 7.2 +- 0.4 mg.
  • Peak current 520 A froman 800V, 900uF capacitor which produces a maximum force of 10 N on the electrode, driving it at 126.25 m/s.
  • Did manage to get neural data.
  • Experimental evidence suggests that macrophages have difficulty adhering to and spreading on polymer fibers ranging between 2.1 and 5.9 um in diameter. PMID-8902241 Bernatchez et al. 1996 and {746}.
  • Shot through the dura.
  • Also reference magnetic stereotaxis for use in manipulating magnetic 'seeds' through cancers for hyperthremic destruction.
  • See also their 2011 AES abstract

{1183}
hide / / print
ref: -0 tags: optical imaging neural recording diamond magnetic date: 01-02-2013 03:44 gmt revision:0 [head]

PMID-22574249 High spatial and temporal resolution wide-field imaging of neuron activity using quantum NV-diamond.

  • yikes: In this work we consider a fundamentally new form of wide-field imaging for neuronal networks based on the nanoscale magnetic field sensing properties of optically active spins in a diamond substrate.
  • Cultured neurons.
  • NV = nitrogen-vacancy defect centers.
    • "The NV centre is a remarkable optical defect in diamond which allows discrimination of its magnetic sublevels through its fluorescence under illumination. "
    • We show that the NV detection system is able to non-invasively capture the transmembrane potential activity in a series of near real-time images, with spatial resolution at the level of the individual neural compartments.
  • Did not actually perform neural measurements -- used a 10um microwire with mA of current running through it.
    • I would imagine that actual neurons have far less current!

{1125}
hide / / print
ref: -0 tags: active filter design Netherlands Gerrit Groenewold date: 02-17-2012 20:27 gmt revision:0 [head]

IEEE-04268406 (pdf) Noise and Group Delay in Actvie Filters

  • relevant conclusion: the output noise spectrum is exactly proportinoal to the group delay.
  • Poschenrieder established a relationship between group delay and energy stored in a passive filter.
  • Fettweis proved from this that the noise generation of an active filter which is based on a passive filter is appoximately proportional to the group delay. (!!!)

{425}
hide / / print
ref: bookmark-2007.08 tags: donoghue cyberkinetics BMI braingate date: 01-06-2012 03:09 gmt revision:3 [2] [1] [0] [head]

images/425_1.pdf August 2007

  • provides more extensive details on the braingate system.
  • including, their automatic impedance tester (5mv, 10pa)
  • and the automatic spike sorter.
  • the different tests that were required, such as accelerated aging in 50-70 deg C saline baths
  • the long path to market - $30 - $40 million more (of course, they have since abandoned the product).

{1007}
hide / / print
ref: Dethier-2011.28 tags: BMI decoder spiking neural network Kalman date: 01-06-2012 00:20 gmt revision:1 [0] [head]

IEEE-5910570 (pdf) Spiking neural network decoder for brain-machine interfaces

  • Golden standard: kalman filter.
  • Spiking neural network got within 1% of this standard.
  • THe 'neuromorphic' approach.
  • Used Nengo, freely available neural simulator.

____References____

Dethier, J. and Gilja, V. and Nuyujukian, P. and Elassaad, S.A. and Shenoy, K.V. and Boahen, K. Neural Engineering (NER), 2011 5th International IEEE/EMBS Conference on 396 -399 (2011)

{998}
hide / / print
ref: -0 tags: bookmark Cory Doctorow EFF SOPA internet freedom date: 01-01-2012 21:51 gmt revision:0 [head]

The Coming War on General Computation "M.P.s and Congressmen and so on are elected to represent districts and people, not disciplines and issues. We don't have a Member of Parliament for biochemistry, and we don't have a Senator from the great state of urban planning, and we don't have an M.E.P. from child welfare. "

{993}
hide / / print
ref: Sanchez-2005.06 tags: BMI Sanchez Nicolelis Wessberg recurrent neural network date: 01-01-2012 18:28 gmt revision:2 [1] [0] [head]

IEEE-1439548 (pdf) Interpreting spatial and temporal neural activity through a recurrent neural network brain-machine interface

  • Putting it here for the record.
  • Note they did a sensitivity analysis (via chain rule) of the recurrent neural network used for BMI predictions.
  • Used data (X,Y,Z) from 2 monkeys feeding.
  • Figure 6 is strange, data could be represented better.
  • Also see: IEEE-1300786 (pdf) Ascertaining the importance of neurons to develop better brain-machine interfaces Also by Justin Sanchez.

____References____

Sanchez, J.C. and Erdogmus, D. and Nicolelis, M.A.L. and Wessberg, J. and Principe, J.C. Interpreting spatial and temporal neural activity through a recurrent neural network brain-machine interface Neural Systems and Rehabilitation Engineering, IEEE Transactions on 13 2 213 -219 (2005)

{968}
hide / / print
ref: Bassett-2009.07 tags: Weinberger congnitive efficiency beta band neuroimagaing EEG task performance optimization network size effort date: 12-28-2011 20:39 gmt revision:1 [0] [head]

PMID-19564605[0] Cognitive fitness of cost-efficient brain functional networks.

  • Idea: smaller, tighter networks are correlated with better task performance
    • working memory task in normal subjects and schizophrenics.
  • Larger networks operate with higher beta frequencies (more effort?) and show less efficient task performance.
  • Not sure about the noisy data, but v. interesting theory!

____References____

[0] Bassett DS, Bullmore ET, Meyer-Lindenberg A, Apud JA, Weinberger DR, Coppola R, Cognitive fitness of cost-efficient brain functional networks.Proc Natl Acad Sci U S A 106:28, 11747-52 (2009 Jul 14)

{323}
hide / / print
ref: Loewenstein-2006.1 tags: reinforcement learning operant conditioning neural networks theory date: 12-07-2011 03:36 gmt revision:4 [3] [2] [1] [0] [head]

PMID-17008410[0] Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity

  • The probability of choosing an alternative in a long sequence of repeated choices is proportional to the total reward derived from that alternative, a phenomenon known as Herrnstein's matching law.
  • We hypothesize that there are forms of synaptic plasticity driven by the covariance between reward and neural activity and prove mathematically that matching (alternative to reward) is a generic outcome of such plasticity
    • models for learning that are based on the covariance between reward and choice are common in economics and are used phenomologically to explain human behavior.
  • this model can be tested experimentally by making reward contingent not on the choices, but rather on the activity of neural activity.
  • Maximization is shown to be a generic outcome of synaptic plasticity driven by the sum of the covariances between reward and all past neural activities.

____References____

{883}
hide / / print
ref: -0 tags: Lehrer internet culture community collapse groupthink date: 06-01-2011 02:22 gmt revision:1 [0] [head]

Response to Jonah Lehrer's The Web and the Wisdom of Crowds:

Lehrer is right on one thing: culture. We're all consuming similar things (e.g. Rebecca Black) via the strong positive feedback of sharing things that you like, liking things that you share, and becoming more like the things that are shared with you. Will this lead to a cultural convergence, or stable n-ary system? To early to tell, but probably not: likely this is nothing new. Would you expect music to collapse to a single genre? No way. Sure, there will be pop culture via the mechanisms Lehrer suggests, but meanwhile there is too much to explore, and we like novelty too much.

Regarding decision making through stochastic averaging as implemented in democracy, I have to agree with John Hawk here. The growing availability of knowledge, news, and other opinions should be a good thing. This ought to be more than enough to counteract the problem of everyone reading say the NYTimes instead of many varied local newspapers; there should be no impoverishment of opinion. Furthermore, we read blogs (like Lehrer's) which have to compete increasingly honestly in the attention economy. The cost of redirecting our attention has gone from that of a subscription to free. Plus, this attention economy ties communication to reality at more points - each reader, as opposed to each publisher, is partially responsible for information amplification and dissemination. (I mean I just published this damn thing and almost zero cost - is that not a great thing?)

{862}
hide / / print
ref: -0 tags: backpropagation cascade correlation neural networks date: 12-20-2010 06:28 gmt revision:1 [0] [head]

The Cascade-Correlation Learning Architecture

  • Much better - much more sensible, computationally cheaper, than backprop.
  • Units are added one by one; each is trained to be maximally correlated to the error of the existing, frozen neural network.
  • Uses quickprop to speed up gradient ascent learning.

{795}
hide / / print
ref: work-0 tags: machine learning reinforcement genetic algorithms date: 10-26-2009 04:49 gmt revision:1 [0] [head]

I just had dinner with Jesse, and the we had a good/productive discussion/brainstorm about algorithms, learning, and neurobio. Two things worth repeating, one simpler than the other:

1. Gradient descent / Newton-Rhapson like techniques should be tried with genetic algorithms. As of my current understanding, genetic algorithms perform an semi-directed search, randomly exploring the space of solutions with natural selection exerting a pressure to improve. What if you took the partial derivative of each of the organism's genes, and used that to direct mutation, rather than random selection of the mutated element? What if you looked before mating and crossover? Seems like this would speed up the algorithm greatly (though it might get it stuck in local minima, too). Not sure if this has been done before - if it has, edit this to indicate where!

2. Most supervised machine learning algorithms seem to rely on one single, externally applied objective function which they then attempt to optimize. (Rather this is what convex programming is. Unsupervised learning of course exists, like PCA, ICA, and other means of learning correlative structure) There are a great many ways to do optimization, but all are exactly that - optimization, search through a space for some set of weights / set of rules / decision tree that maximizes or minimizes an objective function. What Jesse and I have arrived at is that there is no real utility function in the world, (Corollary #1: life is not an optimization problem (**)) -- we generate these utility functions, just as we generate our own behavior. What would happen if an algorithm iteratively estimated, checked, cross-validated its utility function based on the small rewards actually found in the world / its synthetic environment? Would we get generative behavior greater than the complexity of the inputs? (Jesse and I also had an in-depth talk about information generation / destruction in non-linear systems.)

Put another way, perhaps part of learning is to structure internal valuation / utility functions to set up reinforcement learning problems where the reinforcement signal comes according to satisfaction of sub-goals (= local utility functions). Or, the gradient signal comes by evaluating partial derivatives of actions wrt Creating these goals is natural but not always easy, which is why one reason (of very many!) sports are so great - the utility function is clean, external, and immutable. The recursive, introspective creation of valuation / utility functions is what drives a lot of my internal monologues, mixed with a hefty dose of taking partial derivatives (see {780}) based on models of the world. (Stated this way, they seem so similar that perhaps they are the same thing?)

To my limited knowledge, there has been some work as of recent in the creation of sub-goals in reinforcement learning. One paper I read used a system to look for states that had a high ratio of ultimately rewarded paths to unrewarded paths, and selected these as subgoals (e.g. rewarded the agent when this state was reached.) I'm not talking about these sorts of sub-goals. In these systems, there is an ultimate goal that the researcher wants the agent to achieve, and it is the algorithm's (or s') task to make a policy for generating/selecting behavior. Rather, I'm interested in even more unstructured tasks - make a utility function, and a behavioral policy, based on small continuous (possibly irrelevant?) rewards in the environment.

Why would I want to do this? The pet project I have in mind is a 'cognitive' PCB part placement / layout / routing algorithm to add to my pet project, kicadocaml, to finally get some people to use it (the attention economy :-) In the course of thinking about how to do this, I've realized that a substantial problem is simply determining what board layouts are good, and what are not. I have a rough aesthetic idea + some heuristics that I learned from my dad + some heuristics I've learned through practice of what is good layout and what is not - but, how to code these up? And what if these aren't the best rules, anyway? If i just code up the rules I've internalized as utility functions, then the board layout will be pretty much as I do it - boring!

Well, I've stated my sub-goal in the form of a problem statement and some criteria to meet. Now, to go and search for a decent solution to it. (Have to keep this blog m8ta!) (Or, realistically, to go back and see if the problem statement is sensible).

(**) Corollary #2 - There is no god. nod, Dawkins.

{789}
hide / / print
ref: work-0 tags: emergent leabra QT neural networks GUI interface date: 10-21-2009 19:02 gmt revision:4 [3] [2] [1] [0] [head]

I've been reading Computational Explorations in Cognitive Neuroscience, and decided to try the code that comes with / is associated with the book. This used to be called "PDP+", but was re-written, and is now called Emergent. It's a rather large program - links to Qt, GSL, Coin3D, Quarter, Open Dynamics Library, and others. The GUI itself seems obtuse and too heavy; it's not clear why they need to make this so customized / panneled / tabbed. Also, it depends on relatively recent versions of each of these libraries - which made the install on my Debian Lenny system a bit of a chore (kinda like windows).

A really strange thing is that programs are stored in tree lists - woah - a natural folding editor built in! I've never seen a programming language that doesn't rely on simple text files. Not a bad idea, but still foreign to me. (But I guess programs are inherently hierarchal anyway.)

Below, a screenshot of the whole program - note they use a Coin3D window to graph things / interact with the model. The colored boxes in each network layer indicate local activations, and they update as the network is trained. I don't mind this interface, but again it seems a bit too 'heavy' for things that are inherently 2D (like 2D network activations and the output plot). It's good for seeing hierarchies, though, like the network model.

All in all looks like something that could be more easily accomplished with some python (or ocaml), where the language itself is used for customization, and not a GUI. With this approach, you spend more time learning about how networks work, and less time programming GUIs. On the other hand, if you use this program for teaching, the gui is essential for debugging your neural networks, or other people use it a lot, maybe then it is worth it ...

In any case, the book is very good. I've learned about GeneRec, which uses different activation phases to compute local errors for the purposes of error-minimization, as well as the virtues of using both Hebbian and error-based learning (like GeneRec). Specifically, the authors show that error-based learning can be rather 'lazy', purely moving down the error gradient, whereas Hebbian learning can internalize some of the correlational structure of the input space. You can look at this internalization as 'weight constraint' which limits the space that error-based learning has to search. Cool idea! Inhibition also is a constraint - one which constrains the network to be sparse.

To use his/their own words:

... given the explanation above about the network's poor generalization, it should be clear why both Hebbian learning and kWTA (k winner take all) inhibitory competition can improve generalization performance. At the most general level, they constitute additional biases that place important constraints on the learning and the development of representations. Mroe specifically, Hebbian learning constrains the weights to represent the correlational structure of the inputs to a given unit, producing systematic weight patterns (e.g. cleanly separated clusters of strong correlations).

Inhibitory competition helps in two ways. First, it encourages individual units to specialize in representing a subset of items, thus parcelling up the task in a much cleaner and more systematic way than would occur in an otherwise unconstrained network. Second, inhibition greatly restricts the settling dynamics of the network, greatly constraining the number of states the network can settle into, and thus eliminating a large proportion of the attractors that can hijack generalization.."

{787}
hide / / print
ref: life-0 tags: IQ intelligence Flynn effect genetics facebook social utopia data machine learning date: 10-02-2009 14:19 gmt revision:1 [0] [head]

src

My theory on the Flynn effect - human intelligence IS increasing, and this is NOT stopping. Look at it from a ML perspective: there is more free time to get data, the data (and world) has almost unlimited complexity, the data is much higher quality and much easier to get (the vast internet & world!(travel)), there is (hopefully) more fuel to process that data (food!). Therefore, we are getting more complex, sophisticated, and intelligent. Also, the idea that less-intelligent people having more kids will somehow 'dilute' our genetic IQ is bullshit - intelligence is mostly a product of environment and education, and is tailored to the tasks we need to do; it is not (or only very weakly, except at the extremes) tied to the wetware. Besides, things are changing far too fast for genetics to follow.

Regarding this social media, like facebook and others, you could posit that social intelligence is increasing, along similar arguments to above: social data is seemingly more prevalent, more available, and people spend more time examining it. Yet this feels to be a weaker argument, as people have always been socializing, talking, etc., and I'm not sure if any of these social media have really increased it. Irregardless, people enjoy it - that's the important part.

My utopia for today :-)

{690}
hide / / print
ref: Chapin-1999.07 tags: chapin Nicolelis BMI neural net original SUNY rat date: 09-02-2009 23:11 gmt revision:2 [1] [0] [head]

PMID-10404201 Real-time control of a robot arm using simultaneously recorded neurons in the motor cortex.

  • Abstract: To determine whether simultaneously recorded motor cortex neurons can be used for real-time device control, rats were trained to position a robot arm to obtain water by pressing a lever. Mathematical transformations, including neural networks, converted multineuron signals into 'neuronal population functions' that accurately predicted lever trajectory. Next, these functions were electronically converted into real-time signals for robot arm control. After switching to this 'neurorobotic' mode, 4 of 6 animals (those with > 25 task-related neurons) routinely used these brain-derived signals to position the robot arm and obtain water. With continued training in neurorobotic mode, the animals' lever movement diminished or stopped. These results suggest a possible means for movement restoration in paralysis patients.
The basic idea of the experiment. Rat controlled the water lever with a forelimb lever, then later learned to control the water lever directly. They used an artificial neural network to decode the intended movement.

{776}
hide / / print
ref: work-0 tags: neural networks course date: 09-01-2009 04:24 gmt revision:0 [head]

http://www.willamette.edu/~gorr/classes/cs449/intro.html -- descent resource, good explanation of the equations associated with artificial neural networks.

{756}
hide / / print
ref: life-0 tags: education wikinomics internet age college university pedagogy date: 06-11-2009 12:52 gmt revision:0 [head]

Will universities stay relevant? and the rest of the wikinomics blog

  • Idea: for universities to remain relevant, they will have to change their teaching styles to match the impatient and interactive internet-raised generation.
  • Notable quotes:
    • [College students today] want to learn, but they want to learn only from what they have to learn, and they want to learn it in a style that is best for them.
    • In the old model, teachers taught and students were expected to absorb vast quantities of content. Education was about absorbing content and being able to recall it on exams. You graduated and you were set for life - just “keeping” up in your chosen field. Today when you graduate you’re set for say, 15 minutes. (heheh)
  • What matters now is a student's capacity for learning. Hence colleges should teach meta-learning: learning how to learn.
  • My opinion: Universities will not die, they are too useful given the collaborative nature of human learning: they bring many different people together for the purpose of learning (and perhaps doing research). This is essential, not just for professional learning, but for life-learning (learning from other's experience so you don't have to experience it). Sure, people can learn by consulting google or wikipedia, but it's not nearly as good as face-to-face lectures (where you can ask questions!) or office hours, because the teacher there has some idea what is going on in the student's mind as he/she learns, and can anticipate questions and give relevant guidance based on experience. Google and Wikipedia, for now, cannot do this as well as a good, thoughtful teacher or friend.

{724}
hide / / print
ref: Oskoei-2008.08 tags: EMG pattern analysis classification neural network date: 04-07-2009 21:10 gmt revision:2 [1] [0] [head]

  • EMG pattern analysis and classification by Neural Network
    • 1989!
    • short, simple paper. showed that 20 patterns can accurately be decoded with a backprop-trained neural network.
  • PMID-18632358 Support vector machine-based classification scheme for myoelectric control applied to upper limb.
    • myoelectric discrimination with SVM running on features in both the time and frequency domain.
    • a survace MES (myoelectric sensor) is formed via the superposition of individual action potentials generated by irregular discharges of active motor units in a muscle fiber. It's amplitude, variance, energy, and frequency vary depending on contration level.
    • Time domain features:
      • Mean absolute value (MAV)
      • root mean square (RMS)
      • waveform length (WL)
      • variance
      • zero crossings (ZC)
      • slope sign changes (SSC)
      • William amplitude.
    • Frequency domain features:
      • power spectrum
      • autoregressive coefficients order 2 and 6
      • mean signal frequency
      • median signal frequency
      • good performance with just RMS + AR2 for 50 or 100ms segments. Used a SVM with a RBF kernel.
      • looks like you can just get away with time-domain metrics!!

{695}
hide / / print
ref: -0 tags: alopex machine learning artificial neural networks date: 03-09-2009 22:12 gmt revision:0 [head]

Alopex: A Correlation-Based Learning Algorithm for Feed-Forward and Recurrent Neural Networks (1994)

  • read the abstract! rather than using the gradient error estimate as in backpropagation, it uses the correlation between changes in network weights and changes in the error + gaussian noise.
    • backpropagation requires calculation of the derivatives of the transfer function from one neuron to the output. This is very non-local information.
    • one alternative is somewhat empirical: compute the derivatives wrt the weights through perturbations.
    • all these algorithms are solutions to the optimization problem: minimize an error measure, E, wrt the network weights.
  • all network weights are updated synchronously.
  • can be used to train both feedforward and recurrent networks.
  • algorithm apparently has a long history, especially in visual research.
  • the algorithm is quite simple! easy to understand.
    • use stochastic weight changes with a annealing schedule.
  • this is pre-pub: tables and figures at the end.
  • looks like it has comparable or faster convergence then backpropagation.
  • not sure how it will scale to problems with hundreds of neurons; though, they looked at an encoding task with 32 outputs.

{669}
hide / / print
ref: Pearlmutter-2009.06 tags: sleep network stability learning memory date: 02-05-2009 19:21 gmt revision:1 [0] [head]

PMID-19191602 A New Hypothesis for Sleep: Tuning for Criticality.

  • Their hypothesis: in the course of learning, the brain's networks move closer to instability, as the process of learning and information storage requires that the network move closer to instability.
    • That is, a perfectly stable network stores no information: output is the same independent of input; a highly unstable network can potentially store a lot of information, or be a very selective or critical system: output is highly sensitive to input.
  • Sleep serves to restore the stability of the network by exposing it to a variety of inputs, checking for runaway activity, and adjusting accordingly. (inhibition / glia? how?)
  • Say that when sleep is not possible, an emergency mechanism must com into play, namely tiredness, to prevent runaway behavior.
  • (From wikipedia:) a potentially serious side-effect of many antipsychotics is that they tend to lower a individual's seizure threshold. Recall that removal of all dopamine can inhibit REM sleep; it's all somehow consistent, but unclear how maintaining network stability and being able to move are related.

{503}
hide / / print
ref: bookmark-0 tags: internet communication tax broadband election? date: 11-21-2007 22:18 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

quote:

Consumers also pay high taxes for telecommunication services, averaging about 13 percent on some telecom services, similar to the tax rate on tobacco and alcohol, Mehlman said. One tax on telecom service has remained in place since the 1898 Spanish-American War, when few U.S. residents had telephones, he noted.

"We think it's a mistake to treat telecom like a luxury and tax it like a sin," he said.

from: The internet could run out of capacity in two years

comments:

  • I bet this will turn into a great excuse for your next president not to invest on health, but rather on internet. --ana
  • Humm.. I think it is meant to be more of a wake-up call to the backhaul and ISP companies, which own most of the networking capacity (not the government). I imagine there will be some problems, people complain, it gets fixed.. hopefully soon. What is really amazing is the total amount of data the internet is expected to produce - 161 exabytes!! -- tlh
  • They won't upgrade their capacity. After all, the telcos spent a lot of money doing just that in the dot-bomb days. No, instead they will spend their money on technologies and laws that allow them to charge more for certain types of packets or for delivering some packets faster than others. You think it's a coincidence that Google is buying up dark fiber? --jeo

{497}
hide / / print
ref: bookmark-0 tags: open source cellphone public network date: 11-13-2007 21:28 gmt revision:2 [1] [0] [head]

http://dotpublic.istumbler.net/

  • kinda high-level, rather amorphous, but generally in the right direction. The drive is there, the time is coming, but we are not quite there yet..
  • have some designs for wireless repeaters, based on 802.11g mini-pci cards in a SBC, 3 repeaters. total cost about $1000
  • also interesting: http://www.opencellphone.org/index.php?title=Main_Page

{479}
hide / / print
ref: bookmark-0 tags: cybernetics introduction 1957 Ross Ashby feedback date: 10-26-2007 00:50 gmt revision:3 [2] [1] [0] [head]

http://pespmc1.vub.ac.be/books/IntroCyb.pdf -- dated, but still interesting, useful, a book in and of itself!

  • cybernetics = "the study of systems that are open to energy but closed to information and control"
    • cybernetics also = the study of systems whose complexity cannot be reduced away, or rather whose complexity is integral to its function, e.g. the human brain, the world economy. here simple examples have little explanatory power.
  • book, for the most part, avoids calculus, and deals instead with discrete time and sums (i think?)
  • with exercises!! for example, page 60 - cybernetics of a haunted house:)
  • random thought: a lot of this stuff seems dependent on the mathematics of statistical physics...

{467}
hide / / print
ref: bookmark-0 tags: Saab water injection neuralnet 900 turbo date: 10-15-2007 16:09 gmt revision:2 [1] [0] [head]

Self-learning fuzzy neural network with optimal on-line leaning for water injection control of a turbocharged automobile.

  • for a 1994 - 1998 Saab 900 SE (like mine).
  • also has details on the trionic 5 ECU, including how saab detects knock through pre-ignition ionization measurement, and how it subsequently modifies ignition timing & boost pressure.
  • images/467_1.pdf

{465}
hide / / print
ref: notes-0 tags: CRC32 ethernet blackfin date: 10-10-2007 03:57 gmt revision:1 [0] [head]

good explanation of 32-bit CRC (from the blackfin BF537 hardware ref):

{401}
hide / / print
ref: bookmark-0 tags: RF penetration tissue 1978 date: 07-24-2007 04:15 gmt revision:2 [1] [0] [head]

http://hardm.ath.cx:88/pdf/RFpenetrationInTissue.pdf

  • from the perspective of NMR imaging.
  • gives the penetration depths & phase-shifts for RF waves from 1 - 100Mhz. I can obly assume that it is much worse for 400Mhz and 2.4Ghz.
    • that said, Zarlink's MICS transceiver works from the GI tract at 400mhz with low power, suggesting that the attenuation can't be too too great.
  • includes equations used to derive these figures.
  • document describing how various antenna types are effected by biological tissue, e.g. a human head.

even more interesting: wireless brain machine interface

{384}
hide / / print
ref: bookmark-0 tags: magstripe magnetic stripe reader writer encoder date: 05-31-2007 02:49 gmt revision:1 [0] [head]

notes on reading magstripe cards:

{277}
hide / / print
ref: Sergio-2005.1 tags: isometric motor control kinematics kinetics Kalaska date: 04-09-2007 22:33 gmt revision:6 [5] [4] [3] [2] [1] [0] [head]

PMID-15888522[0] Motor cortex neural correlates of output kinematics and kinetics during isometric-force and arm-reaching tasks.

  • see [1]
  • recorded 132 units from the caudal M1
  • two tasks: isometric and movement of a heavy mass, both to 8 peripheral targets.
    • target location was digitized using a 'sonic digitizer'. trajectories look really good - the monkey was well trained.
  • idea: part of M1 functions near the output (of course)
    • evidence supporting this: M1 rasters during movement of the heavy mass show a triphasic profile: one to accelerate the mass, one to decelerate it, and another to hold it steady on target. see [2,3,4,5,6,7,8,9,10]

____References____

{230}
hide / / print
ref: engineering notes-0 tags: homopolar generator motor superconducting magnet date: 03-09-2007 14:39 gmt revision:0 [head]

http://hardm.ath.cx:88/pdf/homopolar.pdf

  • the magnets are energized in 'opposite directions - forcing the field lines to go normal to the rotar.
  • still need brushes - perhaps there is no way to avoid them in a homopolar generator.

{223}
hide / / print
ref: physics notes-0 tags: plasma physics electromagnet tesla coil copper capillary tubing calculations date: 02-23-2007 16:01 gmt revision:0 [head]

calculations for a strong DC loop magnet using 1/8" copper capillary tubing:

  1. OD .125" = 3.1.7mm^2; ID 0.065 -> copper area = 23.2mm^2 ~= AWG 4
  2. AWG 4 = 0.8 ohms/km
  3. length of tubing: 30' ~= 40 turns @ 9" each (windings packed into a torus of major radius 1.5"; minor radius 0.5")
  4. water flow rate through copper capillary tubing: 1 liter/min; assuming we can heat it up from 30C -> 100C, this is 70KCal = 292 KJ/min = 4881 W total. (better pipe it into our hot water heater!)
  5. 4.8kw / 9m of tubing = 540 W/m
  6. 540W/m / 8e-4 = 821 A ; V = 821 * 9 * 8e-4 = 5.9V (!!! where the hall am i going to get that kind of power?)
  7. 821A * 40 turns = 32.8KA in a loop major radius 1.5" = 3.8cm
  8. magnetic field of a current loop -> B = 0.54T
  9. lamour radius: 5eV electrons @B = 0.54T : 15um; proton: 2.7cm; electrons @1KeV ~= 2.66e8 (this is close to the speed of light?) r = 3mm.

{7}
hide / / print
ref: bookmark-0 tags: book information_theory machine_learning bayes probability neural_networks mackay date: 0-0-2007 0:0 revision:0 [head]

http://www.inference.phy.cam.ac.uk/mackay/itila/book.html -- free! (but i liked the book, so I bought it :)

{20}
hide / / print
ref: bookmark-0 tags: neural_networks machine_learning matlab toolbox supervised_learning PCA perceptron SOM EM date: 0-0-2006 0:0 revision:0 [head]

http://www.ncrg.aston.ac.uk/netlab/index.php n.b. kinda old. (or does that just mean well established?)

{39}
hide / / print
ref: bookmark-0 tags: Numenta Bayesian_networks date: 0-0-2006 0:0 revision:0 [head]

http://www.numenta.com/Numenta_HTM_Concepts.pdf

  • shared, hierarchal representation reduces memory requirements, training time, and mirrors the structure of the world.
  • belief propagation techniques force the network into a set of mutually consistent beliefs.
  • a belief is a form of spatio-temporal quantization: ignore the unusual.
  • a cause is a persistent or recurring structure in the world - the root of a spatiotemporal pattern. This is a simple but important concept.
    • HTM marginalize along space and time - they assume time patterns and space patterns, not both at the same time. Temporal parameterization follows spatial parameterization.

{40}
hide / / print
ref: bookmark-0 tags: Bayes Baysian_networks probability probabalistic_networks Kalman ICA PCA HMM Dynamic_programming inference learning date: 0-0-2006 0:0 revision:0 [head]

http://www.cs.ubc.ca/~murphyk/Bayes/bnintro.html very, very good! many references, well explained too.

{92}
hide / / print
ref: bookmark-0 tags: training neural_networks with kalman filters date: 0-0-2006 0:0 revision:0 [head]

with the extended kalman filter, from '92: http://ftp.ccs.neu.edu/pub/people/rjw/kalman-ijcnn-92.ps

with the unscented kalman filter : http://hardm.ath.cx/pdf/NNTrainingwithUnscentedKalmanFilter.pdf